Predictive model for estimating nitrogen density in MD2 pineapple crops from multispectral images and sensors integrated in an IoT platform

ABSTRACT : Nitrogen is the most important nutritional element during the vegetative growth phase of the pineapple crop; however, its presence in the soil is insufficient to meet plant demands. In this doctoral research, nine machine learning techniques were validated to estimate total nitrogen (TN)...

Full description

Autores:
Chaparro Mesa, Jorge Enrique
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2024
Institución:
Universidad de Antioquia
Repositorio:
Repositorio UdeA
Idioma:
eng
OAI Identifier:
oai:bibliotecadigital.udea.edu.co:10495/44223
Acceso en línea:
https://hdl.handle.net/10495/44223
Palabra clave:
Técnicas de predicción
Forecasting techniques
Procesamiento de imágenes
Image processing
Internet de las cosas
Internet of things (IoT)
Multispectral Imaging
Unmanned Aerial Vehicle (UAV)
Sensors in the crop
http://aims.fao.org/aos/agrovoc/c_e4315b22
Rights
openAccess
License
https://creativecommons.org/licenses/by-nc-sa/4.0/
Description
Summary:ABSTRACT : Nitrogen is the most important nutritional element during the vegetative growth phase of the pineapple crop; however, its presence in the soil is insufficient to meet plant demands. In this doctoral research, nine machine learning techniques were validated to estimate total nitrogen (TN) content in MD2 pineapple crops from data from multiple sources. These sources included multispectral images captured by an unmanned aerial vehicle (UAV) and in situ sensors that collected information on ecological and environmental factors, such as pH, temperature, solar radiation, relative humidity, soil moisture, and wind speed and direction. In addition, plant information was collected related to SPAD values, which indicate leaf chlorophyll content, and total nitrogen (TN) values, obtained from leaf tissue samples sent to a certified laboratory for analysis. To introduce nitrogen variability, a randomized complete block experimental design was implemented, applying five different treatments in five blocks, each with 12 replications, during a 6-month period in a pineapple crop located in the municipality of Tauramena, Casanare, Colombia. To address the inherent variability of the agricultural and environmental data, dimensionality was reduced using Principal Component Analysis (PCA). Regularization techniques were also applied, including cross-validation, feature selection, boost methods, L1 (Lasso) and L2 (Ridge) regularization, as well as hyperparameter optimization. These strategies generated more robust and accurate models, among which regression, multilayer perceptron (MLP regressor) and extreme gradient boosting (XGBoost) algorithms stood out. On the first sampling date, XGBoost achieved an R^2 of 86.98\%, which was the highest during the entire experiment. On subsequent dates, MLP achieved an R^2 of 59.11\% on the second date; XGBoost achieved an R^2 of 68.00\% on the third date, and on the last date, MLP achieved an R^2 of 69.4\%. These results indicate that the integration of data from multiple sources and the use of machine learning models enable nitrogen (N) diagnostics in pineapple crops, especially in real-time applications. These results highlight the promising potential of developing machine learning models that integrate multisensor data fusion for various applications in agriculture. In the implementation of the machine learning models, the total nitrogen content obtained in the laboratory was considered as the response variable. The predictor variables comprised sensor data, SPAD values, and statistical information derived from 16 vegetation indices calculated from the multispectral images. To reduce the dimensionality of the predictor variable dataset, Principal Component Analysis (PCA) was applied. Following this dimensionality reduction, nine regression algorithms were used to estimate leaf nitrogen content during each of the four study periods. This comprehensive approach yielded close estimates of leaf nitrogen content. The results of the study indicated that the MLP (Multilayer Perceptron) and XGB (XGBoost) regression algorithms stood out for their superior performance, evidenced by the best performance metrics.