Explainable XGBoost model for crop recommendation in Mizoram using hybrid random forest and particle swarm optimization


565 / 357

Authors

  • V D AMBETH KUMAR Mizoram University, Tanhril, Aizawl, Mizoram 796004, India image/svg+xml
  • ZAITINKHUMA THIHLUM Mizoram University, Tanhril, Aizawl, Mizoram 796004, India image/svg+xml
  • A K MOHANTY ICAR-Agricultural Technology Application Research Institute, Umiam, Meghalaya

https://doi.org/10.56093/ijas.v95i11.166636

Keywords:

Crop recommendation, Feature selection, Precision agriculture, RF-PSO, SHAP, XGBoost

Abstract

The study was carried out during 2023–2024 using multi-location data across the three districts of Lawngtlai, Serchhip, and Champhai to support sustainable agriculture in Mizoram, North-east India, with the help of XGBoost-based crop recommendation system. A hybrid feature selection approach combining Random Forest (RF) and Particle Swarm Optimization (PSO) was proposed to identify key agronomic features. Class imbalance was addressed using Synthetic Minority Oversampling Technique (SMOTE), and model performance was evaluated using standard metrics such as accuracy, precision, recall, and F1-score. GridSearchCV was employed for hyperparameter optimization, with a 5-fold cross-validation applied to validate model performance during training. The XGBoost classifier trained on the hybrid RF + PSO-optimized features outperformed those trained on the full feature set and the RF top-8 features, owing to the combined benefits of SMOTE-based class balancing and PSO-driven optimal feature selection. The SHAP analysis for the four major crops rice, maize, moong, and potato revealed that nitrogen (N) and potassium (K) were the most influential factors shaping crop prediction outcomes, followed by phosphorus (P) and soil pH, while rainfall had the least influence due to Mizoram’s consistently high and evenly distributed precipitation across its cultivation zones. The proposed approach enhances both accuracy and interpretability, providing a reliable decision-support framework for crop selection tailored to Mizoram’s diverse agro-climatic conditions.

Downloads

Download data is not yet available.

References

Ajayi O G, Ashi J and Guda B. 2023. Performance evaluation of YOLO v5 model for automatic crop and weed classification on UAV images. Smart Agricultural Technology 5: 100231. DOI: https://doi.org/10.1016/j.atech.2023.100231

Akbari E, Darvishi Boloorani A, Neysani Samany N, Hamzeh S, Soufizadeh S and Pignatti S. 2020. Crop mapping using Random Forest and Particle Swarm Optimization based on multi-temporal Sentinel-2. Remote Sensing 12(9): 1449. DOI: https://doi.org/10.3390/rs12091449

Amjad M, Ahmad I, Ahmad M, Wróblewski P, Kaminski P and Amjad U. 2022. Prediction of pile bearing capacity using XGBoost algorithm: Modeling and performance evaluation. Applied Sciences 12(4): 2126. DOI: https://doi.org/10.3390/app12042126

Asselman A, Khaldi M and Aammou S. 2023. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments 31(6): 3360–79. DOI: https://doi.org/10.1080/10494820.2021.1928235

Chabalala Y, Adam E and Ali K A. 2023. Exploring the effect of balanced and imbalanced multi-class distribution data and sampling techniques on fruit-tree crop classification using different machine learning classifiers. Geomatics 3(1): 70–92. DOI: https://doi.org/10.3390/geomatics3010004

Clarke A, Yates D, Blanchard C, Islam M Z, Ford R, Rehman S and Walsh R. 2024. The effect of dataset construction and data pre-processing on the eXtreme Gradient Boosting algorithm applied to head rice yield prediction in Australia. Computers and Electronics in Agriculture 219: 108716. DOI: https://doi.org/10.1016/j.compag.2024.108716

Coffie G H and Cudjoe S K. 2024. Using extreme gradient boosting (XGBoost) machine learning to predict construction cost overruns. International Journal of Construction Management 24(16): 1742–50. DOI: https://doi.org/10.1080/15623599.2023.2289754

Convention on Biological Diversity. 2018. 2.6 billion people draw their livelihoods mostly from agriculture. Convention on Biological Diversity, Montreal.

Darjee D K. 2023. A comparative review and analysis of organic farming policies adopted by the north-east states of India: An exploratory study. Journal of Emerging Technologies and Innovative Research 10(12): h555–h570.

De Amorim L B, Cavalcanti G D and Cruz R M. 2023. The choice of scaling technique matters for classification performance. Applied Soft Computing 133: 109924. DOI: https://doi.org/10.1016/j.asoc.2022.109924

Elavarasan D, Vincent P M D R, Srinivasan K and Chang C Y. 2020. A hybrid CFS filter and RF-RFE wrapper-based feature extraction for enhanced agricultural crop yield prediction modeling. Agriculture 10(9): 400. DOI: https://doi.org/10.3390/agriculture10090400

Elsheikh A H and Abd Elaziz M. 2019. Review on applications of particle swarm optimization in solar energy systems. International Journal of Environmental Science and Technology 16: 1159–70. DOI: https://doi.org/10.1007/s13762-018-1970-x

Garg D and Alam M. 2023. An effective crop recommendation method using machine learning techniques. International Journal of Advanced Technology and Engineering Exploration 10(102): 498. DOI: https://doi.org/10.19101/IJATEE.2022.10100456

Geng X, Wu S, Zhang Y, Sun J, Cheng H, Zhang Z and Pu S. 2023. Developing hybrid XGBoost model integrated with entropy weight and Bayesian optimization for predicting tunnel squeezing intensity. Natural Hazards 119(1): 751–71. DOI: https://doi.org/10.1007/s11069-023-06137-0

Gulati A and Juneja R. 2022. Transforming Indian Agriculture. Indian Agriculture Towards 2030, pp. 9–37. Chand R, Joshi P and Khadka S (Eds). Springer, Singapore. DOI: https://doi.org/10.1007/978-981-19-0763-0_2

Hasan M, Marjan M A, Uddin M P, Afjal M I, Kardy S, Ma S and Nam Y. 2023. Ensemble machine learning-based recommendation system for effective prediction of suitable agricultural crop cultivation. Frontiers in Plant Science 14: 1234555. DOI: https://doi.org/10.3389/fpls.2023.1234555

Jain M, Saihjpal V, Singh N and Singh S B. 2022. An overview of variants and advancements of PSO algorithm. Applied Sciences 12(17): 8392. DOI: https://doi.org/10.3390/app12178392

Rohlupuii, Kaur A, Kataria P and Laishram P. 2023. Assessment of crop production dynamics in Mizoram. Agricultural Reviews 44(4): 573–76.

Kennedy J and Eberhart R. 1995. Particle swarm optimization. (In) Proceedings of ICNN'95-International Conference on Neural Networks, Perth, Western Australia, Australia, 27 November–1 December, pp. 1942–48. DOI: https://doi.org/10.1109/ICNN.1995.488968

Kumar M, Maurya P and Verma R. 2022. Future of Indian Agriculture Using AI and Machine Learning Tools and Techniques. The New Advanced Society: Artificial Intelligence and Industrial Internet of Things Paradigm, pp. 447–72. DOI: https://doi.org/10.1002/9781119884392.ch19

Panda S K, Mohapatra R K, Panda S and Balamurugan S (Eds). Scrivener Publishing, Beverly.

Kumar Y B, Lalramhlimi B, Lalrinsanga P L, Soni J K and Doley S. 2023. Success of integrated farming system for enhancing farmer’s income in Mizoram. Indian Farming 73(8): 39–43.

Lee B X, Kjaerulf F, Turner S, Cohen L, Donnelly P D, Muggah R, Davis R, Realini A, Kieselbach B, MacGregor L S and Waller I. 2016. Transforming our world: Implementing the 2030 agenda through sustainable development goal indicators. Journal of Public Health Policy 37(Suppl 1): 13–31. DOI: https://doi.org/10.1057/s41271-016-0002-7

Li J, Zhu Q, Wu Q and Fan Z. 2021. A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors. Information Sciences 565: 438–55. DOI: https://doi.org/10.1016/j.ins.2021.03.041

Li Y, Zeng H, Zhang M, Wu B, Zhao Y, Yao X, Cheng T, Qin X and Wu F. 2023. A county-level soybean yield prediction framework coupled with XGBoost and multidimensional feature engineering. International Journal of Applied Earth Observation and Geoinformation 118:103269. DOI: https://doi.org/10.1016/j.jag.2023.103269

Lv C X, An S Y, Qiao B J and Wu W. 2021. Time series analysis of hemorrhagic fever with renal syndrome in mainland China by using an XGBoost forecasting model. BMC Infectious Diseases 21: 1–3. DOI: https://doi.org/10.1186/s12879-021-06503-y

Manokaran J and Vairavel G. 2023. GIWRF-SMOTE: Gini impurity-based weighted random forest with SMOTE for effective malware attack and anomaly detection in IoT-Edge. Smart Science 11(2): 276–92. DOI: https://doi.org/10.1080/23080477.2022.2152933

Mansouri M, Safavi H R and Rezaei F. 2022. An improved MOPSO algorithm for multi-objective optimization of reservoir operation under climate change. Environmental Monitoring and Assessment 194(4): 261. DOI: https://doi.org/10.1007/s10661-022-09909-6

Mor S, Madan S and Prasad K D. 2021. Artificial intelligence and carbon footprints: Roadmap for Indian agriculture. Strategic Change 30(3): 269–80. DOI: https://doi.org/10.1002/jsc.2409

Mythili K and Rangaraj R. 2021. Deep learning with particle swarm based hyper parameter tuning based crop recommendation for better crop yield for precision agriculture. Indian Journal of Science and Technology 14(17): 1325–37. DOI: https://doi.org/10.17485/IJST/v14i17.450

Naga S P, Ijaz M F and Woźniak M. 2024. XAI-driven model for crop recommender system for use in precision agriculture. Computational Intelligence 40(1): e12629. DOI: https://doi.org/10.1111/coin.12629

Nembrini S, Konig Ira and Wright M N. 2018. The revival of the Gini importance? Bioinformatics 34(21): 3711–18. DOI: https://doi.org/10.1093/bioinformatics/bty373

Pachuau A L and Devi O H. 2020. Shifting cultivation and environment. A study of subsistence to profit in Mizoram. Mizoram University Journal of Humanities and Social Science 6(1): 47–61.

Pandey A. 2021. Crop Production in India. Kaggle Dataset. Available at: https://www.kaggle.com/datasets/asishpandey/crop-production-in-india. Accessed on August 27, 2024.

Pandey V, Pandey P K, Chakma B and Ranjan P. 2024. Influence of short-and long-term persistence on identification of rainfall temporal trends using different versions of the Mann-Kendall test in Mizoram, north-east India. Environmental Science and Pollution Research 31(7): 10359–378. DOI: https://doi.org/10.1007/s11356-023-29436-2

Pan X and Chen J. 2024. The optimization path of agricultural industry structure and intelligent transformation by deep learning. Scientific Reports 14: 29548. DOI: https://doi.org/10.1038/s41598-024-81322-0

Qasim O S and Algamal Z Y. 2018. Feature selection using particle swarm optimization-based logistic regression model. Chemometrics and Intelligent Laboratory Systems 182: 41–46. DOI: https://doi.org/10.1016/j.chemolab.2018.08.016

Sati V P. 2019. Shifting cultivation in Mizoram, India. An empirical 2020 study of its economic implications. Journal of Mountain Science 16: 2136–49. DOI: https://doi.org/10.1007/s11629-019-5416-9

Saxena A, Suna T and Saha D. 2020. Application of artificial intelligence in Indian agriculture. (In) Souvenir: 19 National Convention-Artificial Intelligence in Agriculture: Indian perspective. RCA Alumni Association, May. Udaipur, pp. 14-22.

Singh R, Babu S, Avasthe R, Das A, Praharaj C, Layek J, Kumar A, Rathore S, Mrunalini K, Kumar S, Yadav S and Pashte V. 2021. Organic farming in North-East India: Status and strategies. Indian Journal of Agronomy 66(5): 163–79.

Thanga J L. 2020. Land use policies in the state of Mizoram. Journal of Economic and Social Development 16(1–2): 4483.

Thihlum Z, Ambeth Kumar V D and Chawngsangpuii. 2025. Impact of SMOGN on regression models for crop yield prediction in Mizoram agriculture. (In) Proceedings of International Conference on Soft Computing and its Engineering Applications, pp. 170–182. DOI: https://doi.org/10.1007/978-3-031-88039-1_14

Patel K K, Santosh K C, Oliveira G G de, Patel A and Ghosh A (Eds). Springer, Cham.

Tripathi S K, Hauchhum R, Ovung E Y, Singh N S, Vanlalfakawma D C, Upadhyay K K, Brearley F Q and Lalraminghlova H. 2024. Innovative shifting cultivation and other agricultural practices conducted by the indigenous population of Mizoram, north-east India. Shifting Cultivation Systems, pp. 29–48. Tripathi S K and Brearley F Q (Eds). Springer, Cham. DOI: https://doi.org/10.1007/978-3-031-70388-1_3

Vijay R, Manoj S, Ravikanth V, Vikas Y and Priyadarshini P I. 2021. Augmenting network intrusion detection system using extreme gradient boosting (XGBoost). International Journal of Creative Research Thoughts 9(6): b550–b556.

Wang S, Dai Y, Shen J and Xuan J. 2021. Research on expansion and classification of imbalanced data based on SMOTE algorithm. Scientific Reports 11(1): 24039. DOI: https://doi.org/10.1038/s41598-021-03430-5

Wang D, Thunéll S, Lindberg U, Jiang L, Trygg J and Tysklind M. 2022. Towards better process management in wastewater treatment plants: Process analytics based on SHAP values for tree-based machine learning methods. Journal of Environmental Management 301: 113941. DOI: https://doi.org/10.1016/j.jenvman.2021.113941

Xu Z and Sun Y. 2025. Particle swarm optimization for agricultural problems. Highlights in Business, Economics and Management 51: 245–50. DOI: https://doi.org/10.54097/3g4bnp60

Zhang X and Liu C A. 2023. Model averaging prediction by K-fold cross-validation. Journal of Econometrics 235(1): 280–301. DOI: https://doi.org/10.1016/j.jeconom.2022.04.007

Downloads

Submitted

2025-05-14

Published

2025-11-19

Issue

Section

Articles

How to Cite

KUMAR, V. D. A. ., THIHLUM, Z. ., & MOHANTY, A. K. . (2025). Explainable XGBoost model for crop recommendation in Mizoram using hybrid random forest and particle swarm optimization. The Indian Journal of Agricultural Sciences, 95(11), 1324–1331. https://doi.org/10.56093/ijas.v95i11.166636
Citation