Explainable XGBoost model for crop recommendation in Mizoram using hybrid random forest and particle swarm optimization
565 / 357
Keywords:
Crop recommendation, Feature selection, Precision agriculture, RF-PSO, SHAP, XGBoostAbstract
The study was carried out during 2023–2024 using multi-location data across the three districts of Lawngtlai, Serchhip, and Champhai to support sustainable agriculture in Mizoram, North-east India, with the help of XGBoost-based crop recommendation system. A hybrid feature selection approach combining Random Forest (RF) and Particle Swarm Optimization (PSO) was proposed to identify key agronomic features. Class imbalance was addressed using Synthetic Minority Oversampling Technique (SMOTE), and model performance was evaluated using standard metrics such as accuracy, precision, recall, and F1-score. GridSearchCV was employed for hyperparameter optimization, with a 5-fold cross-validation applied to validate model performance during training. The XGBoost classifier trained on the hybrid RF + PSO-optimized features outperformed those trained on the full feature set and the RF top-8 features, owing to the combined benefits of SMOTE-based class balancing and PSO-driven optimal feature selection. The SHAP analysis for the four major crops rice, maize, moong, and potato revealed that nitrogen (N) and potassium (K) were the most influential factors shaping crop prediction outcomes, followed by phosphorus (P) and soil pH, while rainfall had the least influence due to Mizoram’s consistently high and evenly distributed precipitation across its cultivation zones. The proposed approach enhances both accuracy and interpretability, providing a reliable decision-support framework for crop selection tailored to Mizoram’s diverse agro-climatic conditions.
Downloads
References
Ajayi O G, Ashi J and Guda B. 2023. Performance evaluation of YOLO v5 model for automatic crop and weed classification on UAV images. Smart Agricultural Technology 5: 100231. DOI: https://doi.org/10.1016/j.atech.2023.100231
Akbari E, Darvishi Boloorani A, Neysani Samany N, Hamzeh S, Soufizadeh S and Pignatti S. 2020. Crop mapping using Random Forest and Particle Swarm Optimization based on multi-temporal Sentinel-2. Remote Sensing 12(9): 1449. DOI: https://doi.org/10.3390/rs12091449
Amjad M, Ahmad I, Ahmad M, Wróblewski P, Kaminski P and Amjad U. 2022. Prediction of pile bearing capacity using XGBoost algorithm: Modeling and performance evaluation. Applied Sciences 12(4): 2126. DOI: https://doi.org/10.3390/app12042126
Asselman A, Khaldi M and Aammou S. 2023. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments 31(6): 3360–79. DOI: https://doi.org/10.1080/10494820.2021.1928235
Chabalala Y, Adam E and Ali K A. 2023. Exploring the effect of balanced and imbalanced multi-class distribution data and sampling techniques on fruit-tree crop classification using different machine learning classifiers. Geomatics 3(1): 70–92. DOI: https://doi.org/10.3390/geomatics3010004
Clarke A, Yates D, Blanchard C, Islam M Z, Ford R, Rehman S and Walsh R. 2024. The effect of dataset construction and data pre-processing on the eXtreme Gradient Boosting algorithm applied to head rice yield prediction in Australia. Computers and Electronics in Agriculture 219: 108716. DOI: https://doi.org/10.1016/j.compag.2024.108716
Coffie G H and Cudjoe S K. 2024. Using extreme gradient boosting (XGBoost) machine learning to predict construction cost overruns. International Journal of Construction Management 24(16): 1742–50. DOI: https://doi.org/10.1080/15623599.2023.2289754
Convention on Biological Diversity. 2018. 2.6 billion people draw their livelihoods mostly from agriculture. Convention on Biological Diversity, Montreal.
Darjee D K. 2023. A comparative review and analysis of organic farming policies adopted by the north-east states of India: An exploratory study. Journal of Emerging Technologies and Innovative Research 10(12): h555–h570.
De Amorim L B, Cavalcanti G D and Cruz R M. 2023. The choice of scaling technique matters for classification performance. Applied Soft Computing 133: 109924. DOI: https://doi.org/10.1016/j.asoc.2022.109924
Elavarasan D, Vincent P M D R, Srinivasan K and Chang C Y. 2020. A hybrid CFS filter and RF-RFE wrapper-based feature extraction for enhanced agricultural crop yield prediction modeling. Agriculture 10(9): 400. DOI: https://doi.org/10.3390/agriculture10090400
Elsheikh A H and Abd Elaziz M. 2019. Review on applications of particle swarm optimization in solar energy systems. International Journal of Environmental Science and Technology 16: 1159–70. DOI: https://doi.org/10.1007/s13762-018-1970-x
Garg D and Alam M. 2023. An effective crop recommendation method using machine learning techniques. International Journal of Advanced Technology and Engineering Exploration 10(102): 498. DOI: https://doi.org/10.19101/IJATEE.2022.10100456
Geng X, Wu S, Zhang Y, Sun J, Cheng H, Zhang Z and Pu S. 2023. Developing hybrid XGBoost model integrated with entropy weight and Bayesian optimization for predicting tunnel squeezing intensity. Natural Hazards 119(1): 751–71. DOI: https://doi.org/10.1007/s11069-023-06137-0
Gulati A and Juneja R. 2022. Transforming Indian Agriculture. Indian Agriculture Towards 2030, pp. 9–37. Chand R, Joshi P and Khadka S (Eds). Springer, Singapore. DOI: https://doi.org/10.1007/978-981-19-0763-0_2
Hasan M, Marjan M A, Uddin M P, Afjal M I, Kardy S, Ma S and Nam Y. 2023. Ensemble machine learning-based recommendation system for effective prediction of suitable agricultural crop cultivation. Frontiers in Plant Science 14: 1234555. DOI: https://doi.org/10.3389/fpls.2023.1234555
Jain M, Saihjpal V, Singh N and Singh S B. 2022. An overview of variants and advancements of PSO algorithm. Applied Sciences 12(17): 8392. DOI: https://doi.org/10.3390/app12178392
Rohlupuii, Kaur A, Kataria P and Laishram P. 2023. Assessment of crop production dynamics in Mizoram. Agricultural Reviews 44(4): 573–76.
Kennedy J and Eberhart R. 1995. Particle swarm optimization. (In) Proceedings of ICNN'95-International Conference on Neural Networks, Perth, Western Australia, Australia, 27 November–1 December, pp. 1942–48. DOI: https://doi.org/10.1109/ICNN.1995.488968
Kumar M, Maurya P and Verma R. 2022. Future of Indian Agriculture Using AI and Machine Learning Tools and Techniques. The New Advanced Society: Artificial Intelligence and Industrial Internet of Things Paradigm, pp. 447–72. DOI: https://doi.org/10.1002/9781119884392.ch19
Panda S K, Mohapatra R K, Panda S and Balamurugan S (Eds). Scrivener Publishing, Beverly.
Kumar Y B, Lalramhlimi B, Lalrinsanga P L, Soni J K and Doley S. 2023. Success of integrated farming system for enhancing farmer’s income in Mizoram. Indian Farming 73(8): 39–43.
Lee B X, Kjaerulf F, Turner S, Cohen L, Donnelly P D, Muggah R, Davis R, Realini A, Kieselbach B, MacGregor L S and Waller I. 2016. Transforming our world: Implementing the 2030 agenda through sustainable development goal indicators. Journal of Public Health Policy 37(Suppl 1): 13–31. DOI: https://doi.org/10.1057/s41271-016-0002-7
Li J, Zhu Q, Wu Q and Fan Z. 2021. A novel oversampling technique for class-imbalanced learning based on SMOTE and natural neighbors. Information Sciences 565: 438–55. DOI: https://doi.org/10.1016/j.ins.2021.03.041
Li Y, Zeng H, Zhang M, Wu B, Zhao Y, Yao X, Cheng T, Qin X and Wu F. 2023. A county-level soybean yield prediction framework coupled with XGBoost and multidimensional feature engineering. International Journal of Applied Earth Observation and Geoinformation 118:103269. DOI: https://doi.org/10.1016/j.jag.2023.103269
Lv C X, An S Y, Qiao B J and Wu W. 2021. Time series analysis of hemorrhagic fever with renal syndrome in mainland China by using an XGBoost forecasting model. BMC Infectious Diseases 21: 1–3. DOI: https://doi.org/10.1186/s12879-021-06503-y
Manokaran J and Vairavel G. 2023. GIWRF-SMOTE: Gini impurity-based weighted random forest with SMOTE for effective malware attack and anomaly detection in IoT-Edge. Smart Science 11(2): 276–92. DOI: https://doi.org/10.1080/23080477.2022.2152933
Mansouri M, Safavi H R and Rezaei F. 2022. An improved MOPSO algorithm for multi-objective optimization of reservoir operation under climate change. Environmental Monitoring and Assessment 194(4): 261. DOI: https://doi.org/10.1007/s10661-022-09909-6
Mor S, Madan S and Prasad K D. 2021. Artificial intelligence and carbon footprints: Roadmap for Indian agriculture. Strategic Change 30(3): 269–80. DOI: https://doi.org/10.1002/jsc.2409
Mythili K and Rangaraj R. 2021. Deep learning with particle swarm based hyper parameter tuning based crop recommendation for better crop yield for precision agriculture. Indian Journal of Science and Technology 14(17): 1325–37. DOI: https://doi.org/10.17485/IJST/v14i17.450
Naga S P, Ijaz M F and Woźniak M. 2024. XAI-driven model for crop recommender system for use in precision agriculture. Computational Intelligence 40(1): e12629. DOI: https://doi.org/10.1111/coin.12629
Nembrini S, Konig Ira and Wright M N. 2018. The revival of the Gini importance? Bioinformatics 34(21): 3711–18. DOI: https://doi.org/10.1093/bioinformatics/bty373
Pachuau A L and Devi O H. 2020. Shifting cultivation and environment. A study of subsistence to profit in Mizoram. Mizoram University Journal of Humanities and Social Science 6(1): 47–61.
Pandey A. 2021. Crop Production in India. Kaggle Dataset. Available at: https://www.kaggle.com/datasets/asishpandey/crop-production-in-india. Accessed on August 27, 2024.
Pandey V, Pandey P K, Chakma B and Ranjan P. 2024. Influence of short-and long-term persistence on identification of rainfall temporal trends using different versions of the Mann-Kendall test in Mizoram, north-east India. Environmental Science and Pollution Research 31(7): 10359–378. DOI: https://doi.org/10.1007/s11356-023-29436-2
Pan X and Chen J. 2024. The optimization path of agricultural industry structure and intelligent transformation by deep learning. Scientific Reports 14: 29548. DOI: https://doi.org/10.1038/s41598-024-81322-0
Qasim O S and Algamal Z Y. 2018. Feature selection using particle swarm optimization-based logistic regression model. Chemometrics and Intelligent Laboratory Systems 182: 41–46. DOI: https://doi.org/10.1016/j.chemolab.2018.08.016
Sati V P. 2019. Shifting cultivation in Mizoram, India. An empirical 2020 study of its economic implications. Journal of Mountain Science 16: 2136–49. DOI: https://doi.org/10.1007/s11629-019-5416-9
Saxena A, Suna T and Saha D. 2020. Application of artificial intelligence in Indian agriculture. (In) Souvenir: 19 National Convention-Artificial Intelligence in Agriculture: Indian perspective. RCA Alumni Association, May. Udaipur, pp. 14-22.
Singh R, Babu S, Avasthe R, Das A, Praharaj C, Layek J, Kumar A, Rathore S, Mrunalini K, Kumar S, Yadav S and Pashte V. 2021. Organic farming in North-East India: Status and strategies. Indian Journal of Agronomy 66(5): 163–79.
Thanga J L. 2020. Land use policies in the state of Mizoram. Journal of Economic and Social Development 16(1–2): 4483.
Thihlum Z, Ambeth Kumar V D and Chawngsangpuii. 2025. Impact of SMOGN on regression models for crop yield prediction in Mizoram agriculture. (In) Proceedings of International Conference on Soft Computing and its Engineering Applications, pp. 170–182. DOI: https://doi.org/10.1007/978-3-031-88039-1_14
Patel K K, Santosh K C, Oliveira G G de, Patel A and Ghosh A (Eds). Springer, Cham.
Tripathi S K, Hauchhum R, Ovung E Y, Singh N S, Vanlalfakawma D C, Upadhyay K K, Brearley F Q and Lalraminghlova H. 2024. Innovative shifting cultivation and other agricultural practices conducted by the indigenous population of Mizoram, north-east India. Shifting Cultivation Systems, pp. 29–48. Tripathi S K and Brearley F Q (Eds). Springer, Cham. DOI: https://doi.org/10.1007/978-3-031-70388-1_3
Vijay R, Manoj S, Ravikanth V, Vikas Y and Priyadarshini P I. 2021. Augmenting network intrusion detection system using extreme gradient boosting (XGBoost). International Journal of Creative Research Thoughts 9(6): b550–b556.
Wang S, Dai Y, Shen J and Xuan J. 2021. Research on expansion and classification of imbalanced data based on SMOTE algorithm. Scientific Reports 11(1): 24039. DOI: https://doi.org/10.1038/s41598-021-03430-5
Wang D, Thunéll S, Lindberg U, Jiang L, Trygg J and Tysklind M. 2022. Towards better process management in wastewater treatment plants: Process analytics based on SHAP values for tree-based machine learning methods. Journal of Environmental Management 301: 113941. DOI: https://doi.org/10.1016/j.jenvman.2021.113941
Xu Z and Sun Y. 2025. Particle swarm optimization for agricultural problems. Highlights in Business, Economics and Management 51: 245–50. DOI: https://doi.org/10.54097/3g4bnp60
Zhang X and Liu C A. 2023. Model averaging prediction by K-fold cross-validation. Journal of Econometrics 235(1): 280–301. DOI: https://doi.org/10.1016/j.jeconom.2022.04.007
Downloads
Submitted
Published
Issue
Section
License
Copyright (c) 2025 The Indian Journal of Agricultural Sciences

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The copyright of the articles published in The Indian Journal of Agricultural Sciences is vested with the Indian Council of Agricultural Research, which reserves the right to enter into any agreement with any organization in India or abroad, for reprography, photocopying, storage and dissemination of information. The Council has no objection to using the material, provided the information is not being utilized for commercial purposes and wherever the information is being used, proper credit is given to ICAR.