Abstract

Ensuring food security is a critical challenge as the global population and climate instability continue to rise. This study focuses on predicting the yields of corn, soybean, and wheat under these unstable conditions using a machine learning approach. The model utilizes six years of county-level data for five climate variables: temperature, precipitation, solar radiation, CO2, and soil moisture. An Extreme Gradient Boosting (XGBoost) model was trained for each crop, and the SHAP (SHapley Additive exPlanations) framework was employed to ensure model interpretability.

Separate models were built for each crop, with the corn and soybean models demonstrating high predictive accuracy, achieving R² values of 0.76 and 0.81, respectively. In contrast, the wheat model yielded a lower R² value of approximately 0.5. A SHAP analysis of the high-performing corn and soybean models revealed that temperature and precipitation during the summer growing season were the most influential predictors, a finding that aligns with established agronomic principles.

The ability to predict the yield of major food crops is a critical step toward ensuring food security, especially under unstable climate conditions. The findings from this study can empower governments to formulate data-driven agricultural policies for the future, while also providing farmers with new, data-backed insights to consider when creating their planting plans.

Advisor

Pasteur, Drew

Department

Statistical and Data Sciences

Disciplines

Agronomy and Crop Sciences | Data Science | Environmental Sciences

Keywords

Crop Yield Prediction, Food Security, Machine Learning, Remote Sensing, SHAP

Publication Date

2025

Degree Granted

Bachelor of Arts

Document Type

Senior Independent Study Thesis

Share

COinS
 

© Copyright 2025 Yongchan Lee