Abstract:
Soil organic carbon (SOC) density is a critical soil attribute that not only sustains soil fertility and regulates terrestrial carbon cycles but also exerts a profound impact on regional food security and agricultural management decisions. While previous research on SOC density prediction has predominantly relied on natural environmental variables—such as climate, topography, and vegetation—these studies often overlook the role of anthropogenic disturbances, especially in intensively human-impacted regions where human activities significantly alter soil properties. This study focused on the Huang-Huai-Hai Plain (HHH Plain), a key grain-producing area in China characterized by dense populations and intensive agricultural and urban activities. To improve the accuracy of SOC density prediction for cultivated land in this region, we integrated 24 environmental covariates (encompassing climate, soil parent materials, topography, vegetation, land surface thermal conditions, and soil properties) with four human activity variables (population density, built-up volume, road network density, and hourly anthropogenic heat flux), using the random forest (RF) algorithm. A five-fold cross-validation approach was adopted to optimize model parameters (finalized as n_estimators=100 and max_depth=4) and evaluate performance, with metrics including mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (
R2), and Lin's Concordance Correlation Coefficient (LCCC). The results revealed that the model using only environmental covariates (Model 1) could explain merely 35% of the spatial variability in cultivated land SOC density. In contrast, the integrated model (Model 6) incorporating both environmental and all four human activity variables significantly enhanced prediction performance:
R2 and LCCC increased by 37.14% and 19.67%, respectively, while MAE and RMSE decreased by 8.47% and 9.88%, respectively. This integrated model explained 48% of SOC density variations, highlighting the indispensable role of human activity variables. Among all predictors, the highest daytime land surface temperature was the most influential environmental factor, while hourly anthropogenic heat flux emerged as the most critical human activity variable, contributing 8.25% to prediction importance—surpassing other anthropogenic factors like population density (2.90%), built-up volume (0.58%), and road network density (0.06%). These findings confirm that incorporating human activity factors, particularly hourly anthropogenic heat flux, is essential for improving SOC density prediction accuracy in the HHH Plain. The study provides a scientific basis for regional soil carbon management, agricultural sustainable development, and ecological protection, while also offering a reference for similar studies in human-dominated agricultural regions globally.