基于人类活动因子和随机森林模型改进土壤有机碳密度预测及制图

Improvement of soil organic carbon density prediction and mapping based on human activity factors and random forest model

  • 摘要: 土壤有机碳(SOC)密度是影响粮食安全和农业决策的重要土壤属性。以往SOC密度预测研究多基于自然环境变量开展。然而, 在农业生产频繁地区, 人类活动也会在一定程度上影响土壤性质。本研究以黄淮海平原为研究区域, 选择24个环境变量及人口密度、建筑物体量、道路网密度和人类热排放4个人类活动变量, 探讨人类活动对耕地SOC密度预测的重要性。结果表明, 环境协变量仅能解释耕地SOC密度变化的35%。添加人类活动变量后, 决定系数(R2)和林氏一致性相关系数(Lin’s concordance correlation coefficient, LCCC)分别提高37.14%和19.67%, 平均绝对误差(MAE)和均方根误差(RMSE)分别降低8.47%和9.88%, 显示出更好的模型性能和预测准确性。这说明在黄淮海平原地区, 人类活动对区域SOC密度的空间分异具有重要影响。变量重要性分析发现, 白天最高地表温度是最重要的预测因子, 其次是白天地表温度的标准差。人类活动变量中, 人类热排放是最重要的预测变量, 重要性占比为8.25%, 其次是人口密度、建筑物体量和道路网密度。

     

    Abstract: Soil organic carbon (SOC) density is a critical soil attribute that not only sustains soil fertility and regulates terrestrial carbon cycles but also exerts a profound impact on regional food security and agricultural management decisions. While previous research on SOC density prediction has predominantly relied on natural environmental variables—such as climate, topography, and vegetation—these studies often overlook the role of anthropogenic disturbances, especially in intensively human-impacted regions where human activities significantly alter soil properties. This study focused on the Huang-Huai-Hai Plain (HHH Plain), a key grain-producing area in China characterized by dense populations and intensive agricultural and urban activities. To improve the accuracy of SOC density prediction for cultivated land in this region, we integrated 24 environmental covariates (encompassing climate, soil parent materials, topography, vegetation, land surface thermal conditions, and soil properties) with four human activity variables (population density, built-up volume, road network density, and hourly anthropogenic heat flux), using the random forest (RF) algorithm. A five-fold cross-validation approach was adopted to optimize model parameters (finalized as n_estimators=100 and max_depth=4) and evaluate performance, with metrics including mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2), and Lin's Concordance Correlation Coefficient (LCCC). The results revealed that the model using only environmental covariates (Model 1) could explain merely 35% of the spatial variability in cultivated land SOC density. In contrast, the integrated model (Model 6) incorporating both environmental and all four human activity variables significantly enhanced prediction performance: R2 and LCCC increased by 37.14% and 19.67%, respectively, while MAE and RMSE decreased by 8.47% and 9.88%, respectively. This integrated model explained 48% of SOC density variations, highlighting the indispensable role of human activity variables. Among all predictors, the highest daytime land surface temperature was the most influential environmental factor, while hourly anthropogenic heat flux emerged as the most critical human activity variable, contributing 8.25% to prediction importance—surpassing other anthropogenic factors like population density (2.90%), built-up volume (0.58%), and road network density (0.06%). These findings confirm that incorporating human activity factors, particularly hourly anthropogenic heat flux, is essential for improving SOC density prediction accuracy in the HHH Plain. The study provides a scientific basis for regional soil carbon management, agricultural sustainable development, and ecological protection, while also offering a reference for similar studies in human-dominated agricultural regions globally.

     

/

返回文章
返回