机器学习用于耕地土壤有机碳空间预测对比研究——以亚热带复杂地貌区为例

Comparison of machine learning for predicting and mapping soil organic carbon in cultivated land in a subtropical complex geomorphic region

  • 摘要: 耕地土壤有机碳(SOC)是土壤质量的重要指标,也是生态系统健康的重要表征。当前机器学习(Machine Learning,ML)用于SOC数字制图日益热门,但不同算法在高空间分辨率SOC数字制图中的对比研究尚有欠缺。本研究以福建省东北部复杂地形地貌区为例,采用10 m空间分辨率Sentinel-2影像数据,选取地形、气候、遥感植被变量为驱动因子,重点分析当前常用的机器学习算法——支持向量机(Support Vector Machine,SVM)、随机森林(Random Forest,RF)在SOC预测中的差异,并与传统普通克里格模型(Ordinary Kriging,OK)进行比较。结果表明:基于地形、遥感植被因子和气候因子构建的RF模型表现最佳(RMSE=2.004,r=0.897),其精度优于OK模型(RMSE=4.571,r=0.623),而SVM模型预测精度相对最低(RMSE=5.190,r=0.431);3种模型预测SOC空间分布趋势总体相似,表现为西高东低、北高南低,其中RF模型呈现的空间分异信息更加精细;最优模型反演得到耕地土壤有机碳平均含量为15.33 g·kg-1;RF模型和SVM模型变量重要性程度表明:高程和降水是影响复杂地貌区SOC空间分布的重要变量,而遥感植被因子重要性程度低于高程。

     

    Abstract: Soil organic carbon (SOC) is a key indicator of soil quality and ecosystem health. At present, machine learning (ML) models for predicting soil properties based on environmental variables are increasingly popular; however, the performance of different ML algorithms in predicting and mapping SOC, especially at high spatial resolutions, have not been compared. This study aimed to develop, evaluate, and compare the performance of Support Vector Machine (SVM), Random Forest (RF), and Ordinary Kriging (OK) models for predicting and mapping the SOC contents in the northeast of Fujian Province. Remote sensing vegetation indices were derived from Sentinel-2 image data with a spatial resolution of 10 m. These vegetation indices, along with selected terrain and climate factors, were adopted as environmental variables to map SOC using the SVM and RF models. The results showed that the performance of the RF model (RMSEroot-mean-square error=2.004, r=0.897) was better than that of the OK model (RMSE=4.571, r=0.623) and explained most of the SOC spatial heterogeneity. The SVM model had the poorest prediction accuracy (RMSE=5.190, r=0.431). SOC mapped from the three models had similar spatial patterns, with an increasing SOC gradient from east to west and from south to north of the study area. SOC in the farmlands predicted with the RF model varied in the range of 15.33±4.07 g·kg-1. Elevation and rainfall were the most important variables for the RF and SVM models, respectively, whereas the remote sensing vegetation indices were less important than elevation.

     

/

返回文章
返回