基于机器学习理论的土壤侵蚀模型构建

A soil erosion model built on Machine Learning Theory

  • 摘要: 土壤侵蚀一直是环境问题中的重点和难点。由于影响土壤侵蚀的因素众多, 传统的预测模型存在数据获取困难、适用范围小、研究周期长等不足, 使得对土壤侵蚀的预测无法做到快速、便捷。支持向量机(Support Vector Machine, SVM)是机器学习中的一个重要模型, 具有非线性映射、自我学习能力、全局最小值、对输入数据变化不敏感等优点, 在建立土壤侵蚀量相关性预测模型方面较传统预测模型具有更强的优势。本研究应用浙江省诸暨市浦阳江水文站的降雨数据, 利用ArcGIS地理信息系统确定水文站上游流域为研究区域。以降雨量、研究区域地理数据维度(包括坡度数据、坡长数据、土壤信息、土地利用类型)作为影响因子, 输入支持向量机模型, 进行流域内土壤侵蚀量预测。将水文站土壤侵蚀量实测数据作为对照值, 用模型输出值检验, 从而在取值范围内选择出模型最优的参数组。用影响因子数据和土壤侵蚀量数据对使用最优参数的模型进行检验, 模型的预测准确率最高达到75%。其中, 降雨量对土壤侵蚀量的影响最大, 降雨量单因子预测准确率在70%以上, 其余因子预测准确率在3.5%左右。最终得到一个土壤侵蚀量相关性预测模型, 通过水文站降雨数据以及地理信息, 即可预测当地土壤侵蚀量, 准确率达到75%。

     

    Abstract: In the aporia of environment problems, soil erosion is a critical element. Because of the many influencing factors, traditional prediction models of soil erosion are limited, including limitations such as difficulty in data collection, small-scale application, long research cycle, etc. These limitations make the prediction of soil erosion highly slow and inconvenient. Support Vector Machine (SVM) is one of the most important machine learning models. SVM has advantages such as non-linear mapping, self-learning ability, global minimum, insensitivity to input data. In contrast to traditional prediction models, SVM is highly beneficial in building relevant soil erosion models. Rainfall data were obtained from Puyang River Hydrologic Station of Zhuji City, Zhejiang Province. The layout research was developed in ArcMap and it included the upland catchment of Puyang River Hydrologic Station. The rainfall data and geographic data (including slope length, slope degree, soil type and land use type) were input into the SVM model as influencing factors of soil erosion. After screening, a total of 4 018 rainfall data were used. The proportions of the different slope degrees and slope lengths were calculated and land use types classified in study area using ERADS. After the data processing, the model input data were then ready, and divided into five groups, four of which were used as training data and the other used as examination data. The training data were input into the SVM model and the results compared. When the accuracy rate of the predicted results reached the maximum value, the model was accepted as attaining the optimum parameters. After confirmation of the optimal parameters, the soil erosion prediction model was inspected using the influencing factors and soil erosion data (i.e., the examination data). The highest accuracy rate of the model exceeded 75%. Among influencing factors, rainfall had the highest impact on soil erosion. The accuracy rate of the model reached 70% when only rainfall data were used, and was 3.5% when other influencing factors used together. At last, a relevant soil erosion prediction model was built with prediction accuracy rate of over 75%. The model could predict soil erosion from only rainfall data or rainfall in combination with geographic data. Although the prediction accuracy of model was relatively low under severe soil erosion, it provided a new and alternative method for predicting soil erosion on a large scales and extreme frequencies.

     

/

返回文章
返回