李梦洁, 张曼胤, 崔丽娟, 王贺年, 郭子良, 李伟, 魏圆云, 杨思, 龙颂元. 基于连续小波变换和随机森林的芦苇叶片汞含量反演[J]. 中国生态农业学报(中英文), 2018, 26(11): 1730-1738. DOI: 10.13930/j.cnki.cjea.180131
引用本文: 李梦洁, 张曼胤, 崔丽娟, 王贺年, 郭子良, 李伟, 魏圆云, 杨思, 龙颂元. 基于连续小波变换和随机森林的芦苇叶片汞含量反演[J]. 中国生态农业学报(中英文), 2018, 26(11): 1730-1738. DOI: 10.13930/j.cnki.cjea.180131
LI Mengjie, ZHANG Manyin, CUI Lijuan, WANG Henian, GUO Ziliang, LI Wei, WEI Yuanyun, YANG Si, LONG Songyuan. Inversion of Hg content in reed leaf using continuous wavelet transformation and random forest[J]. Chinese Journal of Eco-Agriculture, 2018, 26(11): 1730-1738. DOI: 10.13930/j.cnki.cjea.180131
Citation: LI Mengjie, ZHANG Manyin, CUI Lijuan, WANG Henian, GUO Ziliang, LI Wei, WEI Yuanyun, YANG Si, LONG Songyuan. Inversion of Hg content in reed leaf using continuous wavelet transformation and random forest[J]. Chinese Journal of Eco-Agriculture, 2018, 26(11): 1730-1738. DOI: 10.13930/j.cnki.cjea.180131

基于连续小波变换和随机森林的芦苇叶片汞含量反演

Inversion of Hg content in reed leaf using continuous wavelet transformation and random forest

  • 摘要: 植物重金属污染是当今世界面临的重大生态环境问题之一,高光谱技术为快速、大面积监测植被重金属含量提供了可能性。本研究以重金属汞(Hg)和湿地植物芦苇为研究对象,采用连续小波变换(CWT)和随机森林(RF)算法相结合的方法建立芦苇叶片总汞含量反演模型,以期寻求一种较为精准的植物汞污染反演模型,未来可通过高光谱技术建立模型来无损、快速估测湿地植物重金属汞污染情况,为湿地生态系统的监测提供方法支持。结果表明:芦苇叶片总汞含量敏感波段主要分布在可见光波段419~522 nm、664~695 nm和724~876 nm以及近红外波段1 450~1 558 nm和1 972~2 500 nm;经CWT变换后,小波系数与叶片总汞含量的相关系数绝对值提高0.04~0.18,所构建的预测反演模型拟合效果R2提高0.107~0.177,模型精度RMSE提高0.008~0.013,其中利用经小波变换的去包络线光谱(CR-CWT)数据建立的RF模型对芦苇叶片总汞含量的反演精度和拟合效果最优(R2=0.713,RMSE=0.127);同时在土壤总汞含量约为20 mg·kg-1时,采用CR-CWT数据构建RF模型的方法来反演芦苇叶片总汞含量更为准确和可靠(R2=0.825,RMSE=0.051)。因此,利用RF算法进行植被重金属含量的反演具有一定的现实可行性,而结合CWT后所构建的反演模型对指导植被重金属含量监测更具参考价值,应用前景广阔。

     

    Abstract: Heavy metal pollution of plants is one of the most important eco-environmental problems in the world. Rapid and large-scale monitoring of heavy metal content in plants has always been an international problem and a key research topic. Due to its high resolution, multiple band and abundant data, hyperspectral technology could offer a rapid and accurate determination of heavy metal pollution in plants. It can be used to detect the absorption, reflection and transmission characteristics of spectral bands corresponding to phytochemical components and to quantitatively analyze weak spectral differences for large-scale determination of the growth and health of plants. However, researchers mostly construct sensitive spectral parameters (e.g., vegetation index) through simple spectral transformation techniques and continuous removal methods. Most of the inversion models are of univariate regression, multiple stepwise regression, principal component regression and other empirical or semi-empirical models. There have also been uses of artificial networks and support vector machine models. These models not only require more training sets, but also easily over-fit. Thus continuous wavelet transform (CWT) and Random Forest (RF) algorithms are used as more accurate models for inverting heavy metal pollution in plants. While CWT model can more clearly characterize spectral signals, RF has strong fitting ability and also has shorter iteration time. It has higher calculation efficiency for large datasets such as hyperspectral data and is superior in model construction. The heavy metal mercury (Hg) and the wetland plant reed (Phragmites communis) were used in this research to test the effectiveness off the CWT and RF models. CWT was used to decompose continuous wavelength at different scales in the original spectral reflectivity (R), first-order derivative reflectivity (FD) and de-envelope reflectivity (CR). Correlation analysis was used to determine sensitive bands of R, FD, CR, the spectral reflectance by continuous wavelet transform (R-CWT), the first derivative reflectivity by continuous wavelet transform (FD-CWT) and de-envelope reflectivity by continuous wavelet transform based on the correlation with leaf total Hg content. Then the sensitive bands and RF algorithm were used to establish the inversion model of reed leaf total Hg content. The results showed that sensitive bands of leaf total Hg content were mainly distributed in the visible regions of 419-522 nm, 664-695 nm and 724-876 nm, and the near-infrared regions of 1 450-1 558 nm and 1 972-2 500 nm. After CWT transformation, the absolute value of correlation coefficient between wavelet coefficient and leaf total Hg content increased by 0.04-0.18, the fitting effect (R2) of the prediction inversion model increased by 0.107-0.177 and the accuracy (RMSE) of the prediction inversion model increased by 0.008-0.013. The RF model which used continuum removal reflectance after wavelet transformation (CR-CWT) had optimal inversion precision and fitting effect (R2=0.713, RMSE=0.127). At the same time, it was more accurate and reliable to use RF model with CR-CWT to retrieve leaf total Hg content when soil total Hg content was about 20 mg·kg-1 (R2=0.825, RMSE=0.051). Therefore, it was feasible to use RF algorithm to retrieve heavy metal content in plants. The inversion model constructed by CWT had a more reference value in terms of monitoring heavy metal content in plants. The model was widely used and provided methodological support for non-destructive and rapid monitoring of heavy metal pollution in ecosystems.

     

/

返回文章
返回