详细信息

基于FTIR技术和稀疏线性判别分析的秦艽种类鉴别  ( SCI-EXPANDED收录 EI收录)  

Identification of Gentiana Macrophylla by FTIR Technology and Sparse Linear Discriminant Analysis

文献类型:期刊文献

中文题名:基于FTIR技术和稀疏线性判别分析的秦艽种类鉴别

英文题名:Identification of Gentiana Macrophylla by FTIR Technology and Sparse Linear Discriminant Analysis

作者:李四海[1];余晓晖[2];赵磊[2];晋玲[2]

第一作者:李四海

通信作者:Jin, L[1]

机构:[1]甘肃中医药大学信息工程学院;[2]甘肃省高校中(藏)药化学与质量研究省级重点实验室

第一机构:甘肃中医药大学信息工程学院(教育技术中心)

通信机构:[1]corresponding author), Coll Gansu Prov, Key Lab Chem & Qual Tradit Chinese Med, Lanzhou 730000, Gansu, Peoples R China.

年份:2018

卷号:38

期号:8

起止页码:2390

中文期刊名:光谱学与光谱分析

外文期刊名:Spectroscopy and Spectral Analysis

收录:CSTPCD;;EI(收录号:20204309378567);Scopus;WOS:【SCI-EXPANDED(收录号:WOS:000443104600012)】;北大核心:【北大核心2017】;CSCD:【CSCD2017_2018】;PubMed;

基金:中央本级重大增减支项目"名贵中药资源可持续利用能力建设"子课题"秦艽全国生产区划研究"项目(20603020212);国家自然科学基金项目(81660577);甘肃省自然科学基金项目(1506RJZA046);甘肃省中医药管理局项目(GZK-2013-44)资助

语种:中文

中文关键词:秦艽;傅里叶变换红外光谱;正则化;稀疏线性判别分析;变量选择

外文关键词:Gentiana macrophylla;FTIR;Regularization;Sparse linear discriminant analysis;Variable selection

摘要:傅里叶变换红外光谱通常包含有大量的波长变量点,对其进行定性分析需要建立稳健的、可解释性的分类模型。稀疏线性判别分析(SLDA)是一种较为新颖和有效的机器学习算法,常用于高维度、小样本数据的变量筛选和判别分析,SLDA通过在线性判别分析中引入正则项,使分类器训练过程和变量选择过程同时完成,不同判别方向上载荷系数的稀疏性则增强了模型的可解释性。采集甘肃不同产地的秦艽样本94个,其中麻花秦艽(Gentiana straminea Maxim)30个,黄管秦艽(Gentiana officinalis)28个,大叶秦艽(Gentiana macrophylla Pall)36个,利用傅里叶变换红外光谱法获得所有样本的光谱图。取其中70个样本构成训练集,剩余24个为测试集。使用训练集建立SLDA模型,对2个判别方向上不为0的载荷系数个数进行网格化寻优,得到了最优的参数空间。利用建立的SLDA模型对测试样本进行预测,其分类准确率达到100%,实现了对三种秦艽的快速、准确鉴别。实验结果表明,与PLS-DA方法相比,SLDA模型在分类准确率、稀疏性及可解释性方面均具有一定优势,是一种新颖、有效的光谱定性分析方法。
Fourier transform infrared(FTIR) spectrum usually includes a large number of wavelength variables and the qualitative analysis of FTIR spectrum needs to establish a stable and interpretable classification model. Sparse linear discriminant analysis (SLDA), a relatively new and effective machine learning algorithm, is commonly used for variable selection and discriminant analysis of high-dimensional settings, in which the number of wavelength variable is very large and the number of observations is limited. By introducing regularization items into linear discriminant analysis, the classifier training and variable selection are performed simultaneously in SLDA, and the sparsity of load coefficients in different discriminant directions increases the interpretability of the model. A total of 94 samples of Gentiana macrophylla, including 30 Gentiana straminea Maxims, 28 Gentiana officinalis and 36 Gentiana macropylla Pall, were collected. FTIR spectrum of all samples was obtained by Fourier transform infrared spectroscopy method. 70 of the samples were selected as the training set, the remaining as the test set. Based on the training set, the SLDA model was established through the grid optimization of the number of non-zero loading coefficients in the two discriminant directions, and the optimal parameter space was obtained. According to the model parameters, the prediction accuracy of the test set was 100%, and thus the rapid and accurate identification of the three kinds of Gentiana macrophylla was realized. The experimental results showed that the SLDA model was superior to PLS-DA method in terms of classification accuracy, sparseness and interpretability. SLDA will be a novel and effective method for spectroscopy qualitative analysis.

参考文献:

正在载入数据...

版权所有©甘肃中医药大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心