4.7 Article

Comparison of a novel PLS1-DA, traditional PLS2-DA and assigned PLS1-DA for classification by molecular spectroscopy

出版社

ELSEVIER
DOI: 10.1016/j.chemolab.2020.104225

关键词

Molecular spectroscopy; PCA-EuD-PLS1-DA; PLS2-DA; Assigned PLS1-DA; Classification

资金

  1. Internationalization Training and Promotion Project of Graduate Students in China Agricultural University

向作者/读者索取更多资源

The study introduces a novel PLS-DA strategy, showing its superior performance in multi-class problems, especially in NIR and Raman spectral datasets. The method achieves high prediction accuracy through selecting representative samples and building qualitative models.
Partial least square discriminant analysis (PLS-DA) has achieved a huge success in many research systems, such as quality evaluation of agricultural products and drug analysis. However, it can be found that PLS-DA has various forms which can be mainly divided into two strategies separately called as PLS1-DA and PLS2-DA which is overused. In this work, we proposed a novel strategy not only to select representative samples for modeling, but also to build qualitative models, which is called as principal component analysis Euclidean distance PLS1-DA (PCA-EuD-PLS1-DA). Outliers were detected by the method Leverage. It is very interesting to compare the results of traditional PLS-DA tactics with the proposed method on six datasets. EuD-PLS1-DA with PCA partitioning subsets has the optimal performances than both PLS2-DA and assigned PLS1-DA when confronting multi-class problems. In particular, different classes' tablets in NIR can be well discriminated and different classes' tablets in Raman can be discriminated in all EuD-PLS1-DA models; and their prediction accuracy is above 98% and 80%, respectively. Their prediction accuracy for PLS2-DA is above 56% and 26%, respectively. And all RMSECV in EuD-PLS1-DA is smaller than the remaining classifiers; so does RMSEP. This behavior means EuD-PLS1-DA has a better fitness than the remaining methods. When there are only two classes in a dataset, performance of all PLS-DA models is highly similar. Before comparing the results of PLS-DA derivatives, the shortcomings in the other two methods have been systematically described. Sample selection algorithms including K-S, SPXY and PCA were compared with percentage of chosen samples in every class. And the current ratios are mainly in the area of 70-90%. Besides, ratios of every class' samples selected for modeling are highly similar among K-S, SPXY and PCA. Most importantly, the proposed method can be realized in most statistical software.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据