4.6 Article

Determination of the Geographical Origin of Coffee Beans Using Terahertz Spectroscopy Combined With Machine Learning Methods

期刊

FRONTIERS IN NUTRITION
卷 8, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA
DOI: 10.3389/fnut.2021.680627

关键词

THz spectroscopy; machine learning; classification; geographical origin; coffee beans

资金

  1. National Natural Science Foundation of China [81871396, 81971657, 81671727]
  2. Tianjin Natural Science Foundation [19JCYBJC29100, 19JCTPJC42200]
  3. National Science and Technology Support Program of China [2015BAD19B03]

向作者/读者索取更多资源

Different geographical origins can greatly impact the quality, taste, and commercial value of coffee. This study explores the use of terahertz spectroscopy and machine learning methods to classify the geographic origin of coffee beans effectively. Results show that CNN method achieves excellent classification, and variable selecting is crucial for creating an accurate and robust discrimination model.
Different geographical origins can lead to great variance in coffee quality, taste, and commercial value. Hence, controlling the authenticity of the origin of coffee beans is of great importance for producers and consumers worldwide. In this study, terahertz (THz) spectroscopy, combined with machine learning methods, was investigated as a fast and non-destructive method to classify the geographic origin of coffee beans, comparing it with the popular machine learning methods, including convolutional neural network (CNN), linear discriminant analysis (LDA), and support vector machine (SVM) to obtain the best model. The curse of dimensionality will cause some classification methods which are struggling to train effective models. Thus, principal component analysis (PCA) and genetic algorithm (GA) were applied for LDA and SVM to create a smaller set of features. The first nine principal components (PCs) with an accumulative contribution rate of 99.9% extracted by PCA and 21 variables selected by GA were the inputs of LDA and SVM models. The results demonstrate that the excellent classification (accuracy was 90% in a prediction set) could be achieved using a CNN method. The results also indicate variable selecting as an important step to create an accurate and robust discrimination model. The performances of LDA and SVM algorithms could be improved with spectral features extracted by PCA and GA. The GA-SVM has achieved 75% accuracy in a prediction set, while the SVM and PCA-SVM have achieved 50 and 65% accuracy, respectively. These results demonstrate that THz spectroscopy, together with machine learning methods, is an effective and satisfactory approach for classifying geographical origins of coffee beans, suggesting the techniques to tap the potential application of deep learning in the authenticity of agricultural products while expanding the application of THz spectroscopy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据