4.7 Article

Sparse reproducible machine learning for near infrared hyperspectral imaging: Estimating the tetrahydrocannabinolic acid concentration in Cannabis sativa L.

期刊

INDUSTRIAL CROPS AND PRODUCTS
卷 192, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.indcrop.2022.116137

关键词

Cannabis; THCA; NIR; HSI; Regression; PLS; CCA; Sparsity; Reproducibility

向作者/读者索取更多资源

This paper presents a reliable real-time technique using proximal near infrared hyperspectral imaging to measure the concentration of tetrahydrocannabinolic acid (THCA) in hemp. The study compares different regression algorithms and finds that regularized partial least squares (RPLS) achieves the best performance. A variation of RPLS with feature selection (PLSFS) is introduced to improve model interpretability and reproducibility.
The concentrations of cannabinoids in hemp are still tightly controlled in New Zealand and around the world with crops exceeding the legal limit being prohibited from cultivation. Thus, there is a need for high throughput methods to accurately assess the cannabinoid content and to evaluate compliance and harvest readiness infield. This paper reports a reliable real-time technique to measure the tetrahydrocannabinolic acid (THCA) concentration of Cannabis sativa L. using proximal near infrared (NIR) hyperspectral imaging (HSI). At implementation, scalability can be achieved by introducing sparsity to the model. Sparsity also enabled better model interpretability and is robust against fitting noisy HSI data. Model reproducibility was used to assess the quality of the model fitness. This work uses linear regression to map NIR HSI images to THCA measured with high performance liquid chromatography (HPLC). Four regression algorithms that cover different regression strategies were compared: Canonical Correlation Analysis (CCA), Ensemble CCA (EnCCA), Partial Least Squares Regression (PLS), and Regularized PLS (RPLS). The RPLS algorithm achieved the best performance but uses all spectral wavelengths for regression. Thus, a variation of RPLS with feature selection (PLSFS) was introduced to improve model interpretability. The proposed PLSFS method leads to reproducible models while maintaining small feature sets. To our knowledge, this publication reports the first research that has used HSI to estimate THCA concentration.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据