4.6 Article

Multiclass classification of environmental chemical stimuli from unbalanced plant electrophysiological data

期刊

PLOS ONE
卷 18, 期 5, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0285321

关键词

-

向作者/读者索取更多资源

In this paper, a statistical analysis pipeline is proposed to deal with a multiclass environmental stimuli classification problem using imbalanced plant electrophysiological data. Fifteen statistical features extracted from the plant electrical signals are used to classify three different environmental chemical stimuli and the performance of eight different classification algorithms is compared. The findings have potential real-world applications in precision agriculture for exploring multiclass classification problems with highly imbalanced datasets and advance existing studies on environmental pollution level monitoring using plant electrophysiological data.
Plant electrophysiological response contains useful signature of its environment and health which can be utilized using suitable statistical analysis for developing an inverse model to classify the stimulus applied to the plant. In this paper, we have presented a statistical analysis pipeline to tackle a multiclass environmental stimuli classification problem with unbalanced plant electrophysiological data. The objective here is to classify three different environmental chemical stimuli, using fifteen statistical features, extracted from the plant electrical signals and compare the performance of eight different classification algorithms. A comparison using reduced dimensional projection of the high dimensional features via principal component analysis (PCA) has also been presented. Since the experimental data is highly unbalanced due to varying length of the experiments, we employ a random under-sampling approach for the two majority classes to create an ensemble of confusion matrices to compare the classification performances. Along with this, three other multi-classification performance metrics commonly used for unbalanced data viz. balanced accuracy, F-1-score and Matthews correlation coefficient have also been analyzed. From the stacked confusion matrices and the derived performance metrics, we choose the best feature-classifier setting in terms of the classification performances carried out in the original high dimensional vs. the reduced feature space, for this highly unbalanced multiclass problem of plant signal classification due to different chemical stress. Difference in the classification performances in the high vs. reduced dimensions are also quantified using the multivariate analysis of variance (MANOVA) hypothesis testing. Our findings have potential real-world applications in precision agriculture for exploring multiclass classification problems with highly unbalanced datasets, employing a combination of existing machine learning algorithms. This work also advances existing studies on environmental pollution level monitoring using plant electrophysiological data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据