期刊
IEEE TRANSACTIONS ON SIGNAL PROCESSING
卷 69, 期 -, 页码 2625-2638出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSP.2021.3075150
关键词
Covariance matrices; Data models; Training data; Standards; Estimation; Training; Analytical models; Robust estimation; covariance matrices; linear discriminant analysis
资金
- Hong Kong RGC [16202918, C6012-20G]
In this study, a robust version of linear discriminant analysis (LDA) classifiers is proposed to address potential spurious or mislabeled observations in training data. The robust LDA relies on a robust estimate of the covariance matrix of the training data. Results show that the use of robust estimators does not degrade the performance of LDA, especially when data are corrupted by outliers or random noise.
In standard discriminant analysis, data are commonly assumed to follow a Gaussian distribution, a condition which is often violated in practice. In this work, to account for potential spurious or mislabeled observations in the training data, we consider a robust version of regularized linear discriminant analysis (LDA) classifiers. Essential to such robust version of LDA is the design of a robust discriminant rule which relies on a robust estimate of the covariance matrix of the training data. We propose to use a regularized version of M-estimators of covariance matrices belonging to Maronna's class of estimators. In the regime where both the number of variables and the number of training samples are large, building upon recent results from random matrix theory, we show that when the training data are free from outliers, each classifier within the class of proposed robust classifiers is asymptotically equivalent to traditional, non-robust classifiers. Rather surprisingly, this entails that the use of robust estimators does not degrade the performance of LDA, up to a transformation of the regularization parameter that we precisely characterize. We also demonstrate that the proposed robust classifiers lead to a better classification accuracy when the data are corrupted by outliers or random noise. Furthermore, through simulations on the popular MNIST data set and considering different classification tasks, we show that the worse the classification error of traditional methods is, the further gain is to be expected with the use of our proposed method.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据