4.6 Article

Penalized classification using Fisher's linear discriminant

出版社

WILEY
DOI: 10.1111/j.1467-9868.2011.00783.x

关键词

Classification; Feature selection; High dimensional problems; Lasso; Linear discriminant analysis; Supervised learning

资金

  1. National Science Foundation [DMS-9971405]
  2. National Institutes of Health [N01-HV-28183]
  3. NATIONAL HEART, LUNG, AND BLOOD INSTITUTE [R01HL028183] Funding Source: NIH RePORTER
  4. NATIONAL INSTITUTE OF BIOMEDICAL IMAGING AND BIOENGINEERING [R01EB001988] Funding Source: NIH RePORTER

向作者/读者索取更多资源

We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high dimensional setting where p >> n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule that is obtained from LDA, since it involves all p features. We propose penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization-maximization approach to optimize it efficiently when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L-1 and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high dimensional setting and explore their relationships with our proposal.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据