Journal
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
Volume 106, Issue 22, Pages 8859-8864Publisher
NATL ACAD SCIENCES
DOI: 10.1073/pnas.0903931106
Keywords
higher criticism; phase diagram; region of impossibility; region of possibility; threshold feature selection
Categories
Funding
- National Science Foundation [DMS-0908613]
- Direct For Mathematical & Physical Scien
- Division Of Mathematical Sciences [0908613] Funding Source: National Science Foundation
Ask authors/readers for more resources
We study a two-class classification problem with a large number of features, out of which many are useless and only a few are useful, but we do not know which ones they are. The number of features is large compared with the number of training observations. Calibrating the model with 4 key parameters-the number of features, the size of the training sample, the fraction, and strength of useful features-we identify a region in parameter space where no trained classifier can reliably separate the two classes on fresh data. The complement of this region-where successful classification is possible-is also briefly discussed.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available