4.6 Article

Hard or Soft Classification? Large-Margin Unified Machines

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
Volume 106, Issue 493, Pages 166-177

Publisher

AMER STATISTICAL ASSOC
DOI: 10.1198/jasa.2011.tm10319

Keywords

Class probability estimation; DWD; Fisher consistency; Regularization; SVM

Funding

  1. NSF [DMS-0747575, DMS-0645293, DMS-0905561]
  2. NIH [NIH/NCI R01 CA-149569, NIH/NCI P01 CA 142538, NIH/NCI R01 CA-085848]
  3. Direct For Mathematical & Physical Scien
  4. Division Of Mathematical Sciences [0905561] Funding Source: National Science Foundation
  5. Direct For Mathematical & Physical Scien
  6. Division Of Mathematical Sciences [1347844] Funding Source: National Science Foundation

Ask authors/readers for more resources

Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Among numerous classifiers, some are hard classifiers while some are soft ones. Soft classifiers explicitly estimate the class conditional probabilities and then perform classification based on estimated probabilities. In contrast, hard classifiers directly target the classification decision boundary without producing the probability estimation. These two types of classifiers are based on different philosophies and each has its own merits. In this article, we propose a novel family of large-margin classifiers, namely large-margin unified machines (LUMs), which covers a broad range of margin-based classifiers including both hard and soft ones. By offering a natural bridge from soft to hard classification, the LUM provides a unified algorithm to fit various classifiers and hence a convenient platform to compare hard and soft classification. Both theoretical consistency and numerical performance of LUMs are explored. Our numerical study sheds some light on the choice between hard and soft classifiers in various classification problems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available