☆ 4.7 Article

Missing is useful: Missing values in cost-sensitive decision trees

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2005)

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Volume 17, Issue 12, Pages 1689-1693

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2005.188

Keywords

induction; knowledge acquisition; machine learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Many real-world data sets for machine learning and data mining contain missing values and much previous research regards it as a problem and attempts to impute missing values before training and testing. In this paper, we study this issue in cost-sensitive learning that considers both test costs and misclassification costs. If some attributes ( tests) are too expensive in obtaining their values, it would be more cost-effective to miss out their values, similar to skipping expensive and risky tests ( missing values) in patient diagnosis ( classification). That is, missing is useful as missing values actually reduces the total cost of tests and misclassifications and, therefore, it is not meaningful to impute their values. We discuss and compare several strategies that utilize only known values and that missing is useful for cost reduction in cost-sensitive decision tree learning.

Missing is useful: Missing values in cost-sensitive decision trees

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Missing is useful: Missing values in cost-sensitive decision trees

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper