期刊
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
卷 117, 期 35, 页码 21175-21184出版社
NATL ACAD SCIENCES
DOI: 10.1073/pnas.1921562117
关键词
machine learning; prediction diagnostics; boosting; quantile regression; conditional distribution estimation
A method for decision tree induction is presented. Given a set of predictor variables x = (x(1), x(2), ..., x(p)) and two outcome variables y and z associated with each x, the goal is to identify those values of x for which the respective distributions of y vertical bar x and z vertical bar x, or selected properties of those distributions such as means or quantiles, are most different. Contrast trees provide a lack-of-fit measure for statistical models of such statistics, or for the complete conditional distribution p(y)(y vertical bar x), as a function of x. They are easily interpreted and can be used as diagnostic tools to reveal and then understand the inaccuracies of models produced by any learning method. A corresponding contrast-boosting strategy is described for remedying any uncovered errors, thereby producing potentially more accurate predictions. This leads to a distribution-boosting strategy for directly estimating the full conditional distribution of y at each x under no assumptions concerning its shape, form, or parametric representation.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据