期刊
ELECTRONIC JOURNAL OF STATISTICS
卷 1, 期 -, 页码 519-537出版社
INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/07-EJS039
关键词
CART; random forests; maximal subtree
资金
- National Institutes of Health [HL-072771]
We characterize and study variable importance (VIMP) and pairwise variable associations in binary regression trees. A key component involves the node mean squared error for a quantity we refer to as a maximal subtree. The theory naturally extends from single trees to ensembles of trees and applies to methods like random forests. This is useful because while importance values from random forests are used to screen variables, for example they are used to filter high throughput genomic data in Bioinformatics, very little theory exists about their properties.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据