☆ 4.5 Article

Relevance measures for subset variable selection in regression problems based on k-additive mutual information

COMPUTATIONAL STATISTICS & DATA ANALYSIS (2005)

期刊

COMPUTATIONAL STATISTICS & DATA ANALYSIS

卷 49, 期 4, 页码 1205-1227

出版社

ELSEVIER

DOI: 10.1016/j.csda.2004.07.026

关键词

subset variable selection; regression; mutual information; Shannon entropy; Mobius representation; k-additive measure

类别

Computer Science, Interdisciplinary Applications Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In the framework of subset variable selection for regression, relevance measures based on the notion of mutual information are studied. Results on the estimation of this index of stochastic dependence in a continuous setting are first presented. They are grounded on kernel density estimation which makes the overall estimation of the mutual information quadratic. The behavior of the mutual information as a relevance measure is then empirically studied on several regression problems. The considered problems are artificially generated to contain irrelevant and redundant candidate explanatory variables as well as strongly nonlinear relationships. Next, still in a subset variable selection context, computationally more efficient approximations of the mutual information based on the notion of k-additive truncation are proposed. The 2- and 3-additive truncations appear to be of practical interest as relevance measures. The 2-additive truncation is based on the computation of the approximate relevance of a set of potential predictors from the relevance values of the singletons and pairs it contains. The 3-additive truncation additionally involves the relevance values of the 3-element subsets. The lower the amount of redundancy among the candidate explanatory variables, the better these approximations. The sample behavior of the two resulting relevance measures is finally empirically studied on the previously generated nonlinear artificial regression problems. (c) 2004 Elsevier B.V. All rights reserved.

Relevance measures for subset variable selection in regression problems based on k-additive mutual information

期刊

COMPUTATIONAL STATISTICS & DATA ANALYSIS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Relevance measures for subset variable selection in regression problems based on k-additive mutual information

期刊

COMPUTATIONAL STATISTICS & DATA ANALYSIS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文