4.5 Article

Relevance measures for subset variable selection in regression problems based on k-additive mutual information

期刊

COMPUTATIONAL STATISTICS & DATA ANALYSIS
卷 49, 期 4, 页码 1205-1227

出版社

ELSEVIER
DOI: 10.1016/j.csda.2004.07.026

关键词

subset variable selection; regression; mutual information; Shannon entropy; Mobius representation; k-additive measure

向作者/读者索取更多资源

In the framework of subset variable selection for regression, relevance measures based on the notion of mutual information are studied. Results on the estimation of this index of stochastic dependence in a continuous setting are first presented. They are grounded on kernel density estimation which makes the overall estimation of the mutual information quadratic. The behavior of the mutual information as a relevance measure is then empirically studied on several regression problems. The considered problems are artificially generated to contain irrelevant and redundant candidate explanatory variables as well as strongly nonlinear relationships. Next, still in a subset variable selection context, computationally more efficient approximations of the mutual information based on the notion of k-additive truncation are proposed. The 2- and 3-additive truncations appear to be of practical interest as relevance measures. The 2-additive truncation is based on the computation of the approximate relevance of a set of potential predictors from the relevance values of the singletons and pairs it contains. The 3-additive truncation additionally involves the relevance values of the 3-element subsets. The lower the amount of redundancy among the candidate explanatory variables, the better these approximations. The sample behavior of the two resulting relevance measures is finally empirically studied on the previously generated nonlinear artificial regression problems. (c) 2004 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据