4.5 Article

Variable selection for linear regression in large databases: exact methods

期刊

APPLIED INTELLIGENCE
卷 51, 期 6, 页码 3736-3756

出版社

SPRINGER
DOI: 10.1007/s10489-020-01927-6

关键词

Variable selection; Linear regression; Branch & Bound methods; Heuristics

资金

  1. FEDER funds [BU062U16, COV2000375]
  2. Spanish Ministry of Economy and Competitiveness [ECO2016-76567-C4-2-R, PID2019-104263RB-C44]
  3. Regional Government of Castilla y Leon, Spain [BU329U14, BU071G19]
  4. Regional Government of Castilla y Leon

向作者/读者索取更多资源

This paper analyzes the variable selection problem in the context of Linear Regression for large databases, proposing a Branch & Bound method to tackle the issue effectively in very large databases. Computational experiments show that this method performs well compared to other known methods and commercial software.
This paper analyzes the variable selection problem in the context of Linear Regression for large databases. The problem consists of selecting a small subset of independent variables that can perform the prediction task optimally. This problem has a wide range of applications. One important type of application is the design of composite indicators in various areas (sociology and economics, for example). Other important applications of variable selection in linear regression can be found in fields such as chemometrics, genetics, and climate prediction, among many others. For this problem, we propose a Branch & Bound method. This is an exact method and therefore guarantees optimal solutions. We also provide strategies that enable this method to be applied in very large databases (with hundreds of thousands of cases) in a moderate computation time. A series of computational experiments shows that our method performs well compared to well-known methods in the literature and with commercial software.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据