4.5 Article

Improved high-dimensional regression models with matrix approximations applied to the comparative case studies with support vector machines

期刊

OPTIMIZATION METHODS & SOFTWARE
卷 37, 期 5, 页码 1912-1929

出版社

TAYLOR & FRANCIS LTD
DOI: 10.1080/10556788.2021.2022144

关键词

Robustness and sensitivity analysis; regression analysis; penalized method; rank-one update; diagonal approximation; metaheuristic algorithm

向作者/读者索取更多资源

In regression analysis, the challenges of high-dimensional data include issues such as the rank of the design matrix, outliers, and ill-conditioning. Penalties can effectively address these challenges, with penalized mixed-integer nonlinear programming models showing promising results in regression analysis for high-dimensional data.
Nowadays, high-dimensional data appear in many practical applications such as biosciences. In the regression analysis literature, the well-known ordinary least-squares estimation may be misleading when the full ranking of the design matrix is missed. As a popular issue, outliers may corrupt normal distribution of the residuals. Thus, since not being sensitive to the outlying data points, robust estimators are frequently applied in confrontation with the issue. Ill-conditioning in high-dimensional data is another common problem in modern regression analysis under which applying the least-squares estimator is hardly possible. So, it is necessary to deal with estimation methods to tackle these problems. As known, a successful approach for high-dimension cases is the penalized scheme with the aim of obtaining a subset of effective explanatory variables that predict the response as the best, while setting the other parameters to zero. Here, we develop several penalized mixed-integer nonlinear programming models to be used in high-dimension regression analysis. The given matrix approximations have simple structures, decreasing computational cost of the models. Moreover, the models are effectively solvable by metaheuristic algorithms. Numerical tests are made to shed light on performance of the proposed methods on simulated and real world high-dimensional data sets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据