4.7 Article

A Fitted Sparse-Group Lasso for Genome-Based Evaluations

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2022.3156805

关键词

Biology and genetics; iterative methods; optimization; statistical computing

向作者/读者索取更多资源

In life sciences, high-throughput techniques often result in high-dimensional data with a large number of covariates compared to observations, leading to multicollinearity challenges in linear regression analysis. Penalization methods, such as lasso, ridge regression, and group lasso, have been effective in addressing this issue. This study introduces a novel approach that combines lasso and standardized group lasso to achieve meaningful weighting of predicted outcomes, which is particularly important in breeding populations. The method was evaluated through extensive simulation studies and demonstrated improved prediction abilities and accurate localization of simulated features compared to other penalization approaches.
In life sciences, high-throughput techniques typically lead to high-dimensional data and often the number of covariates is much larger than the number of observations. This inherently comes with multicollinearity challenging a statistical analysis in a linear regression framework. Penalization methods such as the lasso, ridge regression, the group lasso, and convex combinations thereof, which introduce additional conditions on regression variables, have proven themselves effective. In this study, we introduce a novel approach by combining the lasso and the standardized group lasso leading to meaningful weighting of the predicted (fitted) outcome which is of primary importance, e.g., in breeding populations. This fitted sparse-group lasso was implemented as a proximal-averaged gradient descent method and is part of the R package seagull available at CRAN. For the evaluation of the novel method, we executed an extensive simulation study. We simulated genotypes and phenotypes which resemble data of a dairy cattle population. Genotypes at thousands of genomic markers were used as covariates to fit a quantitative response. The proximity of markers on a chromosome determined grouping. In the majority of simulated scenarios, the new method revealed improved prediction abilities compared to other penalization approaches and was able to localize the signals of simulated features.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据