4.7 Article

Logistic Biplot by Conjugate Gradient Algorithms and Iterated SVD

期刊

MATHEMATICS
卷 9, 期 16, 页码 -

出版社

MDPI
DOI: 10.3390/math9162015

关键词

binary data; logistic biplot; optimization methods; conjugate gradient algorithm; coordinate descent algorithm; MM algorithm; low rank model; R software

向作者/读者索取更多资源

This study introduces a new technique called logistic biplot (LB), which can simultaneously represent the rows and columns of a binary data matrix. The fitting of an LB model using nonlinear conjugate gradient or majorization-minimization algorithms is proposed, along with the introduction of a cross-validation procedure to select the hyperparameter that represents the number of dimensions in the model. Through a Monte Carlo study and comparison of different algorithms, it is found that the procedure based on cross-validation is successful in selecting the model in various scenarios.
Multivariate binary data are increasingly frequent in practice. Although some adaptations of principal component analysis are used to reduce dimensionality for this kind of data, none of them provide a simultaneous representation of rows and columns (biplot). Recently, a technique named logistic biplot (LB) has been developed to represent the rows and columns of a binary data matrix simultaneously, even though the algorithm used to fit the parameters is too computationally demanding to be useful in the presence of sparsity or when the matrix is large. We propose the fitting of an LB model using nonlinear conjugate gradient (CG) or majorization-minimization (MM) algorithms, and a cross-validation procedure is introduced to select the hyperparameter that represents the number of dimensions in the model. A Monte Carlo study that considers scenarios with several sparsity levels and different dimensions of the binary data set shows that the procedure based on cross-validation is successful in the selection of the model for all algorithms studied. The comparison of the running times shows that the CG algorithm is more efficient in the presence of sparsity and when the matrix is not very large, while the performance of the MM algorithm is better when the binary matrix is balanced or large. As a complement to the proposed methods and to give practical support, a package has been written in the R language called BiplotML. To complete the study, real binary data on gene expression methylation are used to illustrate the proposed methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据