4.5 Article

Separating common (global and local) and distinct variation in multiple mixed types data sets

期刊

JOURNAL OF CHEMOMETRICS
卷 34, 期 1, 页码 -

出版社

WILEY
DOI: 10.1002/cem.3197

关键词

common and distinct variation; concave penalty; data fusion; mixed data types

资金

  1. China Scholarship Council [201504910809]

向作者/读者索取更多资源

Multiple sets of measurements on the same objects obtained from different platforms may reflect partially complementary information of the studied system. The integrative analysis of such data sets not only provides us with the opportunity of a deeper understanding of the studied system but also introduces some new statistical challenges. First, the separation of information that is common across all or some of the data sets and the information that is specific to each data set is problematic. Furthermore, these data sets are often a mix of quantitative and discrete (binary or categorical) data types, while commonly used data fusion methods require all data sets to be quantitative. In this paper, we propose an exponential family simultaneous component analysis (ESCA) model to tackle the potential mixed data types problem of multiple data sets. In addition, a structured sparse pattern of the loading matrix is induced through a nearly unbiased group concave penalty to disentangle the global, local common, and distinct information of the multiple data sets. A Majorization-Minimization-based algorithm is derived to fit the proposed model. Analytic solutions are derived for updating all the parameters of the model in each iteration, and the algorithm will decrease the objective function in each iteration monotonically. For model selection, a missing value-based cross validation procedure is implemented. The advantages of the proposed method in comparison with other approaches are assessed using comprehensive simulations as well as the analysis of real data from a chronic lymphocytic leukaemia (CLL) study.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据