4.8 Article

Bayesian Joint Matrix Decomposition for Data Integration with Heterogeneous Noise

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2019.2946370

Keywords

Matrix decomposition; Bayes methods; Data integration; Inference algorithms; Data models; Data mining; Gaussian distribution; Bayesian methods; matrix decomposition; data integration; variational Bayesian inference; maximum a posterior

Funding

  1. National Natural Science Foundation of China [11661141019, 61621003]
  2. Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) [XDB13040600]
  3. National Ten Thousand Talent Program for Young Top-notch Talents
  4. Key Research Program of the Chinese Academy of Sciences [KFZD-SW-219]
  5. National Key Research and Development Program of China [2017YFC0908405]
  6. CAS Frontier Science Research Key Project for Top Young Scientist [QYZDB-SSW-SYS008]

Ask authors/readers for more resources

Matrix decomposition is a popular method in machine learning and data mining, and a joint matrix decomposition framework has been proposed for multi-view data and heterogeneous noise, with two algorithms developed to solve the model, showing superiority over existing methods in experiments.
Matrix decomposition is a popular and fundamental approach in machine learning and data mining. It has been successfully applied into various fields. Most matrix decomposition methods focus on decomposing a data matrix from one single source. However, it is common that data are from different sources with heterogeneous noise. A few of the matrix decomposition methods have been extended for such multi-view data integration and pattern discovery while only a few methods were designed to consider the heterogeneity of noise in such multi-view data for data integration explicitly. To this end, in this article, we propose a joint matrix decomposition framework (BJMD), which models the heterogeneity of noise by the Gaussian distribution in a Bayesian framework. We develop two algorithms to solve this model: one is a variational Bayesian inference algorithm, which makes full use of the posterior distribution; and another is a maximum a posterior algorithm, which is more scalable and can be easily paralleled. Extensive experiments on synthetic and real-world datasets demonstrate that BJMD is superior or competitive to the state-of-the-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available