4.5 Article

Generalized Principal Component Analysis: Projection of Saturated Model Parameters

期刊

TECHNOMETRICS
卷 62, 期 4, 页码 459-472

出版社

AMER STATISTICAL ASSOC
DOI: 10.1080/00401706.2019.1668854

关键词

Binary data; Count data; Dimensionality reduction; Exponential family; Low rank model

向作者/读者索取更多资源

Principal component analysis (PCA) is very useful for a wide variety of data analysis tasks, but its implicit connection to the Gaussian distribution can be undesirable for discrete data such as binary and multi-category responses or counts. We generalize PCA to handle various types of data using the generalized linear model framework. In contrast to the existing approach of matrix factorizations for exponential family data, our generalized PCA provides low-rank estimates of the natural parameters by projecting the saturated model parameters. This difference in formulation leads to the favorable properties that the number of parameters does not grow with the sample size and simple matrix multiplication suffices for computation of the principal component scores on new data. A practical algorithm which can incorporate missing data and case weights is developed for finding the projection matrix. Examples on simulated and real count data show the improvement of generalized PCA over standard PCA for matrix completion, visualization, and collaborative filtering. for this article is available online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据