期刊
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
卷 68, 期 -, 页码 457-476出版社
WILEY
DOI: 10.1111/j.1467-9868.2006.00549.x
关键词
discriminant analysis; gene expression data; regression analysis; sign eigenanalysis; unsupervised learning
The purpose of the paper is to present a new statistical approach to hierarchical cluster analysis with n objects measured on p variables. Motivated by the model of multivariate analysis of variance and the method of maximum likelihood, a clustering problem is formulated as a least squares optimization problem, simultaneously solving for both an n-vector of unknown group membership of objects and a linear clustering function. This formulation is shown to be linked to linear regression analysis and Fisher linear discriminant analysis and includes principal component regression for tackling multicollinearity or rank deficiency, polynomial or B-splines regression for handling non-linearity and various variable selection methods to eliminate irrelevant variables from data analysis. Algorithmic issues are investigated by using sign eigenanalysis.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据