4.4 Article

Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix

期刊

SIAM JOURNAL ON COMPUTING
卷 36, 期 1, 页码 158-183

出版社

SIAM PUBLICATIONS
DOI: 10.1137/S0097539704442696

关键词

randomized algorithms; Monte Carlo methods; massive data sets; singular value decomposition

向作者/读者索取更多资源

In many applications, the data consist of ( or may be naturally formulated as) an m x n matrix A. It is often of interest to find a low-rank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a specified rank k, where k is much smaller than m and n. Methods such as the singular value decomposition (SVD) may be used to find an approximation to A which is the best in a well-defined sense. These methods require memory and time which are superlinear in m and n; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an m x n matrix A, compute a description of a low-rank approximation D* to A, and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix A - D*. For any matrix X, let parallel to X parallel to(F) and parallel to X parallel to(2) denote its Frobenius norm and its spectral norm, respectively. In the first algorithm, c columns of A are randomly chosen. If the m x c matrix C consists of those c columns of A ( after appropriate rescaling), then it is shown that from (CC)-C-T approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description D* of the matrix A may be computed such that rank(D*) <= k and such that parallel to A - D*parallel to(2)(xi) <= min (D: rank(D) <= k) parallel to A - D parallel to(2)(xi) + poly(k, 1/ c)parallel to A parallel to(2)(F) holds with high probability for both xi = 2, F. This algorithm may be implemented without storing the matrix A in random access memory ( RAM), provided it can make two passes over the matrix stored in external memory and use O(cm + c(2)) additional RAM. The second algorithm is similar except that it further approximates the matrix C by randomly sampling r rows of C to form a r x c matrix W. Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant additional RAM. To achieve an additional error ( beyond the best rank k approximation) that is at most epsilon parallel to A parallel to(2)(F), both algorithms take time which is polynomial in k, 1/epsilon, and log(1/delta), where delta > 0 is a failure probability; the first takes time linear in max(m, n) and the second takes time independent of m and n. Our bounds improve previously published results with respect to the rank parameter k for both the Frobenius and spectral norms. In addition, the proofs for the error bounds use a novel method that makes important use of matrix perturbation theory. The probability distribution over columns of A and the rescaling are crucial features of the algorithms which must be chosen judiciously.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据