4.7 Article

Partitioning Models for General Medium-Grain Parallel Sparse Tensor Decomposition

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2020.3012624

关键词

Tensile stress; Program processors; Partitioning algorithms; Computational modeling; Sparse matrices; Load modeling; Minimization; sparse tensor; tensor decomposition; canonical polyadic decomposition; communication cost; communication volume; medium-grain partitioning; recursive bipartitioning; hypergraph partitioning; graph partitioning

资金

  1. Scientific and Technological Research Council of Turkey (TUBITAK) [EEEAG-116E043]
  2. U.S. Department of Energy's National Nuclear Security Administration [DE-NA-0003525]

向作者/读者索取更多资源

This article focuses on efficiently parallelizing the canonical polyadic decomposition algorithm for sparse tensors using the alternating least squares method on distributed-memory architectures. A hypergraph model is proposed for medium-grain partitioning without topological constraints, and a net amalgamation operation is introduced to minimize communication volume. A heuristic is presented to achieve slice coherency at (sub)slice level, and a medium-grain tripartite graph model is proposed for faster partitioning at the cost of increased communication volume.
The focus of this article is efficient parallelization of the canonical polyadic decomposition algorithm utilizing the alternating least squares method for sparse tensors on distributed-memory architectures. We propose a hypergraph model for general medium-grain partitioning which does not enforce any topological constraint on the partitioning. The proposed model is based on splitting the given tensor into nonzero-disjoint component tensors. Then a mode-dependent coarse-grain hypergraph is constructed for each component tensor. A net amalgamation operation is proposed to form a composite medium-grain hypergraph from these mode-dependent coarse-grain hypergraphs to correctly encapsulate the minimization of the communication volume. We propose a heuristic which splits the nonzeros of dense slices to obtain sparse slices in component tensors. So we partially attain slice coherency at (sub)slice level since partitioning is performed on (sub)slices instead of individual nonzeros. We also utilize the well-known recursive-bipartitioning framework to improve the quality of the splitting heuristic. Finally, we propose a medium-grain tripartite graph model with the aim of a faster partitioning at the expense of increasing the total communication volume. Parallel experiments conducted on 10 real-world tensors on up to 1024 processors confirm the validity of the proposed hypergraph and graph models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据