4.6 Article

S-MPEC: Sparse Matrix Multiplication Performance Estimator on a Cloud Environment

Publisher

SPRINGER
DOI: 10.1007/s10586-021-03287-3

Keywords

Cloud computing; Instance recommendation; Sparse matrix multiplication; Apache Spark

Ask authors/readers for more resources

In this paper, we propose a model called S-MPEC for predicting and optimizing the latency of sparse matrix multiplication (SPMM) tasks in distributed cloud environments using Apache Spark. By characterizing different distributed SPMM implementation methods and considering the characteristics and hardware specifications of the cloud, we establish an accurate prediction model that recommends the optimal implementation method. The experimental results show that users can expect a 44% reduction in latency compared to native SPMM implementations in Apache Spark.
Sparse matrix multiplication (SPMM) is widely used for various machine learning algorithms. As the applications of SPMM using large-scale datasets become prevalent, executing SPMM jobs on an optimized setup has become very important. Execution environments of distributed SPMM tasks on cloud resources can be set up in diverse ways with respect to the input sparse datasets, distinct SPMM implementation methods, and the choice of cloud instance types. In this paper, we propose S-MPEC which can predict latency to complete various SPMM tasks using Apache Spark on distributed cloud environments. We first characterize various distributed SPMM implementations on Apache Spark. Considering the characters and hardware specifications on the cloud, we propose unique features to build a GB-regressor model and Bayesian optimizations. Our proposed S-MPEC model can predict latency on an arbitrary SPMM task accurately and recommend an optimal implementation method. Thorough evaluation of the proposed system reveals that a user can expect 44% less latency to complete SPMM tasks compared with the native SPMM implementations in Apache Spark.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available