4.6 Article

A MapReduce based parallel SVM for large-scale predicting protein-protein interactions

Journal

NEUROCOMPUTING
Volume 145, Issue -, Pages 37-43

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2014.05.072

Keywords

Protein-protein interaction; MapReduce; Support vector machine; Protein sequence; Autocorrelation descriptor

Funding

  1. National Natural Science Foundation of China (NSFC) [61102119, 61373086, 61103075, 61170326]
  2. China Postdoctoral Science Foundation [2012M520929]

Ask authors/readers for more resources

Protein-protein interactions (PPIs) are crucial to most biochemical processes, including metabolic cycles, DNA transcription and replication, and signaling cascades. Although large amount of protein-protein interaction data for different species has been generated by high-throughput experimental techniques, the number is still limited compared to the total number of possible PPIs. Furthermore, the experimental methods for identifying PPIs are both time-consuming and expensive. Therefore, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. In this article, we propose a novel MapReduce-based parallel SVM model for large-scale predicting protein-protein interactions only using the information of protein sequences. First, the local sequential features represented by autocorrelation descriptor are extracted from protein sequences. Then the MapReduce framework is employed to train support vector machine (SVM) classifiers in a distributed way, obtaining significant improvement in training time while maintaining a high level of accuracy. The experimental results demonstrate that the proposed parallel algorithms not only can tackle large-scale PPIs dataset, but also perform well in terms of the evaluation metrics of speedup and accuracy. Consequently, the proposed approach can be considered as a new promising and powerful tools for large-scale predicting PPI with excellent performance and less time. (C) 2014 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available