4.4 Article

A fast parallel clustering algorithm for molecular simulation trajectories

Journal

JOURNAL OF COMPUTATIONAL CHEMISTRY
Volume 34, Issue 2, Pages 95-104

Publisher

WILEY
DOI: 10.1002/jcc.23110

Keywords

molecular dynamics; clustering; triangle inequality; general-purpose; computation on GPU; Markov state models

Funding

  1. Hong Kong Research Grants Council GRF [661011, F-HK29/11T, HKUST2/CRF/10, 619509]
  2. University Grants Council [SBI12SC01]
  3. NIH [R01-GM062868]

Ask authors/readers for more resources

We implemented a GPU-powered parallel k-centers algorithm to perform clustering on the conformations of molecular dynamics (MD) simulations. The algorithm is up to two orders of magnitude faster than the CPU implementation. We tested our algorithm on four protein MD simulation datasets ranging from the small Alanine Dipeptide to a 370-residue Maltose Binding Protein (MBP). It is capable of grouping 250,000 conformations of the MBP into 4000 clusters within 40 seconds. To achieve this, we effectively parallelized the code on the GPU and utilize the triangle inequality of metric spaces. Furthermore, the algorithm's running time is linear with respect to the number of cluster centers. In addition, we found the triangle inequality to be less effective in higher dimensions and provide a mathematical rationale. Finally, using Alanine Dipeptide as an example, we show a strong correlation between cluster populations resulting from the k-centers algorithm and the underlying density. (c) 2012 Wiley Periodicals, Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available