4.6 Article

Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data

期刊

BMC BIOINFORMATICS
卷 23, 期 1, 页码 -

出版社

BMC
DOI: 10.1186/s12859-022-04833-5

关键词

Mass spectrometry; Locality-sensitive hashing; Signal processing

资金

  1. Deutsche Forschungsgemeinschaft (DFG) [329350978]
  2. Bundesministerium fur Bildung und Forschung (BMBF) [031L0217A/B]
  3. Projekt DEAL

向作者/读者索取更多资源

This study demonstrates the effectiveness of locality-sensitive hashing in signal classification in mass spectrometry raw data, achieving superior performance by balancing false-positive and false-negative rates through appropriate algorithm parameters. This approach significantly reduces data size while preserving important information in processing large-scale mass spectrometry data.
Background: Mass spectrometry is an important experimental technique in the field of proteomics. However, analysis of certain mass spectrometry data faces a combination of two challenges: first, even a single experiment produces a large amount of multi-dimensional raw data and, second, signals of interest are not single peaks but patterns of peaks that span along the different dimensions. The rapidly growing amount of mass spectrometry data increases the demand for scalable solutions. Furthermore, existing approaches for signal detection usually rely on strong assumptions concerning the signals properties. Results: In this study, it is shown that locality-sensitive hashing enables signal classification in mass spectrometry raw data at scale. Through appropriate choice of algorithm parameters it is possible to balance false-positive and false-negative rates. On synthetic data, a superior performance compared to an intensity thresholding approach was achieved. Real data could be strongly reduced without losing relevant information. Our implementation scaled out up to 32 threads and supports acceleration by GPUs. Conclusions: Locality-sensitive hashing is a desirable approach for signal classification in mass spectrometry raw data. Availability: Generated data and code are available at https://github.com/hildebrand tlab/mzBucket. Raw data is available at https://zenodo.org/record/5036526.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据