4.8 Article

A learned embedding for efficient joint analysis of millions of mass spectra

Journal

NATURE METHODS
Volume 19, Issue 6, Pages 675-+

Publisher

NATURE PORTFOLIO

Keywords

-

Funding

  1. National Institutes of Health [R01 GM121818]

Ask authors/readers for more resources

This study proposes a new computational method that uses deep neural networks to perform supervised learning on mass spectrometry data for large-scale identification of unknown spectra. By embedding spectra into a lower-dimensional space where spectra generated by the same peptide are close to each other, a group of unidentified similar spectra are detected and identified.
Computational methods that aim to exploit publicly available mass spectrometry repositories rely primarily on unsupervised clustering of spectra. Here we trained a deep neural network in a supervised fashion on the basis of previous assignments of peptides to spectra. The network, called 'GLEAMS', learns to embed spectra in a low-dimensional space in which spectra generated by the same peptide are close to one another. We applied GLEAMS for large-scale spectrum clustering, detecting groups of unidentified, proximal spectra representing the same peptide. We used these clusters to explore the dark proteome of repeatedly observed yet consistently unidentified mass spectra. GLEAMS, a deep learning-based algorithm, embeds mass spectra such that spectra related to the same peptide are close to each other, enabling unknown spectra to be identified on a massive scale.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available