4.7 Article

Learned Reconstruction of Protein Folding Trajectories from Noisy Single-Molecule Time Series

Journal

JOURNAL OF CHEMICAL THEORY AND COMPUTATION
Volume 19, Issue 14, Pages 4654-4667

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jctc.2c00920

Keywords

-

Ask authors/readers for more resources

Single-molecule Fo''rster resonance energy transfer (smFRET) is used to track the real-time dynamics of molecules. The Takens' Delay Embedding Theorem guarantees that the atomistic dynamics of a system can be represented by a time-delayed embedding of scalar observables. A method called Single-molecule TAkens Reconstruction (STAR) is used to learn the transformation between atomic coordinates and delay-embedded distances accessible to smFRET. STAR has been applied to reconstruct molecular configurations with high accuracy. In this work, the role of signal-to-noise ratio, data volume, and time resolution in simulated smFRET data is investigated to assess the performance of STAR under experimental conditions.
Single-molecule Fo''rster resonance energy transfer (smFRET) is an experimental methodology to track the real-time dynamics of molecules using fluorescent probes to follow one or more intramolecular distances. These distances provide a low-dimensional representation of the full atomistic dynamics. Under mild technical conditions, Takens' Delay Embedding Theorem guarantees that the full three-dimensional atomistic dynamics of a system are diffeomorphic (i.e., related by a smooth and invertible transformation) to a time-delayed embedding of one or more scalar observables. Appealing to these theoretical guarantees, we employ manifold learning, artificial neural networks, and statistical mechanics to learn from molecular simulation training data the a priori unknown transformation between the atomic coordinates and delay-embedded intramolecular distances accessible to smFRET. This learned transformation may then be used to reconstruct atomistic coordinates from smFRET time series data. We term this approach Single-molecule TAkens Reconstruction (STAR). We have previously applied STAR to reconstruct molecular configurations of a C24H50 polymer chain and the mini-protein Chignolin with accuracies better than 0.2 nm from simulated smFRET data under noise free and high time resolution conditions. In the present work, we investigate the role of signal-to-noise ratio, data volume, and time resolution in simulated smFRET data to assess the performance of STAR under conditions more representative of experimental realities. We show that STAR can reconstruct the Chignolin and Villin mini-proteins to accuracies of 0.12 and 0.42 nm, respectively, and place bounds on these conditions for accurate reconstructions. These results demonstrate that it is possible to reconstruct dynamical trajectories of protein folding from time series in noisy, time binned, experimentally measurable observables and lay the foundations for the application of STAR to real experimental data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available