4.3 Article

SpecDB: A relational database for archiving biomolecular NMR spectral data

Journal

JOURNAL OF MAGNETIC RESONANCE
Volume 342, Issue -, Pages -

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jmr.2022.107268

Keywords

Biomolecular NMR; Spectrum database; Machine learning; SQL

Funding

  1. National Institutes of Health [R01 GM120574, R35 GM141818]

Ask authors/readers for more resources

NMR is a valuable experimental tool for structural biologists to study biomolecules. The progress of machine learning highlights the importance of large, diverse, and reliable datasets in structural biology. By organizing NMR data and using machine learning methods, long-standing problems in NMR data processing and structure determination can be tackled.
NMR is a valuable experimental tool in the structural biologist's toolkit to elucidate the structures, functions, and motions of biomolecules. The progress of machine learning, particularly in structural biology, reveals the critical importance of large, diverse, and reliable datasets in developing new methods and understanding in structural biology and science more broadly. Biomolecular NMR research groups produce large amounts of data, and there is renewed interest in organizing these data to train new, sophisticated machine learning architectures and to improve biomolecular NMR analysis pipelines. The foundational data type in NMR is the free-induction decay (FID). There are opportunities to build sophisticated machine learning methods to tackle long-standing problems in NMR data processing, resonance assignment, dynamics analysis, and structure determination using NMR FIDs. Our goal in this study is to provide a lightweight, broadly available tool for archiving FID data as it is generated at the spectrometer, and grow a new resource of FID data and associated metadata. This study presents a relational schema for storing and organizing the metadata items that describe an NMR sample and FID data, which we call Spectral Database (SpecDB). SpecDB is implemented in SQLite and includes a Python software library providing a command-line application to create, organize, query, backup, share, and maintain the database. This set of software tools and database schema allow users to store, organize, share, and learn from NMR time domain data. SpecDB is freely available under an open source license at https://github.rpi.edu/ RPIBioinformatics/SpecDB. (C) 2022 The Authors. Published by Elsevier Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available