4.7 Review

Using metagenomic data to boost protein structure prediction and discovery

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Biochemistry & Molecular Biology

Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14

Jian Liu et al.

Summary: Significant progress has been made in protein structure prediction by utilizing deep learning and residue-residue distance prediction since CASP13. The MULTICOM predictor in the 2020 CASP14 experiment ranked well in both tertiary structure prediction and inter-domain structure prediction, showing improvement in template-free modeling and overall performance.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2022)

Article Microbiology

Identification of Natural CRISPR Systems and Targets in the Human Microbiome

Philipp C. Muench et al.

Summary: Utilizing metagenomics, this study provides insights into the diversity of CRISPR loci and cas genes in the human microbiome. The taxonomy of spacer sequences mirrors that of their source community, with functional targets enriched for viral elements. Furthermore, the study demonstrates that CRISPR-Cas subtypes exhibit high site and taxon specificity.

CELL HOST & MICROBE (2021)

Review Biochemistry & Molecular Biology

Machine learning in protein structure prediction

Mohammed AlQuraishi

Summary: Prediction of protein structure from sequence has made significant progress in the past two years, driven by the increasing use of neural networks in structure prediction pipelines. These neural networks have optimized the previous energy models and sampling procedures, resulting in algorithms that can now predict protein structures with a median accuracy of 2.1 angstroms.

CURRENT OPINION IN CHEMICAL BIOLOGY (2021)

Article Biochemistry & Molecular Biology

Massive expansion of human gut bacteriophage diversity

Luis F. Camarillo-Guerrero et al.

Summary: The study reveals the diversity of viruses in the human gut and gene flow networks between different bacterial species, as well as the globally distributed viral populations and a highly prevalent phage clade reminiscent of p-crAssphage.
Review Biochemical Research Methods

Metagenomic tools in microbial ecology research

Neslihan Tas et al.

Summary: The ability to directly sequence DNA from the environment has revolutionized microbial ecology, with metagenomics providing new insights into microbial diversity and function. However, challenges remain in annotating functions and assembling genomes in heterogeneous samples. The development of new analysis and sequencing platforms will further enhance our understanding of microbial taxonomy, function, ecology, and evolution in the environment.

CURRENT OPINION IN BIOTECHNOLOGY (2021)

Article Multidisciplinary Sciences

Decoding the link of microbiome niches with homologous sequences enables accurately targeted protein structure prediction

Pengshuo Yang et al.

Summary: The study shows that using deep learning techniques to extract information from metagenome sequences significantly improves the accuracy of template-free protein structure modeling. The Meta-Source model, based on large-scale microbiome sequences, reveals inherent linkage between microbial niches and protein homologous families. Compared to using combined metagenome datasets, a microbiome-targeted approach with individual Meta-Source biomes requires less computational resources and generates more accurate structure models.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Article Microbiology

Evaluation of CRISPR Diversity in the Human Skin Microbiome for Personal Identification

Kochi Toyomane et al.

Summary: This study found that CRISPR spacer sequences in the human skin microbiome are highly personalized and may be used for personal identification, but their forensic relevance is still unclear.

MSYSTEMS (2021)

Article Fisheries

European eel (Anguilla anguilla) GI tract conserves a unique metagenomics profile in the recirculation aquaculture system (RAS)

Md. Shahdat Hossain et al.

Summary: The study on gut microbial communities in European eels revealed that eels from different farms harbor distinct gut microbiota, potentially influenced by both rearing conditions and host physiology. Additionally, the gut bacterial community of eels was significantly impacted by the microbiota of supplied feed and contiguous water sources.

AQUACULTURE INTERNATIONAL (2021)

Article Microbiology

DOE JGI Metagenome Workflow

Alicia Clum et al.

Summary: The DOE JGI Metagenome Workflow processes metagenomic data sets by performing assembly, annotation, and binning. It has been used on thousands of samples and helps researchers interpret metagenome data and apply the workflow to their own data.

MSYSTEMS (2021)

Article Biochemistry & Molecular Biology

A global metagenomic map of urban microbiomes and antimicrobial resistance

David Danko et al.

Summary: This study establishes a global metagenomic atlas of urban microbial ecosystems, revealing a vast number of unknown microbial species and genetic elements, highlighting the distribution of antibiotic resistance genes in cities, and indicating the influence of geographical and climatic characteristics on urban microbial composition.
Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Biochemistry & Molecular Biology

High-accuracy protein structure prediction in CASP14

Joana Pereira et al.

Summary: The application of state-of-the-art deep-learning approaches to protein modeling problem has expanded the high-accuracy category in CASP14, evaluating the performance of different groups and introducing new metrics. Despite the significant progress made by AlphaFold2, the second-best method in CASP14 outperformed the best method in CASP13, demonstrating the role of community-based benchmarking in the development of protein structure prediction field.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2021)

Article Biochemistry & Molecular Biology

Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14

Wei Zheng et al.

Summary: In this article, 3D structure prediction results by two top server groups (Zhang-Server and QUARK) in CASP14 were reported, with significant improvements achieved by introducing new components, particularly the newly added spatial restraints from DeepPotential and the well-tuned force field. However, challenges remain in modeling multi-domain proteins and protein domains from oligomer complexes, suggesting the need for further adjustments in deep learning-based predictors to address these issues.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2021)

Review Biochemistry & Molecular Biology

Protein sequence-to-structure learning: Is this the end(-to-end revolution)?

Elodie Laine et al.

Summary: Deep learning has proven to be highly successful in protein structure prediction, particularly in CASP14 where it reached near-experimental accuracy levels. Novel approaches like geometric learning, pretrained protein language models leveraging attention, equivariant architectures, and the use of large meta-genome databases have contributed to this advancement.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2021)

Article Multidisciplinary Sciences

Accurate prediction of protein structures and interactions using a three-track neural network

Minkyung Baek et al.

Summary: Through the three-track network, we achieved accuracies approaching those of DeepMind in CASP14, enabling rapid solution of challenging x-ray crystallography and cryo-electron microscopy structure modeling problems, and providing insights into the functions of proteins with currently unknown structure.

SCIENCE (2021)

Article Microbiology

Integrating Viral Metagenomics into an Ecological Framework

Pacifica Sommers et al.

Summary: Viral metagenomics with an ecological framework focuses on understanding the ecology of uncultured viruses, exploring the interactions between viruses and their surroundings, and providing a structure for studying the diversity, distribution, interactions of viruses with ecosystems and abiotic factors.

ANNUAL REVIEW OF VIROLOGY, VOL 8 (2021)

Article Microbiology

Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome

Stephen Nayfach et al.

Summary: By mining deposited human stool metagenomes, nearly 190,000 draft-quality DNA virus genomes were recovered to create the Metagenomic Gut Virus catalogue, improving virus detection in stool metagenomes and revealing diverse retroelements with potential involvement in the molecular arms race between phages and their bacterial hosts.

NATURE MICROBIOLOGY (2021)

Review Biochemistry & Molecular Biology

A roadmap for metagenomic enzyme discovery

Serina L. Robinson et al.

Summary: Metagenomics has generated vast amounts of sequencing data revealing the biosynthetic potential of uncultivated microbes. Despite the availability of genome-resolved information on microbial communities from various environments, accurately predicting biocatalytic functions from sequencing data remains challenging, particularly for enzymes involved in secondary metabolism which are crucial for discovering new enzymology.

NATURAL PRODUCT REPORTS (2021)

Article Biochemistry & Molecular Biology

The IMG/M data management and analysis system v.6.0: new tools and advanced capabilities

I-Min A. Chen et al.

Summary: The Integrated Microbial Genomes & Microbiomes system at the DOE's Joint Genome Institute contains annotated genome datasets and metagenome bins, with advanced search functions and a new statistical analysis tool available in IMG v 6.0. The updated web user interface includes a Help page and webinar tutorials to assist users in understanding and utilizing various IMG functions and tools in their research. New datasets have been processed with an extended prokaryotic annotation pipeline v.5, featuring expanded protein family assignments.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

KEGG: integrating viruses and cellular organisms

Minoru Kanehisa et al.

Summary: KEGG is a curated resource integrating eighteen databases categorized into systems, genomic, chemical and health information, providing mapping tools for understanding cellular and organism-level functions from genome sequences and other molecular datasets. The network variation maps in the KEGG database show how different pathogens and environmental factors influence cellular signaling pathways.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

Pfam: The protein families database in 2021

Jaina Mistry et al.

Summary: The Pfam database has recently added a large number of protein families and domains, made revisions for COVID-19 research, and introduced Pfam-B as a supplement. These updates and improvements can help researchers classify protein sequences more effectively and conduct related studies.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

Genomes OnLine Database (GOLD) v.8: overview and updates

Supratim Mukherjee et al.

Summary: The Genomes OnLine Database (GOLD) is a manually curated collection of genome projects and their metadata, with over 1.17 million entries. Users can browse, search, and input project details in GOLD, ensuring accurate metadata documentation for analysis. The database also imports projects from public repositories to maintain a reference dataset for the scientific community.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

Protein contact prediction using metagenome sequence data and residual neural networks

Qi Wu et al.

BIOINFORMATICS (2020)

Article Biochemical Research Methods

pydca v1.0: a comprehensive software for direct coupling analysis of RNA and protein sequences

Mehari B. Zerihun et al.

BIOINFORMATICS (2020)

Article Multidisciplinary Sciences

Improved protein structure prediction using potentials from deep learning

Andrew W. Senior et al.

NATURE (2020)

Article Multidisciplinary Sciences

Improved protein structure prediction using predicted interresidue orientations

Jianyi Yang et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2020)

Article Critical Care Medicine

Metagenomics Reveals a Core Macrolide Resistome Related to Microbiota in Chronic Respiratory Disease

Micheal Mac Aogain et al.

AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE (2020)

Review Biotechnology & Applied Microbiology

Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors

Andrew V. Anzalone et al.

NATURE BIOTECHNOLOGY (2020)

Article Biotechnology & Applied Microbiology

MetaHMM: A webserver for identifying novel genes with specified functions in metagenomic samples

Balazs Szalkai et al.

GENOMICS (2019)

Review Genetics & Heredity

Clinical metagenomics

Charles Y. Chiu et al.

NATURE REVIEWS GENETICS (2019)

Article Biochemical Research Methods

ProteinNet: a standardized data set for machine learning of protein structure

Mohammed AlQuraishi

BMC BIOINFORMATICS (2019)

Article Biochemical Research Methods

Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold

Martin Steinegger et al.

NATURE METHODS (2019)

Article Biochemistry & Molecular Biology

Deep-learning contact-map guided protein structure prediction in CASP13

Wei Zheng et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13

Shaun M. Kandathil et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

NMR-assisted protein structure prediction with MELDxMD

James C. Robertson et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

High-accuracy refinement using Rosetta in CASP13

Hahnbeom Park et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13

Yang Li et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Review Biochemistry & Molecular Biology

Critical assessment of methods of protein structure prediction (CASP)-Round XIII

Andriy Kryshtafovych et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Multidisciplinary Sciences

A metagenomic strategy for harnessing the chemical repertoire of the human microbiome

Yuki Sugimoto et al.

SCIENCE (2019)

Article Biochemical Research Methods

PconsC4: fast, accurate and hassle-free contact predictions

Mirco Michel et al.

BIOINFORMATICS (2019)

Review Immunology

The gut microbiome: Relationships with disease and opportunities for therapy

Juliana Durack et al.

JOURNAL OF EXPERIMENTAL MEDICINE (2019)

Article Biochemistry & Molecular Biology

CATH: expanding the horizons of structure-based functional annotations for genome sequences

Ian Sillitoe et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Microbiology

Prediction of the intestinal resistome by a three-dimensional structure-based method

Etienne Ruppe et al.

NATURE MICROBIOLOGY (2019)

Article Biochemical Research Methods

MG-RAST version 4-lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis

Folker Meyer et al.

BRIEFINGS IN BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

Improved protein contact predictions with the MetaPSICOV2 server in CASP12

Daniel W. A. Buchan et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2018)

Article Biochemistry & Molecular Biology

Critical assessment of methods of protein structure prediction (CASP)Round XII

John Moult et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2018)

Article Biotechnology & Applied Microbiology

Lignolytic-consortium omics analyses reveal novel genomes and pathways involved in lignin modification and valorization

Eduardo C. Moraes et al.

BIOTECHNOLOGY FOR BIOFUELS (2018)

Article Biochemistry & Molecular Biology

HMMER web server: 2018 update

Simon C. Potter et al.

NUCLEIC ACIDS RESEARCH (2018)

Review Physics, Multidisciplinary

Inverse statistical physics of protein sequences: a key issues review

Simona Cocco et al.

REPORTS ON PROGRESS IN PHYSICS (2018)

Article Multidisciplinary Sciences

Clustering huge protein sequence sets in linear time

Martin Steinegger et al.

NATURE COMMUNICATIONS (2018)

Article Biochemistry & Molecular Biology

Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks

Yang Liu et al.

CELL SYSTEMS (2018)

Review Biotechnology & Applied Microbiology

Discovering novel hydrolases from hot environments

Roland Wohlgemuth et al.

BIOTECHNOLOGY ADVANCES (2018)

Article Multidisciplinary Sciences

Data Descriptor: Marine microbial metagenomes sampled across space and time

Steven J. Biller et al.

SCIENTIFIC DATA (2018)

Review Genetics & Heredity

Recent Advances in Function-based Metagenomic Screening

Tanyaradzwa Rodgers Ngara et al.

GENOMICS PROTEOMICS & BIOINFORMATICS (2018)

Article Multidisciplinary Sciences

New CRISPR-Cas systems from uncultivated microbes

David Burstein et al.

NATURE (2017)

Article Biochemistry & Molecular Biology

Uniclust databases of clustered and deeply annotated protein sequences and alignments

Milot Mirdita et al.

NUCLEIC ACIDS RESEARCH (2017)

Article Biochemistry & Molecular Biology

UniProt: the universal protein knowledgebase

Alex Bateman et al.

NUCLEIC ACIDS RESEARCH (2017)

Review Microbiology

Virus taxonomy in the age of metagenomics

Peter Simmonds et al.

NATURE REVIEWS MICROBIOLOGY (2017)

Article Multidisciplinary Sciences

Protein structure determination using metagenome sequence data

Sergey Ovchinnikov et al.

SCIENCE (2017)

Article Biochemical Research Methods

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

Sheng Wang et al.

PLOS COMPUTATIONAL BIOLOGY (2017)

Article Multidisciplinary Sciences

MetaCRAST: reference-guided extraction of CRISPR spacers from unassembled metagenomes

Abraham G. Moller et al.

Article Microbiology

Metagenomic Analysis of Bacterial Communities of Antarctic Surface Snow

Anna Lopatina et al.

FRONTIERS IN MICROBIOLOGY (2016)

Article Microbiology

Antibiotic resistance genes across a wide variety of metagenomes

David Fitzpatrick et al.

FEMS MICROBIOLOGY ECOLOGY (2016)

Review Gastroenterology & Hepatology

The gut microbiome in health and in disease

Andrew B. Shreiner et al.

CURRENT OPINION IN GASTROENTEROLOGY (2015)

Article Multidisciplinary Sciences

Structure and function of the global ocean microbiome

Shinichi Sunagawa et al.

SCIENCE (2015)

Article Multidisciplinary Sciences

Gut microbiome development along the colorectal adenoma-carcinoma sequence

Qiang Feng et al.

NATURE COMMUNICATIONS (2015)

Review Genetics & Heredity

Marine Metagenome as A Resource for Novel Enzymes

Amani D. Alma'abadi et al.

GENOMICS PROTEOMICS & BIOINFORMATICS (2015)

Article Biochemical Research Methods

UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches

Baris E. Suzek et al.

BIOINFORMATICS (2015)

Article Biochemical Research Methods

Metavir 2: new tools for viral metagenome comparison and assembled virome analysis

Simon Roux et al.

BMC BIOINFORMATICS (2014)

Article Biochemistry & Molecular Biology

A structural perspective of compensatory evolution

Dmitry N. Ivankov et al.

CURRENT OPINION IN STRUCTURAL BIOLOGY (2014)

Review Microbiology

Unravelling the structural and mechanistic basis of CRISPR-Cas systems

John van der Oost et al.

NATURE REVIEWS MICROBIOLOGY (2014)

Review Environmental Sciences

Marine metagenomics, a valuable tool for enzymes and bioactive compounds discovery

Rosalba Barone et al.

FRONTIERS IN MARINE SCIENCE (2014)

Article Biochemistry & Molecular Biology

Crass: identification and reconstruction of CRISPR from unassembled metagenomic data

Connor T. Skennerton et al.

NUCLEIC ACIDS RESEARCH (2013)

Article Multidisciplinary Sciences

Pediatric Fecal Microbiota Harbor Diverse and Novel Antibiotic Resistance Genes

Aimee M. Moore et al.

PLOS ONE (2013)

Article Multidisciplinary Sciences

Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era

Hetunandan Kamisetty et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2013)

Article Multidisciplinary Sciences

Metagenome-wide analysis of antibiotic resistance genes in a large cohort of human gut microbiota

Yongfei Hu et al.

NATURE COMMUNICATIONS (2013)

Article Biochemical Research Methods

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment

Michael Remmert et al.

NATURE METHODS (2012)

Article Biochemistry & Molecular Biology

The sequence read archive: explosive growth of sequencing data

Yuichi Kodama et al.

NUCLEIC ACIDS RESEARCH (2012)

Article Multidisciplinary Sciences

The Shared Antibiotic Resistome of Soil Bacteria and Human Pathogens

Kevin J. Forsberg et al.

SCIENCE (2012)

Article Genetics & Heredity

Diverse CRISPRs Evolving in Human Microbiomes

Mina Rho et al.

PLOS GENETICS (2012)

Review Biotechnology & Applied Microbiology

Metagenomic Analyses: Past and Future Trends

Carola Simon et al.

APPLIED AND ENVIRONMENTAL MICROBIOLOGY (2011)

Article Multidisciplinary Sciences

Protein 3D Structure Computed from Evolutionary Sequence Variation

Debora S. Marks et al.

PLOS ONE (2011)

Article Multidisciplinary Sciences

Direct-coupling analysis of residue coevolution captures native contacts across many protein families

Faruck Morcos et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2011)

Article Biochemical Research Methods

Accelerated Profile HMM Searches

Sean R. Eddy

PLOS COMPUTATIONAL BIOLOGY (2011)

Article Biochemical Research Methods

Prodigal: prokaryotic gene recognition and translation initiation site identification

Doug Hyatt et al.

BMC BIOINFORMATICS (2010)

Article Multidisciplinary Sciences

Identification of direct residue contacts in protein-protein interaction by message passing

Martin Weigt et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2009)

Article Multidisciplinary Sciences

Functional Characterization of the Antibiotic Resistance Reservoir in the Human Microflora

Morten O. A. Sommer et al.

SCIENCE (2009)

Review Biochemistry & Molecular Biology

Sequencing breakthroughs for genomic ecology and evolutionary biology

Matthew E. Hudson

MOLECULAR ECOLOGY RESOURCES (2008)

Article Biochemistry & Molecular Biology

CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats

Ibtissem Grissa et al.

NUCLEIC ACIDS RESEARCH (2007)

Article Biochemical Research Methods

UniRef: comprehensive and non-redundant UniProt reference clusters

Baris E. Suzek et al.

BIOINFORMATICS (2007)

Article Multidisciplinary Sciences

Sampling the antibiotic resistome

VM D'Costa et al.

SCIENCE (2006)

Article Biotechnology & Applied Microbiology

Unusual microbial xylanases from insect guts

Y Brennan et al.

APPLIED AND ENVIRONMENTAL MICROBIOLOGY (2004)

Article Biochemistry & Molecular Biology

The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000

A Bairoch et al.

NUCLEIC ACIDS RESEARCH (2000)