4.6 Article

Impact of phylogeny on structural contact inference from protein sequence data

Related references

Note: Only part of the references are listed.
Article Multidisciplinary Sciences

Extracting phylogenetic dimensions of coevolution reveals hidden functional signals

Alexandre Colavin et al.

Summary: This study introduces a background model that can separate the coevolution signal associated with different phylogenetic clades and within the same clade, and demonstrates that coevolution can be measured at multiple timescales within a protein. The study applies nested coevolution (NC) method to show the importance of poorly conserved residues in protein function and improves the accuracy of structural-contact predictions and functional sector detection.

SCIENTIFIC REPORTS (2022)

Article Biochemical Research Methods

Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences

Andonis Gerardos et al.

Summary: Inferring protein-protein interactions from sequences is an important task in computational biology. Recent research shows that correlations from both structural contacts and phylogeny can combine constructively to improve partner inference using DCA or MI. Furthermore, signal from phylogeny can rescue partner inference when signal from contacts becomes less informative, and restricting to non-contact pairs of sites preserves inference performance.

PLOS COMPUTATIONAL BIOLOGY (2022)

Article Multidisciplinary Sciences

Protein language models trained on multiple sequence alignments learn phylogenetic relationships

Umberto Lupo et al.

Summary: Self-supervised neural language models with attention have been applied to biological sequence data, advancing structure, function, and mutational effect prediction. This study demonstrates that protein language models can encode detailed phylogenetic relationships and can distinguish correlations caused by structural constraints from those caused by phylogeny.

NATURE COMMUNICATIONS (2022)

Article Physics, Multidisciplinary

Inferring couplings in networks across order-disorder phase transitions

Vudtiwat Ngampruetikorn et al.

Summary: Statistical inference is crucial for scientific research, but its working mechanism is still not fully understood. This study investigates the efficacy of direct coupling analysis (DCA) in inferring pairwise interactions from amino acid sequence data. The results reveal that the accuracy of inference depends strongly on the nature of data-generating distributions. Importantly, it is found that DCA is not always superior to traditional local-statistics-based methods when data are limited.

PHYSICAL REVIEW RESEARCH (2022)

Article Multidisciplinary Sciences

Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences

Anna G. Green et al.

Summary: The study utilized sequence coevolution to predict protein interactions in the E. coli membrane proteome and discovered new interactions successfully. The research demonstrated that coevolving residue pairs can be used to generate structural models of protein interactions, aiding in understanding the residue-level details of protein interactions.

NATURE COMMUNICATIONS (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Biochemical Research Methods

On the effect of phylogenetic correlations in coevolution-based contact prediction in proteins

Edwin Rodriguez Horta et al.

Summary: The coevolution-based contact prediction method is widely used in protein structure prediction, but the assumption of independent samples in global statistical modeling is violated by phylogenetic relationships between protein sequences. By randomizing or resampling sequence data, conservation patterns and phylogenetic relations can be preserved while removing coevolutionary couplings. Phylogeny-induced spurious couplings are smaller than couplings derived from natural sequences, but may still have an impact on the accuracy of predicted structural contacts.

PLOS COMPUTATIONAL BIOLOGY (2021)

Article Biochemistry & Molecular Biology

Pfam: The protein families database in 2021

Jaina Mistry et al.

Summary: The Pfam database has recently added a large number of protein families and domains, made revisions for COVID-19 research, and introduced Pfam-B as a supplement. These updates and improvements can help researchers classify protein sequences more effectively and conduct related studies.

NUCLEIC ACIDS RESEARCH (2021)

Article Multidisciplinary Sciences

Epistatic contributions promote the unification of incompatible models of neutral molecular evolution

Jose Alberto de la Paz et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2020)

Article Biochemical Research Methods

Direct coupling analysis of epistasis in allosteric materials

Barbara Bravi et al.

PLoS Computational Biology (2020)

Article Multidisciplinary Sciences

An evolution-based model for designing chorismate mutase enzymes

William P. Russ et al.

SCIENCE (2020)

Article Physics, Fluids & Plasmas

Statistical physics of interacting proteins: Impact of dataset size and quality assessed in synthetic sequences

Carlos A. Gandarilla-Perez et al.

PHYSICAL REVIEW E (2020)

Article Physics, Multidisciplinary

Phylogenetic Weighting Does Little to Improve the Accuracy of Evolutionary Coupling Analyses

Adam J. Hockenberry et al.

ENTROPY (2019)

Article Physics, Multidisciplinary

Coevolutionary Analysis of Protein Subfamilies by Sequence Reweighting

Duccio Malinverni et al.

ENTROPY (2019)

Article Physics, Multidisciplinary

Toward Inferring Potts Models for Phylogenetically Correlated Sequence Data

Edwin Rodriguez Horta et al.

ENTROPY (2019)

Article Biochemical Research Methods

Phylogenetic correlations can suffice to infer protein partners from sequences

Guillaume Marmier et al.

PLOS COMPUTATIONAL BIOLOGY (2019)

Article Multidisciplinary Sciences

Protein interaction networks revealed by proteome coevolution

Qian Cong et al.

SCIENCE (2019)

Article Biochemistry & Molecular Biology

How Pairwise Coevolutionary Models Capture the Collective Residue Variability in Proteins?

Matteo Figliuzzi et al.

MOLECULAR BIOLOGY AND EVOLUTION (2018)

Article Multidisciplinary Sciences

Power law tails in phylogenetic systems

Chongli Qin et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2018)

Article Biochemical Research Methods

Inferring interaction partners from protein sequences using mutual information

Anne-Florence Bitbol

PLOS COMPUTATIONAL BIOLOGY (2018)

Article Biochemical Research Methods

Synthetic protein alignments by CCMgen quantify noise in residue-residue contact prediction

Susann Vorberg et al.

PLOS COMPUTATIONAL BIOLOGY (2018)

Article Biochemistry & Molecular Biology

Coevolutionary Landscape Inference and the Context-Dependence of Mutations in Beta-Lactamase TEM-1

Matteo Figliuzzi et al.

MOLECULAR BIOLOGY AND EVOLUTION (2016)

Article Biochemical Research Methods

ACE: adaptive cluster expansion for maximum entropy graphical model inference

J. P. Barton et al.

BIOINFORMATICS (2016)

Article Biochemistry & Molecular Biology

Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes

R. R. Cheng et al.

MOLECULAR BIOLOGY AND EVOLUTION (2016)

Article Multidisciplinary Sciences

Inferring interaction partners from protein sequences

Anne-Florence Bitbol et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2016)

Article Multidisciplinary Sciences

Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis

Thomas Gueudre et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2016)

Article Biochemical Research Methods

Evolution-Based Functional Decomposition of Proteins

Olivier Rivoire et al.

PLOS COMPUTATIONAL BIOLOGY (2016)

Article Biochemical Research Methods

Large-Scale Conformational Transitions and Dimerization Are Encoded in the Amino-Acid Sequences of Hsp70 Chaperones

Duccio Malinverni et al.

PLOS COMPUTATIONAL BIOLOGY (2015)

Article Computer Science, Interdisciplinary Applications

Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences

Magnus Ekeberg et al.

JOURNAL OF COMPUTATIONAL PHYSICS (2014)

Article Multidisciplinary Sciences

Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information

Ryan R. Cheng et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2014)

Article Physics, Fluids & Plasmas

Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models

Magnus Ekeberg et al.

PHYSICAL REVIEW E (2013)

Article Multidisciplinary Sciences

Coevolutionary signals across protein lineages help capture multiple protein conformations

Faruck Morcos et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2013)

Article Multidisciplinary Sciences

Genomics-aided structure prediction

Joanna I. Sulkowska et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2012)

Article Multidisciplinary Sciences

Protein 3D Structure Computed from Evolutionary Sequence Variation

Debora S. Marks et al.

PLOS ONE (2011)

Article Multidisciplinary Sciences

Direct-coupling analysis of residue coevolution captures native contacts across many protein families

Faruck Morcos et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2011)

Article Multidisciplinary Sciences

FastTree 2-Approximately Maximum-Likelihood Trees for Large Alignments

Morgan N. Price et al.

PLOS ONE (2010)

Article Biochemistry & Molecular Biology

Protein Sectors: Evolutionary Units of Three-Dimensional Structure

Najeeb Halabi et al.

Article Multidisciplinary Sciences

Identification of direct residue contacts in protein-protein interaction by message passing

Martin Weigt et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2009)