4.8 Article

Transformer-based protein generation with regularized latent space optimization

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Multidisciplinary Sciences

Learning meaningful representations of protein sequences

Nicki Skafte Detlefsen et al.

Summary: This paper discusses the issue of representation in protein sequence analysis and proposes best practices for ensuring meaningful representations. The research finds that even minor modifications can result in different data representations and biological interpretations, raising the question of what constitutes the most meaningful representation.

NATURE COMMUNICATIONS (2022)

Article Biochemistry & Molecular Biology

DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutations

Carlos H. M. Rodrigues et al.

Summary: DynaMut2 is a web server that combines NMA methods and graph-based signatures to investigate the effects of missense mutations on protein stability and dynamics. It accurately predicts the effects of missense mutations, achieving good performance on single-point and multiple-point missense mutations.

PROTEIN SCIENCE (2021)

Article Multidisciplinary Sciences

Protein sequence design by conformational landscape optimization

Christoffer Norn et al.

Summary: The protein design problem aims to find an appropriate amino acid sequence for a desired protein structure, with optimization over all possible sequences and structures using protein structure prediction and backpropagation. The trRosetta model is more effective than Rosetta single-point energy estimations, and combining trRosetta and Rosetta models can result in more funneled energy landscapes.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Article Biochemical Research Methods

Low-N protein engineering with data-efficient deep learning

Surojit Biswas et al.

Summary: The approach introduced in this study utilizes machine learning to build accurate virtual fitness landscapes and screen millions of sequences via in silico directed evolution using minimal functionally assayed mutant sequences. This method not only helps in quickly identifying enhanced protein variants, but also efficiently utilizes resources for high-throughput screening.

NATURE METHODS (2021)

Article Multidisciplinary Sciences

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

Alexander Rives et al.

Summary: The deep contextual language model trained through unsupervised learning on protein sequences contains information about biological properties, has a multiscale structural organization, and can be used to improve predictions for protein mutational effects, secondary structure, and long-range contacts.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Biochemistry & Molecular Biology

Pfam: The protein families database in 2021

Jaina Mistry et al.

Summary: The Pfam database has recently added a large number of protein families and domains, made revisions for COVID-19 research, and introduced Pfam-B as a supplement. These updates and improvements can help researchers classify protein sequences more effectively and conduct related studies.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

Antibody complementarity determining region design using high-capacity machine learning

Ge Liu et al.

BIOINFORMATICS (2020)

Review Chemistry, Physical

Engineering new catalytic activities in enzymes

Kai Chen et al.

NATURE CATALYSIS (2020)

Proceedings Paper Computer Science, Artificial Intelligence

Uncovering the Folding Landscape of RNA Secondary Structure Using Deep Graph Embeddings

Egbert Castro et al.

2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) (2020)

Review Biochemical Research Methods

Machine-learning-guided directed evolution for protein engineering

Kevin K. Yang et al.

NATURE METHODS (2019)

Article Biochemical Research Methods

Unified rational protein engineering with sequence-based deep representation learning

Ethan C. Alley et al.

NATURE METHODS (2019)

Article Chemistry, Multidisciplinary

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

Rafael Gomez-Bombarelli et al.

ACS CENTRAL SCIENCE (2018)

Article Multidisciplinary Sciences

Local fitness landscape of the green fluorescent protein

Karen S. Sarkisyan et al.

NATURE (2016)

Review Biochemistry & Molecular Biology

Epistasis in protein evolution

Tyler N. Starr et al.

PROTEIN SCIENCE (2016)

Review Cell Biology

Exploring protein fitness landscapes by directed evolution

Philip A. Romero et al.

NATURE REVIEWS MOLECULAR CELL BIOLOGY (2009)