4.3 Article

Convolutional ProteinUnetLM competitive with long short-term memory-based protein secondary structure predictors

Related references

Note: Only part of the references are listed.
Article Biochemistry & Molecular Biology

NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning

Magnus Haraldson Hoie et al.

Summary: Recent advances in machine learning and natural language processing have enabled accurate prediction of protein structures and functions, with NetSurfP-3.0 standing out as a tool with drastically improved runtime and reliable prediction performance.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

Benchmarking the Accuracy of AlphaFold 2 in Loop Structure Prediction

Amy O. Stevens et al.

Summary: The inhibition of protein-protein interactions is a growing strategy in drug development, and protein loop regions are potential drug targets. AlphaFold 2 performs well in predicting protein loop structures, especially for short loops. However, as the length of the loop increases, the accuracy of AlphaFold 2's prediction decreases.

BIOMOLECULES (2022)

Article Computer Science, Artificial Intelligence

ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning

Ahmed Elnaggar et al.

Summary: Computational biology and bioinformatics provide valuable data for the development of language models in natural language processing. In this study, six different models were trained on protein sequence data and the resulting embeddings were used for various protein structure prediction tasks, demonstrating their advantages over traditional methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Chemistry, Multidisciplinary

ProteinUnet-An efficient alternative toSPIDER3-singleforsequence-basedprediction of protein secondary structures

Krzysztof Kotowski et al.

Summary: Predicting protein function and structure from sequence remains a challenging problem in bioinformatics. Two methods, ProteinUnet and SPIDER3-Single, were compared in this study, with ProteinUnet showing advantages in terms of parameter efficiency, inference time, and training speed. Additionally, ProteinUnet performed better for short sequences and residues with few local contacts, and the method of loss weighting was effective in improving accuracy for rare secondary structures.

JOURNAL OF COMPUTATIONAL CHEMISTRY (2021)

Article Multidisciplinary Sciences

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

Alexander Rives et al.

Summary: The deep contextual language model trained through unsupervised learning on protein sequences contains information about biological properties, has a multiscale structural organization, and can be used to improve predictions for protein mutational effects, secondary structure, and long-range contacts.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Article Computer Science, Artificial Intelligence

How to design the fair experimental classifier evaluation

Katarzyna Stapor et al.

Summary: Researchers often evaluate the quality of developed algorithms through computer experiments, supported by statistical analysis and experimental protocols. However, there is a concern about the randomness of data folds used for classification and the potential for manipulating experimental results using statistical evaluation methods. The paper highlights the weaknesses of commonly used experimental protocols and discusses the trustworthiness of evaluation methodology, fairness of presented evaluations, and the risk of unethical behavior.

APPLIED SOFT COMPUTING (2021)

Article Biochemistry & Molecular Biology

Critical assessment of methods of protein structure prediction (CASP)-Round XIV

Andriy Kryshtafovych et al.

Summary: CASP is a community experiment aimed at advancing methods for computing three-dimensional protein structure, including rigorous blind testing and evaluation by independent assessors. In the recent CASP14 experiment, deep-learning methods from one research group consistently delivered computed structures rivaling the corresponding experimental ones in accuracy. These results represent a solution to the classical protein-folding problem, at least for single proteins.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2021)

Proceedings Paper Engineering, Biomedical

MATTHEWS CORRELATION COEFFICIENT LOSS FOR DEEP CONVOLUTIONAL NETWORKS: APPLICATION TO SKIN LESION SEGMENTATION

Kumar Abhishek et al.

Summary: This study proposes a novel metric-based loss function using the Matthews correlation coefficient to optimize deep segmentation models for skin lesion segmentation. Results show that models trained using this loss function outperform those trained using the Dice loss function on three skin lesion image datasets.

2021 IEEE 18TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI) (2021)

Article Biochemistry & Molecular Biology

The language of proteins: NLP, machine learning & protein sequences

Dan Ofer et al.

Summary: NLP methods have made significant progress in studying proteins, allowing for effective encoding and analysis of protein information. By transforming protein data into text format, a variety of NLP techniques can be applied to address tasks related to proteins.

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL (2021)

Review Biochemical Research Methods

Protein Secondary Structure Prediction: A Review of Progress and Directions

Tomasz Smolarczyk et al.

CURRENT BIOINFORMATICS (2020)

Article Biochemistry & Molecular Biology

NetSurfP-2.0: Improved prediction of protein structural features by integrated deep learning

Michael Schantz Klausen et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Computer Science, Artificial Intelligence

Attention gated networks: Learning to leverage salient regions in medical images

Jo Schlemper et al.

MEDICAL IMAGE ANALYSIS (2019)

Article Biochemical Research Methods

Protein secondary structure prediction: A survey of the state of the art

Qian Jiang et al.

JOURNAL OF MOLECULAR GRAPHICS & MODELLING (2017)

Article Biochemical Research Methods

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment

Michael Remmert et al.

NATURE METHODS (2012)

Article Biochemical Research Methods

ADJUSTED GEOMETRIC-MEAN: A NOVEL PERFORMANCE MEASURE FOR IMBALANCED BIOINFORMATICS DATASETS LEARNING

Rukshan Batuwita et al.

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (2012)

Article Mathematical & Computational Biology

Analysis of protein chameleon sequence characteristics

Amine Ghozlane et al.

BIOINFORMATION (2009)

Article Biochemistry & Molecular Biology

MIPS: analysis and annotation of proteins from whole genomes in 2005

H. W. Mewes et al.

NUCLEIC ACIDS RESEARCH (2006)

Article Biochemistry & Molecular Biology

BASys: a web server for automated bacterial genome annotation

GH Van Domselaar et al.

NUCLEIC ACIDS RESEARCH (2005)

Article Multidisciplinary Sciences

Coupled prediction of protein secondary and tertiary structure

J Meiler et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2003)

Article Biochemistry & Molecular Biology

PROTINFO: secondary and tertiary protein structure prediction

LH Hung et al.

NUCLEIC ACIDS RESEARCH (2003)

Article Biochemistry & Molecular Biology

PSORT-B:: improving protein subcellular localization prediction for Gram-negative bacteria

JL Gardy et al.

NUCLEIC ACIDS RESEARCH (2003)

Article Biochemistry & Molecular Biology

Comparing function and structure between entire proteomes

JF Liu et al.

PROTEIN SCIENCE (2001)

Article Biochemistry & Molecular Biology

Preorganized secondary structure as an important determinant of fast protein folding

JK Myers et al.

NATURE STRUCTURAL BIOLOGY (2001)

Article Biochemical Research Methods

What are the baselines for protein fold recognition?

LJ McGuffin et al.

BIOINFORMATICS (2001)