4.7 Review

A guide to machine learning for biologists

Related references

Note: Only part of the references are listed.
Article Biochemical Research Methods

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

Fabian Isensee et al.

Summary: nnU-Net is a deep learning-based image segmentation method that automatically configures itself for diverse biological and medical image segmentation tasks, offering state-of-the-art performance as an out-of-the-box tool.

NATURE METHODS (2021)

Article Biochemical Research Methods

CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks

Ellen D. Zhong et al.

Summary: cryoDRGN is an algorithm that uses the representation power of deep neural networks to directly reconstruct continuous distributions of 3D density maps and map per-particle heterogeneity of single-particle cryo-EM datasets. It can uncover residual heterogeneity in high-resolution datasets and visualize large-scale continuous motions of protein complexes, while also offering interactive tools for dataset visualization and analysis.

NATURE METHODS (2021)

Article Multidisciplinary Sciences

Improved protein structure refinement guided by deep learning based accuracy estimation

Naozumi Hiranuma et al.

Summary: DeepAccNet is a deep learning framework that estimates per-residue accuracy and residue-residue distance signed error in protein models, guiding Rosetta protein structure refinement and demonstrating improved accuracy prediction and refinement compared to other methods.

NATURE COMMUNICATIONS (2021)

Article Engineering, Biomedical

Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations

Payel Das et al.

Summary: The combination of deep learning and molecular dynamics simulations enables the rapid discovery of antimicrobial peptides with low toxicity and high potency against a variety of Gram-positive and Gram-negative pathogens.

NATURE BIOMEDICAL ENGINEERING (2021)

Article Multidisciplinary Sciences

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.

Summary: Proteins are essential for life, and accurate prediction of their structures is a crucial research problem. Current experimental methods are time-consuming, highlighting the need for accurate computational approaches to address the gap in structural coverage. Despite recent progress, existing methods fall short of atomic accuracy in protein structure prediction.

NATURE (2021)

Article Multidisciplinary Sciences

Structure-based protein function prediction using graph convolutional networks

Vladimir Gligorijevic et al.

Summary: DeepFRI is a graph convolutional network for predicting protein functions by leveraging sequence features from a protein language model and protein structures. It outperforms other methods and supports large-scale sequence repositories.

NATURE COMMUNICATIONS (2021)

Article Computer Science, Artificial Intelligence

Improved protein structure prediction by deep learning irrespective of co-evolution information

Jinbo Xu et al.

Summary: Recent advances in computational protein structure prediction have shown significant improvements by integrating deep learning and co-evolutionary analysis. Using ResNet, predictions of correct protein folds have been successful even without co-evolution information, suggesting potential for learning important protein sequence-structure relationships.

NATURE MACHINE INTELLIGENCE (2021)

Article Computer Science, Artificial Intelligence

Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans

Michael Roberts et al.

Summary: Many machine learning-based approaches have been developed for the prognosis and diagnosis of COVID-19 from medical images. However, a systematic review found that current studies have methodological flaws, preventing their potential clinical utility. Recommendations are provided to address these issues for higher-quality model development.

NATURE MACHINE INTELLIGENCE (2021)

Article Multidisciplinary Sciences

Improved protein structure prediction using potentials from deep learning

Andrew W. Senior et al.

NATURE (2020)

Article Multidisciplinary Sciences

Improved protein structure prediction using predicted interresidue orientations

Jianyi Yang et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2020)

Correction Biochemistry & Molecular Biology

A Deep Learning Approach to Antibiotic Discovery (vol 180, pg 688.e1, 2020)

Jonathan M. Stokes et al.

Article Biochemistry & Molecular Biology

A Generative Neural Network for Maximizing Fitness and Diversity of Synthetic DNA and Protein Sequences

Johannes Linder et al.

CELL SYSTEMS (2020)

Review Biochemistry & Molecular Biology

Machine Learning Applications for Mass Spectrometry-Based Metabolomics

Ulf W. Liebal et al.

METABOLITES (2020)

Article Biochemistry & Molecular Biology

Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations

Benjamin J. Livesey et al.

MOLECULAR SYSTEMS BIOLOGY (2020)

Review Clinical Neurology

Applications of machine learning to diagnosis and treatment of neurodegenerative diseases

Monika A. Myszczynska et al.

NATURE REVIEWS NEUROLOGY (2020)

Article Multidisciplinary Sciences

Deep learning for genomics using Janggu

Wolfgang Kopp et al.

NATURE COMMUNICATIONS (2020)

Article Multidisciplinary Sciences

A deep learning model to predict RNA-Seq expression of tumours from whole slide images

Benoit Schmauch et al.

NATURE COMMUNICATIONS (2020)

Review Biochemistry & Molecular Biology

Enhancing scientific discoveries in molecular biology with deep generative models

Romain Lopez et al.

MOLECULAR SYSTEMS BIOLOGY (2020)

Article Biochemical Research Methods

Predicting 3D genome folding from DNA sequence with Akita

Geoff Fudenberg et al.

NATURE METHODS (2020)

Article Biochemistry & Molecular Biology

Fast and Flexible Protein Design Using Deep Graph Neural Networks

Alexey Strokach et al.

CELL SYSTEMS (2020)

Article Biochemical Research Methods

DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction

Daniel Munro et al.

BIOINFORMATICS (2020)

Article Computer Science, Artificial Intelligence

Shortcut learning in deep neural networks

Robert Geirhos et al.

NATURE MACHINE INTELLIGENCE (2020)

Review Computer Science, Artificial Intelligence

Drug discovery with explainable artificial intelligence

Jose Jimenez-Luna et al.

NATURE MACHINE INTELLIGENCE (2020)

Article Biochemical Research Methods

Protein model quality assessment using 3D oriented convolutional neural networks

Guillaume Pages et al.

BIOINFORMATICS (2019)

Article Biochemical Research Methods

Selene: a PyTorch-based deep learning library for sequence data

Kathleen M. Chen et al.

NATURE METHODS (2019)

Article Biochemical Research Methods

SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information

Kai Duhrkop et al.

NATURE METHODS (2019)

Letter Biotechnology & Applied Microbiology

The Kipoi repository accelerates community exchange and reuse of predictive models for genomics

Ziga Avsec et al.

NATURE BIOTECHNOLOGY (2019)

Article Biochemistry & Molecular Biology

The PSIPRED Protein Analysis Workbench: 20 years on

Daniel W. A. Buchan et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biochemistry & Molecular Biology

End-to-End Differentiable Learning of Protein Structure

Mohammed AlQuraishi

CELL SYSTEMS (2019)

Article Biochemical Research Methods

ProteinNet: a standardized data set for machine learning of protein structure

Mohammed AlQuraishi

BMC BIOINFORMATICS (2019)

Review Biochemical Research Methods

Machine-learning-guided directed evolution for protein engineering

Kevin K. Yang et al.

NATURE METHODS (2019)

Review Biochemistry & Molecular Biology

Machine learning approaches and their current application in plant molecular biology: A systematic review

Jose Cleydson F. Silva et al.

PLANT SCIENCE (2019)

Article Multidisciplinary Sciences

Environmental conditions shape the nature of a minimal bacterial genome

Magdalena Antczak et al.

NATURE COMMUNICATIONS (2019)

Article Multidisciplinary Sciences

HyperFoods: Machine intelligent mapping of cancer-beating molecules in foods

Kirill Veselkov et al.

SCIENTIFIC REPORTS (2019)

Article Chemistry, Analytical

Deep Neural Networks for Classification of LC-MS Spectral Peaks

Edward D. Kantz et al.

ANALYTICAL CHEMISTRY (2019)

Article Biochemical Research Methods

HH-suite3 for fast remote homology detection and deep protein annotation

Martin Steinegger et al.

BMC BIOINFORMATICS (2019)

Article Biotechnology & Applied Microbiology

Deep learning enables rapid identification of potent DDR1 kinase inhibitors

Alex Zhavoronkov et al.

NATURE BIOTECHNOLOGY (2019)

Article Biochemical Research Methods

Robust and automated detection of subcellular morphological motifs in 3D microscopy images

Meghan K. Driscoll et al.

NATURE METHODS (2019)

Editorial Material Cell Biology

Setting the standards for machine learning in biology

David T. Jones

NATURE REVIEWS MOLECULAR CELL BIOLOGY (2019)

Article Biochemical Research Methods

Real-time cryo-electron microscopy data preprocessing with Warp

Dimitry Tegunov et al.

NATURE METHODS (2019)

Article Biochemical Research Methods

Unified rational protein engineering with sequence-based deep representation learning

Ethan C. Alley et al.

NATURE METHODS (2019)

Review Biochemistry & Molecular Biology

Critical assessment of methods of protein structure prediction (CASP)-Round XIII

Andriy Kryshtafovych et al.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2019)

Article Multidisciplinary Sciences

The art of using t-SNE for single-cell transcriptomics

Dmitry Kobak et al.

NATURE COMMUNICATIONS (2019)

Article Genetics & Heredity

Gene Expression Value Prediction Based on XGBoost Algorithm

Wei Li et al.

FRONTIERS IN GENETICS (2019)

Article Genetics & Heredity

A primer on deep learning in genomics

James Zou et al.

NATURE GENETICS (2019)

Article Biochemistry & Molecular Biology

CATH: expanding the horizons of structure-based functional annotations for genome sequences

Ian Sillitoe et al.

NUCLEIC ACIDS RESEARCH (2019)

Correction Biochemistry & Molecular Biology

Multi-omic and multi-view clustering algorithms: review and cancer benchmark (vol 46, pg 10546, 2018)

Nimrod Rappoport et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biotechnology & Applied Microbiology

Visualizing structure and transitions in high-dimensional biological data

Kevin R. Moon et al.

NATURE BIOTECHNOLOGY (2019)

Article Multidisciplinary Sciences

Deep learning for inferring gene relationships from single-cell expression data

Ye Yuan et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2019)

Article Biochemical Research Methods

deepNF: deep network fusion for protein function prediction

Vladimir Gligorijevic et al.

BIOINFORMATICS (2018)

Article Biochemical Research Methods

Modeling polypharmacy side effects with graph convolutional networks

Marinka Zitnik et al.

BIOINFORMATICS (2018)

Article Biochemistry & Molecular Biology

Simulations meet machine learning in structural biology

Adria Perez et al.

CURRENT OPINION IN STRUCTURAL BIOLOGY (2018)

Article Chemistry, Medicinal

Recurrent Neural Network Model for Constructive Peptide Design

Alex T. Mueller et al.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2018)

Review Multidisciplinary Sciences

Opportunities and obstacles for deep learning in biology and medicine

Travers Ching et al.

JOURNAL OF THE ROYAL SOCIETY INTERFACE (2018)

Editorial Material Biochemical Research Methods

Machine learning: supervised methods

Danilo Bzdok et al.

NATURE METHODS (2018)

Article Engineering, Biomedical

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning

Ryan Poplin et al.

NATURE BIOMEDICAL ENGINEERING (2018)

Article Biotechnology & Applied Microbiology

A universal SNP and small-indel variant caller using deep neural networks

Ryan Poplin et al.

NATURE BIOTECHNOLOGY (2018)

Article Biochemical Research Methods

Inferring single-trial neural population dynamics using sequential auto-encoders

Chethan Pandarinath et al.

NATURE METHODS (2018)

Article Multidisciplinary Sciences

Design of metalloproteins and novel protein folds using variational autoencoders

Joe G. Greener et al.

SCIENTIFIC REPORTS (2018)

Article Multidisciplinary Sciences

Realizing private and practical pharmacological collaboration

Brian Hie et al.

SCIENCE (2018)

Article Chemistry, Multidisciplinary

Improving Scoring-Docking-Screening Powers of Protein-Ligand Scoring Functions using Random Forest

Cheng Wang et al.

JOURNAL OF COMPUTATIONAL CHEMISTRY (2017)

Article Multidisciplinary Sciences

Dermatologist-level classification of skin cancer with deep neural networks

Andre Esteva et al.

NATURE (2017)

Article Biotechnology & Applied Microbiology

Mutation effects predicted from sequence co-variation

Thomas A. Hopf et al.

NATURE BIOTECHNOLOGY (2017)

Letter Biotechnology & Applied Microbiology

MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets

Martin Steinegger et al.

NATURE BIOTECHNOLOGY (2017)

Editorial Material Biochemical Research Methods

Clustering

Naomi Altman et al.

NATURE METHODS (2017)

Editorial Material Biochemical Research Methods

Ten Simple Rules for Developing Usable Software in Computational Biology

Markus List et al.

PLOS COMPUTATIONAL BIOLOGY (2017)

Article Biochemical Research Methods

Correct machine learning on protein sequences: a peer-reviewing perspective

Ian Walsh et al.

BRIEFINGS IN BIOINFORMATICS (2016)

Article Biochemical Research Methods

Convolutional neural network architectures for predicting DNA-protein binding

Haoyang Zeng et al.

BIOINFORMATICS (2016)

Article Biochemical Research Methods

Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model

Lujia Chen et al.

BMC BIOINFORMATICS (2016)

Article Biochemistry & Molecular Biology

Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks

David R. Kelley et al.

GENOME RESEARCH (2016)

Article Biochemistry & Molecular Biology

DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences

Daniel Quang et al.

NUCLEIC ACIDS RESEARCH (2016)

Article Mathematical & Computational Biology

Toward an Integration of Deep Learning and Neuroscience

Adam H. Marblestone et al.

FRONTIERS IN COMPUTATIONAL NEUROSCIENCE (2016)

Article Multidisciplinary Sciences

FFPred 3: feature-based function prediction for all Gene Ontology domains

Domenico Cozzetto et al.

SCIENTIFIC REPORTS (2016)

Article Environmental Sciences

DeepTox: Toxicity Prediction using Deep Learning

Andreas Mayr et al.

FRONTIERS IN ENVIRONMENTAL SCIENCE (2016)

Review Multidisciplinary Sciences

Deep learning

Yann LeCun et al.

NATURE (2015)

Article Biotechnology & Applied Microbiology

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

Babak Alipanahi et al.

NATURE BIOTECHNOLOGY (2015)

Review Genetics & Heredity

Machine learning applications in genetics and genomics

Maxwell W. Libbrecht et al.

NATURE REVIEWS GENETICS (2015)

Article Genetics & Heredity

A general framework for estimating the relative pathogenicity of human genetic variants

Martin Kircher et al.

NATURE GENETICS (2014)

Article Biochemistry & Molecular Biology

DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach

Douglas E. V. Pires et al.

NUCLEIC ACIDS RESEARCH (2014)

Article Biochemical Research Methods

Predicting folding free energy changes upon single point mutations

Zhe Zhang et al.

BIOINFORMATICS (2012)

Article Biochemistry & Molecular Biology

Protein sequence comparison and fold recognition: progress and good-practice benchmarking

Johannes Soeding et al.

CURRENT OPINION IN STRUCTURAL BIOLOGY (2011)

Article Computer Science, Artificial Intelligence

Data clustering: 50 years beyond K-means

Anil K. Jain

PATTERN RECOGNITION LETTERS (2010)

Article Biochemical Research Methods

Transmembrane protein topology prediction using support vector machines

Timothy Nugent et al.

BMC BIOINFORMATICS (2009)

Article Computer Science, Interdisciplinary Applications

Building Predictive Models in R Using the caret Package

Max Kuhn

JOURNAL OF STATISTICAL SOFTWARE (2008)

Article Biochemical Research Methods

Support Vector Machines and Kernels for Computational Biology

Asa Ben-Hur et al.

PLOS COMPUTATIONAL BIOLOGY (2008)

Article Biochemical Research Methods

Machine learning and its applications to biology

Adi L. Tarca et al.

PLOS COMPUTATIONAL BIOLOGY (2007)

Article Biotechnology & Applied Microbiology

What is a support vector machine?

William S. Noble

NATURE BIOTECHNOLOGY (2006)

Article Biochemistry & Molecular Biology

nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms

L Bao et al.

NUCLEIC ACIDS RESEARCH (2005)

Article Statistics & Probability

Regularization and variable selection via the elastic net

H Zou et al.

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY (2005)

Article Chemistry, Multidisciplinary

SPICKER:: A clustering approach to identify near-native protein folds

Y Zhang et al.

JOURNAL OF COMPUTATIONAL CHEMISTRY (2004)

Article Biochemical Research Methods

Understanding protein flexibility through dimensionality reduction

ML Teodoro et al.

JOURNAL OF COMPUTATIONAL BIOLOGY (2003)