☆ 4.6 Article

Positional SHAP (PoSHAP) for Interpretation of machine learning models trained from biological sequences

PLOS COMPUTATIONAL BIOLOGY (2022)

Journal

PLOS COMPUTATIONAL BIOLOGY

Volume 18, Issue 1, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pcbi.1009736

Keywords

Funding

National Institutes of Health (NIH)
National Institute of General Medical Sciences (NIGMS) [R35 GM142502]
National Library of Medicine (NLM) [T15 LM007359]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Machine learning with deep neural networks is effective for biological predictions, but interpreting models with sequential input data is challenging. This paper introduces a framework called PoSHAP, which utilizes SHAP to interpret models trained from biological sequences. The experiments demonstrate that PoSHAP can reproduce known peptide binding motifs and provide new insights into peptide properties.

Machine learning with multi-layered artificial neural networks, also known as deep learning, is effective for making biological predictions. However, model interpretation is challenging, especially for sequential input data used with recurrent neural network architectures. Here, we introduce a framework called Positional SHAP (PoSHAP) to interpret models trained from biological sequences by utilizing SHapely Additive exPlanations (SHAP) to generate positional model interpretations. We demonstrate this using three long short-term memory (LSTM) regression models that predict peptide properties, including binding affinity to major histocompatibility complexes (MHC), and collisional cross section (CCS) measured by ion mobility spectrometry. Interpretation of these models with PoSHAP reproduced MHC class I (rhesus macaque Mamu-A1*001 and human A*11:01) peptide binding motifs, reflected known properties of peptide CCS, and provided new insights into interpositional dependencies of amino acid interactions. PoSHAP should have widespread utility for interpreting a variety of models trained from biological sequences.

Positional SHAP (PoSHAP) for Interpretation of machine learning models trained from biological sequences

Journal

PLOS COMPUTATIONAL BIOLOGY

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Positional SHAP (PoSHAP) for Interpretation of machine learning models trained from biological sequences

Journal

PLOS COMPUTATIONAL BIOLOGY

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper