4.8 Article

Deep profiling of protease substrate specificity enabled by dual random and scanned human proteome substrate phage libraries

Publisher

NATL ACAD SCIENCES
DOI: 10.1073/pnas.2009279117

Keywords

protease specificity; human proteome; substrate phage library; NGS

Funding

  1. Chan Zuckerberg Initiative
  2. Biohub Investigator Program
  3. National Cancer Institute (NCI) [P41CA196276]
  4. NCI [F32CA236151-02]
  5. NSF Computing and Communication Foundations [1763191]
  6. NIH [R21 MD012867-01, P30AG059307]
  7. Silicon Valley Foundation
  8. Division of Computing and Communication Foundations
  9. Direct For Computer & Info Scie & Enginr [1763191] Funding Source: National Science Foundation

Ask authors/readers for more resources

Proteolysis is a major posttranslational regulator of biology inside and outside of cells. Broad identification of optimal cleavage sites and natural substrates of proteases is critical for drug discovery and to understand protease biology. Here, we present a method that employs two genetically encoded substrate phage display libraries coupled with next generation sequencing (SPD-NGS) that allows up to 10,000-fold deeper sequence coverage of the typical sixto eight-residue protease cleavage sites compared to state-of-the-art synthetic peptide libraries or proteomics. We applied SPD-NGS to two classes of proteases, the intracellular caspases, and the ectodomains of the sheddases, ADAMs 10 and 17. The first library (Lib 10AA) allowed us to identify 10(4) to 10(5) unique cleavage sites over a 1,000fold dynamic range of NGS counts and produced consensus and optimal cleavage motifs based position-specific scoring matrices. A second SPD-NGS library (Lib hP), which displayed virtually the entire human proteome tiled in contiguous 49 amino acid sequences with 25 amino acid overlaps, enabled us to identify candidate human proteome sequences. We identified up to 104 natural linear cut sites, depending on the protease, and captured most of the examples previously identified by proteomics and predicted 10- to 100-fold more. Structural bioinformatics was used to facilitate the identification of candidate natural protein substrates. SPD-NGS is rapid, reproducible, simple to perform and analyze, inexpensive, and renewable, with unprecedented depth of coverage for substrate sequences, and is an important tool for protease biologists interested in protease specificity for specific assays and inhibitors and to facilitate identification of natural protein substrates.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available