Journal
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
Volume 117, Issue 41, Pages 25464-25475Publisher
NATL ACAD SCIENCES
DOI: 10.1073/pnas.2009279117
Keywords
protease specificity; human proteome; substrate phage library; NGS
Categories
Funding
- Chan Zuckerberg Initiative
- Biohub Investigator Program
- National Cancer Institute (NCI) [P41CA196276]
- NCI [F32CA236151-02]
- NSF Computing and Communication Foundations [1763191]
- NIH [R21 MD012867-01, P30AG059307]
- Silicon Valley Foundation
- Division of Computing and Communication Foundations
- Direct For Computer & Info Scie & Enginr [1763191] Funding Source: National Science Foundation
Ask authors/readers for more resources
Proteolysis is a major posttranslational regulator of biology inside and outside of cells. Broad identification of optimal cleavage sites and natural substrates of proteases is critical for drug discovery and to understand protease biology. Here, we present a method that employs two genetically encoded substrate phage display libraries coupled with next generation sequencing (SPD-NGS) that allows up to 10,000-fold deeper sequence coverage of the typical sixto eight-residue protease cleavage sites compared to state-of-the-art synthetic peptide libraries or proteomics. We applied SPD-NGS to two classes of proteases, the intracellular caspases, and the ectodomains of the sheddases, ADAMs 10 and 17. The first library (Lib 10AA) allowed us to identify 10(4) to 10(5) unique cleavage sites over a 1,000fold dynamic range of NGS counts and produced consensus and optimal cleavage motifs based position-specific scoring matrices. A second SPD-NGS library (Lib hP), which displayed virtually the entire human proteome tiled in contiguous 49 amino acid sequences with 25 amino acid overlaps, enabled us to identify candidate human proteome sequences. We identified up to 104 natural linear cut sites, depending on the protease, and captured most of the examples previously identified by proteomics and predicted 10- to 100-fold more. Structural bioinformatics was used to facilitate the identification of candidate natural protein substrates. SPD-NGS is rapid, reproducible, simple to perform and analyze, inexpensive, and renewable, with unprecedented depth of coverage for substrate sequences, and is an important tool for protease biologists interested in protease specificity for specific assays and inhibitors and to facilitate identification of natural protein substrates.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available