4.5 Article

A germline knowledge based computational approach for determining antibody complementarity determining regions

Journal

MOLECULAR IMMUNOLOGY
Volume 47, Issue 4, Pages 694-700

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.molimm.2009.10.028

Keywords

Germline; Framework regions (FRs); Complementarity determining regions (CDRs); CDR definition; Kabat; IMGT; Chothia; Somatic insertions; Somatic deletions; Algorithm; JAVA; CDR delimitation

Ask authors/readers for more resources

Determination of framework regions (FRs) and complementarity determining regions (CDRs) in an antibody is essential for understanding the underlying biology as well as antibody engineering and optimization. However, there are no computational algorithms available to delimit an antibody sequence or a library of sequences into FRs and CDRs in a coherent and automatic fashion. Based upon the mapping relationships among mature antibody sequences and their corresponding germline gene segments, a novel computational algorithm has been developed for automatic determination of CDRs. Even though a human can make more than 10(12) different antibody molecules in its preimmune repertoire to fight off invading pathogens, these antibodies are generated from rearrangements of a very limited number of germline variable (V) gene, diversity (D) gene and joining (J) gene segments followed by somatic hypermutation. The framework regions FR1, FR2 and FR3 in mature antibodies are encoded by germline V gene segments, while FR4 is encoded by J gene segments. Since there are only a limited number of germline gene segments, these genes can be pre-delimited to generate a knowledge base of FRs and CDRs. Then for a given antibody sequence, the algorithm scans each pre-delimited gene in knowledge base, finds the best matching V and J segments, and accordingly, identifies the FRs and CDRs. The described algorithm is stringently tested using nearly 25,000 human antibody sequences from NCBI, and it is proven to be very robust. Over 99.7% of antibody sequences can be delimited computationally. Of those delimited sequences, only 0.28% of them have somatic insertions and deletions in FRs, and their corresponding delimited results need manual checking. Another feature of the algorithm is that it is CDR definition independent, and can be easily extended to other CDR definitions besides the most widely used Kabat, Chothia and IMGT definitions. In addition to delimitation of antibody sequences into FRs and CDRs, the described algorithm is good for sequence annotation and sequence quality control by detecting unusual sequence patterns and features. Furthermore, it has been suggested that the algorithm may easily be embedded into other applications, such as to create a gene family specific PSSM (Position Specific Scoring Matrix) for antibody engineering, and to automatically number an antibody sequence. (C) 2010 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available