☆ 4.8 Article

Prediction of celiac disease associated epitopes and motifs in a protein

FRONTIERS IN IMMUNOLOGY (2023)

期刊

FRONTIERS IN IMMUNOLOGY

卷 14, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA

DOI: 10.3389/fimmu.2023.1056101

关键词

celiac disease; gluten immunogenic peptides; HLA-DQ2; DQ8; ensemble method; motif

类别

Immunology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this study, computational tools were used to predict CD associated epitopes and motifs in protein-based foods and therapeutics. The analysis revealed that proline (P) and glutamine (Q) are highly abundant in CD associated peptides. Machine learning based models and motif-based approach were developed, and the best models and motifs were integrated into a web server and standalone software package CDpred.

IntroductionCeliac disease (CD) is an autoimmune gastrointestinal disorder causes immune-mediated enteropathy against gluten. Gluten immunogenic peptides have the potential to trigger immune responses which leads to damage the small intestine. HLA-DQ2/DQ8 are major alleles that bind to epitope/antigenic region of gluten and induce celiac disease. There is a need to identify CD associated epitopes in protein-based foods and therapeutics. MethodsIn this study, computational tools have been developed to predict CD associated epitopes and motifs. Dataset used for training, testing and evaluation contain experimentally validated CD associated and non-CD associate peptides. We perform positional analysis to identify the most significant position of an amino acid residue in the peptide and checked the frequency of HLA alleles. We also compute amino acid composition to develop machine learning based models. We also developed ensemble method that combines motif-based approach and machine learning based models. Results and DiscussionOur analysis support existing hypothesis that proline (P) and glutamine (Q) are highly abundant in CD associated peptides. A model based on density of P&Q in peptides has been developed for predicting CD associated peptides which achieve maximum AUROC 0.98 on independent data. We discovered motifs (e.g., QPF, QPQ, PYP) which occurs specifically in CD associated peptides. We also developed machine learning based models using peptide composition and achieved maximum AUROC 0.99. Finally, we developed ensemble method that combines motif-based approach and machine learning based models. The ensemble model-predict CD associated motifs with 100% accuracy on an independent dataset, not used for training. Finally, the best models and motifs has been integrated in a web server and standalone software package CDpred. We hope this server anticipate the scientific community for the prediction, designing and scanning of CD associated peptides as well as CD associated motifs in a protein/peptide sequence (https://webs.iiitd.edu.in/raghava/cdpred/).

Prediction of celiac disease associated epitopes and motifs in a protein

期刊

FRONTIERS IN IMMUNOLOGY

出版社

FRONTIERS MEDIA SA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Prediction of celiac disease associated epitopes and motifs in a protein

期刊

FRONTIERS IN IMMUNOLOGY

出版社

FRONTIERS MEDIA SA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文