4.6 Article

ColGen: An end-to-end deep learning model to predict thermal stability of de novo collagen sequences

出版社

ELSEVIER
DOI: 10.1016/j.jmbbm.2021.104921

关键词

Collagen; Machine learning; Melting temperature; Deep learning; Long short-term memory artificial recurrent; neural network

资金

  1. IBM-MIT Watson AI Lab, ARO [W911NF-17-1-0384]
  2. NOR [NIH P41EB027062, U01 EB014976]
  3. ONR [N000141612333, N000142012189, N000141912375]
  4. Ministry of Science and Technology in Taiwan [MOST 109-2222-E-006-005-MY2, MOST 110-2224-E-007-003]
  5. NSF GRFP
  6. U.S. Department of Defense (DOD) [N000142012189] Funding Source: U.S. Department of Defense (DOD)

向作者/读者索取更多资源

Collagen is a crucial structural protein in human tissues, commonly used for repairs and regeneration, but designing specific collagen sequences remains a challenge. Research shows that mutations to glycines, mutations in the middle of a sequence, and short sequence lengths have the greatest impact on the stability of collagen structures.
Collagen is the most abundant structural protein in humans, with dozens of sequence variants accounting for over 30% of the protein in an animal body. The fibrillar and hierarchical arrangements of collagen are critical in providing mechanical properties with high strength and toughness. Due to this ubiquitous role in human tissues, collagen-based biomaterials are commonly used for tissue repairs and regeneration, requiring chemical and thermal stability over a range of temperatures during materials preparation ex vivo and subsequent utility in vivo. Collagen unfolds from a triple helix to a random coil structure during a temperature interval in which the midpoint or Tm is used as a measure to evaluate the thermal stability of the molecules. However, finding a robust framework to facilitate the design of a specific collagen sequence to yield a specific Tm remains a challenge, including using conventional molecular dynamics modeling. Here we propose a de novo framework to provide a model that outputs the Tm values of input collagen sequences by incorporating deep learning trained on a large data set of collagen sequences and corresponding Tm values. By using this framework, we are able to quickly evaluate how mutations and order in the primary sequence affect the stability of collagen triple helices. Specifically, we confirm that mutations to glycines, mutations in the middle of a sequence, and short sequence lengths cause the greatest drop in Tm values.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据