4.7 Article

Building a BioChemformatics Database

向作者/读者索取更多资源

The structural registration of chemically modified macromolecules is vital for the development of biopharmaceuticals. However, registration and search of such complex molecules has so far posed formidable challenges performance-wise, since today's chemistry-oriented databases do not scale well to macromolecules. As a practical consequence, macromolecules tend to be stored in protein databases with a focus on protein sequence only, and salient chemistry details are therefore lost. This article describes protein format extensions and the use of pseudoatoms for representing natural amino acids in chemical structures to allow high-performance registration and retrieval of large macromolecules. The representations include exact chemical modifications and enable lossless conversion between chemistry and sequence formats. Registration is done in parallel in both sequence and chemistry formats, and users can register and retrieve molecules in either format as they choose, resulting in what we call a BioChemformatics database. Having both sequence and chemistry formats available on-demand allows for the construction of protein SAR tables with mixed sequence and chemistry information. Likewise, searching may combine sequence and chemistry terms and be performed in standard vendor applications like MDL's ISIS/Base or in-house applications using standard SQL queries.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据