4.5 Article

Machine Learning of Toxicological Big Data Enables Read-Across Structure Activity Relationships (RASAR) Outperforming Animal Test Reproducibility

期刊

TOXICOLOGICAL SCIENCES
卷 165, 期 1, 页码 198-212

出版社

OXFORD UNIV PRESS
DOI: 10.1093/toxsci/kfy152

关键词

-

资金

  1. NIEHS [T32 ES007141]
  2. EU-ToxRisk project - European Commission [681002]

向作者/读者索取更多资源

Earlier we created a chemical hazard database via natural language processing of dossiers submitted to the European Chemical Agency with approximately 10 000 chemicals. We identified repeat OECD guideline tests to establish reproducibility of acute oral and dermal toxicity, eye and skin irritation, mutagenicity and skin sensitization. Based on 350-700+ chemicals each, the probability that an OECD guideline animal test would output the same result in a repeat test was 78%-96% (sensitivity 50%-87%). An expanded database with more than 866 000 chemical properties/hazards was used as training data and to model health hazards and chemical properties. The constructed models automate and extend the read-across method of chemical classification. The novel models called RASARs (read-across structure activity relationship) use binary fingerprints and Jaccard distance to define chemical similarity. A large chemical similarity adjacency matrix is constructed from this similarity metric and is used to derive feature vectors for supervised learning. We show results on 9 health hazards from 2 kinds of RASARs-Simple and Data Fusion. The Simple RASAR seeks to duplicate the traditional read-across method, predicting hazard from chemical analogs with known hazard data. The Data Fusion RASAR extends this concept by creating large feature vectors from all available property data rather than only the modeled hazard. Simple RASAR models tested in cross-validation achieve 70%-80% balanced accuracies with constraints on tested compounds. Cross validation of data fusion RASARs show balanced accuracies in the 80%-95% range across 9 health hazards with no constraints on tested compounds.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据