4.7 Article

Support vector machine: Classifying and predicting mutagenicity of complex mixtures based on pollution profiles

期刊

TOXICOLOGY
卷 313, 期 2-3, 页码 151-159

出版社

ELSEVIER IRELAND LTD
DOI: 10.1016/j.tox.2013.01.016

关键词

Support vector machine; Complex mixture; Pollution profile; Mutagenicity

资金

  1. National Key Technology R&D Program in the 11th Five Year Plan [2006BAI19B02]
  2. National Natural Science Foundation of China [30972438, 30771770, 81273035, 81202165]
  3. Key Project of National High-tech R&D Program of China (863 Program) [2008AA062501, 2013AA065204]
  4. Shanghai Municipal Health Bureau Leading Academic Discipline Project [08GWD14]
  5. Dawn Program of Shanghai Education Commission [07SG01]
  6. Non-profit Foundation of National Health Ministry in the 12th Five Year Plan [201302004, 2012BAJ25805]

向作者/读者索取更多资源

Powerful, robust in silico approaches offer great promise for classifying and predicting biological effects of complex mixtures and for identifying the constituents of greatest concern. Support vector machine (SVM) methods can deal with high dimensional data and small sample size and examine multiple interrelationships among samples. In this work, we applied SVM methods to examine pollution profiles and mutagenicity of 60 water samples obtained from 6 cities in China during 2006-2011. Pollutant profiles were characterized in water extracts by gas chromatography-mass spectrometry (GC/MS) and mutagenicity examined by Ames assays. We encoded feature vectors of GS-MS peaks in the mixtures and used 48 samples as the training set, reserving 12 samples as the test set. The SVM model and regression were constructed from whole pollution profiles that ranked compounds in relation to their correlation to the mutagenicity. Both classification and prediction performance were evaluated. The SVM model based on whole pollution profiles showed lower performance (sensitivity, specificity, accuracy and correlation coefficient were 69.5-70.7%, 70.6-73.2%, 69.9-72.1%, and 0.55-0.59%, respectively) than one based on compounds with highest association with mutagenicity. A SVM model with the top 10 compounds had the highest performance (sensitivity, specificity, accuracy, and correlation coefficient were 89.8-90.3%, 90.1-92.1%, 90.1-91.3%, and 0.80-0.82%, respectively), with negligible decreases in performance between the test and training set. SVM can be a powerful, robust classifier of the relationship of pollutants and mutagenicity in complex real-world mixtures. The top 14 compounds have the greatest contribution to mutagenicity and deserve further studies to identify these constituents. (C) 2013 Elsevier Ireland Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据