期刊
TOXICOLOGY
卷 313, 期 2-3, 页码 151-159出版社
ELSEVIER IRELAND LTD
DOI: 10.1016/j.tox.2013.01.016
关键词
Support vector machine; Complex mixture; Pollution profile; Mutagenicity
资金
- National Key Technology R&D Program in the 11th Five Year Plan [2006BAI19B02]
- National Natural Science Foundation of China [30972438, 30771770, 81273035, 81202165]
- Key Project of National High-tech R&D Program of China (863 Program) [2008AA062501, 2013AA065204]
- Shanghai Municipal Health Bureau Leading Academic Discipline Project [08GWD14]
- Dawn Program of Shanghai Education Commission [07SG01]
- Non-profit Foundation of National Health Ministry in the 12th Five Year Plan [201302004, 2012BAJ25805]
Powerful, robust in silico approaches offer great promise for classifying and predicting biological effects of complex mixtures and for identifying the constituents of greatest concern. Support vector machine (SVM) methods can deal with high dimensional data and small sample size and examine multiple interrelationships among samples. In this work, we applied SVM methods to examine pollution profiles and mutagenicity of 60 water samples obtained from 6 cities in China during 2006-2011. Pollutant profiles were characterized in water extracts by gas chromatography-mass spectrometry (GC/MS) and mutagenicity examined by Ames assays. We encoded feature vectors of GS-MS peaks in the mixtures and used 48 samples as the training set, reserving 12 samples as the test set. The SVM model and regression were constructed from whole pollution profiles that ranked compounds in relation to their correlation to the mutagenicity. Both classification and prediction performance were evaluated. The SVM model based on whole pollution profiles showed lower performance (sensitivity, specificity, accuracy and correlation coefficient were 69.5-70.7%, 70.6-73.2%, 69.9-72.1%, and 0.55-0.59%, respectively) than one based on compounds with highest association with mutagenicity. A SVM model with the top 10 compounds had the highest performance (sensitivity, specificity, accuracy, and correlation coefficient were 89.8-90.3%, 90.1-92.1%, 90.1-91.3%, and 0.80-0.82%, respectively), with negligible decreases in performance between the test and training set. SVM can be a powerful, robust classifier of the relationship of pollutants and mutagenicity in complex real-world mixtures. The top 14 compounds have the greatest contribution to mutagenicity and deserve further studies to identify these constituents. (C) 2013 Elsevier Ireland Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据