期刊
CHEMOSPHERE
卷 63, 期 1, 页码 99-108出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.chemosphere.2005.07.002
关键词
prediction outlier diagnostics; quantitative structure-activity relationships; partial least squares (PLS); Pseudokirschneriella subcapitata; Daphnia magna; Lepomis macrochirus
Empirical QSAR models are only valid in the domain they were trained and validated. Application of the model to substances outside the domain of the model can lead to grossly erroneous predictions. Partial least squares (PLS) regression provides tools for prediction diagnostics that can be used to decide whether or not a substance is within the model domain, i.e. if the model prediction can be trusted. QSAR models for four different environmental end-points are used to demonstrate the importance of appropriate training set selection and how the reliability of QSAR predictions can be increased by outlier diagnostics. All models showed consistent results; test set prediction errors were very similar in magnitude to training set estimation errors when prediction outlier diagnostics were used to detect and remove outliers in the prediction data. Test set prediction errors for substances classified as outliers were much larger. The difference in the number of outliers between models with a randomly and systematically selected training illustrates well the need of representative training data. (c) 2005 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据