4.7 Article

Semi-automated harmonization and selection of chemical data for risk and impact assessment

Journal

CHEMOSPHERE
Volume 302, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.chemosphere.2022.134886

Keywords

Data quality; Uncertainty assessment; Chemical properties; REACH; Partition coefficient

Funding

  1. Safe and Efficient Chemistry by Design (SafeChem) project - Swedish Foundation for Strategic Environmental Research [DIA 2018/11]
  2. European Chemicals Agency [ECHA/2017/445]
  3. Technical University of Denmark [ECHA/2017/445]

Ask authors/readers for more resources

To use data for various assessments, a method for data harmonization and selection is necessary. We developed a method for obtaining substance property values for different assessment frameworks. The method aligns and selects appropriate values to derive representative mean values.
Chemical data for thousands of substances are available for safety, risk, life cycle and substitution assessments, as submitted for example under the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) Regulation. However, to widely disseminate reported physicochemical properties as well as human and ecological exposure and toxicological data for use in various science and policy fields, systematic methods for data harmonization and selection are necessary. In response to this need, we developed a semi-automated method for deriving appropriate substance property values as input for various assessment frameworks with different requirements for resolution and data quality. Starting with data reported for a given substance and property, we propose a set of aligned data selection and harmonization criteria to obtain a representative mean value and related confidence intervals per chemical-property combination. The proposed method was tested on a set of octanol-water partition coefficients (Kow) for an illustrative set of 20 substances, reported under the REACH regulation as example data source. Our method is generally applicable to any set of substances, and can assess specific distributions in quality and variability across reported data. Further research can likely extend our method for mining information from text fields and adapt it to available data reported or collected from other sources and other substance properties to improve the reliability of input data for risk and impact assessments.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available