☆ 4.5 Article

Training data distribution significantly impacts the estimation of tissue microstructure with machine learning

MAGNETIC RESONANCE IN MEDICINE (2022)

Journal

MAGNETIC RESONANCE IN MEDICINE

Volume 87, Issue 2, Pages 932-947

Publisher

WILEY

DOI: 10.1002/mrm.29014

Keywords

machine learning; microstructure imaging; model fitting; quantitative MRI; training data distribution

Funding

Biotechnology and Biological Sciences Research Council [BB/M009513/1]
Engineering and Physical Sciences Research Council [EP/M020533/1, EP/N018702/1]
UK Research and Innovation [MR/T020296/1]
NIHR GOSH Biomedical Research Centre
NIHR UCLH Biomedical Research Centre
EPSRC [EP/M020533/1, EP/N018702/1] Funding Source: UKRI
UKRI [MR/T020296/1] Funding Source: UKRI

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study demonstrates the impact of different training data distributions on the accuracy and precision of parameter estimates using supervised machine learning for quantitative MRI. The research shows that the distribution of training data strongly influences the estimation of model parameters, with high precision obtained through ML potentially masking strong bias. Visual assessment of parameter maps alone is insufficient for evaluating the quality of estimates.

Purpose Supervised machine learning (ML) provides a compelling alternative to traditional model fitting for parameter mapping in quantitative MRI. The aim of this work is to demonstrate and quantify the effect of different training data distributions on the accuracy and precision of parameter estimates when supervised ML is used for fitting. Methods We fit a two- and three-compartment biophysical model to diffusion measurements from in-vivo human brain, as well as simulated diffusion data, using both traditional model fitting and supervised ML. For supervised ML, we train several artificial neural networks, as well as random forest regressors, on different distributions of ground truth parameters. We compare the accuracy and precision of parameter estimates obtained from the different estimation approaches using synthetic test data. Results When the distribution of parameter combinations in the training set matches those observed in healthy human data sets, we observe high precision, but inaccurate estimates for atypical parameter combinations. In contrast, when training data is sampled uniformly from the entire plausible parameter space, estimates tend to be more accurate for atypical parameter combinations but may have lower precision for typical parameter combinations. Conclusion This work highlights that estimation of model parameters using supervised ML depends strongly on the training-set distribution. We show that high precision obtained using ML may mask strong bias, and visual assessment of the parameter maps is not sufficient for evaluating the quality of the estimates.

Training data distribution significantly impacts the estimation of tissue microstructure with machine learning

Journal

MAGNETIC RESONANCE IN MEDICINE

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Training data distribution significantly impacts the estimation of tissue microstructure with machine learning

Journal

MAGNETIC RESONANCE IN MEDICINE

Publisher

WILEY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper