4.8 Article

Different scaling of linear models and deep learning in UKBiobank brain images versus machine-learning datasets

Journal

NATURE COMMUNICATIONS
Volume 11, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41467-020-18037-z

Keywords

-

Funding

  1. UKBiobank Resource [25163]
  2. Healthy Brains Healthy Lives initiative (Canada First Research Excellence fund)
  3. CIFAR Artificial Intelligence Chairs program (Canada Institute for Advanced Research)
  4. Google
  5. NIH [R01AG068563A]
  6. Deutsche Forschungsgemeinschaft (DFG) [BZ2/2-1, BZ2/3-1, BZ2/4-1, IRTG2150]
  7. Amazon AWS Research Grant
  8. START-Program of the Faculty of Medicine [126/16]
  9. Exploratory Research Space, RWTH Aachen [OPSF449]
  10. RWTH Aachen University [rwth0238]
  11. Singapore National Research Foundation (NRF) Fellowship

Ask authors/readers for more resources

Recently, deep learning has unlocked unprecedented success in various domains, especially using images, text, and speech. However, deep learning is only beneficial if the data have nonlinear relationships and if they are exploitable at available sample sizes. We systematically profiled the performance of deep, kernel, and linear models as a function of sample size on UKBiobank brain images against established machine learning references. On MNIST and Zalando Fashion, prediction accuracy consistently improves when escalating from linear models to shallow-nonlinear models, and further improves with deep-nonlinear models. In contrast, using structural or functional brain scans, simple linear models perform on par with more complex, highly parameterized models in age/sex prediction across increasing sample sizes. In sum, linear models keep improving as the sample size approaches similar to 10,000 subjects. Yet, nonlinearities for predicting common phenotypes from typical brain scans remain largely inaccessible to the examined kernel and deep learning methods. Schulz et al. systematically benchmark performance scaling with increasingly sophisticated prediction algorithms and with increasing sample size in reference machine-learning and biomedical datasets. Complicated nonlinear intervariable relationships remain largely inaccessible for predicting key phenotypes from typical brain scans.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available