4.4 Article

Quality assessment of anatomical MRI images from generative adversarial networks: Human assessment and image quality metrics

期刊

JOURNAL OF NEUROSCIENCE METHODS
卷 374, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.jneumeth.2022.109579

关键词

Generative Adversarial Network; GAN; MRI; Machine learning; Generative models; Ageing; Quality assessment; Deep learning

资金

  1. European Regional Development Fund (ERDF) via Welsh Government
  2. Guarantors of Brain [G101149]

向作者/读者索取更多资源

This study investigates the feasibility of using image quality metrics to assess the visual quality of GAN-generated images. The results show that experts are sensitive to changes in image quality, image quality metrics are sensitive to lower-quality images, and a deep quality assessment model trained on human ratings can capture subtle differences between higher-quality images.
Background: Generative Adversarial Networks (GANs) can synthesize brain images from image or noise input. So far, the gold standard for assessing the quality of the generated images has been human expert ratings. However, due to limitations of human assessment in terms of cost, scalability, and the limited sensitivity of the human eye to more subtle statistical relationships, a more automated approach towards evaluating GANs is required.New method: We investigated to what extent visual quality can be assessed using image quality metrics and we used group analysis and spatial independent components analysis to verify that the GAN reproduces multivariate statistical relationships found in real data. Reference human data was obtained by recruiting neuroimaging experts to assess real Magnetic Resonance (MR) images and images generated by a GAN. Image quality was manipulated by exporting images at different stages of GAN training.Results: Experts were sensitive to changes in image quality as evidenced by ratings and reaction times, and the generated images reproduced group effects (age, gender) and spatial correlations moderately well. We also surveyed a number of image quality metrics. Overall, Fre acute accent chet Inception Distance (FID), Maximum Mean Discrepancy (MMD) and Naturalness Image Quality Evaluator (NIQE) showed sensitivity to image quality and good correspondence with the human data, especially for lower-quality images (i.e., images from early stages of GAN training). However, only a Deep Quality Assessment (QA) model trained on human ratings was able to reproduce the subtle differences between higher-quality images. Conclusions: We recommend a combination of group analyses, spatial correlation analyses, and both distortion metrics (FID, MMD, NIQE) and perceptual models (Deep QA) for a comprehensive evaluation and comparison of brain images produced by GANs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据