4.7 Article

Spectral Representation of Behaviour Primitives for Depression Analysis

期刊

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
卷 13, 期 2, 页码 829-844

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TAFFC.2020.2970712

关键词

Depression; Videos; Task analysis; Interviews; Feature extraction; Magnetic heads; Neural networks; Automatic depression analysis; fourier transform; spectral representation; time-frequency analysis; convolution neural networks

资金

  1. NIHR Nottingham Biomedical Research Centre (BRC)
  2. Horizon Centre for Doctoral Training, School of Computer Science, University of Nottingham
  3. Shenzhen University
  4. National Natural Science Foundation of China [61672357, 91959108]
  5. Science and Technology Funding of Guangdong Province [2018A050501014]

向作者/读者索取更多资源

This article discusses the research on video-based automatic depression analysis system. It proposes the extraction of multi-scale video-level features and spectral representations, along with the use of Convolution Neural Networks and Artificial Neural Networks for depression analysis. The experiments show the significant impact of interview tasks on depression analysis, the higher accuracy achieved through fusion of multiple tasks, and the greater informativeness of longer tasks.
Depression is a serious mental disorder affecting millions of people all over the world. Traditional clinical diagnosis methods are subjective, complicated and require extensive participation of clinicians. Recent advances in automatic depression analysis systems promise a future where these shortcomings are addressed by objective, repeatable, and readily available diagnostic tools to aid health professionals in their work. Yet there remain a number of barriers to the development of such tools. One barrier is that existing automatic depression analysis algorithms base their predictions on very brief sequential segments, sometimes as little as one frame. Another barrier is that existing methods do not take into account what the context of the measured behaviour is. In this article, we extract multi-scale video-level features for video-based automatic depression analysis. We propose to use automatically detected human behaviour primitives as the low-dimensional descriptor for each frame. We also propose two novel spectral representations, i.e., spectral heatmaps and spectral vectors, to represent video-level multi-scale temporal dynamics of expressive behaviour. Constructed spectral representations are fed to Convolution Neural Networks (CNNs) and Artificial Neural Networks (ANNs) for depression analysis. We conducted experiments on the AVEC 2013 and AVEC 2014 benchmark datasets to investigate the influence of interview tasks on depression analysis. In addition to achieving state of the art accuracy in severity of depression estimation, we show that the task conducted by the user matters, that fusion of a combination of tasks reaches highest accuracy, and that longer tasks are more informative than shorter tasks, up to a point.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据