☆ 4.7 Article

Multimodal Assessment of Schizophrenia Symptom Severity From Linguistic, Acoustic and Visual Cues

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING (2023)

Journal

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING

Volume 31, Issue -, Pages 3469-3479

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNSRE.2023.3307597

Keywords

Schizophrenia; parameter-efficient fine-tuning; multimodal fusion

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study proposes a multimodal assessment model that predicts the severity of schizophrenia symptoms based on linguistic, acoustic, and visual behavior. The model combines deep learning techniques and a multimodal fusion framework to achieve superior performance in assessment.

Assessing the condition of every schizophrenia patient correctly normally requires lengthy and frequent interviews with professionally trained doctors. To alleviate the time and manual burden on those mental health professionals, this paper proposes a multimodal assessment model that predicts the severity level of each symptom defined in Scale for the Assessment of Thought, Language, and Communication (TLC) and Positive and Negative Syndrome Scale (PANSS) based on the patient ' s linguistic, acoustic, and visual behavior. The proposed deep-learning model consists of a multimodal fusion framework and four unimodal transformer-based backbone networks. The second-stage pre-training is introduced to make each off-the-shelf pre-trained model learn the pattern of schizophrenia data more effectively. It learns to extract the desired features from the view of its modality. Next, the pre-trained parameters are frozen, and the light-weight trainable unimodal modules are inserted and fine-tuned to keep the number of parameters low while maintaining the superb performance simultaneously. Finally, the four adapted unimodal modules are fused into a final multimodal assessment model through the proposed multimodal fusion framework. For the purpose of validation, we train and evaluate the proposed model on schizophrenia patients recruited from National Taiwan University Hospital, whose performance achieves 0.534/0.685 in MAE/MSE, outperforming the related works in the literature. Through the experimental results and ablation studies, as well as the comparison with other related multimodal assessment works, our approach not only demonstrates the superiority of our performance but also the effectiveness of our approach to extract and integrate information from multiple modalities

Multimodal Assessment of Schizophrenia Symptom Severity From Linguistic, Acoustic and Visual Cues

Journal

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multimodal Assessment of Schizophrenia Symptom Severity From Linguistic, Acoustic and Visual Cues

Journal

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper