☆ 4.6 Article

Multimodal learning for fetal distress diagnosis using a multimodal medical information fusion framework

FRONTIERS IN PHYSIOLOGY (2022)

期刊

FRONTIERS IN PHYSIOLOGY

卷 13, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA

DOI: 10.3389/fphys.2022.1021400

关键词

fetal heart rate; intelligent cardiotocography classification; fetal distress diagnosis; multimodal learning; vit; transformer

类别

Physiology

资金

National Natural Science Foundation of China
[62071162]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Cardiotocography (CTG) monitoring is an important tool for evaluating fetal well-being in late pregnancy. Intelligent CTG classification based on Fetal Heart Rate (FHR) signals can assist obstetricians in making clinical decisions. This study proposes a Multimodal Medical Information Fusion framework (MMIF) to address challenges in multimodal learning for Fetal Distress Diagnosis (FDD). Experimental results show the superior performance and effectiveness of the proposed models.

Cardiotocography (CTG) monitoring is an important medical diagnostic tool for fetal well-being evaluation in late pregnancy. In this regard, intelligent CTG classification based on Fetal Heart Rate (FHR) signals is a challenging research area that can assist obstetricians in making clinical decisions, thereby improving the efficiency and accuracy of pregnancy management. Most existing methods focus on one specific modality, that is, they only detect one type of modality and inevitably have limitations such as incomplete or redundant source domain feature extraction, and poor repeatability. This study focuses on modeling multimodal learning for Fetal Distress Diagnosis (FDD); however, exists three major challenges: unaligned multimodalities; failure to learn and fuse the causality and inclusion between multimodal biomedical data; modality sensitivity, that is, difficulty in implementing a task in the absence of modalities. To address these three issues, we propose a Multimodal Medical Information Fusion framework named MMIF, where the Category Constrained-Parallel ViT model (CCPViT) was first proposed to explore multimodal learning tasks and address the misalignment between multimodalities. Based on CCPViT, a cross-attention-based image-text joint component is introduced to establish a Multimodal Representation Alignment Network model (MRAN), explore the deep-level interactive representation between cross-modal data, and assist multimodal learning. Furthermore, we designed a simple-structured FDD test model based on the highly modal alignment MMIF, realizing task delegation from multimodal model training (image and text) to unimodal pathological diagnosis (image). Extensive experiments, including model parameter sensitivity analysis, cross-modal alignment assessment, and pathological diagnostic accuracy evaluation, were conducted to show our models' superior performance and effectiveness.

Multimodal learning for fetal distress diagnosis using a multimodal medical information fusion framework

期刊

FRONTIERS IN PHYSIOLOGY

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multimodal learning for fetal distress diagnosis using a multimodal medical information fusion framework

期刊

FRONTIERS IN PHYSIOLOGY

出版社

FRONTIERS MEDIA SA

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文