4.7 Article

Hybrid method to automatically extract medical document tree structure

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2023.105922

关键词

Medical text mining; Section detection; Machine learning; Information extraction; Multimodal features; Electronic medical records

向作者/读者索取更多资源

The paper presents an automatic section detection method in the medical field to improve information extraction tasks. Rules were constructed to prepare the training set, and a machine learning model was trained using various features to find titles. Experiments showed that combining these features using a Convolutional Neural Network led to better results in real medical documents.
A huge and rapidly growing quantity of medical documents is available in an electronic versions. These informing documents mostly have textual content in natural language. These facts can make the documents difficult to read, ambiguous, or even contain mistakes. Consequently, when a doctor decides on treatment, many medical critical errors can happen. The information extraction in unstructured documents can handle this problem. In our paper, we introduce an automatic section detection method in the medical field SDM (Section Detection in Medical field) to improve information extraction tasks by providing more context. Accordingly, we have constructed some rules to prepare automatically the training set. Then, we benefit from numerous features such as formatting style, syntactic, and semantic features to train a machine learning model to find titles. Then, a section tree is generated that can be useful for other tasks. As anticipated, our experiments show that merging these features using a Convolutional Neural Network (CNN) can lead to a better result in real medical documents according to the F1-score measure. Thus, we benefit from the layout information and our method can provide the document sections in a tree form. It is worth noting that our method can be easily applied in other fields since it is not strongly dependent on the document type or language.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据