Journal
COMPUTERS & ELECTRICAL ENGINEERING
Volume 94, Issue -, Pages -Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compeleceng.2021.107354
Keywords
Bidirectional Maximal Matching; Hidden Markov model; Medical text segmentation; Dictionary dynamic update
Categories
Funding
- National Natural Science Foundation of China [71974069]
Ask authors/readers for more resources
The HMM-BiMM algorithm combines the Hidden Markov Model and Bi-directional Maximal Matching algorithm to achieve fast and accurate Chinese word segmentation. By dynamically updating the dictionary, it further improves the accuracy and efficiency of word segmentation.
Combining with the Hidden Markov Model and Bi-directional Maximal Matching algorithm, a new word segmentation algorithm, HMM-BiMM, is presented. In terms of the sub-dictionary matching, it can implement a fast word segmentation. After segmenting the text by the Bidirectional Maximal Matching (BiMM), the remaining text connected by the remaining single words will be segmented again by the strategy of the Hidden Markov Model (HMM). By the HMM, this algorithm can realize the dictionary dynamic update by the new segmentation words and improve the segmentation accuracy accordingly. Compared with five representative algorithms in the real-world clinical text (symptom), we show that the HMM-BiMM algorithm achieves the highest efficiency and accuracy for symptom text segmentation. In detail, this algorithm has around 3% in precision and 70% in running time improved to the BiMM.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available