4.5 Article

HMM-BiMM: Hidden Markov Model-based word segmentation via improved Bi-directional Maximal Matching algorithm

Journal

COMPUTERS & ELECTRICAL ENGINEERING
Volume 94, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.compeleceng.2021.107354

Keywords

Bidirectional Maximal Matching; Hidden Markov model; Medical text segmentation; Dictionary dynamic update

Funding

  1. National Natural Science Foundation of China [71974069]

Ask authors/readers for more resources

The HMM-BiMM algorithm combines the Hidden Markov Model and Bi-directional Maximal Matching algorithm to achieve fast and accurate Chinese word segmentation. By dynamically updating the dictionary, it further improves the accuracy and efficiency of word segmentation.
Combining with the Hidden Markov Model and Bi-directional Maximal Matching algorithm, a new word segmentation algorithm, HMM-BiMM, is presented. In terms of the sub-dictionary matching, it can implement a fast word segmentation. After segmenting the text by the Bidirectional Maximal Matching (BiMM), the remaining text connected by the remaining single words will be segmented again by the strategy of the Hidden Markov Model (HMM). By the HMM, this algorithm can realize the dictionary dynamic update by the new segmentation words and improve the segmentation accuracy accordingly. Compared with five representative algorithms in the real-world clinical text (symptom), we show that the HMM-BiMM algorithm achieves the highest efficiency and accuracy for symptom text segmentation. In detail, this algorithm has around 3% in precision and 70% in running time improved to the BiMM.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available