期刊
IEEE ACCESS
卷 6, 期 -, 页码 70874-70883出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2018.2881280
关键词
Chinese word segmentation; capsule network; ancient Chinese medical books
资金
- Beijing Natural Science Foundation [7174328, 4174098]
- National Natural Science Foundation of China [61702047]
- Fundamental Research Funds for the Central Universities [2017RC02]
- Basic Scientific Research Fund of the Chinese Academy of Chinese Medical Science [zz110318]
Neural network models are popularly used in Chinese word segmentation task. The capsule architecture is proposed recently which has solved some defects of convolutional neural network. In this paper, we first introduce the capsule architecture to Chinese word segmentation. We utilize capsules as neural units. Before doing routing algorithm, we make a sliding capsule window to select the features which are extracted from the primary capsule layer. The sliding capsule window is proposed to adapt the capsule architecture to the sequence labeling task. The experiment results show that our proposed capsules based Chinese word segmentation model achieves competitive performances with the previous state-of-the-art methods. Ancient Chinese medical books record a lot of valuable experiences from the ancient medical workers. However, the research about the automatic text analysis on ancient Chinese medical documents is just a beginning. Due to the lack of the annotated data for Chinese medicine, we develop the word segmentation guideline for the ancient Chinese medical documents and select 10 genres, 30 ancient Chinese medical books to set up the annotation dataset. And with the annotated data, we develop the segmenter for the ancient Chinese medical text. Experiments show that the F-1 measures of our model on the two datasets are 94.9% and 81.4% on Chinese Treebank6.0 and Ancient Chinese Medical Books, respectively.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据