4.3 Article

A Hybrid Model for Chinese Spelling Check

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3047405

关键词

Chinese spelling check; hybrid model; graph model; conditional random field; rule-based model

资金

  1. Cai Yuanpei Program [201304490199, 201304490171]
  2. National Natural Science Foundation of China [61170114, 61672343, 61272248]
  3. National Basic Research Program of China [2013CB329401]
  4. Major Basic Research Program of Shanghai Science and Technology Committee [15JC1400103]
  5. Art and Science Interdisciplinary Funds of Shanghai Jiao Tong University [14JCRZ04]
  6. Key Project of the National Society Science Foundation of China [15ZDA041]

向作者/读者索取更多资源

Spelling check for Chinese has more challenging difficulties than that for other languages. A hybrid model for Chinese spelling check is presented in this article. The hybrid model consists of three components: one graph-based model for generic errors and two independently trained models for specific errors. In the graph model, a directed acyclic graph is generated for each sentence, and the single-source shortest-path algorithm is performed on the graph to detect and correct general spelling errors at the same time. Prior to that, two types of errors over functional words (characters) are first solved by conditional random fields: the confusion of (at) (pinyin is zai in Chinese), (again, more, then) (pinyin: zai) and (of) (pinyin: de), (- ly, adverb- forming particle) (pinyin: de), and (so that, have to) (pinyin: de). Finally, a rule- based model is exploited to distinguish pronoun usage confusion: (she) (pinyin: ta), (he) (pinyin: ta), and some other common collocation errors. The proposed model is evaluated on the standard datasets released by the SIGHAN Bake-off shared tasks, giving state-of-the-art results.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据