期刊
LANGUAGE RESOURCES AND EVALUATION
卷 40, 期 3-4, 页码 311-330出版社
SPRINGER
DOI: 10.1007/s10579-007-9031-y
关键词
morphological parsing; word segmentation; data annotation; unsupervised learning; Asian language processing; Bengali
Unsupervised morphological analysis is the task of segmenting words into prefixes, suffixes and stems without prior knowledge of language-specific morphotactics and morpho-phonological rules. This paper introduces a simple, yet highly effective algorithm for unsupervised morphological learning for Bengali, an Indo-Aryan language that is highly inflectional in nature. When evaluated on a set of 4,110 human-segmented Bengali words, our algorithm achieves an F-score of 83%, substantially outperforming Linguistica, one of the most widely-used unsupervised morphological parsers, by about 23%.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据