☆ 4.6 Article

A survey on Urdu and Urdu like language stemmers and stemming techniques

ARTIFICIAL INTELLIGENCE REVIEW (2018)

期刊

ARTIFICIAL INTELLIGENCE REVIEW

卷 49, 期 3, 页码 339-373

出版社

SPRINGER

DOI: 10.1007/s10462-016-9527-1

关键词

Stemming; Natural Language Processing; Information Retrieval; Urdu; Suffixes; Stemming Techniques

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Stemming is one of the basic steps in natural language processing applications such as information retrieval, parts of speech tagging, syntactic parsing and machine translation, etc. It is a morphological process that intends to convert the inflected forms of a word into its root form. Urdu is a morphologically rich language, emerged from different languages, that includes prefix, suffix, infix, co-suffix and circumfixes in inflected and multi-gram words that need to be edited in order to convert them into their stems. This editing (insertion, deletion and substitution) makes the stemming process difficult due to language morphological richness and inclusion of words of foreign languages like Persian and Arabic. In this paper, we present a comprehensive review of different algorithms and techniques of stemming Urdu text and also considering the syntax, morphological similarity and other common features and stemming approaches used in Urdu like languages, i.e. Arabic and Persian analyzed, extract main features, merits and shortcomings of the used stemming approaches. In this paper, we also discuss stemming errors, basic difference between stemming and lemmatization and coin a metric for classification of stemming algorithms. In the final phase, we have presented the future work directions.

A survey on Urdu and Urdu like language stemmers and stemming techniques

期刊

ARTIFICIAL INTELLIGENCE REVIEW

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A survey on Urdu and Urdu like language stemmers and stemming techniques

期刊

ARTIFICIAL INTELLIGENCE REVIEW

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文