Journal
INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI
Volume 13, Issue 2, Pages 23-36Publisher
IGI GLOBAL
DOI: 10.4018/IJSSCI.2021040102
Keywords
Character N-Grams; Code-Mixing; LSTM; Machine Learning; Natural Language Processing; Neural Networks; Support Vector Machines; Word Embeddings
Ask authors/readers for more resources
With the rise of internet applications and social media platforms, people from multilingual nations tend to mix their regional language with English on social media, forming code-mixed text. Named entity recognition in natural language processing faces challenges in handling this mixed language, and this paper proposes three approaches to improve NER for code-mixed text.
With the increase of internet applications and social media platforms there has been an increase in the informal way of text communication. People belonging to different regions tend to mix their regional language with English on social media text. This has been the trend with many multilingual nations now and is commonly known as code mixing. In code mixing, multiple languages are used within a statement. The problem of named entity recognition (NER) is a well-researched topic in natural language processing (NLP), but the present NER systems tend to perform inefficiently on code-mixed text. This paper proposes three approaches to improve named entity recognizers for handling code-mixing. The first approach is based on machine learning techniques such as support vector machines and other tree-based classifiers. The second approach is based on neural networks and the third approach uses long short-term memory (LSTM) architecture to solve the problem.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available