3.8 Article

RNN based machine translation and transliteration for Twitter data

期刊

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY
卷 23, 期 3, 页码 499-504

出版社

SPRINGER
DOI: 10.1007/s10772-020-09724-9

关键词

Long short-term memory (LSTM); Recurrent neural network (RNN); Sequence-to-sequence; Python; Translation; Transliteration; Twitter; Machine translation (MT); BLEU; Tensorflow

向作者/读者索取更多资源

The present work aims at analyzing the social media data for code-switching and transliterated to English language using the special kind of recurrent neural network (RNN) called Long Short-Term Memory (LSTM) Network. During the course of work, TensorFlow is used to express LSTM suitably. Twitter data is stored in MongoDB to enable easy handling and processing of data. The data is parsed through different fields with the aid of Python script and cleaned using regular expressions. The LSTM model is trained for 1 M data which is further used for transliteration and translation of the Twitter data. Translation and transliteration of social media data enables publicizing the content in the language understood by majority of the population. With this, any content which is anti-social or threat to law and order can be easily verified and blocked at the source.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据