☆ 4.4 Article

A discourse-aware neural network-based text model for document-level text classification

JOURNAL OF INFORMATION SCIENCE (2018)

期刊

JOURNAL OF INFORMATION SCIENCE

卷 44, 期 6, 页码 715-735

出版社

SAGE PUBLICATIONS LTD

DOI: 10.1177/0165551517743644

关键词

Deep learning; discourse analysis; neural network; sarcasm detection; sentiment analysis; text classification; text model

类别

Computer Science, Information Systems Information Science & Library Science

资金

ICT R&D program of MSICT/IITP [2013-0-00179]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Capturing semantics scattered across entire text is one of the important issues for Natural Language Processing (NLP) tasks. It would be particularly critical with long text embodying a flow of themes. This article proposes a new text modelling method that can handle thematic flows of text with Deep Neural Networks (DNNs) in such a way that discourse information and distributed representations of text are incorporate. Unlike previous DNN-based document models, the proposed model enables discourse-aware analysis of text and composition of sentence-level distributed representations guided by the discourse structure. More specifically, our method identifies Elementary Discourse Units (EDUs) and their discourse relations in a given document by applying Rhetorical Structure Theory (RST)-based discourse analysis. The result is fed into a tree-structured neural network that reflects the discourse information including the structure of the document and the discourse roles and relation types. We evaluate the document model for two document-level text classification tasks, sentiment analysis and sarcasm detection, with comparisons against the reference systems that also utilise discourse information. In addition, we conduct additional experiments to evaluate the impact of neural network types and adopted discourse factors on modelling documents vis-a-vis the two classification tasks. Furthermore, we investigate the effects of various learning methods, input units on the quality of the proposed discourse-aware document model.

A discourse-aware neural network-based text model for document-level text classification

期刊

JOURNAL OF INFORMATION SCIENCE

出版社

SAGE PUBLICATIONS LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A discourse-aware neural network-based text model for document-level text classification

期刊

JOURNAL OF INFORMATION SCIENCE

出版社

SAGE PUBLICATIONS LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文