☆ 4.4 Article

A sentence structure-based approach to unsupervised author identification

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS (2016)

期刊

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS

卷 46, 期 1, 页码 1-19

出版社

SPRINGER

DOI: 10.1007/s10844-014-0349-9

关键词

Author identification; Natural language processing; Clustering

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems

资金

Italian PON project Puglia@Service [PON02_00563_3489339]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Assessing whether two documents were written by the same author is a crucial task, especially in the Internet age, with possible applications to philology and forensics. The problem has been tackled in the literature by exploiting frequency-based approaches, numeric techniques or writing style analysis. Focusing on this last perspective, this paper proposes a novel technique that takes into account the structure of sentences, assuming that it is strictly related to the author's writing style. Specifically, a (collection of) text(s) in natural language written by a given author is translated into a set of First-Order Logic descriptions, and a model of the author's writing habits is obtained as the result of clustering these descriptions. Then, if an overlapping exists between the models of a known author and of an unknown one, the conclusion can be drawn that they are the same person. Among the advantages of this approach, it does not need a training phase, and performs well also on short texts and/or small collections.

A sentence structure-based approach to unsupervised author identification

期刊

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A sentence structure-based approach to unsupervised author identification

期刊

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文