4.7 Article

Text characterization based on recurrence networks

Journal

INFORMATION SCIENCES
Volume 641, Issue -, Pages -

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2023.119124

Keywords

-

Ask authors/readers for more resources

This article introduces a method called "recurrence network" which can be used to analyze text narratives at multiple scales. By applying this method to the analysis of 300 books, the study found that recurrence networks can be effective in distinguishing meaningful and meaningless texts, as well as different literary genres.
Several complex systems are characterized by exhibiting intricate properties that occur at multiple scales. These multi-scale characterizations are used in various applications. In particular, texts can be characterized by a hierarchical structure, which can be approached by using multi-scale concepts and methods. Here, we adopt an extension of the multi-scale, mesoscopic approach - hereafter referred to as a recurrence network - to represent text narratives, in which only the recurrent relationships among tagged parts of speech (subject, verb and direct object) are considered to establish connections among sequential pieces of text. The characterization of the texts was then achieved by considering scale-dependent complementary methods: accessibility and symmetry. To evaluate the potential of these concepts, we approached the problem of distinguishing between meaningful and meaningless texts and different literary genres (namely, fiction and non-fiction). A set of 300 books was considered and compared by using the above approaches. The recurrence network characterization was able to discriminate to some extent between real and meaningless and between the two genres assessed. Thus, our results indicate that recurrence networks are able to capture subtleties in book plots, suggesting that a similar methodology can be used in related networked applications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available