Journal
FORENSIC SCIENCE INTERNATIONAL
Volume 325, Issue -, Pages -Publisher
ELSEVIER IRELAND LTD
DOI: 10.1016/j.forsciint.2021.110824
Keywords
Linguistics; Questioned documents; Digital forensics; Computational linguistics; Authorship attribution; Authorship verification
Categories
Funding
- National Science Foundation [1814602]
- Direct For Computer & Info Scie & Enginr
- Division Of Computer and Network Systems [1814602] Funding Source: National Science Foundation
Ask authors/readers for more resources
The paper introduces a computer program to identify the author of anonymous or disputed documents, and validates its accuracy through a series of controlled experiments involving English language blogs. The system achieved a measured accuracy of 77% across over 32,000 different document pairs, providing a solution to a key problem in forensic linguistics.
Being able to identify the author of an anonymous or disputed document is an important task in forensic science. This can be treated as a form of pattern evidence based on writing style, but the subjective analysis of writing style may have all the well-known problems of other forms of subjective pattern evidence. In this paper, we demonstrate a computer program to address these issues. This program analyzes a pair of documents (a known document and a questioned document) to determine if they were written by the same author. More importantly, this paper also validates the accuracy of this program through a large-scale series of controlled experiments involving English language blogs. Across more than 32,000 different document pairs, the system achieved a measured accuracy of 77%. This paper concludes that this system not only addresses a key problem in forensic linguistics, but also provides the repeatability, reproducibility, and measured accuracy levels that are key to the advancement of forensic science. (c) 2021 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available