4.5 Article

NLP-based digital forensic investigation platform for online communications

Journal

COMPUTERS & SECURITY
Volume 104, Issue -, Pages -

Publisher

ELSEVIER ADVANCED TECHNOLOGY
DOI: 10.1016/j.cose.2021.102210

Keywords

Digital investigation; Criminal investigation; Email forensics; Social network forensics; NLP-based forensics

Funding

  1. National Key RD Plan of China [2017YFA0604500]
  2. National Sci-Tech Support Plan of China [2014BAH02F00]
  3. National Natural Science Foundation of China [61701190]
  4. Youth Science Foundation of Jilin Province of China [20180520021JH]
  5. Key Technology Innovation Cooperation Project of Government [SXGJSF2017-4]
  6. Key scientific and technological R&D Plan of Jilin Province of China [20180201103GX]
  7. Project of Jilin Province Development and Reform Commission [2019FGWTZC001]
  8. National Science Foundation CREST [HRD-1736209]
  9. Cloud Technology Endowed Professorship

Ask authors/readers for more resources

Digital investigations play a crucial role in criminal investigations and civil litigations due to the increasing prevalence of online communications. This paper introduces a Natural Language Processing (NLP)-based digital investigation platform, demonstrating its superiority over other existing methods through empirical comparisons.
Digital (forensic) investigations will be increasingly important in both criminal investigations and civil litigations (e.g., corporate espionage, and intellectual property theft) as more of our communications take place over cyberspace (e.g., e-mail and social media platforms). In this paper, we present our proposed Natural Language Processing (NLP)-based digital investigation platform. The platform comprises the data collection and representation phase, the vectorization phase, the feature selection phase, and the classifier generation and evaluation phase. We then demonstrate the potential of our proposed approach using a realworld dataset, whose findings indicate that it outperforms two other competing approaches, namely: LogAnalysis (published in Expert Systems with Applications, 2014) and SIIMCO (published in IEEE Transactions on Information Forensics and Security, 2016). Specifically, our proposed approach achieves 0.65 in F1-score and 0.83 in precision, whilst LogAnalysis and SIIMCO respectively achieve 0.51 and 0.59 in F1-score and 0.49 and 0.58 in precision. (C) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available