4.6 Article

Group incremental adaptive clustering based on neural network and rough set theory for crime report categorization

Journal

NEUROCOMPUTING
Volume 459, Issue -, Pages 465-480

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2019.10.109

Keywords

Crime report categorization; Group incremental clustering; Neural network; Adaptive resonance theory; Rough set theory; Unsupervised learning

Funding

  1. National Natural Science Foundation of China [61976120]
  2. Natural Science Foundation of Jiangsu Province [BK20191445]
  3. Six Talent Peaks Project of Jiangsu Province [XYDXXJS-048]
  4. Qing Lan Project of Jiangsu Province

Ask authors/readers for more resources

The proposed method integrates neural network and rough set theory for clustering crime reports by identifying named entities and selecting phrases to describe each report. The phrases are vectorized and clustered using a graph-based algorithm, with an adaptive resonance theory neural network used to generate clusters. This approach adapts to dynamic data environments and has been validated with various crime report datasets, demonstrating its effectiveness compared to other clustering algorithms.
Explosively growing online text reports are mostly unstructured in nature. Many state-of-the-art tech-niques involving supervised, unsupervised or semi-supervised approaches have been developed in the recent years for automatic clustering of these reports. Annotation of online crime reports is a challenging task as various types of crime reports are frequently generated over time. To the best of the authors' knowledge, this is the first attempt taken for group incremental adaptive clustering of crime reports inte-grating neural network and rough set theory. The proposed work initially identifies the named entities and selects only the context words within a pair of entities as a phrase. Thus every report is described by a collection of phrases. The phrases are vectorized using GloVe and a graph based clustering algorithm is applied to cluster all the collected phrases. The phrases within a cluster are considered as the similar type of phrases, called paraphrases and each report is represented by a binary vector of dimension equal to the number of clusters obtained. If a phrase of the report lies in a cluster then a '1' is set at the corre-sponding position of the binary vector; otherwise it is set as '0'. Next, an adaptive resonance theory neural network is applied on the binary vector representation of the crime reports to generate a set of clusters of crime reports. When a new group of reports is available, the reports are transformed into binary form in the similar way and the rough set theory is applied on them. It puts many reports into existing clusters and for the remaining reports, adaptive resonance theory is further applied to modify the existing clusters and possibly generate the new clusters. Thus, in the dynamic environment when data are generated grad-ually over time, the proposed group incremental clustering algorithm is adapted to provide the updated set of clusters. The method has been applied on various crime report datasets and validated with the help of several cluster validation indices. The method is also compared with some state-of-the-art clustering algorithms to express its effectiveness and statistical significance in the domain of crime corpora. (c) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available