4.5 Article

Scalable Detection of Anomalous Patterns With Connectivity Constraints

期刊

出版社

TAYLOR & FRANCIS INC
DOI: 10.1080/10618600.2014.960926

关键词

Biosurveillance; Event detection; Graph mining; Scan statistics; Spatial scan statistic

资金

  1. National Science Foundation [IIS-0916345, IIS-0911032, IIS-0953330]
  2. NSF [GRFP-0946825]
  3. AT&T Labs Fellowship
  4. Div Of Information & Intelligent Systems
  5. Direct For Computer & Info Scie & Enginr [0953330] Funding Source: National Science Foundation

向作者/读者索取更多资源

We present GraphScan, a novel method for detecting arbitrarily shaped connected clusters in graph or network data. Given a graph structure, data observed at each node, and a score function defining the anomalousness of a set of nodes, GraphScan can efficiently and exactly identify the most anomalous (highest-scoring) connected subgraph. Kulldorff's spatial scan, which searches over circles consisting of a center location and its k - 1 nearest neighbors, has been extended to include connectivity constraints by FlexScan. However, FlexScan performs an exhaustive search over connected subsets and is computationally infeasible for k > 30. Alternatively, the upper level set (ULS) scan scales well to large graphs but is not guaranteed to find the highest-scoring subset. We demonstrate that GraphScan is able to scale to graphs an order of magnitude larger than FlexScan, while guaranteeing that the highest-scoring subgraph will be identified. We evaluate GraphScan, Kulldorff's spatial scan (searching over circles) and ULS in two different settings of public health surveillance. The first examines detection power using simulated disease outbreaks injected into real-world Emergency Department data. GraphScan improved detection power by identifying connected, irregularly shaped spatial clusters while requiring less than 4.3 sec of computation time per day of data. The second scenario uses contaminant plumes spreading through a water distribution system to evaluate the spatial accuracy of the methods. GraphScan improved spatial accuracy using data generated from noisy, binary sensors in the network while requiring less than 0.22 sec of computation time per hour of data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据