4.7 Article

Stat-DSM: Statistically Discriminative Sub-Trajectory Mining With Multiple Testing Correction

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 34, Issue 3, Pages 1477-1488

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2020.2994344

Keywords

Trajectory mining; significant pattern mining; statistical testing; multiple hypothesis testing

Funding

  1. MEXT KAKENHI [17H00758, 16H06538]
  2. JST CREST [JPMJCR1502]
  3. RIKEN Center for Advanced Intelligence Project
  4. JST

Ask authors/readers for more resources

This study proposes a novel statistical approach, called Stat-DSM, to evaluate the statistical significance of discriminative sub-trajectory mining results. The proposed method properly controls the statistical significance of the extracted sub-trajectories and addresses the computational and statistical challenges of massive trajectory datasets.
We propose a novel statistical approach to evaluate the statistical significance (reliability) of the results from discriminative sub-trajectory mining, which we call Statistically Discriminative Sub-trajectory Mining (Stat-DSM). Given two groups of trajectories, the goal of Stat-DSM is to extract moving patterns in the form of sub-trajectories that occur statistically significantly more often in one group than in the other. An advantage of the proposed method is that the statistical significance of the extracted sub-trajectories are properly controlled in the sense that the probability of finding a falsely discriminative sub-trajectory is smaller than a specified significance threshold alpha (e.g., 0.05), which is crucial when the method is used in scientific or social science studies under noisy environments. Finding such statistically discriminative sub-trajectories from a massive trajectory dataset is both computationally and statistically challenging. In the Stat-DSM method, we address these difficulties by introducing a tree representation of sub-trajectories, and applying an efficient permutation-based statistical inference method to the tree. To the best of our knowledge, Stat-DSM is the first method that provides a statistical approach to quantify the reliability of discriminative sub-trajectory mining results. We illustrate the effectiveness and scalability of the Stat-DSM method by applying it to a real-world dataset containing 1,000,000 trajectories.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available