4.4 Article

Solving the data sparsity problem in destination prediction

Journal

VLDB JOURNAL
Volume 24, Issue 2, Pages 219-243

Publisher

SPRINGER
DOI: 10.1007/s00778-014-0369-7

Keywords

Trajectory mining; Destination prediction; Markov model; Bayes' rule

Funding

  1. Australian Research Council (ARC) [DP130104587, FT120100832]
  2. Australian Research Council [FT120100832] Funding Source: Australian Research Council

Ask authors/readers for more resources

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the data sparsity problem, i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into synthesised trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available