☆ 4.4 Article

Solving the data sparsity problem in destination prediction

VLDB JOURNAL (2015)

Journal

VLDB JOURNAL

Volume 24, Issue 2, Pages 219-243

Publisher

SPRINGER

DOI: 10.1007/s00778-014-0369-7

Keywords

Trajectory mining; Destination prediction; Markov model; Bayes' rule

Funding

Australian Research Council (ARC) [DP130104587, FT120100832]
Australian Research Council [FT120100832] Funding Source: Australian Research Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Destination prediction is an essential task for many emerging location-based applications such as recommending sightseeing places and targeted advertising according to destinations. A common approach to destination prediction is to derive the probability of a location being the destination based on historical trajectories. However, almost all the existing techniques use various kinds of extra information such as road network, proprietary travel planner, statistics requested from government, and personal driving habits. Such extra information, in most circumstances, is unavailable or very costly to obtain. Thereby we approach the task of destination prediction by using only historical trajectory dataset. However, this approach encounters the data sparsity problem, i.e., the available historical trajectories are far from enough to cover all possible query trajectories, which considerably limits the number of query trajectories that can obtain predicted destinations. We propose a novel method named Sub-Trajectory Synthesis (SubSyn) to address the data sparsity problem. SubSyn first decomposes historical trajectories into sub-trajectories comprising two adjacent locations, and then connects the sub-trajectories into synthesised trajectories. This process effectively expands the historical trajectory dataset to contain much more trajectories. Experiments based on real datasets show that SubSyn can predict destinations for up to ten times more query trajectories than a baseline prediction algorithm. Furthermore, the running time of the SubSyn-training algorithm is almost negligible for a large set of 1.9 million trajectories, and the SubSyn-prediction algorithm runs over two orders of magnitude faster than the baseline prediction algorithm constantly.

Solving the data sparsity problem in destination prediction

Journal

VLDB JOURNAL

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Solving the data sparsity problem in destination prediction

Journal

VLDB JOURNAL

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper