4.5 Article

Trip Purpose Imputation Using GPS Trajectories with Machine Learning

Journal

Publisher

MDPI
DOI: 10.3390/ijgi10110775

Keywords

class noise; data mining; ensemble filter; hierarchical clustering; machine learning; random forest; trip purpose

Funding

  1. Swiss Innovation Agency (Innosuisse)
  2. Federal Department of the Environment, Transport, Energy and Communications (DETEC).

Ask authors/readers for more resources

The study demonstrates that using GPS trajectory information for inferring trip purposes performs well without the need for personal information, and has broad applicability in case of limited information availability. The ensemble filter is an effective tool, which not only increases accuracy, especially for minority classes, but also reduces uncertainties caused by blindly trusting participants' labeling of activities. The model trained on a small subset of citizens' GPS trajectories can effectively be applied to a larger GPS trajectory sample.
We studied trip purpose imputation using data mining and machine learning techniques based on a dataset of GPS-based trajectories gathered in Switzerland. With a large number of labeled activities in eight categories, we explored location information using hierarchical clustering and achieved a classification accuracy of 86.7% using a random forest approach as a baseline. The contribution of this study is summarized below. Firstly, using information from GPS trajectories exclusively without personal information shows a negligible decrease in accuracy (0.9%), which indicates the good performance of our data mining steps and the wide applicability of our imputation scheme in case of limited information availability. Secondly, the dependence of model performance on the geographical location, the number of participants, and the duration of the survey is investigated to provide a reference when comparing classification accuracy. Furthermore, we show the ensemble filter to be an excellent tool in this research field not only because of the increased accuracy (93.6%), especially for minority classes, but also the reduced uncertainties in blindly trusting the labeling of activities by participants, which is vulnerable to class noise due to the large survey response burden. Finally, the trip purpose derivation accuracy across participants reaches 74.8%, which is significant and suggests the possibility of effectively applying a model trained on GPS trajectories of a small subset of citizens to a larger GPS trajectory sample.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available