4.7 Article

Robust causal dependence mining in big data network and its application to traffic flow predictions

期刊

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.trc.2015.03.003

关键词

Big Data; Traffic flow prediction; Causal dependence; Lasso regression; Robust

资金

  1. National Science and Technology Support Program [2013BAG18B00]
  2. National Natural Science Foundation of China [51278280]
  3. National Basic Research Program of China (973 Project) [2012CB725405]
  4. Tsinghua University [20131089307]

向作者/读者索取更多资源

In this paper, we focus on a special problem in transportation studies that concerns the so called Big Data challenge, which is: how to build concise yet accurate traffic flow prediction models based on the massive data collected by different sensors? The size of the data, the hidden causal dependence and the complexity of traffic time series are some of the obstacles that affect making reliable forecast at a reasonable cost, both time-wise and computationwise. To better prepare the data for traffic modeling, we introduce a multiple-step strategy to process the raw Big Data into compact time series that are better suited for regression and causality analysis. First, we use the Granger causality to define and determine the potential dependence among data, and produce a much condensed set of times series who are also highly dependent. Next, we deploy a decomposition algorithm to separate daily-similar trend and nonstationary bursts components from the traffic flow time series yielded by the Granger test. The decomposition results are then treated by two rounds of Lasso regression: the standard Lasso method is first used to quickly filter out most of the irrelevant data, followed by a robust Lasso method to further remove the disturbance caused by bursts components and recover the strongest dependence among the remaining data. Test results show that the proposed method significantly reduces the costs of building prediction models. Moreover, the obtained causal dependence graph reveals the relationship between the structure of road networks and the correlations among traffic time series. All these findings are useful for building better traffic flow prediction models. (C) 2015 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据