4.7 Article

Runtime Optimizations for Tree-based Machine Learning Models

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 26, Issue 9, Pages 2281-2292

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2013.73

Keywords

Web search; general information storage and retrieval; information technology and systems; scalability and efficiency; learning to rank; regression trees

Funding

  1. U.S. NSF [IIS-0916043, IIS-1144034, IIS-1218043]
  2. Dutch National Program COMMIT
  3. Direct For Computer & Info Scie & Enginr
  4. Div Of Information & Intelligent Systems [1144034] Funding Source: National Science Foundation

Ask authors/readers for more resources

Tree-based models have proven to be an effective solution for web ranking as well as other machine learning problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, specifically using gradient-boosted regression trees for learning to rank. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processors. By laying out data structures in memory in a more cache-conscious fashion, removing branches from the execution flow using a technique called predication, and micro-batching predictions using a technique called vectorization, we are able to better exploit modern processor architectures. Experiments on synthetic data and on three standard learning-to-rank datasets show that our approach is significantly faster than standard implementations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available