4.7 Article

Retroscope: Retrospective Monitoring of Distributed Systems

Journal

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
Volume 30, Issue 11, Pages 2582-2594

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPDS.2019.2911944

Keywords

Distributed debugging; distributed applications; tracing; query languages

Funding

  1. National Science Foundation (NSF) [XPS-1533870, XPS-1533802]

Ask authors/readers for more resources

Retroscope is a comprehensive lightweight distributed monitoring tool that enables users to query and reconstruct past consistent global states of the system. Retroscope achieves this by augmenting the system with Hybrid Logical Clocks (HLC) and by streaming HLC-stamped event logs for storage and processing; these HLC timestamps are then used for constructing global (or nonlocal) snapshots upon request. Retroscope provides a rich querying language (RQL) to facilitate searching for global predicates across past consistent states. The search is performed by advancing through global states in small incremental steps, greatly reducing the amount of computation needed to construct consistent states. The Retroscope search algorithm is embarrassingly-parallel and can employ many worker processes (each processing up to 150,000 consistent snapshots per second) to handle a single query. We evaluate Retroscope's monitoring capabilities in two case studies: Chord and Apache ZooKeeper.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available