☆ 4.6 Article

Optically Disaggregated Data Centers With Minimal Remote Memory Latency: Technologies, Architectures, and Resource Allocation [Invited]

JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING (2018)

Journal

JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING

Volume 10, Issue 2, Pages A270-A285

Publisher

Optica Publishing Group

DOI: 10.1364/JOCN.10.00A270

Keywords

Hybrid OCS/EPS; Memory, accelerator, and storage disaggregation; On-board silicon photonic transceivers; Reconfigurable and function embedded architecture

Funding

EU [687632]
Huber+Suhner Polatis
Luxtera
EPSRC [EP/J017582/1] Funding Source: UKRI
Engineering and Physical Sciences Research Council [EP/J017582/1] Funding Source: researchfish

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Disaggregated rack-scale data centers have been proposed as the only promising avenue to break the barrier of the fixed CPU-to-memory proportionality caused by main-tray direct-attached conventional/traditional server-centric systems. However, memory disaggregation has stringent network requirements in terms of latency, energy efficiency, bandwidth, and bandwidth density. This paper identifies all the requirements and key performance indicators of a network to disaggregate IT resources while summarizing the progress and importance of optical interconnects. Crucially, it proposes a rack-and-cluster scale architecture, which supports the disaggregation of CPU, memory, storage, and/or accelerator blocks. Optical circuit switching forms the core of this architecture, whereas the end-points (IT resources) are equipped with on-chip programmable hybrid electrical packet/circuit switches. This architecture offers dynamically reconfigurable physical topology to form virtual ones, each embedded with a set of functions. It analyzes the latency overhead of disaggregated DDR4 (parallel) and the proposed hybrid memory cube (serial) memory elements on the conventional and the proposed architecture. A set of resource allocation algorithms are introduced to (1) optimally select disaggregated IT resources with the lowest possible latency, (2) pool them together by means of a virtual network interconnect, and (3) compose virtual disaggregated servers. Simulation findings show up to a 34% resource utilization increase over traditional data centers while highlighting the importance of the placement and locality among compute, memory, and storage resources. In particular, the network- aware locality-based resource allocation algorithm achieves as low as 15 ns, 95 ns, and 315 ns memory transaction round-trip latency on 63%, 22%, and 15% of the allocated virtual machines (VMs) accordingly while utilizing 100% of the CPU resources. Furthermore, a formulation to parameterize and evaluate the additional financial costs endured by disaggregation is reported. It is shown that the more diverse the VM requests are, the higher the net financial gain is. Finally, an experiment was carried out using silicon photonic midboard optics and an optical circuit switch, which demonstrates forward error correction free 10-12 bit error rate performance on up to five-tier scale-out networks.

Optically Disaggregated Data Centers With Minimal Remote Memory Latency: Technologies, Architectures, and Resource Allocation [Invited]

Journal

JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING

Publisher

Optica Publishing Group

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Optically Disaggregated Data Centers With Minimal Remote Memory Latency: Technologies, Architectures, and Resource Allocation [Invited]

Journal

JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING

Publisher

Optica Publishing Group

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper