4.4 Article

Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance

Journal

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
Volume 35, Issue 1, Pages 47-60

Publisher

SCIENCE PRESS
DOI: 10.1007/s11390-020-9798-5

Keywords

high performance computing; I; O interference; parallel file system; performance optimization; resource misallocation

Funding

  1. National Key Research and Development Program of China [2016YFB1000504]
  2. Natural Science Foundation of China [61433008, 61373145, 61572280]
  3. China Postdoctoral Science Foundation [2018M630162]

Ask authors/readers for more resources

It is hard for applications to make full utilization of the peak bandwidth of the storage system in highperformance computers because of I/O interferences, storage resource misallocations and complex long I/O paths. We performed several studies to bridge this gap in the Sunway storage system, which serves the supercomputer Sunway TaihuLight. To locate these issues and connections between them, an end-to-end performance monitoring and diagnosis tool was developed to understand I/O behaviors of applications and the system. With the help of the tool, we were about to find out the root causes of such performance barriers at the I/O forwarding layer and the parallel file system layer. An application-aware I/O forwarding allocation framework was used to address the I/O interferences and resource misallocations at the I/O forwarding layer. A performance-aware data placement mechanism was proposed to mitigate the impact of I/O interferences and performance variations of storage devices in the PFS. Together, applications obtained much better I/O performance. During the process, we also proposed a lightweight storage stack to shorten the I/O path of applications with -N I/O pattern. This paper summarizes these studies and presents the lessons learned from the process.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available