4.7 Article

Comparing Data Staging Techniques for Large Scale Brain Images

Journal

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
Volume 9, Issue 4, Pages 1697-1708

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TETC.2020.3028744

Keywords

Brain; Random access memory; Deep learning; File systems; Nonvolatile memory; Computer architecture; Bandwidth; Parallel IO; storage hierarchy; large scale deep learning; data staging; brain altas

Funding

  1. European Union's Horizon 2020 research and innovation programs [720270, 785907]
  2. European Union [604102]

Ask authors/readers for more resources

The use of Deep Learning methods is seen as a key opportunity for processing large-scale scientific datasets, but efficient processing requires hierarchical storage architectures for faster access to frequently used data. Different staging techniques are evaluated for Deep Learning usecases, with DRAM staging or usecase specific staging techniques showing the best performance, while a technique called split staging provides improved performance compared to non-staged usecases and comparable performance to specialized solutions. Performance often depends more on data layout and transformations used than on storage layer bandwidth.
The use of Deep Learning methods is identified as a key opportunity for enabling processing of extreme-scale scientific datasets. Efficient processing of these datasets thus requires the ability to store petabytes of data as well as to access this data fast. Hierarchical storage architectures are a promising technology to allow faster access to frequently used data, while providing high capacity. However, the efficient use of faster storage layers is hard, as they usually provide a lower capacity and store data only temporally. IO problems, which are caused by random access patterns, may not be solved by simply coping data to a faster layer. One way to overcome this bottleneck is staging. This means that frequently used data is temporarily stored in a faster memory or storage layer, so the processes can access it faster. In this work, we evaluate four different staging techniques for two Deep Learning usecases working on large scale brain images. These applications are very challenging for the underlying IO system, due to very high bandwidth requirements and random, fine granular access patterns. We analyse and evaluate these different methods on three different staging layers: local SSDs, the same local SSDs but clustered in a parallel file system, and a dedicated storage server. We also evaluate the performance of staging data in DRAM. As expected, the best performance is reached with DRAM staging or by using a usecase specific staging technique. However, since these methods cannot always be used, we developed a technique called split staging, which always can be used. With split staging, the performance, compared to the non-staged usecases can be improved up to the factor of four on our test machines and has comparable performance than specialized solutions of two storage layers. Our results also show that the performance often depends more on the data layout and thus the used transformations than on the bandwidth of the storage layer.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available