Dynamic File-Access Characteristics of a Production Parallel Scientific Workload

David Kotz and Nils Nieuwejaar.
Appeared in Supercomputing '94, pages 640-649.

Abstract: Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors.

Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor file systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation differs from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.

Full paper (Postscript, 60635 bytes)
(Copyright (c) 1994 by IEEE)

Technical report version.

Another related technical report.

The CHARISMA project.

Back to Dartmouth CS Home Page