Parallel File System Workload Characterization
Most parallel file systems (eg, Intel's CFS, Thinking Machines SFS) have
been designed around the assumption that scientific applications running on
parallel computers would exhibit behavior similar to that of scientific
applications running on uniprocessors and vector supercomputers.
The primary characteristics of file access in those environments are:
- Files are huge - hundreds of megabytes, gigabytes, or larger.
- Files are accessed in large pieces - hundreds of kilobytes or megabytes
at a time.
- Files are accessed sequentially. That is, every byte in the file is
accessed, in order, from beginning to end.
To test the validity of that assumption, we traced the workloads of two
different parallel file systems, on two different machines, at two different
sites, running primarily scientific applications. The tracing involved
recording every single access that was made to the parallel file system over
a period of weeks.
The two machines we traced were an
Intel iPSC/860 at
NASA Ames'
Numerical Aerodynamic Simulation
facility and a
Thinking Machines CM-5 at the
National Center for Supercomputing
Applications . All parallel file access on the iPSC was done
through Intel's Concurrent File System. Parallel applications on the
CM-5 could use either the data-parallel CMF I/O library or the control
parallel CMMD I/O library.
Our observations may be summarized as follows:
- Many parallel applications access files in small (64-256 bytes),
noncontiguous pieces.
- Within a single file, these pieces tend to be regularly sized and
spaced.
- Many parallel applications use many different files in a single run.
- There is a great deal of interprocessor sharing of files.
- There is very little interjob sharing of files.
We examined the millions of small, noncontiguous requests in greater
detail, and found that most of them appeared to be part of regular, higher-level pattern.
The experiments, our observations, and our conclusions are examined in much
greater detail in the following papers:
Nils A. Nieuwejaar
nils@cs.dartmouth.edu
Last modified Friday, March 29, 1996