Galley, a parallel file system (1994-1998)

This project is no longer active; this page is no longer updated. Last updated Monday, September 2, 1996; reformatted May 2020.

Related projects: [Armada], [CHARISMA], [Parallel-I/O], [RAPID-Transit], [STARFISH]

Related keywords: [pario]


Summary

Large parallel computing systems, especially those used for scientific computation, consume and produce huge amounts of data. To provide the necessary semantics for parallel processes accessing a file, and to provide the necessary throughput for an application working with terabytes of data, requires a multiprocessor file system.

We designed and implemented Galley, a new parallel file system that is intended to meet the needs of parallel scientific applications. Galley demonstrated the power of a split-level interface: a low-level interface that allowed efficient data transfers and in particular the ability of I/O nodes in a multiprocessor to execute some of the file-system code, and a set of high-level interfaces that may be specific to a programming language or application domain and thus most convenient for the programmer.

Galley was designed to run on networks of workstations (NOWs) or parallel supercomputers. It ran on networks of IBM RS/6000s (in the "FLEET" lab at Dartmouth) and on the IBM SP-2 parallel supercomputer (specifically, "Babbage" at NASA Ames Research Center).

Galley was implemented using a model of distinct Compute Processors and I/O Processors. That is, the nodes in the network or parallel machine are partitioned into two sets: one that runs users' applications and one that runs Galley's I/O servers. Even though most NOWs, and some parallel machines, had disks on each node, Galley used the distinct CP/IOP model since it led to more predictable performance and reduced the performance impact that one user's application may have on another.

To use the Galley Parallel File System, a user's application must be linked with the Galley runtime library. The runtime library establishes connections with the system's I/O servers, and handles all communication between the client's code and the servers. When an application makes a call to a Galley routine, the runtime library converts the request into an internal format, and passes it on to the I/O servers via message passing. The runtime library then controls the flow of data from the IOP to the application's address space, or vice versa.

Galley made no pretense to be anything but an experimental file system, and was missing many of the features that would be required in a 'production' file system. Most importantly, Galley did not provide any sort of security or support for recovery from IOP crashes (e.g., fsck).

Features

More information

Source code

You may download the full source code for Galley. The code isn't as clean or as fully commented as I would have liked, but I don't have time to really do a good job on it.

I'm not planning to do any sort of support (unfortunately, this includes bugfixes), partly since I believe the non-competition clause of my new employment agreement will prohibit it, and partly because I don't expect to have time.

Finally, if you find this code useful please let me know. Nils A. Nieuwejaar - September 2, 1996

People

Nils Nieuwejaar and David Kotz, with Matthew Carter, Sanjay Khanna, and Joel Thomas.

Funding and acknowledgements

Galley research was funded by the US National Science Foundation under award number CCR-9404919 and by the US NASA under agreement numbers NCC 2-849 and NAG 2-936.

The views and conclusions contained on this site and in its documents are those of the authors and should not be interpreted as necessarily representing the official position or policies, either expressed or implied, of the sponsor(s). Any mention of specific companies or products does not imply any endorsement by the authors or by the sponsor(s).


Papers (tagged 'galley')

[Also available in BibTeX]

Papers are listed in reverse-chronological order; click an entry to pop up the abstract. For full information and pdf, please click Details link. Follow updates with RSS.

1998:
Matthew P. Carter and David Kotz. An Implementation of the Vesta Parallel File System API on the Galley Parallel File System. Technical Report, April 1998. [Details]

To demonstrate the flexibility of the Galley parallel file system and to analyze the efficiency and flexibility of the Vesta parallel file system interface, we implemented Vesta’s application-programming interface on top of Galley. We implemented the Vesta interface using Galley’s file-access methods, whose design arose from extensive testing and characterization of the I/O requirements of scientific applications for high-performance multiprocessors. We used a parallel CPU, parallel I/O, out-of-core matrix-multiplication application to test the Vesta interface in both its ability to specify data access patterns and in its run-time efficiency. In spite of its powerful ability to specify the distribution of regular, non-overlapping data access patterns across disks, we found that the Vesta interface has some significant limitations. We discuss these limitations in detail in the paper, along with the performance results.

1997:
Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Parallel Computing. June 1997. [Details]

Most current multiprocessor file systems are designed to use multiple disks in parallel computing, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley’s file structure and application interface, as well as the performance advantages offered by that interface.

Sanjay Khanna and David Kotz. A Split-Phase Interface for Parallel File Systems. Technical Report, March 1997. [Details]

We describe the effects of a new user-level library for the Galley Parallel File System. This library allows some pre-existing sequential programs to make use of the Galley Parallel File System with minimal modification. It permits programs to efficiently use the parallel file system because the user-level library groups accesses together. We examine the performance of our library, and we show how code needs to be modified to use the library.

1996:
Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Input/Output in Parallel and Distributed Computer Systems. 1996. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present three extensions to the interface that support strided, nested-strided, and nested-batched I/O requests. We show how these extensions can be used to express common access patterns.

Nils A. Nieuwejaar. Galley: A New Parallel File System for Parallel Applications. PhD thesis, November 1996. Available as Dartmouth Computer Science Technical Report PCS-TR96-300. [Details]

Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scientific applications. Most multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access those multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated application and library programmers to use knowledge about their I/O to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support.

In this work we examine current multiprocessor file systems, as well as how those file systems are used by scientific applications. Contrary to the expectations of the designers of current parallel file systems, the workloads on those systems are dominated by requests to read and write small pieces of data. Furthermore, rather than being accessed sequentially and contiguously, as in uniprocessor and supercomputer workloads, files in multiprocessor file systems are accessed in regular, structured, but non-contiguous patterns.

Based on our observations of multiprocessor workloads, we have designed Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. In this work, we introduce Galley and discuss its design and implementation. We describe Galley’s new three-dimensional file structure and discuss how that structure can be used by parallel applications to achieve higher performance. We introduce several new data-access interfaces, which allow applications to explicitly describe the regular access patterns we found to be common in parallel file system workloads. We show how these new interfaces allow parallel applications to achieve tremendous increases in I/O performance. Finally, we discuss how Galley’s new file structure and data-access interfaces can be useful in practice.


David Kotz and Nils Nieuwejaar. Flexibility and Performance of Parallel File Systems. Proceedings of the International Conference of the Austrian Center for Parallel Computation (ACPC). September 1996. Invited paper. [Details]

As we gain experience with parallel file systems, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to find a single appropriate interface, caching policy, file structure, or disk-management strategy. Furthermore, the proliferation of file-system interfaces and abstractions make applications difficult to port.

We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs).

We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, re specifically designed with the flexibility and performance necessary to support a wide range of applications.


Joel T. Thomas. The Panda Array I/O Library on the Galley Parallel File System. Technical Report, June 1996. Available as Dartmouth Computer Science Technical Report PCS-TR96-288. [Details]

The Panda Array I/O library, created at the University of Illinois, Urbana-Champaign, was built especially to address the needs of high-performance scientific applications. I/O has been one of the most frustrating bottlenecks to high performance for quite some time, and the Panda project is an attempt to ameliorate this problem while still providing the user with a simple, high-level interface. The Galley File System, with its hierarchical structure of files and strided requests, is another attempt at addressing the performance problem. My project was to redesign the Panda Array library for use on the Galley file system. This project involved porting Panda's three main functions: a checkpoint function for writing a large array periodically for 'safekeeping,' a restart function that would allow a checkpointed file to be read back in, and finally a timestep function that would allow the user to write a group of large arrays several times in a sequence. Panda supports several different distributions in both the compute-node memories and I/O-node disks.

We have found that the Galley File System provides a good environment on which to build high-performance libraries, and that the mesh of Panda and Galley was a successful combination.


Nils Nieuwejaar and David Kotz. Performance of the Galley Parallel File System. Proceedings of the Workshop on Input/Output in Parallel and Distributed Systems (IOPADS). May 1996. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance I/O to applications that access data in patterns that have been observed to be common.

Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Proceedings of the ACM International Conference on Supercomputing (ICS). May 1996. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. We discuss Galley’s file structure and application interface, as well as an application that has been implemented using that interface.

Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Technical Report, May 1996. [Details]

Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley’s file structure and application interface, as well as the performance advantages offered by that interface.

David Kotz and Nils Nieuwejaar. Flexibility and Performance of Parallel File Systems. ACM Operating Systems Review. April 1996. [Details]

Many scientific applications for high-performance multiprocessors have tremendous I/O requirements. As a result, the I/O system is often the limiting factor of application performance. Several new parallel file systems have been developed in recent years, each promising better performance for some class of parallel applications. As we gain experience with parallel computing, and parallel file systems in particular, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to find a single appropriate interface, caching policy, file structure, or disk management strategy. Furthermore, the proliferation of file-system interfaces and abstractions make application portability a significant problem.

We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs). We think of this approach as the “RISC” of parallel file-system design.

We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, are specifically designed with the flexibility and performance necessary to support a wide range of applications.


1995:
Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Proceedings of the IPPS Workshop on Input/Output in Parallel and Distributed Systems (IOPADS). April 1995. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present three extensions to the interface that support strided, nested-strided, and nested-batched I/O requests. We show how these extensions can be used to express common access patterns.

Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Technical Report, March 1995. Revised 4/18/95 and appeared in IOPADS workshop at IPPS’95. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present three extensions to the interface that support strided, nested-strided, and nested-batched I/O requests. We show how these extensions can be used to express common access patterns.

1994:
Nils Nieuwejaar and David Kotz. A Multiprocessor Extension to the Conventional File System Interface. Technical Report, September 1994. [Details]

As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present an extension which supports strided and nested-strided I/O requests.


[Kotz research]