Galley, a parallel file system (1994-1998)

This project is no longer active; this page is no longer updated. Last updated Monday, September 2, 1996; reformatted May 2020.

Summary
People
Funding
Papers

Related projects: [Armada], [CHARISMA], [Parallel-I/O], [RAPID-Transit], [STARFISH]

Related keywords: [pario]

Summary

Large parallel computing systems, especially those used for scientific computation, consume and produce huge amounts of data. To provide the necessary semantics for parallel processes accessing a file, and to provide the necessary throughput for an application working with terabytes of data, requires a multiprocessor file system.

We designed and implemented Galley, a new parallel file system that is intended to meet the needs of parallel scientific applications. Galley demonstrated the power of a split-level interface: a low-level interface that allowed efficient data transfers and in particular the ability of I/O nodes in a multiprocessor to execute some of the file-system code, and a set of high-level interfaces that may be specific to a programming language or application domain and thus most convenient for the programmer.

Galley was designed to run on networks of workstations (NOWs) or parallel supercomputers. It ran on networks of IBM RS/6000s (in the "FLEET" lab at Dartmouth) and on the IBM SP-2 parallel supercomputer (specifically, "Babbage" at NASA Ames Research Center).

Galley was implemented using a model of distinct Compute Processors and I/O Processors. That is, the nodes in the network or parallel machine are partitioned into two sets: one that runs users' applications and one that runs Galley's I/O servers. Even though most NOWs, and some parallel machines, had disks on each node, Galley used the distinct CP/IOP model since it led to more predictable performance and reduced the performance impact that one user's application may have on another.

To use the Galley Parallel File System, a user's application must be linked with the Galley runtime library. The runtime library establishes connections with the system's I/O servers, and handles all communication between the client's code and the servers. When an application makes a call to a Galley routine, the runtime library converts the request into an internal format, and passes it on to the I/O servers via message passing. The runtime library then controls the flow of data from the IOP to the application's address space, or vice versa.

Galley made no pretense to be anything but an experimental file system, and was missing many of the features that would be required in a 'production' file system. Most importantly, Galley did not provide any sort of security or support for recovery from IOP crashes (e.g., fsck).

Features

Support for many common access patterns
Complete control over parallelism
New three-dimensional file structure
Support for libraries
Asynchronous I/O
Low memory and computational overhead
Designed to be portable
Matt Carter later implemented the Vesta File System interface on Galley (tarball)

More information

Source code

You may download the full source code for Galley. The code isn't as clean or as fully commented as I would have liked, but I don't have time to really do a good job on it.

I'm not planning to do any sort of support (unfortunately, this includes bugfixes), partly since I believe the non-competition clause of my new employment agreement will prohibit it, and partly because I don't expect to have time.

Finally, if you find this code useful please let me know. Nils A. Nieuwejaar - September 2, 1996

People

Nils Nieuwejaar and David Kotz, with Matthew Carter, Sanjay Khanna, and Joel Thomas.

Funding and acknowledgements

Galley research was funded by by NSF under award number CCR-9404919 and by NASA under agreement numbers NCC 2-849 and NAG 2-936.

The views and conclusions contained on this site and in its documents are those of the authors and should not be interpreted as necessarily representing the official position or policies, either expressed or implied, of the sponsor(s). Any mention of specific companies or products does not imply any endorsement by the authors or by the sponsor(s).

Papers (tagged 'galley')

[Also available in BibTeX]

Papers are listed in reverse-chronological order. Follow updates with RSS.

1998:

Matthew P. Carter and David Kotz. An Implementation of the Vesta Parallel File System API on the Galley Parallel File System. Technical Report number PCS-TR98-329, Dartmouth Computer Science, April 1998. [Details]

1997:

Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Parallel Computing, volume 23, number 4, pages 447–476. North-Holland (Elsevier Scientific), June 1997. doi:10.1016/S0167-8191(97)00009-4. [Details]
Sanjay Khanna and David Kotz. A Split-Phase Interface for Parallel File Systems. Technical Report number PCS-TR97-312, Dartmouth Computer Science, March 1997. [Details]

1996:

Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Input/Output in Parallel and Distributed Computer Systems, chapter 9, pages 205–223. Edited by Ravi Jain, John Werth, and James C. Browne. Volume 362 in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1996. ISBN13: 978-1-4613-1401-1. [Details]
Nils A. Nieuwejaar. Galley: A New Parallel File System for Parallel Applications. PhD thesis, Dartmouth College Computer Science, Hanover, NH, November 1996. Available as Dartmouth Computer Science Technical Report PCS-TR96-300. [Details]
David Kotz and Nils Nieuwejaar. Flexibility and Performance of Parallel File Systems. Proceedings of the International Conference of the Austrian Center for Parallel Computation (ACPC), volume 1127 in Lecture Notes in Computer Science, pages 1–11. Springer-Verlag, September 1996. doi:10.1007/3-540-61695-0_1. Invited paper. [Details]
Joel T. Thomas. The Panda Array I/O Library on the Galley Parallel File System. Technical Report number PCS-TR96-288, Dartmouth Computer Science, Hanover, NH, June 1996. Available as Dartmouth Computer Science Technical Report PCS-TR96-288. Senior Honors Thesis. Advisor: David Kotz. [Details]
Nils Nieuwejaar and David Kotz. Performance of the Galley Parallel File System. Proceedings of the Workshop on Input/Output in Parallel and Distributed Systems (IOPADS), pages 83–94. ACM, Philadelphia, May 1996. doi:10.1145/236017.236038. [Details]
Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Proceedings of the ACM International Conference on Supercomputing (ICS), pages 374–381. ACM, Philadelphia, May 1996. doi:10.1145/237578.237639. [Details]
Nils Nieuwejaar and David Kotz. The Galley Parallel File System. Technical Report number PCS-TR96-286, Dartmouth Computer Science, May 1996. [Details]
David Kotz and Nils Nieuwejaar. Flexibility and Performance of Parallel File Systems. ACM Operating Systems Review, volume 30, number 2, pages 63–73. ACM, April 1996. doi:10.1145/232302.232314. [Details]

1995:

Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Proceedings of the IPPS Workshop on Input/Output in Parallel and Distributed Systems (IOPADS), pages 47–62. April 1995. [Details]
Nils Nieuwejaar and David Kotz. Low-level Interfaces for High-level Parallel I/O. Technical Report number PCS-TR95-253, Dartmouth Computer Science, March 1995. Revised 4/18/95 and appeared in IOPADS workshop at IPPS’95. [Details]

1994:

Nils Nieuwejaar and David Kotz. A Multiprocessor Extension to the Conventional File System Interface. Technical Report number PCS-TR94-230, Dartmouth Computer Science, September 1994. [Details]

[Kotz research]