This project is no longer active; this page is no longer updated.
Related projects: [Armada], [CHARISMA], [Galley], [RAPID-Transit], [STARFISH]
Related keywords: [pario], [software], [survey]
Large parallel computing systems, especially those used for scientific computation, consume and produce huge amounts of data. To provide the necessary semantics for parallel processes accessing a file, and to provide the necessary throughput for an application working with terabytes of data, requires a multiprocessor file system.
Most of our research is described under one of the related projects: Armada, CHARISMA, Galley, RAPID-Transit, and STARFISH.
We created an on-line bibliography of parallel I/O papers. This annotated bibliography helped many new researchers learn about and begin research in the area of parallel I/O.
We also created a detailed simulation model of the HP 97560 disk drive; see the [technical report] and the [software].
We created the 'Parallel I/O Archive', gathering material related to parallel I/O, including the above-mentioned bibliography of parallel I/O papers. That archive has not been updated in over a decade, but is reproduced here for reference.
The papers reported here were created as part of one or more other projects, each of which had its own source(s) of funding. For more information, see those project pages: Armada, CHARISMA, Galley, RAPID-Transit, and STARFISH.
The following people are co-author on one or more of the papers listed below (see the related projects for others involved in my parallel-I/O research): Charles Catlett, Alok Choudhary, C. Connelly, Thomas H. Cormen, C. Ellis, M. Holliday, Ravi Jain, V. Khera, David Kotz, R. LaRowe, Ron Oldfield, S. Owen, Sriram Radhakrishnan, Daniel A. Reed, Marc Snir, Song Bac Toh, and Darren Erik Vengroff.
The papers here are those that did not clearly fit with one of the named projects listed above. This list is also available in BibTeX.
Papers are listed in reverse-chronological order;
click an entry to pop up the abstract.
For full information and pdf, please click Details link.
Follow updates with RSS.
Using zero(), it is possible to efficiently implement applications including a variety of databases and I/O-efficient computation systems on top of the Unix file system. zero() can also be used to implement an efficient file-system-based paging mechanism. In some I/O-efficient computations, the availability of zero() effectively doubles disk capacity by allowing blocks of temporary files to be reallocated to new files as they are read.
Experiments on a Linux ext2 file system augmented by zero() demonstrate that where their functionality overlaps, zero() is more efficient than ftruncate(). Additional experiments reveal that in exchange for added effective disk capacity, I/O-efficient code pays only a small performance penalty.
We have found that the Galley File System provides a good environment on which to build high-performance libraries, and that the mesh of Panda and Galley was a successful combination.