This project is no longer active; this page is no longer updated. Last updated Monday, September 2, 1996; reformatted May 2020.
Related projects: [Armada], [CHARISMA], [Parallel-I/O], [RAPID-Transit], [STARFISH]
Related keywords: [pario]
Large parallel computing systems, especially those used for scientific computation, consume and produce huge amounts of data. To provide the necessary semantics for parallel processes accessing a file, and to provide the necessary throughput for an application working with terabytes of data, requires a multiprocessor file system.
We designed and implemented Galley, a new parallel file system that is intended to meet the needs of parallel scientific applications. Galley demonstrated the power of a split-level interface: a low-level interface that allowed efficient data transfers and in particular the ability of I/O nodes in a multiprocessor to execute some of the file-system code, and a set of high-level interfaces that may be specific to a programming language or application domain and thus most convenient for the programmer.
Galley was designed to run on networks of workstations (NOWs) or parallel supercomputers. It ran on networks of IBM RS/6000s (in the "FLEET" lab at Dartmouth) and on the IBM SP-2 parallel supercomputer (specifically, "Babbage" at NASA Ames Research Center).
Galley was implemented using a model of distinct Compute Processors and I/O Processors. That is, the nodes in the network or parallel machine are partitioned into two sets: one that runs users' applications and one that runs Galley's I/O servers. Even though most NOWs, and some parallel machines, had disks on each node, Galley used the distinct CP/IOP model since it led to more predictable performance and reduced the performance impact that one user's application may have on another.
To use the Galley Parallel File System, a user's application must be linked with the Galley runtime library. The runtime library establishes connections with the system's I/O servers, and handles all communication between the client's code and the servers. When an application makes a call to a Galley routine, the runtime library converts the request into an internal format, and passes it on to the I/O servers via message passing. The runtime library then controls the flow of data from the IOP to the application's address space, or vice versa.
Galley made no pretense to be anything but an experimental file system, and was missing many of the features that would be required in a 'production' file system. Most importantly, Galley did not provide any sort of security or support for recovery from IOP crashes (e.g., fsck).
You may download the full source code for Galley. The code isn't as clean or as fully commented as I would have liked, but I don't have time to really do a good job on it.
I'm not planning to do any sort of support (unfortunately, this includes bugfixes), partly since I believe the non-competition clause of my new employment agreement will prohibit it, and partly because I don't expect to have time.
Finally, if you find this code useful please let me know. Nils A. Nieuwejaar - September 2, 1996Nils Nieuwejaar and David Kotz, with Matthew Carter, Sanjay Khanna, and Joel Thomas.
Galley research was funded by by NSF under award number CCR-9404919 and by NASA under agreement numbers NCC 2-849 and NAG 2-936.
The views and conclusions contained on this site and in its documents are those of the authors and should not be interpreted as necessarily representing the official position or policies, either expressed or implied, of the sponsor(s). Any mention of specific companies or products does not imply any endorsement by the authors or by the sponsor(s).
[Also available in BibTeX]
Papers are listed in reverse-chronological order.
Follow updates with RSS.