Dartmouth research on parallel I/O

This page dates from the early 1990s and is no longer updated. It is here for archival reference.

Many scientific applications that run on today's multiprocessors, such as weather forecasting and seismic analysis, are bottlenecked by their file-I/O needs. To alleviate this bottleneck, multiprocessors must include parallel access to numerous disks. Many questions remain about the appropriate architecture for connecting multiple disks and multiple processors for high throughput and reliability, but it is clear that a multiprocessor must be configured with many disks and service nodes to obtain high performance.

Even if the multiprocessor is configured with sufficient I/O hardware (and unfortunately many are not), software often fails to provide the available bandwidth to the application. Applications could be rewritten to use algorithms that are designed to optimize I/O. All applications could use new interfaces that allow the application programmer to express parallel data transfers, or manipulate out-of-core data structures, without resorting to low-level Unix-like file-system calls. Applications and the workloads of current parallel file systems could be studied to understand application needs. Compilers could analyze and rearrange programs to optimize I/O. Run-time libraries could abstract and encapsulate complex data structures and I/O algorithms, presenting a cleaner abstraction to the user, perhaps one specific to a class of applications. Finally, the core file-system software could provide underlying storage abstractions, provide hooks for efficient implementation of run-time libraries, smooth data flow between processors and disks, manage buffer caches, and arbitrate the needs of concurrent applications. Much research is needed to make these goals possible.

Dartmouth research (outdated)

Dartmouth researchers were working on many of these aspects of parallel-I/O research (outdated list):
Application characterization
The CHARISMA project aims to characterize the workload of existing parallel file systems to gain a better understanding of application needs. This project, headed by David Kotz and Nils Nieuwejaar at Dartmouth, involves people from many institutions.
Algorithms
Tom Cormen, Leonard Wisniewski, Liddy Shriver, and Tom Sundquist have developed several algorithms that are asymptotically optimal in the number of parallel-I/O operations, for problems on out-of-core data sets such as sorting, permuting, FFT, and matrix multiply.
Interfaces
David Kotz and Nils Nieuwejaar have proposed extensions to the traditional file-system interface that can be used to support common parallel-access needs of SPMD scientific applications. For data-parallel programmers, Tom Cormen has extended the C* language to support out-of-core parallel variables, forming a new language he calls ViC*.
Compilers
Tom Cormen and Alex Colvin are building the ViC* compiler that automatically inserts I/O commands, and calls to a library of optimal I/O algorithms, for programs written to use out-of-core parallel variables.
Run-time libraries
Part of the ViC* compiler project is to build a run-time library that includes many of the optimal parallel-I/O algorithms, which can then be called by the programmer or by the compiler on behalf of the program. Tom Cormen is working on this with Melissa Hirschl, Anna Poplaswki, and Brian Premore. Some other libraries are being developed for the Galley file system (below): one with a Vesta interface, one with a Panda interface by Joel Thomas, and SOLAR by Sivan Toledo.
Operating systems
David Kotz has investigated many file-system issues, including prefetching and caching, disk-directed I/O, interfaces, and the role of computational and I/O nodes (with Ting Cai). Nils Nieuwejaar is designing Galley, a new parallel file system based in the FLEET lab. Preston Crow is developing a file system for workstation clusters based on distributed shared memory.
Architecture
David Kotz has written a survey of parallel I/O architectures.

Our people (outdated)

Faculty
Tom Cormen
David Kotz
Tom Sundquist
Graduate students
Ting Cai
Alex Colvin
Preston Crow
Melissa Hirschl
Nils Nieuwejaar
Anna Poplawski
Brian Premore
Liddy Shriver
Len Wisniewski (now graduated and at Thinking Machines)
Undergraduate students
Matt Carter '98
Joel Thomas '96

Our tools (outdated)

Our service (outdated)

Our funding (outdated)

We also gratefully acknowledge the assistance we received from NASA Ames Research Center, National Center for Supercomputing Applications, Sandia National Laboratories, and Thinking Machines Corporation.