Third Annual Workshop on I/O in Parallel and Distributed Systems

at IPPS in Santa Barbara, April 25, 1995

In cooperation with: ACM SIGARCH.

The 1995 workshop was the third year of the workshop; in the previous two years it was entitled the "Workshop on I/O in Parallel Computer Systems". In keeping with the spirit of the change in title, the 1995 workshop encouraged broad participation from researchers involved in all aspects of solutions to the I/O bottleneck in parallel and distributed systems. In keeping with discussions held at the 1994 workshop, papers describing experimental and empirical investigations were particularly encouraged. The workshop was held on the first day of the IPPS '95 symposium.

Program co-chairs:

Ravi Jain, Bellcore
John Werth, U. Texas, Austin
J. C. Browne, U. of Texas, Austin

Program Committee:

Abhaya Asthana, AT&T Bell Labs
Larry Berdahl, Lawrence Livermore
Peter Chen, Univ. of Michigan
Alok Choudhary, Syracuse
Peter Corbett, IBM Watson
Tom Cormen, Dartmouth
Sam Fineberg, NASA Ames
Shahram Ghandeharizadeh, USC
Paul Messina, Caltech
John Nickolls, MasPar

Call for papers.

Papers:

A BibTeX file is available.

There is a book containing edited papers from IOPADS '94 and IOPADS '95, plus several survey/tutorial papers.

Peter Corbett, Dror Feitelson, Sam Fineberg, Yarsun Hsu, Bill Nitzberg, Jean-Pierre Prost, Marc Snir, Bernard Traversat, and Parkson Wong. Overview of the MPI-IO parallel I/O interface. Pages 1-15.

Abstract: Thanks to MPI, writing portable message passing parallel programs is almost a reality. One of the remaining problems is file I/O. Although parallel file systems support similar interfaces, the lack of a standard makes developing a truly portable program impossible. It is not feasible to develop large scientific applications from scratch for each generation of parallel machine, and, in the scientific world, a program is not considered truly portable unless it not only compiles, but also runs efficiently. The MPI-IO interface is being proposed as an extension to the MPI standard to fill this need. MPI-IO supports a high-level interface to describe the partitioning of file data among processes, a collective interface describing complete transfers of global data structures between process memories and files, asynchronous I/O operations, allowing computation to be overlapped with I/O, and optimization of physical file layout on storage devices (disks).

Comment: A more readable explanation of MPI-IO than the proposed-standard document corbett:mpi-io3. See also the slides presented at IOPADS.

Sandra Johnson Baylor and C. Eric Wu. Parallel I/O workload characteristics using Vesta. Pages 16-29.

Comment: They characterize four parallel applications: sort, matrix multiply, seismic migration, and video server, in terms of their I/O activity.

Edgar T. Kalns and Yarsun Hsu. Video on demand using the Vesta parallel file system. Pages 30-46.

Comment: Hook a video-display system to the compute node of an SP-1 running Vesta, and then use Vesta file system to serve the video.

Nils Nieuwejaar and David Kotz. Low-level interfaces for high-level parallel I/O. Pages 47-62.

Abstract: As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present three extensions to the interface that support strided, nested-strided, and nested-batched I/O requests. We show how these extensions can be used to express common access patterns.

Comment: Identical to revised TR95-253.

Gary Schloss and Michael Vernick. HCSA: a hybrid client-server architecture. Pages 63-77.

Comment: In the context of client-server database systems, they propose to make a compromise between shared-disk architectures, where the disks are all attached to the network and all machines are both clients and servers, and a system where the disks are attached to a single server. Their compromise attaches the disks to both the network and the server.

David Kotz and Ting Cai. Exploring the use of I/O nodes for computation in a MIMD multiprocessor. Pages 78-89.

Abstract: As parallel systems move into the production scientific-computing world, the emphasis will be on cost-effective solutions that provide high throughput for a mix of applications. Cost-effective solutions demand that a system make effective use of all of its resources. Many MIMD multiprocessors today, however, distinguish between ``compute'' and ``I/O'' nodes, the latter having attached disks and being dedicated to running the file-system server. This static division of responsibilities simplifies system management but does not necessarily lead to the best performance in workloads that need a different balance of computation and I/O. Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we begin to examine this issue experimentally. We found that high-performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible.

Steven A. Moyer and V. S. Sunderam. Scalable concurrency control for parallel file systems. Pages 90-106.

Abstract: Parallel file systems employ data declustering to increase I/O throughput. As a result, a single read or write operation can generate concurrent data accesses on multiple storage devices. Unless a concurrency control mechanism is employed, familiar file access semantics are likely to be violated. This paper details the transaction-based concurrency control mechanism implemented in the PIOUS parallel file system. Performance results are presented demonstrating that sequential consistency semantics can be provided without loss of system scalability.

Comment: Based on, or perhaps identical to, CSTR-950202.

Last modified: Wed Sep 4 12:26:44 1996