The dataset represents the entire output of a particular plasma physics satellite over it's 5-year mission. In the past, analysis of such data has tended to be in small chunks of "interesting" data, read sequentially, because that was the only thing possible when the dataset resided on a roomful of 9-tracks. We hope to do some synoptic sampling which requires more random access, and the optical disk solution seems to be the cheapest method to make that possible, although something that could keep the entire set online at once would obviously be preferable. I know of juke boxe systems that implement a virtual file system that can extend to terabytes, and will transparently swap in media as needed, but they are too may $$.
... the issue of the data file structure hasn't yet been decided. The data current exists as raw 4kb blocks on mag tape (9tracks have been copied to exabytes at this point) each representing some few seconds of satellite telemetry. How we store this on whatever medium we and up with is still to be determined. One option is to not use files at all but just copy to a raw disk partition, although I have no experience with this kind of operation.
-- anonymous 1993