diff -r pario/web/bibtex/barve:competitive2.bib pario/web/new/barve:competitive2.bib 12,39c12 < keyword = {disk prefetching, file caching, parallel I/O, pario-bib}, < abstract = {We provide a competitive analysis framework for online < prefetching and buffer management algorithms in parallel I/O systems, using a < read-once model of block references. This has widespread applicability to key < I/O-bound applications such as external merging and concurrent playback of < multiple video streams. Two realistic lookahead models, global lookahead and < local lookahead, are defined. Algorithms NOM and GREED based on these two < forms of lookahead are analyzed for shared buffer and distributed buffer < configurations, both of which occur frequently in existing systems. An < important aspect of our work is that we show how to implement both the models < of lookahead in practice using the simple techniques of forecasting and < flushing. \par Given a D-disk parallel I/O system and a globally shared I/O < buffer that can hold upto M disk blocks, we derive a lower bound of < $\Omega(\sqrt{D}$) on the competitive ratio of any deterministic online < prefetching algorithm with O(M) lookahead. NOM is shown to match the lower < bound using global M-block lookahead. In contrast, using only local lookahead < results in an $\Omega(D)$ competitive ratio. When the buffer is distributed < into D portions of M/D blocks each, the algorithm GREED based on local < lookahead is shown to be optimal, and NOM is within a constant factor of < optimal. Thus we provide a theoretical basis for the intuition that global < lookahead is more valuable for prefetching in the case of a shared buffer < configuration whereas it is enough to provide local lookahead in case of the < distributed configuration. Finally, we analyze the performance of these < algorithms for reference strings generated by a uniformly-random stochastic < process and we show that they achieve the minimal expected number of I/Os. < These results also give bounds on the worst-case expected performance of < algorithms which employ randomization in the data layout.}, < comment = {See barve:competitive.} --- > keyword = {disk prefetching, file caching, parallel I/O, pario-bib} diff -r pario/web/bibtex/cho:local.bib pario/web/new/cho:local.bib 29c29,39 < predicts the results within a 13\% margin of error.} --- > predicts the results within a 13\% margin of error.}, > comment = {They examine a system that supports nodes that are both compute > and I/O nodes. The assumption is that the application is writing data to a > new file, and does not care to which disks the data goes. They are trying to > decide which nodes should be used for I/O, given the distribution of data on > compute nodes and the distribution desired across disks. They use a Hungarian > algorithm to solve a weighted optimization problem on a bipartite graph > connecting I/O nodes to compute nodes, in an attempt to minimize the data > flow across the network. But there is no attempt to make a decision that > might be sensible for a future read operation that may want to read in a > different pattern.} diff -r pario/web/bibtex/cormen:fft3.bib pario/web/new/cormen:fft3.bib 13,30c13,14 < abstract = {This paper extends an earlier out-of-core Fast Fourier Transform < (FFT) method for a uniprocessor with the Parallel Disk Model (PDM) to use < multiple processors. Four out-of-core multiprocessor methods are examined. < Operationally, these methods differ in the size of "mini-butterfly" computed < in memory and how the data are organized on the disks and in the distributed < memory of the multiprocessor. The methods also perform differing amounts of < I/O and communication. Two of them have the remarkable property that even < though they are computing the FFT on a multiprocessor, all interprocessor < communication occurs outside the mini-butterfly computations; communication < that ordinarily occurs in a butterfly is folded into other data-movement < operations. An analysis program shows that the two methods that use no < butterfly communication usually use less communication overall than the other < methods. The analysis program is fast enough that it can be invoked at run < time to determine which of the four methods uses the least communication. One < set of performance results on a small workstation cluster indicates that the < methods without butterfly communication are approximately 9.5\% faster. < Moreover, they are much easier to implement.}, < comment = {See also cormen:fft and cormen:fft2.} --- > comment = {They find a way to move the interprocessor communication involved > in the out-of-core FFT into a single BMMC permutation between} diff -r pario/web/bibtex/cortes:cooperative.bib pario/web/new/cortes:cooperative.bib 12,22c12 < pario-bib}, < abstract = {In this paper, we examine some of the important problems observed < in the design of cooperative caches. Solutions to the coherence, < load-balancing and fault-tolerance problems are presented. These solutions < have been implemented as a part of PAFS, a parallel/distributed file system, < and its performance has been compared to the one achieved by xFS. Using the < comparison results, we have observed that the proposed ideas not only solve < the main problems of cooperative caches, but also increase the overall system < performance. Although the solutions presented in this paper were targeted to < a parallel machine, reasonable good results have also been obtained for < networks of workstations.} --- > pario-bib} diff -r pario/web/bibtex/foster:chemio.bib pario/web/new/foster:chemio.bib 7a8 > later = {nieplocha:chemio}, 11c12 < supports out-of-core arrays.} --- > supports out-of-core arrays. See also nieplocha:chemio.} diff -r pario/web/bibtex/foster:remote-io.bib pario/web/new/foster:remote-io.bib 32c32,41 < can improve turnaround time relative to staging.} --- > can improve turnaround time relative to staging.}, > comment = {They want to support users that have datasets at different > locations in the Internet, but need to access the data at supercomputer > parallel machines. Rather than staging data in and out, they want to provide > remote access. Issues: naming, dynamic loads, heterogeneity, security, > fault-tolerance. All traffic goes through a 'forwarder node' that funnels all > the traffic into the network. They use URLs for pathnames (e.g., > "x-rio://..."). They find that non-blocking ops are important, as is > collective I/O. They think that buffering will be important. Limited > experiments.} diff -r pario/web/bibtex/gibson:sdcr.bib pario/web/new/gibson:sdcr.bib 3,4c3,4 < title = {Report of the Working Group on Storage {I/O} Issues in Large-Scale < Computing}, --- > title = {Strategic directions in storage {I/O} issues in large-scale > computing}, diff -r pario/web/bibtex/madhyastha:classification.bib pario/web/new/madhyastha:classification.bib 23c23,29 < benchmarks and applications.} --- > benchmarks and applications.}, > comment = {The most interesting thing in this paper is the use of a Hidden > Markov Model to understand the access pattern of an application to a file. > After running the application on the file once, and simultaneously training > their HMM, they use the result to tune the system for the next execution > (cache size, cache partitioning, prefetching, Intel file mode, etc). They get > much better performance in future runs. See also her paper in SC97.} diff -r pario/web/bibtex/oldfield:seismic.bib pario/web/new/oldfield:seismic.bib 4,5c4,6 < journal = {International Journal of Supercomputer Applications}, < year = {1997}, --- > journal = {International Journal of Supercomputer Applications and High > Performance Computing}, > year = {1998}, diff -r pario/web/bibtex/prabhakar:browsing.bib pario/web/new/prabhakar:browsing.bib 31c31,33 < disks by placing image coefficients based upon browsing access patterns.} --- > disks by placing image coefficients based upon browsing access patterns.}, > comment = {They use simulation to study several different placement policies > for the thumbnail and varying-resolution versions of images on a disk array.}