Dartmouth College Computer Science Technical Report series 
CS home TR home TR search TR listserv 

By author:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z  
By number:  2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986 
Abstract:
We give asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bitmatrixmultiply/complement (BMMC) permutations on parallel disk systems. In a BMMC permutation on N records, where N is a power of 2, each (lg N)bit source address x maps to a corresponding (lg N)bit target address y by the matrix equation y = Ax XOR c, where matrix multiplication is performed over GF(2). The characteristic matrix A is (lg N) x (lg N) and nonsingular over GF(2). Under the VitterShriver paralleldisk model with N records, D disks, B records per block, and M records of memory, we show a universal lower bound of $\Omega \left( \frac{N}{BD} \left( 1 + \frac{\rank{\gamma}}{\lg (M/B)} \right) \right)$ parallel I/Os for performing a BMMC permutation, where gamma is the lower left (lg (N/B)) x (lg B) submatrix of the characteristic matrix. We adapt this lower bound to show that the algorithm for bitpermute/complement (BPC) permutations in Cormen93a is asymptotically optimal. We also present an algorithm that uses at most $\frac{2N}{BD} \left( 4 \ceil{\frac{\rank{\gamma}}{\lg (M/B)}} + 4 \right)$ parallel I/Os, which asymptotically matches the lower bound and improves upon the BMMC algorithm in Cormen93a. When rank (gamma) is low, this method is an improvement over the generalpermutation bound of $\Theta \left( \frac{N}{BD} \frac{\lg (N/B)}{\lg (M/B)} \right)$.
We introduce a new subclass of BMMC permutations, called memoryloaddispersal (MLD) permutations, which can be performed in one pass. This subclass, which is used in the BMMC algorithm, extends the catalog of onepass permutations appearing in Cormen93a.
Although many BMMC permutations of practical interest fall into subclasses that might be explicitly invoked within the source code, we show how to detect in at most $N/BD + \ceil{\frac{\lg (N/B) + 1}{D}}$ parallel I/Os whether a given vector of target addresses specifies a BMMC permutation. Thus, one can determine efficiently at run time whether a permutation to be performed is BMMC and then avoid the generalpermutation algorithm and save parallel I/Os by using our algorithm.
Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]
Or copy and paste:
Thomas H. Cormen,
Thomas Sundquist, and
Leonard F. Wisniewski,
"Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems."
Dartmouth Computer Science Technical Report PCSTR94223,
1994.
Notify me about new tech reports.
To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu
Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Technical reports collection maintained by David Kotz.