BIB-VERSION:: CS-TR-v2.0 ID:: ncstrl.dartmouthcs//TR2003-445 ENTRY:: April 11, 2003 ORGANIZATION:: Dartmouth College, Computer Science TITLE:: Relaxing the Problem-Size Bound for Out-of-Core Columnsort TYPE:: Technical Report (paper) REVISION:: 1 AUTHOR:: Chaudhry, Geeta AUTHOR:: Hamon, Elizabeth A. AUTHOR:: Cormen, Thomas H. DATE:: April 2003 RETRIEVAL:: For a paper copy, email RETRIEVAL:: For a paper copy, write to Technical Report Librarian Department of Computer Science Dartmouth College 6211 Sudikoff Laboratory Hanover, NH 03755-3510 USA RETRIEVAL:: Compressed Postscript at http://www.cs.dartmouth.edu/reports/TR2003-445.ps.Z RETRIEVAL:: PDF at http://www.cs.dartmouth.edu/reports/TR2003-445.pdf ABSTRACT:: Previous implementations of out-of-core columnsort limit the problem size to $N \leq \sqrt{(M/P)^3 / 2}$, where $N$ is the number of records to sort, $P$ is the number of processors, and $M$ is the total number of records that the entire system can hold in its memory (so that $M/P$ is the number of records that a single processor can hold in its memory). We implemented two variations to out-of-core columnsort that relax this restriction. Subblock columnsort is based on an algorithmic modification of the underlying columnsort algorithm, and it improves the problem-size bound to $N \leq (M/P)^{5/3} / 4^{2/3}$ but at the cost of additional disk I/O\@. $M$-columnsort changes the notion of the column size in columnsort, improving the maximum problem size to $N \leq \sqrt{M^3 / 2}$ but at the cost of additional computation and communication. Experimental results on a Beowulf cluster show that both subblock columnsort and $M$-columnsort run well but that $M$-columnsort is faster. A further advantage of $M$-columnsort is that it handles a wider range of problem sizes than subblock columnsort. END:: ncstrl.dartmouthcs//TR2003-445