Dartmouth logo Dartmouth College Computer Science
Technical Report series
CS home
TR home
TR search TR listserv
By author: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
By number: 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986

Tackling Latency Using FG
Priya Natarajan
Dartmouth TR2011-706


Applications that operate on datasets which are too big to fit in main memory, known in the literature as external-memory or out-of-core applications, store their data on one or more disks. Several of these applications make multiple passes over the data, where each pass reads data from disk, operates on it, and writes data back to disk. Compared with an in-memory operation, a disk-I/O operation takes orders of magnitude (approx. 100,000 times) longer; that is, disk-I/O is a high-latency operation. Out-of-core algorithms often run on a distributed-memory cluster to take advantage of a cluster's computing power, memory, disk space, and bandwidth. By doing so, however, they introduce another high-latency operation: interprocessor communication. Efficient implementations of these algorithms access data in blocks to amortize the cost of a single data transfer over the disk or the network, and they introduce asynchrony to overlap high-latency operations and computations.

FG, short for Asynchronous Buffered Computation Design and Engineering Framework Generator, is a programming framework that helps to mitigate latency in out-of-core programs that run on distributed-memory clusters. An FG program is composed of a pipeline of stages operating on buffers. FG runs the stages asynchronously so that stages performing high-latency operations can overlap their work with other stages. FG supplies the code to create a pipeline, synchronize the stages, and manage data buffers; the user provides a straightforward function, containing only synchronous calls, for each stage.

In this thesis, we use FG to tackle latency and exploit the available parallelism in out-of-core and distributed-memory programs. We show how FG helps us design out-of-core programs and think about parallel computing in general using three instances: an out-of-core, distribution-based sorting program; an implementation of external-memory suffix arrays; and a scientific-computing application called the fast Gauss transform. FG's interaction with these real-world programs is symbiotic: FG enables efficient implementations of these programs, and the design of the first two of these programs pointed us toward further extensions for FG. Today's era of multicore machines compels us to harness all opportunities for parallelism that are available in a program, and so in the latter two applications, we combine FG's multithreading capabilities with the routines that OpenMP offers for in-core parallelism. In the fast Gauss transform application, we use this strategy to realize an up to 20-fold performance improvement compared with an alternate fast Gauss transform implementation. In addition, we use our experience with designing programs in FG to provide some suggestions for the next version of FG.

Note: Ph.D Dissertation. Advisor: Thomas H. Cormen.


Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]

Or copy and paste:
   Priya Natarajan, "Tackling Latency Using FG." Dartmouth Computer Science Technical Report TR2011-706, September 2011.

Notify me about new tech reports.

Search the technical reports.

To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu

Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Technical reports collection maintained by David Kotz.