Reducing Mass Degeneracy in SAR by MS by Stable Isotopic Labeling Dartmouth Technical Report TR2000-362 Chris Bailey-Kellogg John J. Kelley Clifford Stein Bruce Randall Donald Date: February 2000 URL (compressed postscript): (216KB) URL (PDF): (400KB) Abstract: Mass spectrometry (MS) promises to be an invaluable tool for functional genomics, by supporting low-cost, high-throughput experiments. However, large-scale MS faces the potential problem of mass degeneracy -- indistinguishable masses for multiple biopolymer fragments (e.g. from a limited proteolytic digest). This paper studies the tasks of planning and interpreting MS experiments that use selective isotopic labeling, thereby substantially reducing potential mass degeneracy. Our algorithms support an experimental-computational protocol called Structure-Activity Relation by Mass Spectrometry (SAR by MS), for elucidating the function of protein-DNA and protein-protein complexes. SAR by MS enzymatically cleaves a crosslinked complex and analyzes the resulting mass spectrum for mass peaks of hypothesized fragments. Depending on binding mode, some cleavage sites will be shielded; the absence of anticipated peaks implicates corresponding fragments as either part of the interaction region or inaccessible due to conformational change upon binding. Thus different mass spectra provide evidence for different structure-activity relations. We address combinatorial and algorithmic questions in the areas of data analysis (constraining binding mode based on mass signature) and experiment planning (determining an isotopic labeling strategy to reduce mass degeneracy and aid data analysis). We explore the computational complexity of these problems, obtaining upper and lower bounds. We report experimental results from implementations of our algorithms. Note: This report supercedes TR99-359. To appear in the 8th International Conference on Intelligent Systems for Molecular Biology, (August 20-23, 2000) La Jolla, CA (Accepted; in press). A Formal Semantics for SPKI Dartmouth Technical Report TR2000-363 Jon Howell David Kotz Date: March 2000 URL (compressed postscript): (376KB) URL (PDF): (468KB) Abstract: We extend the logic and semantics of authorization due to Abadi, Lampson, et al. to support restricted delegation. Our formal model provides a simple interpretation for the variety of constructs in the Simple Public Key Infrastructure (SPKI), and lends intuition about possible extensions. We discuss both extensions that our semantics supports and extensions that it cautions against. Note: This TR supercedes TR1999-361. This technical report is an extended version of a paper submitted to ESORICS 2000. For more information, see the project web page. Landmarks for absolute localization Dartmouth Technical Report TR2000-364 Jon Howell Keith Kotay Date: March 2000 URL (compressed postscript): (932KB) URL (PDF): (372KB) Abstract: For certain experiments in mobile robotics, it is convenient to eliminate positional estimation error in the interest of analyzing other parts of the experiment. We designed and implemented a simple, accurate scheme for encoding and recovering absolute position information. The encoding is a two-dimensional image printed on the plane of the floor, and the absolute position information is recovered using a downward-looking video camera mounted on a mobile robot. Note: This document describes work done in the Dartmouth Robotics Laboratory in April of 1997 and August of 1999. It was previously ``published'' as a web page, but we thought it would make sense to document it more permanently. Mobile Agents: Motivations and State-of-the-Art Systems Dartmouth Technical Report TR2000-365 Robert S. Gray David Kotz George Cybenko Daniela Rus Date: April 2000 URL (compressed postscript): (296KB) URL (PDF): (404KB) Abstract: A mobile agent is an executing program that can migrate, at times of its own choosing, from machine to machine in a heterogeneous network. On each machine, the agent interacts with stationary service agents and other resources to accomplish its task. In this chapter, we first make the case for mobile agents, discussing six strengths of mobile agents and the applications that benefit from these strengths. Although none of these strengths are unique to mobile agents, no competing technique shares all six. In other words, a mobile-agent system provides a single general framework in which a wide range of distributed applications can be implemented efficiently and easily. We then present a representative cross-section of current mobile-agent systems. Note: This technical report will appear as a chapter in Jeffrey M. Bradshaw, editor, Handbook of Agent Technology, AAAI/MIT Press, 2000. In Press. Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks Dartmouth Technical Report TR2000-366 David Kotz Guofei Jiang Robert S. Gray George Cybenko Ronald A. Peterson Date: May 2000 URL (compressed postscript): (392KB) URL (PDF): (380KB) Abstract: Wireless networks are an ideal environment for mobile agents, because their mobility allows them to move across an unreliable link to reside on a wired host, next to or closer to the resources they need to use. Furthermore, client-specific data transformations can be moved across the wireless link, and run on a wired gateway server, with the goal of reducing bandwidth demands. In this paper we examine the tradeoffs faced when deciding whether to use mobile agents to support a data-filtering application, in which numerous wireless clients filter information from a large data stream arriving across the wired network. We develop an analytical model and use parameters from our own experiments to explore the model's implications. Note: In August 2000 a revised version appeared in the International Workshop on Modeling and Simulation of Wireless and Mobile Systems (MSWiM 2000). In October 2000 a further revised version appeared as Dartmouth Technical Report TR2000-377, and was submitted to the journal Mobile Networks and Applications (ACM MONET). Approximation Algorithms for the Minimum Bends Traveling Salesman Problem Dartmouth Technical Report TR2000-367 Cliff Stein David P. Wagner Date: April 2000 URL (compressed postscript): (108KB) URL (PDF): (100KB) Abstract: The problem of traversing a set of points in the order that minimizes the total distance traveled (traveling salesman problem) is one of the most famous and well-studied problems in combinatorial optimization. It has many applications, and has been a testbed for many of the must useful ideas in algorithm design and analysis. The usual metric, minimizing the total distance traveled, is an important one, but many other metrics are of interest. In this paper, we introduce the metric of minimizing the number of turns in the tour, given that the input points are in the Euclidean plane. To our knowledge this metric has not been studied previously. It is motivated by applications in robotics and in the movement of other heavy machinery: for many such devices turning is an expensive operation. We give approximation algorithms for several variants of the traveling salesman problem for which the metric is to minimize the number of turns. We call this the minimum bends traveling salesman problem. For the case of an arbitrary set of $n$ points in the Euclidean plane, we give an O(lg z)-approximation algorithm, where z is the maximum number of collinear points. In the worst case z can be as big as n, but z will often be much smaller. For the case when the lines are restricted to being either horizontal or vertical, we give a 2-approximation algorithm. If we have the further restriction that no two points are allowed to have the same x- or y-coordinate, we give an algorithm that finds a tour which makes at most two turns more than the optimal tour. Thus we have an approximation algorithm with an additive, rather than a multiplicative error bound. Beyond the additive error bound, our algorithm for this problem introduces several interesting algorithmic techniques for decomposing sets of points in the Euclidean plane that we believe to be of independent interest. Note: Submitted to FOCS 2000. A Simulation of Auroral Absorption Dartmouth Technical Report TR2000-368 Eric Michael Greenberg Date: May 2000 URL (PDF): (3028KB) Abstract: HF radio transmissions propagate long distances by reflecting off the ionosphere. At high latitudes radio propagation is strongly affected by the northern lights (aurora borealis), which causes ionization at low altitudes and hence the absorption of radio waves. Models of this process are still in a primitive state. A simulation of radio wave propagation was created in order to test Foppiano and Bradley's empirical model of auroral absorption. The simulation attempts to predict the net absorption of signals at a receiver by simulating a large number of transmitters, even though the exact sources of the signals are unknown. Although the simulation takes into account auroral and nonauroral absorption as well as other sources of path loss, the analysis focuses on the nighttime aurora. An intelligent search algorithm is used in order to efficiently adjust the model to best fit the data. The output of the simulation is qualitatively and quantitatively compared to signal levels observed with HF radio receivers located in northern Canada. The analysis allows us to develop alternative models of auroral absorption which account for the level of geomagnetic activity, and these are compared to the standard Foppiano and Bradley model. Note: Advisor: Dan Rockmore Registration of Images with Dissimilar Contrast using a Hybrid Method Employing Correlation and Mutual Information Dartmouth Technical Report TR2000-369 Karolyn A. Abram Date: June 2000 URL (compressed postscript): (1736KB) URL (PDF): (1440KB) Abstract: The problem of fitting one image into another is commonly called "registration." Finding the best possible translation and rotation necessary to align two images is one approach to solving this problem. Registration is a crucial component of many remote sensing and medical image interpretation applications. Image alignment techniques aid in volumetric estimations of complicated structures and allow radiologists to accurately identify changes between sequential images. Radiologists require image alignment capabilities to correct for patient motion and/or content displacement between images. Numerous image registration techniques exist for correcting the alignment problems mentioned above. Unfortunately, most of these techniques, such as Correlation, fail to find a good alignment when dealing with images that differ in contrast. The Mutual Information method is able to align images independently of contrast, but it is computationally intensive. We explore a hybrid technique that utilizes both Correlation and Mutual Information. The Hybrid technique hopes to gain greater contrast independence than Correlation alone while achieving a lower running time than a pure Mutual Information technique. Note: Undergraduate Honors Thesis Advisor: Daniel N. Rockmore An Infrastructure for a Mobile-Agent System that Provides Personalized Services to Mobile Devices Dartmouth Technical Report TR2000-370 Debbie O. Chyi Date: May 2000 URL (compressed postscript): (888KB) URL (PDF): (372KB) Abstract: In this paper, we present the design of a mobile-agent system that provides a mobile user with a personalized information retrieval service and we describe the implementation of the infrastructure for such a system. This "Personal Agent System" gathers information from the Internet and uses context-aware mechanisms to manage the information according to a mobile user's needs and preferences. The user's schedule and location are the context indicators in this system. These indicators are critical in ensuring that users obtain only the information they want, receive information in a form that is most useful for viewing on their mobile device, and is notified of new information in a minimally intrusive manner. The system incorporates a rule-based learning system to enhance the personalization achieved by the system. Note: Undergraduate Honors Thesis. Advisor: David F. Kotz Personal Radio Dartmouth Technical Report TR2000-372 John C. Artz Date: June 2000 URL (compressed postscript): (2756KB) URL (PDF): (516KB) Abstract: With the development of new technologies that allow the broadcast of digital data over radio signals, there are many possibilities for improving upon the traditional radio station model for content delivery. The idea of Personal Radio is a system that tailors content to meet the needs of each individual. Using Global Positioning System (GPS) technology to play location specific content, the listening history to play content an appropriate number of times, and user feedback to learn personal preferences, the Personal Radio provides the listener with the content that is the most useful/interesting to them. This paper will examine the general design of such a system and present solutions developed in the implementation of several pieces of the design. Note: Undergraduate Honors Thesis. Advisors: David Kotz and Daniela Rus Depth from Flash Dartmouth Technical Report TR2000-373 David B. Martin Date: June 2000 URL (compressed postscript): (4936KB) URL (PDF): (3700KB) Abstract: Digital camera technology has recently seen substantial improvements in image quality while lower prices have made it affordable to the average consumer. Camera manufacturers, however, are not taking full advantage of this new medium for image capture. By filtering the already digitized image produced by these cameras through on-board image processing algorithms we can dramatically increase the power of digital cameras. For example, according to experts in the photographic industry, most people simply take bad pictures. Classic examples of this phenomenon are photographs taken indoors with a point-and-shoot style camera using its built-in flash. The subjects of these photographs often seem to have a spotlight on them, making them look bright and washed out while the rest of the photograph is dark and indistinct. This can primarily be accounted for by a well known property of point light sources: falloff in brightness is inversely proportional to the square of the distance between the light and the object being illuminated. A technique first introduced in the field of computer vision has been shown to successfully recover information about the distance between the light source and objects in the world. We propose using this technique, which is readily implementable in hardware, to correct for a variety of poorly illuminated digital images. Note: Undergraduate Honors Thesis. Advisor: Hany Farid An Economic CPU-Time Market for D'Agents Dartmouth Technical Report TR2000-375 Ezra E. K. Cooper Robert S. Gray Date: June 2000 URL (compressed postscript): (100KB) URL (PDF): (228KB) Abstract: A usable and efficient resource-management system has been created for use with D'Agents. The software dynamically negotiates a price rate for CPU time, using the competitive bids of mobile agents that offer currency in return for fast computation. The system allows mobile agents to plan their expenditures across many hosts while minimizing the time needed for their tasks. The ability to price CPU time opens the door for service owners to be compensated for the computation consumed by agents and provides an incentive for servers to allow anonymous agents. We discuss the theoretical background which makes a CPU market system possible and the performance of the D'Agents market system. Note: Undergraduate honors thesis. Advisor: Bob Gray. The complexity of planning with partially-observable Markov decision processes Dartmouth Technical Report TR2000-376 Martin Mundhenk Date: June 2000 URL (compressed postscript): (468KB) URL (PDF): (656KB) Abstract: This work surveys results on the complexity of planning under uncertainty. The planning model considered is the partially-observable Markov decision process. The general planning problems are, given such a process, (a) to calculate its performance under a given control policy, (b) to find an optimal or approximate optimal control policy, and (c) to decide whether a good policy exists. The complexity of this and related problems depend on a variety of factors, including the observability of the process state, the compactness of the process representation, the type of policy, or even the number of actions relative to the number of states. In most cases, the problem can be shown to be complete for some known complexity class. The skeleton of this survey are results from Littman, Goldsmith and Mundhenk (Journal of Artificial Intelligence Research 1998), Mundhenk (Mathematics of Operations Research 2000), Mundhenk, Goldsmith, Lusena and Allender (Journal of the ACM 2000), and Lusena, Goldsmith and Mundhenk (University of KY CS TR). But there are also some news. Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks Dartmouth Technical Report TR2000-377 David Kotz George Cybenko Robert S. Gray Guofei Jiang Ronald A. Peterson Martin O. Hofmann Daria A. Chacon Kenneth R. Whitebread James Hendler Date: October 2000 URL (compressed postscript): (168KB) URL (PDF): (200KB) Abstract: Wireless networks are an ideal environment for mobile agents, since their mobility allows them to move across an unreliable link to reside on a wired host, next to or closer to the resources that they need to use. Furthermore, client-specific data transformations can be moved across the wireless link and run on a wired gateway server, reducing bandwidth demands. In this paper we examine the tradeoffs faced when deciding whether to use mobile agents in a data-filtering application where numerous wireless clients filter information from a large data stream arriving across the wired network. We develop an analytical model and use parameters from filtering experiments conducted during a U.S. Navy Fleet Battle Experiment (FBE) to explore the model's implications. Note: Updated version of TR2000-366. To appear, after revisions, in the journal Mobile Networks and Applications (ACM MONET). Naming and sharing resources across administrative boundaries (Volume I) Dartmouth Technical Report TR2000-378 Jon Howell Date: May 2000 URL (compressed postscript): (928KB) URL (PDF): (1320KB) Abstract: I tackle the problem of naming and sharing resources across administrative boundaries. Conventional systems manifest the hierarchy of typical administrative structure in the structure of their own mechanism. While natural for communication that follows hierarchical patterns, such systems interfere with naming and sharing that cross administrative boundaries, and therefore cause headaches for both users and administrators. I propose to organize resource naming and security, not around administrative domains, but around the sharing patterns of users. The dissertation is organized into four main parts. First, I discuss the challenges and tradeoffs involved in naming resources and consider a variety of existing approaches to naming. Second, I consider the architectural requirements for user-centric sharing. I evaluate existing systems with respect to these requirements. Third, to support the sharing architecture, I develop a formal logic of sharing that captures the notion of restricted delegation. Restricted delegation ensures that users can use the same mechanisms to share resources consistently, regardless of the origin of the resource, or with whom the user wishes to share the resource next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism to the Simple Public Key Infrastructure and discuss how the formalism either supports or discourages potential extensions to such a system. Finally, I use the formalism to drive a user-centric sharing implementation for distributed systems. I show how this implementation enables end-to-end authorization, a feature that makes heterogeneous distributed systems more secure and easier to audit. Conventionally, gateway services that bridge administrative domains, add abstraction, or translate protocols typically impede the flow of authorization information from client to server. In contrast, end-to-end authorization enables us to build gateway services that preserve authorization information, hence we reduce the size of the trusted computing base and enable more effective auditing. I demonstrate my implementation and show how it enables end-to-end authorization across various boundaries. I measure my implementation and argue that its performance tracks that of similar authorization mechanisms without end-to-end structure. I conclude that my user-centric philosophy of naming and sharing benefits both users and administrators. Note: This technical report represents Volume One of the dissertation. Volume One (187 pages) is the heart of the dissertation. Volume Two (237 pages) contains a software manual introduced with illustrated code fragments, plus plots of the raw data used for the experimental results presented in Volume One. That material is optional; I recommend that the interested reader begin with just Volume One. Volume II is available as TR2000-379. Please note that a list of errata is available as TR2000-380. Naming and sharing resources across administrative boundaries (Volume II) Dartmouth Technical Report TR2000-379 Jon Howell Date: May 2000 URL (compressed postscript): (492KB) URL (PDF): (1188KB) Abstract: I tackle the problem of naming and sharing resources across administrative boundaries. Conventional systems manifest the hierarchy of typical administrative structure in the structure of their own mechanism. While natural for communication that follows hierarchical patterns, such systems interfere with naming and sharing that cross administrative boundaries, and therefore cause headaches for both users and administrators. I propose to organize resource naming and security, not around administrative domains, but around the sharing patterns of users. The dissertation is organized into four main parts. First, I discuss the challenges and tradeoffs involved in naming resources and consider a variety of existing approaches to naming. Second, I consider the architectural requirements for user-centric sharing. I evaluate existing systems with respect to these requirements. Third, to support the sharing architecture, I develop a formal logic of sharing that captures the notion of restricted delegation. Restricted delegation ensures that users can use the same mechanisms to share resources consistently, regardless of the origin of the resource, or with whom the user wishes to share the resource next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism to the Simple Public Key Infrastructure and discuss how the formalism either supports or discourages potential extensions to such a system. Finally, I use the formalism to drive a user-centric sharing implementation for distributed systems. I show how this implementation enables end-to-end authorization, a feature that makes heterogeneous distributed systems more secure and easier to audit. Conventionally, gateway services that bridge administrative domains, add abstraction, or translate protocols typically impede the flow of authorization information from client to server. In contrast, end-to-end authorization enables us to build gateway services that preserve authorization information, hence we reduce the size of the trusted computing base and enable more effective auditing. I demonstrate my implementation and show how it enables end-to-end authorization across various boundaries. I measure my implementation and argue that its performance tracks that of similar authorization mechanisms without end-to-end structure. I conclude that my user-centric philosophy of naming and sharing benefits both users and administrators. Note: This technical report represents Volume Two of the dissertation. Volume One (187 pages) is the heart of the dissertation. Volume Two (237 pages) contains a software manual introduced with illustrated code fragments, plus plots of the raw data used for the experimental results presented in Volume One. That material is optional; I recommend that the interested reader begin with just Volume One. Volume I is available as TR2000-378. Please note that a list of errata is available as TR2000-380. Naming and sharing resources across administrative boundaries (errata) Dartmouth Technical Report TR2000-380 Jon Howell Date: May 2000 URL (compressed postscript): (36KB) URL (PDF): (292KB) Abstract: I tackle the problem of naming and sharing resources across administrative boundaries. Conventional systems manifest the hierarchy of typical administrative structure in the structure of their own mechanism. While natural for communication that follows hierarchical patterns, such systems interfere with naming and sharing that cross administrative boundaries, and therefore cause headaches for both users and administrators. I propose to organize resource naming and security, not around administrative domains, but around the sharing patterns of users. The dissertation is organized into four main parts. First, I discuss the challenges and tradeoffs involved in naming resources and consider a variety of existing approaches to naming. Second, I consider the architectural requirements for user-centric sharing. I evaluate existing systems with respect to these requirements. Third, to support the sharing architecture, I develop a formal logic of sharing that captures the notion of restricted delegation. Restricted delegation ensures that users can use the same mechanisms to share resources consistently, regardless of the origin of the resource, or with whom the user wishes to share the resource next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism to the Simple Public Key Infrastructure and discuss how the formalism either supports or discourages potential extensions to such a system. Finally, I use the formalism to drive a user-centric sharing implementation for distributed systems. I show how this implementation enables end-to-end authorization, a feature that makes heterogeneous distributed systems more secure and easier to audit. Conventionally, gateway services that bridge administrative domains, add abstraction, or translate protocols typically impede the flow of authorization information from client to server. In contrast, end-to-end authorization enables us to build gateway services that preserve authorization information, hence we reduce the size of the trusted computing base and enable more effective auditing. I demonstrate my implementation and show how it enables end-to-end authorization across various boundaries. I measure my implementation and argue that its performance tracks that of similar authorization mechanisms without end-to-end structure. I conclude that my user-centric philosophy of naming and sharing benefits both users and administrators. Note: This technical report contains errata for the dissertation. Volume One (187 pages) is the heart of the dissertation. Volume Two (237 pages) contains a software manual introduced with illustrated code fragments, plus plots of the raw data used for the experimental results presented in Volume One. That material is optional; I recommend that the interested reader begin with just Volume One. Volume I is available as TR2000-378. Volume II is available as TR2000-379. A Survey of Context-Aware Mobile Computing Research Dartmouth Technical Report TR2000-381 Guanling Chen David Kotz Date: November 2000 URL (compressed postscript): (884KB) URL (PDF): (132KB) Abstract: Context-aware computing is a mobile computing paradigm in which applications can discover and take advantage of contextual information (such as user location, time of day, nearby people and devices, and user activity). Since it was proposed about a decade ago, many researchers have studied this topic and built several context-aware applications to demonstrate the usefulness of this new technology. Context-aware applications (or the system infrastructure to support them), however, have never been widely available to everyday users. In this survey of research on context-aware systems and applications, we looked in depth at the types of context used and models of context information, at systems that support collecting and disseminating context, and at applications that adapt to the changing context. Through this survey, it is clear that context-aware research is an old but rich area for research. The difficulties and possible solutions we outline serve as guidance for researchers hoping to make context-aware computing a reality. Bayes Optimal Metasearch: A Probabilistic Model for Combining the Results of Multiple Retrieval Systems Dartmouth Technical Report TR2000-382 Javed A. Aslam Mark Montague Date: December 2000 URL (compressed postscript): (112KB) URL (PDF): (208KB) Abstract: We introduce a new, probabilistic model for combining the outputs of an arbitrary number of query retrieval systems. By gathering simple statistics on the average performance of a given set of query retrieval systems, we construct a Bayes optimal mechanism for combining the outputs of these systems. Our construction yields a metasearch strategy whose empirical performance nearly always exceeds the performance of any of the constituent systems. Our construction is also robust in the sense that if ``good'' and ``bad'' systems are combined, the performance of the composite is still on par with, or exceeds, that of the best constituent system. Finally, our model and theory provide theoretical and empirical avenues for the improvement of this metasearch strategy. Note: Preliminary version appeared in SIGIR 2000. Reconstructing Ancient Egyptian Tombs Dartmouth Technical Report TR2000-383 Hany Farid Date: December 2000 URL (compressed postscript): (3816KB) URL (PDF): (2808KB) Abstract: From the pyramids of Giza to the tombs of Thebes (modern Luxor), ancient Egypt's glorious history has produced remarkable architecture. Sadly, the nearly four million yearly tourists have taken a heavy toll on many of these ancient structures. Of particular concern are many of the tombs located opposite to Luxor on the western bank of the Nile. Digital reconstruction of these tombs has the potential to help document and preserve these important historical structures. Photographing and reconstruction of these tombs poses new and unique problems that this paper begins to address. Techniques for removing image distortions, recovering 3-D shape, and correcting for lighting imbalances are discussed. A complete reconstruction of the tomb of Sennedjem is shown. Ambiguity-Directed Sampling for Qualitative Analysis of Sparse Data from Spatially-Distributed Physical Systems Dartmouth Technical Report TR2001-384 Chris Bailey-Kellogg Naren Ramakrishnan Date: January 2001 URL (compressed postscript): (352KB) URL (PDF): (244KB) Abstract: A number of important scientific and engineering applications, such as fluid dynamics simulation and aircraft design, require analysis of spatially-distributed data from expensive experiments and complex simulations. In such data-scarce applications, it is advantageous to use models of given sparse data to identify promising regions for additional data collection. This paper presents a principled mechanism for applying domain-specific knowledge to design focused sampling strategies. In particular, our approach uses ambiguities identified in a multi-level qualitative analysis of sparse data to guide iterative data collection. Two case studies demonstrate that this approach leads to highly effective sampling decisions that are also explainable in terms of problem structures and domain knowledge. Lock-free Scheduling of Logical Processes in Parallel Simulation Dartmouth Technical Report TR2001-385 Xiaowen Liu David M. Nicol King Tan Date: January 2001 URL (compressed postscript): (140KB) URL (PDF): (268KB) Abstract: With fixed lookahead information in a simulation model, the overhead of asynchronous conservative parallel simulation lies in the mechanism used for propagating time updates in order for logical processes to safely advance their local simulation clocks. Studies have shown that a good scheduling algorithm should preferentially schedule processes containing events on the critical path. This paper introduces a lock-free algorithm for scheduling logical processes in conservative parallel discrete-event simulation on shared-memory multiprocessor machines. The algorithm uses fetch\&add operations that help avoid inefficiencies associated with using locks. The lock-free algorithm is robust. Experiments show that, compared with the scheduling algorithm using locks, the lock-free algorithm exhibits better performance when the number of logical processes assigned to each processor is small or when the workload becomes significant. In models with large number of logical processes, our algorithm shows only modest increase in execution time due to the overhead in the algorithm for extra bookkeeping. Note: A revision of this report appears on PADS 2001. Mobile-Agent versus Client/Server Performance: Scalability in an Information-Retrieval Task Dartmouth Technical Report TR2001-386 Robert S. Gray David Kotz Ronald A. Peterson Peter Gerken Martin Hofmann Daria Chacon Greg Hill Niranjan Suri Date: January 2001 URL (compressed postscript): (1076KB) URL (PDF): (1160KB) Abstract: Mobile agents are programs that can jump from host to host in the network, at times and to places of their own choosing. Many groups have developed mobile-agent software platforms, and several mobile-agent applications. Experiments show that mobile agents can, among other things, lead to faster applications, reduced bandwidth demands, or less dependence on a reliable network connection. There are few if any studies of the scalability of mobile-agent servers, particularly as the number of clients grows. We present some recent performance and scalability experiments that compare three mobile-agent platforms with each other and with a traditional client/server approach. The experiments show that mobile agents often outperform client/server solutions, but also demonstrate the deep interaction between environmental and application parameters. The three mobile-agent platforms have similar behavior but their absolute performance varies with underlying implementation choices. Note: Revised version appeared in Mobile Agents 2001. See here. A simple bound on the expected height of a randomly built binary search tree Dartmouth Technical Report TR2001-387 Javed A. Aslam Date: Sometime 2001 Note: Abstract and paper lost. Applying the Vector Radix Method to Multidimensional, Multiprocessor, Out-of-Core Fast Fourier Transforms Dartmouth Technical Report TR2001-388 Michael F. Ringenburg Date: March 2001 URL (compressed postscript): (268KB) Abstract: We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices of arbitrarily high dimension using the vector-radix method when the problem size is out-of-core (i.e., when the size of the data set is larger than the total available memory of the system). The algorithm takes advantage of multiple processors when they are present, but it is also efficient on single-processor systems. Our work is an extension of work done by Lauren Baptist in [Bapt99], which applied the vector-radix method to 2-dimensional out-of-core matrices. To determine the effectiveness of the algorithm, we present empirical results as well as an analysis of the I/O, communication, and computational complexity. We perform the empirical tests on a DEC 2100 server and on a cluster of Pentium-based Linux workstations. We compare our results with the traditional dimensional method of calculating multidimensional FFTs, and show that as the number of dimensions increases, the vector-radix-based algorithm becomes increasingly effective relative to the dimensional method. In order to calculate the complexity of the algorithm, it was necessary to develop a method for analyzing the interprocessor communication costs of the BMMC data-permutation algorithm (presented in [CSW98]) used by our FFT algorithms. We present this analysis method and show how it was derived. Note: Masters Thesis. Advisor: Tom Cormen. Improving a Brokering System for Linking Distributed Simulations Dartmouth Technical Report TR2001-389 Thomas B. Stephens Date: June 2001 URL (compressed postscript): (112KB) URL (PDF): (244KB) Abstract: The Agent Based Environment for Linking Simulations (ABELS) is a software framework designed to provide disparate simulations with dynamically updated data sources. It allows simulations and other agents to join a "cloud" of interacting producers and consumers of data. Once they have joined the cloud, they can publish services to other members and use methods published by others. This paper presents the initial design of a set of matchmaking components for the ABELS framework. These components dictate how services describe their abilities and requirements to ABELS. Furthermore, they help ABELS successfully match data producing services to the requests of data consuming clients. We begin by describing a system for a data producing service to describe itself to the ABELS cloud, as well as a corresponding system for a data consumer to describe its needs. We then describe in detail the three components that make up the ABELS matchmaking system: the match ranker, which ranks a data producer's ability to fill the request of a data consumer; the thesaurus, which helps the match ranker recognize closely related terms; and the unit database, which allows participants in the ABELS system to translate between related data units. We also discuss how these basic components can be built upon and improved in future versions of the ABELS framework. Note: Senior Honors Thesis. Advisor: Linda F. Wilson. Mobile Voice Over IP (MVOIP): An Application-level Protocol Dartmouth Technical Report TR2001-390 G. Ayorkor Mills-Tettey Date: June 2001 URL (compressed postscript): (1548KB) URL (PDF): (432KB) Abstract: Current Voice over Internet Protocol (VOIP) protocols require participating hosts to have fixed IP addresses for the duration of a VOIP call. When using a wireless-enabled host, such as a tablet computer on an 802.11 wireless network, it is possible for a participant in a VOIP call to roam around the network, moving from one subnet to another and needing to change IP addresses. This address change creates the need for mobility support in VOIP applications. We present the design of Mobile Voice over IP (MVOIP), an application-level protocol that enables such mobility in a VOIP application based on the ITU H.323 protocol stack. An MVOIP application uses hints from the surrounding network to determine that it has switched subnets. It then initiates a hand-off procedure that comprises pausing its current calls, obtaining a valid IP address for the current subnet, and reconnecting to the remote party with whom it was in a call. Testing the system shows that on a Windows 2000 platform there is a perceivable delay in the hand-off process, most of which is spent in the Windows API for obtaining DHCP addresses. Despite this bottleneck, MVOIP works well on a wireless network. Note: Senior Honors Thesis. Advisor: David Kotz. A Directory Infrastructure to Support Mobile Services Dartmouth Technical Report TR2001-391 Ammar Khalid Date: June 2001 URL (compressed postscript): (1672KB) URL (PDF): (368KB) Abstract: Traditional Voice-over-IP applications such as Microsoft NetMeeting assume that the user is on a machine with a fixed IP address. If, however, the user connects to the Internet, via a wireless network, on a handheld device, his IP address frequently changes as he moves from one subnet to another. In such a situation, we need a service that can be queried for the most current IP address of a person whom we wish to contact. In this project, we design and implement such a directory service. The service authenticates all callers and callees, is robust against most host failure, and scales to several thousand registered users. Note: Senior Honors Thesis. Advisor: David Kotz. SmartReminder: A Case Study on Context-Sensitive Applications Dartmouth Technical Report TR2001-392 Arun Mathias Date: June 2001 URL (compressed postscript): (464KB) URL (PDF): (404KB) Abstract: Designing context-sensitive applications is challenging. We design and implement SmartReminder to explore designing context-sensitive applications and to demonstrate how the SOLAR system can be used in developing such applications. SmartReminder is an application that reminds the user based on contextual information. Current appointment-reminder applications remind the user about their appointments at an arbitrarily specified time. For instance, they might remind the user ten minutes before each appointment. SmartReminder, on the other hand, uses contextual information, like location, to better estimate the appropriate reminder time for each appointment. It reminds the user based on where they are, where they need to be, and how long it will take them to get there. This paper presents SmartReminder as an illustration of how context-sensitive applications can be designed using the SOLAR system for dissemination of contextual information. Note: Senior Honors Thesis. Advisor: David Kotz. Measuring early usage of Dartmouth's wireless network Dartmouth Technical Report TR2001-393 Pablo Stern Date: June 2001 URL (compressed postscript): (1416KB) URL (PDF): (336KB) Abstract: In Spring 2001, Dartmouth College installed a campus-wide 802.11b wireless network. To understand how that network is used, we examined the usage characteristics of the network over a five-week period. We monitored access points to determine user behavior, and user and network traffic characteristics. Because our study coincided with the deployment of the access points, our analysis captures the growth of a wireless network. The results of this study help understand the behavior of mobile users and provide a reference to network engineers wishing to deploy and expand similar wireless networks. Note: Senior Honors Thesis. Advisor: David Kotz. An Empirical Study of Training and Testing Error in Boosting Dartmouth Technical Report TR2001-394 David D. Latham Date: June 2001 URL (compressed postscript): (464KB) URL (PDF): (440KB) Abstract: Bounds have been proven for both training and testing error for the boosting algorithm AdaBoost, but in practice neither seem to produce a particularly tight bound. In this paper we share some observations of these bounds from empirical results, and then explore some properties of the algorithm with an eye towards finding an improved bound for the performance of AdaBoost. Based on our empirical evidence, the error of a hypothesis which labels examples probabilistically based upon the confidence of the vote of the weak hypotheses forms a tighter bound for the training error. Note: Senior Honors Thesis. Advisor: Jay Aslam. An Implementation of Object-Oriented Program Transformation for Thought-Guided Debugging Dartmouth Technical Report TR2001-395 Tiffany M. Wong Date: June 2001 URL (compressed postscript): (36KB) URL (PDF): (60KB) Abstract: This paper presents our design and implementation of program transformation for C++ that will be used in the context of a thought-guided debugging system. The program uses a lexical analyzer written in Flex and a grammar written in Bison that work in conjunction to scan the inputted C++ code for function definitions and class definitions. The code is then transformed to produce trace information for each defined function, while the original functionality of the code is left untouched. We also implement two additional data structures that are used for information storage during the course of the program. Note: Senior Honors Thesis. Advisor: Tom Cormen Implementing a Database Information System for an Electronic Baseball Scorecard Dartmouth Technical Report TR2001-396 Tiffany M. Wong Date: June 2001 URL (compressed postscript): (140KB) URL (PDF): (104KB) Abstract: We present our design and implementation of a database system of information storage and retrieval for an electronic baseball scorecard. The program uses the relational MySQL database to hold information and a Tcl API to handle interactions between the database and the user interface code. This paper discusses the inner workings of how information storage was broken down inside the database, how queries were internally constructed in accordance with the user's input, and how statistics for players and teams were calculated and returned to the user. Finally, we discuss some limitations attached to our current implementation of the program and propose improvements that can be made in future versions. Note: Senior Honors Thesis. Advisor: Tom Cormen Supporting Adaptive Ubiquitous Applications with the SOLAR System Dartmouth Technical Report TR2001-397 Guanling Chen David Kotz Date: May 2001 URL (compressed postscript): (212KB) URL (PDF): (240KB) Abstract: As we embed more computers into our daily environment, ubiquitous computing promises to make them less noticeable and help to prevent information overload. We see, however, few ubiquitous applications that are able to adapt to the dynamics of user, physical, and computational context. We believe that there are two challenges causing this lack of ubiquitous applications: there is no flexible and scalable way to support information collection and dissemination in a ubiquitous and mobile environment, and there is no general approach to building adaptive applications given heterogeneous contextual information. We propose a system infrastructure, Solar, to meet these challenges. Solar uses a subscription-based operator graph abstraction and allows dynamic composition of stackable operators to manage ubiquitous information sources. After developing a set of diverse adaptive applications, we expect to identify fundamental techniques for context-aware adaptation. Our expectation is that Solar's end-to-end support for information collection, dissemination, and utilization will make it easy to build adaptive applications for a ubiquitous mobile environment with many users and devices. A System for Audio Personalization with Applications on Wireless Devices Dartmouth Technical Report TR2001-398 David Marmaros Date: June 2001 URL (PDF): (764KB) Abstract: We present and analyze a system for dynamically tailoring discrete audio content for numerous users based on aggregate data and intuitive feedback mechanisms. The framework for this system utilizes a flexible client-server architecture to facilitate audio dissemination, with particular attention to distribution over wireless networks. We discuss the requirements and specifications of such a system. We further analyze the algorithms and protocols required for its operation. Finally, we outline and provide data from a demonstration of this application. Note: Senior Honors Thesis. Advisors: David Kotz and Daniela Rus. WebALPS Implementation and Performance Analysis: Using Trusted Co-servers to Enhance Privacy and Security of Web Interactions Dartmouth Technical Report TR2001-399 Shan Jiang Date: June 2001 URL (compressed postscript): (272KB) URL (PDF): (360KB) Abstract: The client-server model of the Web poses a fundamental trust issue: clients are forced to trust in secrecy and correctness of computation occurring at a remote server of unknown credibility. The current solution for this problem is to use a PKI (Public Key Infrastructure) system and SSL (Secure Sockets Layer) digital certificates to prove the claimed identity of a server and establish an authenticated, encrypted channel between the client and this server. However, this approach does not address the security risks posed by potential malicious server operators or any third parties who may penetrate the server sites. The WebALPS (Web Applications with Lots of Privacy and Security) approach is proposed to address these weaknesses by moving sensitive computations at server side into trusted co-servers running inside high-assurance secure coprocessors. In this report, we examine the foundations of the credibility of WebALPS co-servers. Then we will describe our work of designing and building a prototype WebALPS co-server, which is integrated into the widely-deployed, commercial-grade Apache server. We will also present the performance test results of our system which support the argument that WebALPS approach provides a systematic and practical way to address the remote trust issue. Note: Master's thesis. An Armored Data Vault Dartmouth Technical Report TR2001-400 Alex Iliev Date: May 2001 URL (PDF): (256KB) Abstract: We consider the problem of secure long-term archiving of network traffic, an instance of the problem of storing data securely. We approach the problem using secure hardware, which enables the enforcement of flexible access policy. The policy cannot be circumvented by anyone, even insiders, and so we are assured that access to the data is as originally intended. The policy can be expressed as any feasible computation, as it will be checked inside the secure hardware without possibility of interference. We discuss our design of a device to perform such network data archiving and have implemented a prototpe device. We discuss other possible application areas of the design. Note: Senior Honors Thesis. Advisor: Sean Smith. Outbound Authentication for Programmable Secure Coprocessors Dartmouth Technical Report TR2001-401 Sean W. Smith Date: March 2001 URL (compressed postscript): (84KB) URL (PDF): (120KB) Abstract: A programmable secure coprocessor platform can help solve many security problems in distributed computing. These solutions usually require that coprocessor applications be able to participate as full-fledged parties in distributed cryptographic protocols. Thus, to fully enable these solutions, a generic platform must not only provide programmability, maintenance, and configuration in the hostile field---it must also provide outbound authentication for the entities that result. A particular application on a particular untampered device must be able to prove who it is to a party on the other side of the Internet. To be effective, a secure outbound authentication service must closely mesh with the overall security architecture. Our initial architecture only sketched a rough design for this service, and did not complete it. This paper presents our research and development experience in refining and implementing this design, to provide PKI-based outbound authentication for the IBM 4758 Model 2 secure coprocessor platform. Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs Dartmouth Technical Report TR2001-402 Jeremy T. Fineman Date: June 2001 URL (compressed postscript): (240KB) URL (PDF): (296KB) Abstract: We present an improved version of the Dimensional Method for computing multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor system when the data consist of too many records to fit into memory. Data are spread across parallel disks and processed in sections. We use the Parallel Disk Model for analysis. The simple Dimensional Method performs the 1-dimensional FFTs for each dimension in term. Between each dimension, an out-of-core permutation is used to rearrange the data to contiguous locations. The improved Dimensional Method processes multiple dimensions at a time. We show that determining an optimal sequence and groupings of dimensions is NP-complete. We then analyze the effects of two modifications to the Dimensional Method independently: processing multiple dimensions at one time, and processing single dimensions in a different order. Finally, we show a lower bound on the I/O complexity of the Dimensional Method and present an algorithm that is approximately asymptotically optimal. Note: Senior Honors Thesis. Advisor: Tom Cormen. EcomRISK.org : A site to classify and organize the risks of performing business on the Internet Dartmouth Technical Report TR2001-403 Aidan S. Marcuss Date: June 2001 URL (compressed postscript): (372KB) URL (PDF): (328KB) Abstract: As the use of the Internet and other computer networks to transact business grows, there is an ever increasing need for those taking part in those transactions to understand the risks of doing so. While there are many web sites that have created valuable databases of specific vulnerabilities for certain types of hardware and software, there is a lack of focus on attempting to analyze the interaction of businesses, their systems, computer networks, and their customers and the risks that are created by either intended or unattended interactions. EcomRISK.org is a web site that presents a clear taxonomy to classify these risks and provides other features to aid in the general discussion of e-commerce risk. The site, and the taxonomy at the center of it, creates a database of these incidents so they can be clearly searched. This paper discusses the creation of EcomRISK.org, from vision to birth. Note: Senior Honors Thesis. Advisor: Fillia Makedon. Efficient Compression of Generic Function Dispatch Tables Dartmouth Technical Report TR2001-404 Eric Kidd Date: June 2001 URL (compressed postscript): (200KB) URL (PDF): (228KB) Abstract: A generic function is similar to an overloaded operator, but provides a way to select an appropriate behavior at run-time instead of compile-time. Dujardin and colleagues have proposed an algorithm for building and compressing generic function dispatch tables. We present several modifications to their algorithm, including an improvement to Pseudo-Closest-Poles and two new algorithms for compressing pole tables. The two new compression algorithms are simple and fast, and one produces smaller output than the original. Note: Senior Honors Thesis. Advisor: Chris Hawblitzel. DaSSFNet: An Extension to DaSSF for High-Performance Network Modeling Dartmouth Technical Report TR2001-405 Mehmet Iyigun Date: June 2001 URL (compressed postscript): (1184KB) URL (PDF): (396KB) Abstract: Scalable Simulation Framework (SSF) is a discrete-event simulation framework providing a unified programming interface geared towards network simulation. Dartmouth SSF (DaSSF) is a C++ implementation of SSF, designed for simulating very large-scale multi-protocol communication networks. As of the latest release, DaSSF lacks many features present in SSF and this prevents it from achieving mainstream use. To alleviate this shortcoming we designed and implemented DaSSFNet which extends DaSSF to the levels of functionality found in SSF. In this paper, we show that DaSSFNet and SSFNet are identical in operation given the same input. We also show that DaSSFNet is about twice as fast and has one third the memory consumption of SSFNet when simulating identical networks. Therefore, we argue, that the DaSSF simulation package with DaSSFNet now offers a viable alternative to SSF in high-performance network simulation. Note: Senior Honors Thesis. Advisor: David M. Nicol Fastab: Solving the Pitch to Notation Problem Dartmouth Technical Report TR2001-406 Jeremy I. Robin Date: May 2001 URL (compressed postscript): (1216KB) URL (PDF): (180KB) Abstract: I have always been frustrated with the length of time necessary to notate a piece of music. Computers have simplified so many other aspects of our lives, it seems that they should be able to simplify this task as well. In fact, there are already two distinct ways that engineers have attempted to attack this problem. The first analyzes the waveform generated by microphone input and relies on Fourier Analysis and other similar methods. The other examines the analog signal generated by a electric guitar-like pickup placed beneath the strings. The method used by Fastab relies much less on the musical properties of an instrument. Instead, Fastab records where and when the fingers and pick contact the instrument using digital electronics and microprocessor technology. Fastab provides a solution to the pitch to notation problem which is cheaper and more accurate than any other device available today. Note: Senior Honors Thesis. Advisor: Scot Drysdale TCP/IP Implementation within the Dartmouth Scalable Simulation Framework Dartmouth Technical Report TR2001-407 Michael G. Khankin Date: June 2001 URL (compressed postscript): (1600KB) URL (PDF): (484KB) Abstract: This paper discusses TCP/IP networking, and in particular, the DaSSF implementation of TCP/IP. The paper reviews the protocols, outlines the implementation design, and demonstrates some tests. In addition, some performance and memory usage analysis is performed. We find DaSSF TCP/IP to be a viable option to the existing SSF. DaSSF TCP/IP is faster and uses less memory so we can simulate larger, more complex, models. Note: Senior Honors Thesis. Advisor: David M. Nicol Market-based Control of Mobile-agent Systems Dartmouth Technical Report TR2001-408 Jonathan L. Bredin Date: June 2001 URL (compressed postscript): (704KB) URL (PDF): (984KB) Abstract: Modern distributed systems scatter sensors, storage, and computation throughout the environment. Ideally these devices communicate and share resources, but there is seldom motivation for a device's owner to yield control to another user. We establish markets for computational resources to motivate principals to share resources with arbitrary users, to enforce priority in distributed systems, to provide flexible and rational limitations on the potential of an application, and to provide a lightweight structure to balance the workload over time and between devices. As proof of concept, we implement a structure software agents can use to discover and negotiate access to networked resources. The structure separates discovery, authentication, and consumption enforcement as separate orthogonal issues to give system designers flexibility. Mobile agents represent informational and computational flow. We develop mechanisms that distributively allocate computation among mobile agents in two settings. The first models a situation where users collectively own networked computing resources and require priority enforcement. We extend the allocation mechanism to allow resource reservation to mitigate utility volatility. The second, more general model relaxes the ownership assumption. We apply our computational market to an open setting where a principal's chief concern is revenue maximization. Our simulations compare the performance of market-based allocation policies to traditional policies and relate the cost of ownership and consumption separation. We observe that our markets effectively prioritize applications' performance, can operate under uncertainty and network delay, provide metrics to balance network load, and allow measurement of market-participation risk versus reservation-based computation. In addition to allocation problems, we investigate resource selection to optimize execution time. The problem is NP-complete if the costs and latencies are constant. Both metrics' dependence on the chosen set complicates matters. We study how a greedy approach, a novel heuristic, and a shortest-constrained-path strategy perform in mobile-agent applications. Market-based computational-resource allocation fertilizes applications where previously there was a dearth of motive for or means of cooperation. The rationale behind mobile-agent performance optimization is also useful for resource allocation in general distributed systems where an application has a sequence of dependent tasks or when data collection is expensive. Note: A Ph.D thesis from the Department of Computer Science, Dartmouth College. Web Spoofing 2001 Dartmouth Technical Report TR2001-409 Yougu Yuan Eileen Zishuang Ye Sean W. Smith Date: July 2001 URL (compressed postscript): (780KB) URL (PDF): (492KB) Abstract: The Web is currently the pre-eminent medium for electronic service delivery to remote users. As a consequence, authentication of servers is more important than ever. Even sophisticated users base their decision whether or not to trust a site on browser cues---such as location bar information, SSL icons, SSL warnings, certificate information, response time, etc. In their seminal work on web spoofing, Felten et al showed how a malicious server could forge some of these cues---but using approaches that are no longer reproducible. However, subsequent evolution of Web tools has not only patched security holes---it has also added new technology to make pages more interactive and vivid. In this paper, we explore the feasibility of web spoofing using this new technology---and we show how, in many cases, every one of the above cues can be forged. In particular, we show how a malicious server can forge all the SSL information a client sees---thus providing a cautionary tale about the security of one of the most common applications of PKI. We stress that these techniques have been implemented, and are available for public demonstration. Securing Web Servers against Insider Attack Dartmouth Technical Report TR2001-410 Shan Jiang Sean Smith Kazuhiro Minami Date: July 2001 URL (compressed postscript): (116KB) URL (PDF): (120KB) Abstract: Too often, ``security of Web transactions'' reduces to ``encryption of the channel''---and neglects to address what happens at the server on the other end. This oversight forces clients to trust the good intentions and competence of the server operator---but gives clients no basis for that trust. Furthermore, despite academic and industrial research in secure coprocessing, many in the computer science community still regard ``secure hardware'' as a synonym for ``cryptographic accelerator.' This oversight neglects the real potential of COTS secure coprocessing technology to establish trusted islands of computation in hostile environments---such as at web servers with risk of insider attack. In this paper, we apply secure coprocessing and cryptography to solve this real problem in Web technology. We present a vision: using secure coprocessors to establish trusted co-servers at Web servers and moving sensitive computations inside these co-servers. We present a prototype implementation of this vision that scales to realistic workloads. Finally, we validate this approach by building a simple E-voting application on top of our prototype. From our experience, we conclude that this approach provides a practical and effective way to enhance the security of Web servers against insider attack. Write Once, Move Anywhere: Toward Dynamic Interoperability of Mobile Agent Systems Dartmouth Technical Report TR2001-411 Arne Grimstrup Robert S. Gray David Kotz Thomas Cowin Greg Hill Niranjan Suri Daria Chacon Martin Hofmann Date: July 2001 URL (compressed postscript): (204KB) URL (PDF): (220KB) Abstract: Mobile agents are an increasingly popular paradigm, and in recent years there has been a proliferation of mobile-agent systems. These systems are, however, largely incompatible with each other. In particular, agents cannot migrate to a host that runs a different mobile-agent system. Prior approaches to interoperability have tried to force agents to use a common API, and so far none have succeeded. Our goal, summarized in the catch phrase ``Write Once, Move Anywhere,'' led to our efforts to develop mechanisms that support dynamic runtime interoperability of mobile-agent systems. This paper describes the Grid Mobile-Agent System, which allows agents to migrate to different mobile-agent systems. Note: Revised July 25, 2001. Detecting Steganographic Messages in Digital Images Dartmouth Technical Report TR2001-412 Hany Farid Date: September 2001 URL (compressed postscript): (592KB) URL (PDF): (608KB) Abstract: Techniques and applications for information hiding have become increasingly more sophisticated and widespread. With high-resolution digital images as carriers, detecting the presence of hidden messages has also become considerably more difficult. It is sometimes possible, nevertheless, to detect (but not necessarily decipher) the presence of embedded messages. The basic approach taken here works by finding predictable higher-order statistics of ``natural'' images within a multi-scale decomposition, and then showing that embedded messages alter these statistics. Differential Elastic Image Registration Dartmouth Technical Report TR2001-413 Senthil Periaswamy Hany Farid Date: September 2001 URL (compressed postscript): (2144KB) URL (PDF): (1452KB) Abstract: We have applied techniques from differential motion estimation to the problem of automatic elastic registration of medical images. This method models the mapping between images as a locally affine but globally smooth warp. The mapping also explicitly accounts for variations in image intensities. This approach is simple and highly effective across a broad range of medical images. We show the efficacy of this approach on several synthetic and clinical images. Decentralized Control for Coordinated flow of Multi-Agent Systems Dartmouth Technical Report TR2002-414 Valentino Crespi George Cybenko Massimo Santini Daniela Rus Date: January 2002 URL (compressed postscript): (92KB) URL (PDF): (192KB) Abstract: This paper describes a distributed algorithm for coordinating the flow of a mass of vehicles approaching a highway exit or a tollbooth. We provide the problem formulation, a general methodology for distributed control and an instantiation of this methodology to the coordinated flow problem. We analyze our algorithm and provide experimental data. Future Directions for Mobile-Agent Research Dartmouth Technical Report TR2002-415 David Kotz Robert S. Gray Daniela Rus Date: January 2002 URL (compressed postscript): (1872KB) URL (PDF): (208KB) Abstract: During a discussion in September 2000 the authors examined the future of research on mobile agents and mobile code. (A mobile agent is a running program that can move from host to host in network at times and to places of its own choosing.) In this paper we summarize and reflect on that discussion. It became clear that the field should shift its emphasis toward mobile code, in all its forms, rather than to continue its narrow focus on mobile agents. Furthermore, we encourage the development of modular components, so that application designers may take advantage of code mobility without needing to rewrite their application to fit in a monolithic mobile-agent system. There are many potential applications that may productively use mobile code, but there is no ``killer application'' for mobile agents. Finally, we note that although security is an important and challenging problem, there are many applications and environments with security requirements well within the capability of existing mobile-code and mobile-agent frameworks. Virtual Hierarchies - An Architecture for Building and Maintaining Efficient and Resilient Trust Chains. Dartmouth Technical Report TR2002-416 John C. Marchesini Sean W. Smith Date: February 2002 URL (compressed postscript): (388KB) URL (PDF): (100KB) Abstract: In Public Key Infrastructure (PKI), the simple, monopolistic CA model works fine until we consider the real world. Then, issues such as scalability and mutually suspicious organizations create the need for a multiplicity of CAs, which immediately introduces the problem of how to organize them to balance resilience to compromise against efficiency of path discovery. However, security has given us tools such as secure coprocessing, secret splitting, secret sharing, and threshold cryptography for securely carrying out computations among multiple trust domains; distributed computing has given us peer-to-peer networking, for creating self-organizing distributed systems. In this paper, we use these latter tools to address the former problem by overlaying a virtual hierarchy on a mesh architecture of peer CAs, and achieving both resilience and efficiency. Web Spoofing Revisited: SSL and Beyond Dartmouth Technical Report TR2002-417 Eileen Ye Yougu Yuan Sean Smith Date: February 2002 URL (compressed postscript): (1436KB) URL (PDF): (288KB) Abstract: Can users believe what their browsers tell them? Even sophisticated Web users decide whether or not to trust a server based on browser cues such as location bar information, SSL icons, SSL warnings, certificate information, and response time. In their seminal work on Web spoofing, Felten et al showed how, in 1996, a malicious server could forge some of these cues. However, this work used genuine SSL sessions, and Web technology has evolved much since 1996. The Web has since become the pre-eminent medium for electronic service delivery to remote users, and the security of many commerce, government, and academic network applications critically rests on the assumption that users can authenticate the servers with which they interact. This situation raises the question: is the browser-user communication model today secure enough to warrant this assumption? In this paper, we answer this question by systematically showing how a malicious server can forge every one of the above cues. Our work extends the prior results by examining contemporary browsers, and by forging all of the SSL information a client sees, including the very existence of an SSL session (thus providing a cautionary tale about the security of one of the most common applications of PKI). We have made these techniques available for public demonstration, because anything less than working code would not convincingly answer the question. We also discuss implications and potential countermeasures, both short-term and long-term. Note: This TR supercedes TR2001-409. Trusted Paths for Browsers: An Open-Source Solution to Web Spoofing Dartmouth Technical Report TR2002-418 Eileen Ye Sean Smith Date: February 2002 URL (compressed postscript): (308KB) URL (PDF): (84KB) Abstract: The security of the vast majority of ``secure'' Web services rests on SSL server PKI. However, this PKI doesn't work if the the adversary can trick the browser into appearing to tell the user the wrong thing about the certificates and cryptography. The seminal web spoofing work of Felten et al demonstrated the potential, in 1996, for malicious servers to impersonate honest servers. Our recent follow-up work explicitly shows how malicious servers can still do this---and can also forge the existence of an SSL session and the contents of the alleged server certificate. This paper reports the results of our work to systematically defend against Web spoofing, by creating a trusted path from the browser to the user. Starting with the Mozilla source, we have implemented techniques that protect a wide variety of browser-user communications, that require little participation by the user and minimal disruption of the displayed server content. We have prepared shell scripts that install these modifications on the Mozilla source, to enable others to replicate this work. In on-going work, we are cleaning up and fine-tuning our code. In future work, we hope to examine more deeply the role of user interfaces in enabling users to make effective trust judgments. FFTs for the 2-Sphere - Improvements and Variations Dartmouth Technical Report TR2002-419 Dennis M. Healy Daniel N. Rockmore Peter J. Kostelec Sean S. B. Moore Date: March 2002 URL (compressed postscript): (1340KB) URL (PDF): (1340KB) Abstract: Earlier work by Driscoll and Healy has produced an efficient algorithm for computing the Fourier transform of band-limited functions on the 2-sphere. In this paper we present a reformulation and variation of the original algorithm which results in a greatly improved inverse transform, and consequent improved convolution algorithm for such functions. All require at most $O(N\log^2 N)$ operations where $N$ is the number of sample points. We also address implementation considerations and give heuristics for allowing reliable and computationally efficient floating point implementations of slightly modified algorithms. These claims are supported by extensive numerical experiments from our implementation in C on DEC, HP, SGI and Linux Pentium platforms. These results indicate that variations of the algorithm are both reliable and efficient for a large range of useful problem sizes. Performance appears to be architecture-dependent. The paper concludes with a brief discussion of a few potential applications. Note: Preliminary versions of some of these results have appeared in the Dartmouth College Department of Computer Science Technical Report PCS-TR94-222 and ``An FFT for the 2-sphere and Applications'', Proc. of ICASSP-96, Volume 3, pp. 1323--1326. Context Aggregation and Dissemination in Ubiquitous Computing Systems Dartmouth Technical Report TR2002-420 Guanling Chen David Kotz Date: February 2002 URL (compressed postscript): (104KB) URL (PDF): (92KB) Abstract: Many ``ubiquitous computing'' applications need a constant flow of information about their environment to be able to adapt to their changing context. To support these ``context-aware'' applications we propose a graph-based abstraction for collecting, aggregating, and disseminating context information. The abstraction models context information as events, produced by sources and flowing through a directed acyclic graph of event-processing operators and delivered to subscribing applications. Applications describe their desired event stream as a tree of operators that aggregate low-level context information published by existing sources into the high-level context information needed by the application. The operator graph is thus the dynamic combination of all applications' subscription trees. In this paper, we motivate and describe our graph abstraction, and discuss a variety of critical design issues. We also sketch our Solar system, an implementation that represents one point in the design space for our graph abstraction. Note: To appear in WMCSA 2002. Solar: A pervasive-computing infrastructure for context-aware mobile applications Dartmouth Technical Report TR2002-421 Guanling Chen David Kotz Date: February 2002 URL (compressed postscript): (352KB) URL (PDF): (96KB) Abstract: Emerging pervasive computing technologies transform the way we live and work by embedding computation in our surrounding environment. To avoid increasing complexity, and allow the user to concentrate on her tasks, applications must automatically adapt to their changing \emph{context}, the physical and computational environment in which they run. To support these ``context-aware'' applications we propose a graph-based abstraction for collecting, aggregating, and disseminating context information. The abstraction models context information as \emph{events}, which are produced by \emph{sources}, flow through a directed acyclic graph of event-processing \emph{operators}, and are delivered to subscribing applications. Applications describe their desired event stream as a tree of operators that aggregate low-level context information published by existing sources into the high-level context information needed by the application. The \emph{operator graph\/} is thus the dynamic combination of all applications' subscription trees. In this paper, we motivate our graph abstraction by discussing several applications under development, sketch the architecture of our system (``Solar'') that implements our abstraction, report some early experimental results from the prototype, and outline issues for future research. Controlling access to pervasive information in the ``Solar'' system Dartmouth Technical Report TR2002-422 Kazuhiro Minami David Kotz Date: February 2002 URL (compressed postscript): (360KB) URL (PDF): (144KB) Abstract: Pervasive-computing infrastructures necessarily collect a lot of context information to disseminate to their context-aware applications. Due to the personal or proprietary nature of much of this context information, however, the infrastructure must limit access to context information to authorized persons. In this paper we propose a new access-control mechanism for event-based context-distribution infrastructures. The core of our approach is based on a conservative information-flow model of access control, but users may express discretionary relaxation of the resulting access-control list (ACL) by specifying relaxation functions. This combination of automatic ACL derivation and user-specified ACL relaxation allows access control to be determined and enforced in a decentralized, distributed system with no central administrator or central policy maker. It also allows users to express their personal balance between functionality and privacy. Finally, our infrastructure allows access-control policies to depend on context-sensitive roles, allowing great flexibility. We describe our approach in terms of a specific context-dissemination framework, the Solar system, although the same principles would apply to systems with similar properties. Characterizing Usage of a Campus-wide Wireless Network Dartmouth Technical Report TR2002-423 David Kotz Kobby Essien Date: March 2002 URL (compressed postscript): (236KB) URL (PDF): (200KB) Abstract: Wireless local-area networks (WLANs) are increasingly common, but little is known about how they are used. A clear understanding of usage patterns in real WLANs is critical information to those who develop, deploy, and manage WLAN technology, as well as those who develop systems and application software for wireless networks. This paper presents results from the largest and most comprehensive trace of network activity in a large, production wireless LAN. For eleven weeks we traced the activity of nearly two thousand users drawn from a general campus population, using a campus-wide network of 476 access points spread over 161 buildings. Our study expands on those done by Tang and Baker, with a significantly larger and broader population. We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams. Note: A major revision of this paper appeared in Mobicom 2002. The Mobicom paper contained two erroneous figures, however; another report, TR2002-432, is the final corrected version. Metasearch: Data Fusion for Document Retrieval Dartmouth Technical Report TR2002-424 Mark H. Montague Date: May 2002 URL (compressed postscript): (508KB) URL (PDF): (860KB) Abstract: The metasearch problem is to optimally merge the ranked lists output by an arbitrary number of search systems into one ranked list. In this work: (1) We show that metasearch improves upon not just the raw performance of the input search engines, but also upon the consistency of the input search engines from query to query. (2) We experimentally prove that simply weighting input systems by their average performance can dramatically improve fusion results. (3) We show that score normalization is an important component of a metasearch engine, and that dependence upon statistical outliers appears to be the problem with the standard technique. (4) We propose a Bayesian model for metasearch that outperforms the best input system on average and has performance competetive with standard techniques. (5) We introduce the use of Social Choice Theory to the metasearch problem, modeling metasearch as a democratic election. We adapt a positional voting algorithm, the Borda Count, to create a metasearch algorithm, acheiving reasonable performance. (6) We propose a metasearch model adapted from a majoritarian voting procedure, the Condorcet algorithm. The resulting algorithm is the best performing algorithm in a number of situations. (7) We propose three upper bounds for the problem, each bounding a different class of algorithms. We present experimental results for each algorithm using two types of experiments on each of four data sets. Note: Ph.D dissertation. The Future of Cryptography Under Quantum Computers Dartmouth Technical Report TR2002-425 Marco A. Barreno Date: July 2002 URL (compressed postscript): (152KB) URL (PDF): (240KB) Abstract: Cryptography is an ancient art that has passed through many paradigms, from simple letter substitutions to polyalphabetic substitutions to rotor machines to digital encryption to public-key cryptosystems. With the possible advent of quantum computers and the strange behaviors they exhibit, a new paradigm shift in cryptography may be on the horizon. Quantum computers could hold the potential to render most modern encryption useless against a quantum-enabled adversary. The aim of this thesis is to characterize this convergence of cryptography and quantum computation. We provide definitions for cryptographic primitives that frame them in general terms with respect to complexity. We explore the various possible relationships between BQP, the primary quantum complexity class, and more familiar classes, and we analyze the possible implications for cryptography. Note: This paper was written as a senior honors thesis with advisor Sean W. Smith. Role Definition Language (RDL): A Language to Describe Context-Aware Roles Dartmouth Technical Report TR2002-426 Christopher P. Masone Date: May 2002 URL (compressed postscript): (116KB) URL (PDF): (88KB) Abstract: As wireless networks become more prevalent, a widening array of computational resources becomes available to the mobile user. Since not all users should have unrestricted access to these resources, a method of access control must be devised. In a context-aware environment, context information can be used to supplement more conventional password-based access control systems. We believe the best way to achieve this is through the use of Context-Aware Role-Based Access Control, a model in which permissions are assigned to entities called roles, each principal is a member of one or more roles, and a role's membership is determined using context information. We designed and implemented RDL (Role-Definition Language), a simple, expressive and somewhat extensible programming language to facilitate the description of roles in terms of context information. Note: Senior Honors Thesis. Advisor: David Kotz. Performance and Interoperability In Solar Dartmouth Technical Report TR2002-427 A. Abram White Date: June 2002 URL (compressed postscript): (988KB) URL (PDF): (356KB) Abstract: Ubiquitous computing promises to integrate computers into our physical environment, surrounding us with applications that are able to adapt to our dynamics. Solar is a software infrastructure designed to deliver contextual information to these applications. To serve the large number and wide variety of context-aware devices envisioned by ubiquitous computing, Solar must exhibit both high performance and the ability to interoperate with many computing platforms. We created a testing framework to measure the performance of distributed systems such as Solar, as well as a pluggable data-transfer mechanism to support the dissemination of information to heterogeneous applications. This paper explores the testing framework developed, analyzes its findings concerning the performance of the current Solar prototype, presents several optimizations to Solar and their effects, and finally discusses the design of the pluggable data-transfer mechanism. Note: Senior Honors Thesis. Advisor: David Kotz. See also TR2002-429. Information-theoretic Bounds on the Training and Testing Error of Boosting Dartmouth Technical Report TR2002-428 Sebastien M. Lahaie Date: May 2002 URL (compressed postscript): (116KB) URL (PDF): (252KB) Abstract: Boosting is a means of using weak learners as subroutines to produce a strong learner with markedly better accuracy. Recent results showing the connection between logistic regression and boosting provide the foundation for an information-theoretic analysis of boosting. We describe the analogy between boosting and gambling, which allows us to derive a new upper bound on training error. This upper bound explicitly describes the effect of noisy data on training error. We also use information-theoretic techniques to derive an alternative upper-bound on testing error which is independent of the size of the weak-learner space. XSLT and XQuery as Operator Languages Dartmouth Technical Report TR2002-429 A. Abram White Date: June 2002 URL (compressed postscript): (60KB) URL (PDF): (92KB) Abstract: Ubiquitous computing promises to integrate computers into our physical environment, surrounding us with applications that are able to adapt to our dynamics. Solar is a software infrastructure designed to deliver contextual information to these applications. Solar represents context data as events, and uses small programs called operators to filter, merge, aggregate, or transform event streams. This paper explores the possibility of using XSLT and XQuery to build language-neutral Solar operators. Note: See also TR2002-427. Building Trusted Paths for Web Browsers Dartmouth Technical Report TR2002-430 Eileen Zishuang Ye Date: May 2002 URL (compressed postscript): (404KB) URL (PDF): (416KB) Abstract: The communication between the Web browser and the human user is one component of the server-client channel. It is not the user but the browser that receives all server information and establishes the secure connection. The browser's user interface signals, such as SSL lock, https protocol header et al., indicate whether the browser-server communication at the current moment is secure. Those user interface signals indicating the security status of browser should be clearly and correctly understood by the user. A survey of modern Web browsers shows the information provided by current browsers is insufficient for users to make trust judgment. Our Web spoofing work further proved that the browser status information is not reliable either. We discuss the criteria for and how to build the trusted paths between a browser and a human user. We present an open source implementation of one of the designs--synchronized random dynamic (SRD) boundary, based on Modified Mozilla source code, together with its usability study results. Analysis of Protein Sequences Using Time Frequency and Kolmogorov-Smirnov Methods Dartmouth Technical Report TR2002-431 Kobby Essien Date: May 2002 URL (compressed postscript): (7216KB) URL (PDF): (8356KB) Abstract: The plethora of genomic data currently available has resulted in a search for new algorithms and analysis techniques to interpret genomic data. In this two-fold study we explore techniques for locating critical amino acid residues in protein sequences and for estimating the similarity between proteins. We demonstrate the use of the Short-Time Fourier Transform and the Continuous Wavelet Transform together with amino acid hydrophobicity in locating important amino acid domains in proteins and also show that the Kolmogorov-Smirnov statistic can be used as a metric of protein similarity. Note: Senior Honors Thesis. Advisor: Metin Akay.
Analysis of a Campus-wide Wireless Network Dartmouth Technical Report TR2002-432 David Kotz Kobby Essien Date: September 2002 URL (compressed postscript): (568KB) URL (PDF): (176KB) Abstract: Understanding usage patterns in wireless local-area networks (WLANs) is critical for those who develop, deploy, and manage WLAN technology, as well as those who develop systems and application software for wireless networks. This paper presents results from the largest and most comprehensive trace of network activity in a large, production wireless LAN. For eleven weeks we traced the activity of nearly two thousand users drawn from a general campus population, using a campus-wide network of 476 access points spread over 161 buildings. Our study expands on those done by Tang and Baker, with a significantly larger and broader population. We found that residential traffic dominated all other traffic, particularly in residences populated by newer students; students are increasingly choosing a wireless laptop as their primary computer. Although web protocols were the single largest component of traffic volume, network backup and file sharing contributed an unexpectedly large amount to the traffic. Although there was some roaming within a network session, we were surprised by the number of situations in which cards roamed excessively, unable to settle on one access point. Cross-subnet roams were an especial problem, because they broke IP connections, indicating the need for solutions that avoid or accommodate such roams. Note: This paper is a revision of the MOBICOM '02 paper by the same title. The only difference is the correction of Figures 27-28 and the associated text. This report supplants TR2002-423. Using the Emulab network testbed to evaluate the Armada I/O framework for computational grids Dartmouth Technical Report TR2002-433 Ron Oldfield David Kotz Date: September 2002 URL (compressed postscript): (160KB) URL (PDF): (92KB) Abstract: This short report describes our experiences using the Emulab network testbed at the University of Utah to test performance of the Armada framework for parallel I/O on computational grids. Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum Dartmouth Technical Report TR2002-434 Ryan H. Lilien Hany Farid Bruce R. Donald Date: October 2002 URL (compressed postscript): (1180KB) URL (PDF): (4492KB) Abstract: We have developed an algorithm called Q5 for probabilistic classification of healthy vs. disease whole serum samples using mass spectrometry. The algorithm employs Principal Components Analysis (PCA) followed by Linear Discriminant Analysis (LDA) on whole spectrum Surface-Enhanced Laser Desorption/Ionization Time of Flight (SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a novel probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally efficient; it is non-iterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques, and can provide clues as to the molecular identities of differentially-expressed proteins and peptides. Note: To appear in Journal of Computational Biology (2003). Distributed Algorithms for Guiding Navigation across a Sensor Network Dartmouth Technical Report TR2002-435 Qun Li Michael De Rosa Daniela Rus Date: October 2002 URL (compressed postscript): (3504KB) URL (PDF): (380KB) Abstract: We develop distributed algorithms for self-reconfiguring sensor networks that respond to directing a target through a region. The sensor network models the danger levels sensed across its area and has the ability to adapt to changes. It represents the dangerous areas as obstacles. A protocol that combines the artificial potential field of the sensors with the goal location for the moving object guides the object incrementally across the network to the goal, while maintaining the safest distance to the danger areas. We report on hardware experiments using a physical sensor network consisting of Mote sensors. Heterogeneous Self-Reconfiguring Robotics Dartmouth Technical Report TR2002-436 Robert C. Fitch Date: November 2002 URL (compressed postscript): (9540KB) URL (PDF): (976KB) Abstract: Self-reconfiguring robots are modular systems that can change shape, or "reconfigure," to match structure to task. They comprise many small, discrete, often identical modules that connect together and that are minimally actuated. Global shape transformation is achieved by composing local motions. Systems with a single module type, known as "homogeneous" systems, gain fault tolerance, robustness and low production cost from module interchangeability. However, we are interested in "heterogeneous" systems, which include multiple types of modules such as those with sensors, batteries or wheels. We believe that heterogeneous systems offer the same benefits as homogeneous systems with the added ability to match not only structure to task, but also capability to task. Although significant results have been achieved in understanding homogeneous systems, research in heterogeneous systems is challenging as key algorithmic issues remain unexplored. We propose in this thesis to investigate questions in four main areas: 1) how to classify heterogeneous systems, 2) how to develop efficient heterogeneous reconfiguration algorithms with desired characteristics, 3) how to characterize the complexity of key algorithmic problems, and 4) how to apply these heterogeneous algorithms to perform useful new tasks in simulation and in the physical world. Our goal is to develop an algorithmic basis for heterogeneous systems. This has theoretical significance in that it addresses a major open problem in the field, and practical significance in providing self-reconfiguring robots with increased capabilities. Proofs of Soundness and Strong Normalization for Linear Memory Types Dartmouth Technical Report TR2002-437 Heng Huang Chris Hawblitzel Date: November 2002 URL (compressed postscript): (280KB) URL (PDF): (276KB) Abstract: Efficient low-level systems need more control over memory than safe high-level languages usually provide. As a result, run-time systems are typically written in unsafe languages such as C. This report describes an abstract machine designed to give type-safe code more control over memory. It includes complete definitions and proofs. Exact formulae for the Lovasz Theta Function of sparse Circulant Graphs Dartmouth Technical Report TR2002-438 Valentino Crespi Date: November 2002 URL (compressed postscript): (160KB) URL (PDF): (296KB) Abstract: The Lovasz theta function has attracted a lot of attention for its connection with diverse issues, such as communicating without errors and computing large cliques in graphs. Indeed this function enjoys the remarkable property of being computable in polynomial time, despite being sandwitched between clique and chromatic number, two well known hard to compute quantities. In this paper I provide a closed formula for the Lovasz function of a specific class of sparse circulant graphs thus generalizing Lovasz results on cycle graphs (circulant graphs of degree 2). 3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings Dartmouth Technical Report TR2003-439 Chris J. Langmead Bruce R. Donald Date: January 2003 URL (PDF): (284KB) Abstract: Recognition of a protein's fold provides valuable information about its function. While many sequence-based homology prediction methods exist, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure for detecting 3D-structural homologies from sparse, unassigned protein NMR data. Our method identifies the 3D-structural models in a protein structural database whose geometries best fit the unassigned experimental NMR data. It does not use sequence information and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or sequence homology. The algorithm runs in O(pnk3) time, where p is the number of proteins in the database, n is the number of residues in the target protein, and k is the resolution of a rotation search. The method requires only uniform 15N-labelling of the protein and processes unassigned 1H-15N residual dipolar couplings, which can be acquired in a couple of hours. Our experiments on NMR data from 5 different proteins demonstrate that the method identifies closely related protein folds, despite low-sequence homology between the target protein and the computed model. Note: A revised and expanded version of this TR will appear as a refereed paper at the IEEE Computer Society Bioinformatics Conference , Stanford, California (August, 2003), Efficient Security for BGP Route Announcements Dartmouth Technical Report TR2003-440 David M. Nicol Sean W. Smith Meiyuan Zhao Date: May 2003 URL (compressed postscript): (76KB) URL (PDF): (136KB) Abstract: The Border Gateway Protocol (BGP) determines how Internet traffic is routed throughout the entire world; malicious behavior by one or more BGP speakers could create serious security issues. Since the protocol depends on a speaker honestly reporting path information sent by previous speakers and involves a large number of independent speakers, the Secure BGP (S-BGP) approach uses public-key cryptography to ensure that a malicious speaker cannot fabricate this information. However, such public-key cryptography is expensive: S-BGP requires a digital signature operation on each announcement sent to each peer, and a linear (in the length of the path) number of verifications on each receipt. We use simulation of a 110 AS system derived from the Internet to evaluate the impact that the processing costs of cryptography have on BGP convergence time. We find that under heavy load the convergence time using ordinary S-BGP is nearly twice as large as under BGP. We examine the impact of highly aggressive caching and pre-computation optimizations for S-BGP, and find that convergence time is much closer to BGP. However, these optimizations may be unrealistic, and are certainly expensive of memory. We consequently use the structure of BGP processing to design optimizations that reduce cryptographic overhead by amortizing the cost of private-key signatures over many messages. We call this method Signature-Amortization (S-A). We find that S-A provides as good or better convergence times as the highly optimized S-BGP, but without the cost and complications of caching and pre-computation. It is possible therefore to minimize the impact route validation has on convergence, by being careful with signatures, rather than consumptive of memory. Note: Revision 2 released May 9, 2003. Original revision 1, of February 2003, is available in pdf or ps.Z. Flexible and Scalable Public Key Security for SSH Dartmouth Technical Report TR2003-441 Yasir Ali S. W. Smith Date: February 2003 URL (PDF): (616KB) Abstract: A standard tool for secure remote access, the SSH protocol uses public-key cryptography to establish an encrypted and integrity-protected channel with a remote server. However, widely-deployed implementations of the protocol are vulnerable to man-in-the-middle attacks, where an adversary substitutes her public key for the server's. This danger particularly threatens a traveling user Bob borrowing a client machine. Imposing a traditional X.509 PKI on all SSH servers and clients is neither flexible nor scalable nor (in the foreseeable future) practical. Requiring extensive work or an SSL server at Bob's site is also not practical for many users. This paper presents our experiences designing and implementing an alternative scheme that solves the public-key security problem in SSH without requiring such an a priori universal trust structure or extensive sysadmin work--although it does require a modified SSH client. (The code is available for public download.) Note: A revised version, published as a conference paper, as follows:
Y. Ali, S.W. Smith.
"Flexible and Scalable Public Key Security for SSH."
Public Key Infrastructure: EuroPKI 2004, pp. 43-56,
LNCS 3093, June 2004. Springer-Verlag. DOI 10.1007/b98201.
http://dx.doi.org/10.1007/b98201
http://www.springerlink.com/content/h8jng6kc1rf3j97g/ Privacy-enhanced credential services Dartmouth Technical Report TR2003-442 Alex Iliev Sean Smith Date: February 2003 URL (compressed postscript): (140KB) URL (PDF): (296KB) Abstract: The use of credential directories in PKI and authorization systems such as Shibboleth introduces a new privacy risk: an insider at the directory can learn much about otherwise protected interactions by observing who makes queries, and what they ask for. Recent advances in Practical Private Information Retrieval provide promising countermeasures. In this paper, we extend this technology to solve this new privacy problem, and present a design and preliminary prototype for a LDAP-based credential service that can prevent even an insider from learning anything more than the fact a query was made. Our preliminary performance analysis suggests that the complete prototype may be sufficiently robust for academic enterprise settings. Note: Submitted to the 2nd Annual PKI Research Workshop. Keyjacking: Risks of the Current Client-side Infrastructure Dartmouth Technical Report TR2003-443 John C. Marchesini Sean W. Smith Meiyuan Zhao Date: February 2003 URL (compressed postscript): (64KB) URL (PDF): (108KB) Abstract: In theory, PKI can provide a flexible and strong way to authenticate users in distributed information systems. In practice, much is being invested in realizing this vision via client-side SSL and browser-based keystores. Exploring this vision, we demonstrate that browsers will use personal certificates to authenticate requests that the person neither knew of nor approved (and which password-based systems would have defeated), and we demonstrate the easy permeability of these keystores (including new attacks on medium and high-security IE/XP keys). We suggest some countermeasures, but also suggest that a fundamental rethinking of the trust, usage, and storage model might result in a more effective PKI. Stupid Columnsort Tricks Dartmouth Technical Report TR2003-444 Geeta Chaudhry Thomas H. Cormen Date: April 2003 URL (compressed postscript): (136KB) URL (PDF): (196KB) Abstract: Leighton's columnsort algorithm sorts on an $r \times s$ mesh, subject to the restrictions that $s$ is a divisor of~$r$ and that $r \geq 2s^2$ (so that the mesh is tall and thin). We show how to mitigate both of these restrictions. One result is that the requirement that $s$ is a divisor of~$r$ is unnecessary; columnsort sorts correctly whether or not $s$ divides~$r$. We present two algorithms that, as long as $s$ is a perfect square, relax the restriction that $r \geq 2s^2$; both reduce the exponent of~$s$ to~$3/2$. One algorithm requires $r \geq 4s^{3/2}$ if $s$ divides~$r$ and $r \geq 6s^{3/2}$ if $s$ does not divide~$r$. The other algorithm requires $r \geq 4^{3/2}$, and it requires $s$ to be a divisor of~$r$. Both algorithms have applications in increasing the maximum problem size in out-of-core sorting programs. Relaxing the Problem-Size Bound for Out-of-Core Columnsort Dartmouth Technical Report TR2003-445 Geeta Chaudhry Elizabeth A. Hamon Thomas H. Cormen Date: April 2003 URL (compressed postscript): (92KB) URL (PDF): (128KB) Abstract: Previous implementations of out-of-core columnsort limit the problem size to $N \leq \sqrt{(M/P)^3 / 2}$, where $N$ is the number of records to sort, $P$ is the number of processors, and $M$ is the total number of records that the entire system can hold in its memory (so that $M/P$ is the number of records that a single processor can hold in its memory). We implemented two variations to out-of-core columnsort that relax this restriction. Subblock columnsort is based on an algorithmic modification of the underlying columnsort algorithm, and it improves the problem-size bound to $N \leq (M/P)^{5/3} / 4^{2/3}$ but at the cost of additional disk I/O\@. $M$-columnsort changes the notion of the column size in columnsort, improving the maximum problem size to $N \leq \sqrt{M^3 / 2}$ but at the cost of additional computation and communication. Experimental results on a Beowulf cluster show that both subblock columnsort and $M$-columnsort run well but that $M$-columnsort is faster. A further advantage of $M$-columnsort is that it handles a wider range of problem sizes than subblock columnsort. Efficient and Practical Constructions of LL/SC Variables Dartmouth Technical Report TR2003-446 Prasad Jayanti Srdjan Petrovic Date: June 2003 URL (PDF): (316KB) Abstract: Over the past decade, LL/SC have emerged as the most suitable synchronization instructions for the design of lock-free algorithms. However, current architectures do not support these instructions; instead, they support either CAS or RLL/RSC (e.g. POWER4, MIPS, SPARC, IA-64). To bridge this gap, this paper presents two efficient wait-free algorithms for implementing 64-bit LL/SC objects from 64-bit CAS or RLL/RSC objects. Our first algorithm is practical: it has a small, constant time complexity (of 4 for LL and 5 for SC) and a space overhead of only 4 words per process. This algorithm uses unbounded sequence numbers. For theoretical interest, we also present a more complex bounded algorithm that still guarantees constant time complexity and O(1) space overhead per process. The LL/SC primitive is free of the well-known ABA problem that afflicts CAS. By efficiently implementing LL/SC words from CAS words, this work presents an efficient general solution to the ABA problem. Billiards Adviser as a Search in a Continuous Domain with Significant Uncertainty Dartmouth Technical Report TR2003-448 Thomas Mueller Date: May 2003 URL (PDF): (512KB) Abstract: Typical search algorithms are limited to problems in which there is a certain number of moves for any given state, and the effect of each move is well known. In order to overcome this limitation, we consider the problem of determining the optimal shot given the positions of balls on a billiards table. Our solution includes the image recognition necessary to determine each ball's position, the calculation of the optimal shot, and the presentation of that shot to the player. The focus of the paper is on the second part - determining the angle and force with which the player should attempt to hit the cue ball for each shot in order to sink all of the other balls with the fewest shots. The solution to this problem is unique from other game search algorithms in that it must take into account the infinite number of possible shots given any configuration of balls as well as the fact that the player is not likely to hit the ball exactly how he attempts to do so. We compare the performance of our algorithm with one that ignores the latter fact to show that our modifications do in fact improve performance for a search in a continuous domain with significant uncertainty. Note: Advisor: Zack Butler An Active Learning Approach to Efficiently Ranking Retrieval Engines Dartmouth Technical Report TR2003-449 Lisa A. Torrey Date: May 2003 URL (compressed postscript): (476KB) URL (PDF): (800KB) Abstract: Evaluating retrieval systems, such as those submitted to the annual TREC competition, usually requires a large number of documents to be read and judged for relevance to query topics. Test collections are far too big to be exhaustively judged, so only a subset of documents is selected to form the judgment ``pool.'' The selection method that TREC uses produces pools that are still quite large. Research has indicated that it is possible to rank the retrieval systems correctly using substantially smaller pools. This paper introduces an active learning algorithm whose goal is to reach the correct rankings using the smallest possible number of relevance judgments. It adds one document to the pool at a time, always trying to select the document with the highest information gain. Several variants of this algorithm are described, each with improvements on the one before. Results from experiments are included for comparison with the traditional TREC pooling method. The best version of the algorithm reliably outperforms the traditional method, although its degree of improvement varies. Note: Senior Honors Thesis. Advisor: Jay Aslam. An Evaluation of the Impact of Models for Radio Propagation on the Simulation of 802.11b Wireless Networks Dartmouth Technical Report TR2003-450 Evan W. Richardson Date: June 2003 URL (compressed postscript): (4644KB) URL (PDF): (2944KB) Abstract: Working with an existing wireless network simulator, we describe the addition of both a method for modeling arbitrary terrain, and for calculating signal attenuation with the Irregular Terrain Model (ITM). We also investigate ITM's effects on upper protocol layer in comparison to the Two-Ray Ground Reflection model. Upon examination, it was found that aside from the terrain between the transmitter and receiver, ITM's various parameters are of little significance in the computed signal attenuation. Further, examination of the behavior of the upper protocol layers revealed that at high traffic levels, choice of propagation model can have significant effects on the results of the simulation. Note: Senior Honors Thesis. Advisor: Felipe Perrone. 802.11b Wireless Network Visualization and Radiowave Propagation Modeling Dartmouth Technical Report TR2003-451 Chris Lentz Date: June 2003 URL (PDF): (2252KB) Abstract: This paper outlines the methods of creating detailed coverage maps of 802.11b networks, with an emphasis on minimizing the expenses and time involved. The goal of this work is to develop and present a streamlined, reproducible approach to wireless visualization as well as techniques for predicting coverage area before conducting network installations. After evaluating these coverage maps, a repeated series of field measurements will be checked against interpolated values in order to improve techniques for extrapolation of data for unsampled regions. If successful, these extrapolation techniques will provide additional guidelines for, and assist modeling of, new wireless network installations. However, this paper demonstrates that due to the microcellular structure of indoor/outdoor 802.11b networks, accurate interpolation and propagation prediction techniques do not exist independent of highly specific location models. In lieu of the creation of extensive simulation environments, best practice guidelines for municipal wireless network planning and deployment are presented. Note: Senior thesis. Advisor: David Kotz. An Analysis of Convergence Properties of the Border Gateway Protocol Using Discrete Event Simulation Dartmouth Technical Report TR2003-452 Brian J. Premore Date: May 2003 URL (compressed postscript): (844KB) URL (PDF): (2388KB) Abstract: The Internet is an enormous internetwork formed by connecting tens of thousands of independently managed computer networks. Though the Internet has no central authority and is highly heterogeneous, a universally adopted addressing scheme---defined by the Internet Protocol (IP)---makes interaction between the individual networks possible. Complementing IP is the Border Gateway Protocol (BGP), which facilitates communication between parts of the internetwork by determining paths by which data can get from one network to any other. Just as IP is used ubiquitously as an addressing scheme, BGP is used ubiquitously for the purpose of network-to-network routing. Because BGP is universal, its well-being is the concern of everyone. In other words, when BGP suffers, everyone suffers. Even when just one instance of BGP on one router is ill-behaved, it can have global effects. Unfortunately, as the Internet has grown, the amount of stress put on BGP has increased. For a long time, the behavior of inter-domain routing was studied minimally and was assumed to be working just fine. Research eventually showed, however, that routing was not actually functioning so smoothly, and the highly dynamic nature of the Internet was taking its toll on the routing infrastructure. This discovery prompted a closer look at the behavior of BGP. Though its underlying premise is a simple distributed shortest-path algorithm, the dynamic nature of the Internet, combined with some additional constraints in the protocol, has made analytical approaches to studying the protocol infeasible. Measurement-based approaches have been taken, but they are difficult to implement and have minimal leeway for allowing exploration of the protocol's behavior under different conditions. For these reasons we have taken the approach of simulation in order to begin to understand some of the complex ways in which BGP behaves. Simulation allows one to explore the protocol more fully, testing it under various conditions and modifying the protocol itself to explore the consequences of its fundamental design. We have studied BGP behavior with respect to several parameters, some external (network characteristics) and some internal (protocol characteristics). We show that there is room for improvement in the protocol, in particular with respect to convergence following changes in availability of an address in the network. The rate-limiting mechanism of the protocol is a particular parameter of concern. Although it was previously thought to help improve convergence, we found that in some cases it can have drastic degrading effects. As a result of our work, we suggest ways in which BGP could be modified in practice to reduce the instability of the protocol. Note: This is a Ph.D. thesis. It differs from the version of the thesis which appears in the Dartmouth College library in a couple of significant ways. First, it is single-spaced. Figures have moved around in order to accommodate this change. Second, it includes some corrections. The primary correction is that some bibliography entries were reordered in order to properly alphabetize them. This has the side effect that the numbered citations throughout the document are different in this version than in the original version. SPADE: SPKI/SDSI for Attribute Release Policies in a Distributed Environment Dartmouth Technical Report TR2003-453 Sidharth P. Nazareth Date: May 2003 URL (compressed postscript): (776KB) URL (PDF): (676KB) Abstract: Shibboleth is a federated administrated system that supports inter-institutional authentication and authorization for sharing of resources. SPKI/SDSI is a public key infrastructure whose creation was motivated by the perception that X.509 is too complex and flawed. This thesis addresses the problem of how users that are part of a Public Key Infrastructure in a distributed computing system can effectively specify, create, and disseminate their Attribute Release Policies for Shibboleth using SPKI/SDSI. This thesis explores existing privacy mechanims, as well as distributed trust management and policy based systems. My work describes the prototype for a Trust Management Framework called SPADE (SPKI/SDSI for Attribute Release Policies in a Distributed Environment) that I have designed, developed and implemented. The principal result of this research has been the demonstration that SPKI/SDSI is a viable approach for trust management and privacy policy specification, especially for minimalistic policies in a distributed environment. Note: M.S Thesis. Advisor: Sean Smith Discrete-Event Fluid Modeling of Background TCP Traffic Dartmouth Technical Report TR2003-454 David M. Nicol Guanhua Yan Date: June 2003 URL (compressed postscript): (2396KB) URL (PDF): (852KB) Abstract: TCP is the most widely used transport layer protocol used in the internet today. A TCP session adapts the demands it places on the network to observations of bandwidth availability on the network. Because TCP is adaptive, any model of its behavior that aspires to be accurate must be influenced by other network traffic. This point is especially important in the context of using simulation to evaluate some new network algorithm of interest (e.g. reliable multi-cast) in an environment where the background traffic affects---and is affected by---its behavior. We need to generate background traffic efficiently in a way that captures the salient features of TCP, while the reference and background traffic representations interact with each other. This paper describes a fluid model of TCP and a switching model that has flows represented by fluids interacting with packet-oriented flows. We describe conditions under which a fluid model produces exactly the same behavior as a packet-oriented model, and we quantify the performance advantages of the approach both analytically and empirically. We observe that very significant speedups may be attained while keeping high accuracy. Persistence and Prevalence in the Mobility of Dartmouth Wireless Network Users Dartmouth Technical Report TR2003-455 Clara E. Lee Date: May 200