%T Reducing Mass Degeneracy in SAR by MS by Stable Isotopic Labeling
%A Chris Bailey-Kellogg
%A John J. Kelley
%A Clifford Stein
%A Bruce Randall Donald
%R Technical Report TR2000-362
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-362.ps.Z
%X
Mass spectrometry (MS) promises to be an invaluable tool for
functional genomics, by supporting low-cost, high-throughput
experiments. However, large-scale MS faces the potential problem of
mass degeneracy -- indistinguishable masses for multiple
biopolymer fragments (e.g. from a limited proteolytic digest). This
paper studies the tasks of planning and interpreting MS experiments
that use selective isotopic labeling, thereby substantially reducing
potential mass degeneracy. Our algorithms support an
experimental-computational protocol called Structure-Activity
Relation by Mass Spectrometry (SAR by MS), for elucidating the
function of protein-DNA and protein-protein complexes. SAR by MS
enzymatically cleaves a crosslinked complex and analyzes the resulting
mass spectrum for mass peaks of hypothesized fragments. Depending on
binding mode, some cleavage sites will be shielded; the absence of
anticipated peaks implicates corresponding fragments as either part of
the interaction region or inaccessible due to conformational change
upon binding. Thus different mass spectra provide evidence for
different structure-activity relations. We address combinatorial and
algorithmic questions in the areas of data analysis
(constraining binding mode based on mass signature) and
experiment planning (determining an isotopic labeling strategy to
reduce mass degeneracy and aid data analysis). We explore the
computational complexity of these problems, obtaining upper and lower
bounds. We report experimental results from implementations of our
algorithms.
%Z
This report supercedes TR99-359.
To appear in the 8th
International Conference on Intelligent Systems for Molecular Biology,
(August 20-23, 2000) La Jolla, CA (Accepted; in press).
%T A Formal Semantics for SPKI
%A Jon Howell
%A David Kotz
%R Technical Report TR2000-363
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-363.ps.Z
%X
We extend the logic and semantics of authorization due to
Abadi, Lampson, et al. to support restricted delegation.
Our formal model provides a simple interpretation for the variety of
constructs in the Simple Public Key Infrastructure (SPKI), and
lends intuition about possible extensions. We discuss both extensions
that our semantics supports and extensions that it cautions against.
%Z
This TR supercedes TR1999-361.
This technical report is an extended version of a paper submitted to
ESORICS 2000.
For more information, see the project web page.
%T Landmarks for absolute localization
%A Jon Howell
%A Keith Kotay
%R Technical Report TR2000-364
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-364.ps.Z
%X
For certain experiments in mobile robotics, it is convenient to eliminate
positional estimation error in the interest of analyzing other parts of
the experiment. We designed and implemented a simple, accurate scheme
for encoding and recovering absolute position information. The encoding
is a two-dimensional image printed on the plane of the floor, and the
absolute position information is recovered using a downward-looking
video camera mounted on a mobile robot.
%Z
This document describes work done in the Dartmouth Robotics Laboratory
in April of 1997 and August of 1999. It was previously ``published'' as
a web page, but we thought it would make sense to document it more
permanently.
%T Mobile Agents: Motivations and State-of-the-Art Systems
%A Robert S. Gray
%A David Kotz
%A George Cybenko
%A Daniela Rus
%R Technical Report TR2000-365
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-365.ps.Z
%X
A mobile agent is an executing program that can migrate, at times of its
own choosing, from machine to machine in a heterogeneous network. On
each machine, the agent interacts with stationary service agents and
other resources to accomplish its task. In this chapter, we first make
the case for mobile agents, discussing six strengths of mobile agents
and the applications that benefit from these strengths. Although none of
these strengths are unique to mobile agents, no
competing technique shares all six. In other words, a mobile-agent
system provides a single general framework in which a wide range of
distributed applications can be implemented efficiently and easily.
We then present a representative cross-section of current mobile-agent
systems.
%Z
This technical report will appear as a chapter in Jeffrey M. Bradshaw,
editor, Handbook of Agent Technology, AAAI/MIT Press, 2000. In Press.
%T Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
%A David Kotz
%A Guofei Jiang
%A Robert S. Gray
%A George Cybenko
%A Ronald A. Peterson
%R Technical Report TR2000-366
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-366.ps.Z
%X
Wireless networks are an ideal environment for mobile agents, because
their mobility allows them to move across an unreliable link to reside
on a wired host, next to or closer to the resources they need to use.
Furthermore, client-specific data transformations can be moved across
the wireless link, and run on a wired gateway server, with the goal of
reducing bandwidth demands. In this paper we examine the tradeoffs
faced when deciding whether to use mobile agents to support a
data-filtering application, in which numerous wireless clients filter
information from a large data stream arriving across the wired
network. We develop an analytical model and use parameters from our
own experiments to explore the model's implications.
%Z
In August 2000 a revised version appeared in the International
Workshop on Modeling and Simulation of Wireless and Mobile Systems
(MSWiM 2000). In October 2000 a further revised version appeared as
Dartmouth Technical Report TR2000-377, and was submitted to the
journal Mobile Networks and Applications (ACM MONET).
%T Approximation Algorithms for the Minimum Bends Traveling Salesman Problem
%A Cliff Stein
%A David P. Wagner
%R Technical Report TR2000-367
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-367.ps.Z
%X
The problem of traversing a set of points in the order that minimizes
the total distance traveled (traveling salesman problem) is one of
the most famous and well-studied problems in combinatorial
optimization. It has many applications, and has been a testbed for
many of the must useful ideas in algorithm design and analysis. The
usual metric, minimizing the total distance traveled, is an important
one, but many other metrics are of interest.
In this paper, we introduce the metric of minimizing the number of
turns in the tour, given that the input points are in the Euclidean
plane. To our knowledge this metric has not been studied previously.
It is motivated by applications in robotics and in the movement of
other heavy machinery: for many such devices turning is an expensive
operation. We give approximation algorithms for several variants
of the traveling salesman problem for which the metric is to minimize
the number of turns. We call this the minimum bends traveling salesman
problem.
For the case of an arbitrary set of $n$ points in the
Euclidean plane, we give an O(lg z)-approximation algorithm, where
z is the maximum number of collinear points. In the worst case z
can be as big as n, but z will often be much smaller. For the case
when the lines are restricted to being either horizontal or vertical,
we give a 2-approximation algorithm. If we have the further
restriction that no two points are allowed to have the same x- or
y-coordinate, we give an algorithm that finds a tour which makes at
most two turns more than the optimal tour. Thus we have an
approximation algorithm with an additive, rather than a multiplicative
error bound. Beyond the additive error bound, our algorithm for this
problem introduces several interesting algorithmic techniques for
decomposing sets of points in the Euclidean plane that we believe
to be of independent interest.
%Z
Submitted to FOCS 2000.
%T A Simulation of Auroral Absorption
%A Eric Michael Greenberg
%R Technical Report TR2000-368
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-368.pdf
%X
HF radio transmissions propagate long distances by reflecting off the ionosphere. At high latitudes radio propagation is strongly affected by the northern lights (aurora borealis), which causes ionization at low altitudes and hence the absorption of radio waves. Models of this process are still in a primitive state. A simulation of radio wave propagation was created in order to test Foppiano and Bradley's empirical model of auroral absorption. The simulation attempts to predict the net absorption of signals at a receiver by simulating a large number of transmitters, even though the exact sources of the signals are unknown. Although the simulation takes into account auroral and nonauroral absorption as well as other sources of path loss, the analysis focuses on the nighttime aurora. An intelligent search algorithm is used in order to efficiently adjust the model to best fit the data. The output of the simulation is qualitatively and quantitatively compared to signal levels observed with HF radio receivers located in northern Canada. The analysis allows us to develop alternative models of auroral absorption which account for the level of geomagnetic activity, and these are compared to the standard Foppiano and Bradley model.
%Z
Advisor: Dan Rockmore
%T Registration of Images with Dissimilar Contrast using a Hybrid Method Employing Correlation and Mutual Information
%A Karolyn A. Abram
%R Technical Report TR2000-369
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-369.ps.Z
%X
The problem of fitting one image into another is commonly
called "registration." Finding the best possible translation and rotation
necessary to align two images is one approach to solving this problem.
Registration is a crucial component of many remote sensing and medical
image interpretation applications. Image alignment techniques aid in
volumetric estimations of complicated structures and allow
radiologists to accurately identify changes between sequential images.
Radiologists require image alignment capabilities to correct for
patient motion and/or content displacement between images.
Numerous image registration techniques exist for correcting the
alignment problems mentioned above. Unfortunately, most of these
techniques, such as Correlation, fail to find a good alignment when
dealing with images that differ in contrast. The Mutual Information
method is able to align images independently of contrast, but it is
computationally intensive. We explore a hybrid technique that
utilizes both Correlation and Mutual Information. The Hybrid
technique hopes to gain greater contrast independence than Correlation
alone while achieving a lower running time than a pure Mutual
Information technique.
%Z
Undergraduate Honors Thesis
Advisor: Daniel N. Rockmore
%T An Infrastructure for a Mobile-Agent System that Provides Personalized Services to Mobile Devices
%A Debbie O. Chyi
%R Technical Report TR2000-370
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-370.ps.Z
%X
In this paper, we present the design of a mobile-agent system
that provides a mobile user with a personalized information retrieval
service and we describe the implementation of the infrastructure for
such a system. This "Personal Agent System" gathers information from
the Internet and uses context-aware mechanisms to manage the
information according to a mobile user's needs and preferences. The
user's schedule and location are the context indicators in this
system. These indicators are critical in ensuring that users obtain
only the information they want, receive information in a form that is
most useful for viewing on their mobile device, and is notified of
new information in a minimally intrusive manner. The system
incorporates a rule-based learning system to enhance the
personalization achieved by the system.
%Z
Undergraduate Honors Thesis.
Advisor: David F. Kotz
%T Personal Radio
%A John C. Artz
%R Technical Report TR2000-372
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-372.ps.Z
%X
With the development of new technologies that allow the broadcast of
digital data over radio signals, there are many possibilities for
improving upon the traditional radio station model for content
delivery. The idea of Personal Radio is a system that tailors content
to meet the needs of each individual. Using Global Positioning System
(GPS) technology to play location specific content, the listening
history to play content an appropriate number of times, and user
feedback to learn personal preferences, the Personal Radio provides
the listener with the content that is the most useful/interesting to
them. This paper will examine the general design of such a system and
present solutions developed in the implementation of several pieces
of the design.
%Z
Undergraduate Honors Thesis.
Advisors: David Kotz and Daniela Rus
%T Depth from Flash
%A David B. Martin
%R Technical Report TR2000-373
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-373.ps.Z
%X
Digital camera technology has recently seen substantial
improvements in image quality while lower prices have made
it affordable to the average consumer. Camera manufacturers,
however, are not taking full advantage of this new medium
for image capture. By filtering the already digitized image
produced by these cameras through on-board image processing
algorithms we can dramatically increase the power of digital
cameras. For example, according to experts in the
photographic industry, most people simply take bad
pictures. Classic examples of this phenomenon are
photographs taken indoors with a point-and-shoot style
camera using its built-in flash. The subjects of these
photographs often seem to have a spotlight on them, making
them look bright and washed out while the rest of the
photograph is dark and indistinct. This can primarily be
accounted for by a well known property of point light
sources: falloff in brightness is inversely proportional to
the square of the distance between the light and the object
being illuminated. A technique first introduced in the field
of computer vision has been shown to successfully recover
information about the distance between the light source and
objects in the world. We propose using this technique, which
is readily implementable in hardware, to correct for a
variety of poorly illuminated digital images.
%Z
Undergraduate Honors Thesis.
Advisor: Hany Farid
%T An Economic CPU-Time Market for D'Agents
%A Ezra E. K. Cooper
%A Robert S. Gray
%R Technical Report TR2000-375
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-375.ps.Z
%X
A usable and efficient resource-management system has been created for
use with D'Agents. The software dynamically negotiates a price rate for
CPU time, using the competitive bids of mobile agents that offer
currency in return for fast computation. The system allows mobile agents
to plan their expenditures across many hosts while minimizing the time
needed for their tasks. The ability to price CPU time opens the door for
service owners to be compensated for the computation consumed by agents and
provides an incentive for servers to allow anonymous agents. We
discuss the theoretical background which makes a CPU market system
possible and the performance of the D'Agents market system.
%Z
Undergraduate honors thesis.
Advisor: Bob Gray.
%T The complexity of planning with partially-observable Markov decision processes
%A Martin Mundhenk
%R Technical Report TR2000-376
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-376.ps.Z
%X
This work surveys results on the complexity of planning under
uncertainty. The planning model considered is the partially-observable
Markov decision process. The general planning problems are,
given such a process,
(a) to calculate its performance under a given control policy,
(b) to find an optimal or approximate optimal control policy, and
(c) to decide whether a good policy exists.
The complexity of this and related problems
depend on a variety of factors,
including the observability of the process state,
the compactness of the process representation,
the type of policy, or
even the number of actions relative to the number of states.
In most cases, the problem can be shown to be
complete for some known complexity class.
The skeleton of this survey are results from
Littman, Goldsmith and Mundhenk (Journal of Artificial Intelligence
Research 1998), Mundhenk (Mathematics of Operations Research 2000),
Mundhenk, Goldsmith, Lusena and Allender (Journal of the ACM 2000),
and Lusena, Goldsmith and Mundhenk (University of KY CS TR).
But there are also some news.
%T Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
%A David Kotz
%A George Cybenko
%A Robert S. Gray
%A Guofei Jiang
%A Ronald A. Peterson
%A Martin O. Hofmann
%A Daria A. Chacon
%A Kenneth R. Whitebread
%A James Hendler
%R Technical Report TR2000-377
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-377.ps.Z
%X
Wireless networks are an ideal environment for mobile agents, since their
mobility allows them to move across an unreliable link to reside on a wired
host, next to or closer to the resources that they need to use. Furthermore,
client-specific data transformations can be moved across the wireless link and
run on a wired gateway server, reducing bandwidth demands. In
this paper we examine the tradeoffs faced when deciding whether to use mobile
agents in a data-filtering application where numerous wireless clients filter
information from a large data stream arriving across the wired network. We
develop an analytical model and use parameters from filtering experiments
conducted during a U.S. Navy Fleet Battle Experiment (FBE) to explore the
model's implications.
%Z
Updated version of TR2000-366. To appear, after revisions, in the journal
Mobile Networks and Applications (ACM MONET).
%T Naming and sharing resources across administrative boundaries (Volume I)
%A Jon Howell
%R Technical Report TR2000-378
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-378.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report represents Volume One of the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume II is available as TR2000-379.
Please note that a list of errata is available as TR2000-380.
%T Naming and sharing resources across administrative boundaries (Volume II)
%A Jon Howell
%R Technical Report TR2000-379
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-379.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report represents Volume Two of the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume I is available as TR2000-378.
Please note that a list of errata is available as TR2000-380.
%T Naming and sharing resources across administrative boundaries (errata)
%A Jon Howell
%R Technical Report TR2000-380
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-380.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report contains errata for the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume I is available as TR2000-378.
Volume II is available as TR2000-379.
%T A Survey of Context-Aware Mobile Computing Research
%A Guanling Chen
%A David Kotz
%R Technical Report TR2000-381
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-381.ps.Z
%X
Context-aware computing is a mobile computing paradigm in which
applications can discover and take advantage of contextual information
(such as user location, time of day, nearby people and devices, and
user activity). Since it was proposed about a decade ago, many
researchers have studied this topic and built several context-aware
applications to demonstrate the usefulness of this new
technology. Context-aware applications (or the system infrastructure
to support them), however, have never been widely available to
everyday users. In this survey of research on context-aware systems
and applications, we looked in depth at the types of context used and
models of context information, at systems that support collecting and
disseminating context, and at applications that adapt to the changing
context. Through this survey, it is clear that context-aware research
is an old but rich area for research. The difficulties and possible
solutions we outline serve as guidance for researchers hoping to make
context-aware computing a reality.
%T Bayes Optimal Metasearch: A Probabilistic Model for Combining the Results of Multiple Retrieval Systems
%A Javed A. Aslam
%A Mark Montague
%R Technical Report TR2000-382
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-382.ps.Z
%X
We introduce a new, probabilistic model for combining the outputs of
an arbitrary number of query retrieval systems. By gathering simple
statistics on the average performance of a given set of query
retrieval systems, we construct a Bayes optimal mechanism for
combining the outputs of these systems. Our construction yields a
metasearch strategy whose empirical performance nearly always
exceeds the performance of any of the constituent systems. Our
construction is also robust in the sense that if ``good'' and
``bad'' systems are combined, the performance of the composite is
still on par with, or exceeds, that of the best constituent system.
Finally, our model and theory provide theoretical and empirical
avenues for the improvement of this metasearch strategy.
%Z
Preliminary version appeared in SIGIR 2000.
%T Reconstructing Ancient Egyptian Tombs
%A Hany Farid
%R Technical Report TR2000-383
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-383.ps.Z
%X
From the pyramids of Giza to the tombs of Thebes (modern Luxor),
ancient Egypt's glorious history has produced remarkable architecture.
Sadly, the nearly four million yearly tourists have taken a heavy toll
on many of these ancient structures. Of particular concern are many
of the tombs located opposite to Luxor on the western bank of the
Nile. Digital reconstruction of these tombs has the potential to help
document and preserve these important historical structures.
Photographing and reconstruction of these tombs poses new and unique
problems that this paper begins to address. Techniques for removing
image distortions, recovering 3-D shape, and correcting for lighting
imbalances are discussed. A complete reconstruction of the tomb of
Sennedjem is shown.
%T Ambiguity-Directed Sampling for Qualitative Analysis of Sparse Data from Spatially-Distributed Physical Systems
%A Chris Bailey-Kellogg
%A Naren Ramakrishnan
%R Technical Report TR2001-384
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-384.ps.Z
%X
A number of important scientific and engineering applications, such as
fluid dynamics simulation and aircraft design, require analysis of
spatially-distributed data from expensive experiments and complex
simulations. In such data-scarce applications, it is advantageous to
use models of given sparse data to identify promising regions for
additional data collection. This paper presents a principled
mechanism for applying domain-specific knowledge to design focused
sampling strategies. In particular, our approach uses ambiguities
identified in a multi-level qualitative analysis of sparse data to
guide iterative data collection. Two case studies demonstrate that
this approach leads to highly effective sampling decisions that are
also explainable in terms of problem structures and domain knowledge.
%T Lock-free Scheduling of Logical Processes in Parallel Simulation
%A Xiaowen Liu
%A David M. Nicol
%A King Tan
%R Technical Report TR2001-385
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-385.ps.Z
%X
With fixed lookahead information in a simulation model, the overhead
of asynchronous conservative parallel simulation lies in the mechanism
used for propagating time updates in order for logical processes to
safely advance their local simulation clocks. Studies have shown that
a good scheduling algorithm should preferentially schedule processes
containing events on the critical path. This paper introduces a
lock-free algorithm for scheduling logical processes in conservative
parallel discrete-event simulation on shared-memory multiprocessor
machines. The algorithm uses fetch\&add operations that help avoid
inefficiencies associated with using locks. The lock-free algorithm is
robust. Experiments show that, compared with the scheduling algorithm
using locks, the lock-free algorithm exhibits better performance when
the number of logical processes assigned to each processor is small or
when the workload becomes significant. In models with large number of
logical processes, our algorithm shows only modest increase in
execution time due to the overhead in the algorithm for extra
bookkeeping.
%Z
A revision of this report appears on PADS 2001.
%T Mobile-Agent versus Client/Server Performance: Scalability in an Information-Retrieval Task
%A Robert S. Gray
%A David Kotz
%A Ronald A. Peterson
%A Peter Gerken
%A Martin Hofmann
%A Daria Chacon
%A Greg Hill
%A Niranjan Suri
%R Technical Report TR2001-386
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-386.ps.Z
%X
Mobile agents are programs that can jump from host
to host in the network, at times and to places of
their own choosing. Many groups have developed
mobile-agent software platforms, and several
mobile-agent applications. Experiments show that
mobile agents can, among other things, lead to
faster applications, reduced bandwidth demands, or
less dependence on a reliable network
connection. There are few if any studies of the
scalability of mobile-agent servers, particularly as
the number of clients grows. We present some recent
performance and scalability experiments that compare
three mobile-agent platforms with each other and
with a traditional client/server approach. The
experiments show that mobile agents often outperform
client/server solutions, but also demonstrate the
deep interaction between environmental and
application parameters. The three mobile-agent
platforms have similar behavior but their absolute
performance varies with underlying implementation
choices.
%Z
Revised version appeared in Mobile Agents 2001.
See here.
%T A simple bound on the expected height of a randomly built binary search tree
%A Javed A. Aslam
%R Technical Report TR2001-387
%I Dartmouth College, Computer Science
%C Hanover, NH
%D Sometime 2001
%Z
Abstract and paper lost.
%T Applying the Vector Radix Method to Multidimensional, Multiprocessor, Out-of-Core Fast Fourier Transforms
%A Michael F. Ringenburg
%R Technical Report TR2001-388
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-388.ps.Z
%X
We describe an efficient algorithm for calculating Fast Fourier
Transforms on matrices of arbitrarily high dimension using the
vector-radix method when the problem size is out-of-core (i.e.,
when the size of the data set is larger than the total available
memory of the system). The algorithm takes advantage of multiple
processors when they are present, but it is also efficient on single-processor
systems. Our work is an extension of work done by Lauren Baptist in
[Bapt99], which applied the vector-radix method to 2-dimensional
out-of-core matrices.
To determine the effectiveness of the algorithm, we present
empirical results as well as an analysis of the I/O, communication,
and computational complexity. We perform the empirical tests on
a DEC 2100 server and on a cluster of Pentium-based Linux
workstations. We compare our results with the traditional
dimensional method of calculating multidimensional FFTs, and show
that as the number of dimensions increases, the vector-radix-based
algorithm becomes increasingly effective relative to the dimensional
method.
In order to calculate the complexity of the algorithm, it was
necessary to develop a method for analyzing the interprocessor
communication costs of the BMMC data-permutation algorithm
(presented in [CSW98]) used by our FFT algorithms. We present
this analysis method and show how it was derived.
%Z
Masters Thesis. Advisor: Tom Cormen.
%T Improving a Brokering System for Linking Distributed Simulations
%A Thomas B. Stephens
%R Technical Report TR2001-389
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-389.ps.Z
%X
The Agent Based Environment for Linking Simulations (ABELS)
is a software framework designed to provide disparate simulations
with dynamically updated data sources. It allows simulations and
other agents to join a "cloud" of interacting producers and consumers
of data. Once they have joined the cloud, they can publish services
to other members and use methods published by others. This paper
presents the initial design of a set of matchmaking components for
the ABELS framework. These components dictate how services describe
their abilities and requirements to ABELS. Furthermore, they help
ABELS successfully match data producing services to the requests of
data consuming clients.
We begin by describing a system for a data producing service
to describe itself to the ABELS cloud, as well as a corresponding
system for a data consumer to describe its needs. We then describe
in detail the three components that make up the ABELS matchmaking
system: the match ranker, which ranks a data producer's ability to
fill the request of a data consumer; the thesaurus, which helps the
match ranker recognize closely related terms; and the unit database,
which allows participants in the ABELS system to translate between
related data units. We also discuss how these basic components can
be built upon and improved in future versions of the ABELS framework.
%Z
Senior Honors Thesis. Advisor: Linda F. Wilson.
%T Mobile Voice Over IP (MVOIP): An Application-level Protocol
%A G. Ayorkor Mills-Tettey
%R Technical Report TR2001-390
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-390.ps.Z
%X
Current Voice over Internet Protocol (VOIP) protocols require
participating hosts to have fixed IP addresses for the duration of a
VOIP call. When using a wireless-enabled host, such as a tablet
computer on an 802.11 wireless network, it is possible for a
participant in a VOIP call to roam around the network, moving from
one subnet to another and needing to change IP addresses. This
address change creates the need for mobility support in VOIP
applications.
We present the design of Mobile Voice over IP (MVOIP), an
application-level protocol that enables such mobility in a VOIP
application based on the ITU H.323 protocol stack. An MVOIP
application uses hints from the surrounding network to determine that
it has switched subnets. It then initiates a hand-off procedure that
comprises pausing its current calls, obtaining a valid IP address for
the current subnet, and reconnecting to the remote party with whom it
was in a call. Testing the system shows that on a Windows 2000
platform there is a perceivable delay in the hand-off process, most
of which is spent in the Windows API for obtaining DHCP addresses.
Despite this bottleneck, MVOIP works well on a wireless network.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T A Directory Infrastructure to Support Mobile Services
%A Ammar Khalid
%R Technical Report TR2001-391
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-391.ps.Z
%X
Traditional Voice-over-IP applications such as Microsoft NetMeeting
assume that the user is on a machine with a fixed IP address. If,
however, the user connects to the Internet, via a wireless network,
on a handheld device, his IP address frequently changes as he moves
from one subnet to another. In such a situation, we need a service
that can be queried for the most current IP address of a person whom
we wish to contact. In this project, we design and implement such a
directory service. The service authenticates all callers and callees,
is robust against most host failure, and scales to several thousand
registered users.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T SmartReminder: A Case Study on Context-Sensitive Applications
%A Arun Mathias
%R Technical Report TR2001-392
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-392.ps.Z
%X
Designing context-sensitive applications is challenging. We design
and implement SmartReminder to explore designing context-sensitive
applications and to demonstrate how the SOLAR system can be used
in developing such applications. SmartReminder is an application
that reminds the user based on contextual information. Current
appointment-reminder applications remind the user about their
appointments at an arbitrarily specified time. For instance, they
might remind the user ten minutes before each appointment.
SmartReminder, on the other hand, uses contextual information,
like location, to better estimate the appropriate reminder time
for each appointment. It reminds the user based on where they are,
where they need to be, and how long it will take them to get
there. This paper presents SmartReminder as an illustration of how
context-sensitive applications can be designed using the SOLAR
system for dissemination of contextual information.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T Measuring early usage of Dartmouth's wireless network
%A Pablo Stern
%R Technical Report TR2001-393
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-393.ps.Z
%X
In Spring 2001, Dartmouth College installed a campus-wide 802.11b
wireless network. To understand how that network is used, we examined
the usage characteristics of the network over a five-week period. We
monitored access points to determine user behavior, and user and
network traffic characteristics. Because our study coincided with the
deployment of the access points, our analysis captures the growth of
a wireless network. The results of this study help understand the
behavior of mobile users and provide a reference to network engineers
wishing to deploy and expand similar wireless networks.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T An Empirical Study of Training and Testing Error in Boosting
%A David D. Latham
%R Technical Report TR2001-394
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-394.ps.Z
%X
Bounds have been proven for both training and testing error for the boosting
algorithm AdaBoost, but in practice neither seem to produce a particularly
tight bound. In this paper we share some observations of these bounds from
empirical results, and then explore some properties of the algorithm with an
eye towards finding an improved bound for the performance of AdaBoost.
Based on our empirical evidence, the error of a hypothesis which labels
examples probabilistically based upon the confidence of the vote of the weak
hypotheses forms a tighter bound for the training error.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T An Implementation of Object-Oriented Program Transformation for Thought-Guided Debugging
%A Tiffany M. Wong
%R Technical Report TR2001-395
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-395.ps.Z
%X
This paper presents our design and implementation of program transformation for
C++ that will be used in the context of a thought-guided debugging system. The
program uses a lexical analyzer written in Flex and a grammar written in Bison
that work in conjunction to scan the inputted C++ code for function definitions
and class definitions. The code is then transformed to produce trace
information for each defined function, while the original functionality of the
code is left untouched. We also implement two additional data structures that
are used for information storage during the course of the program.
%Z
Senior Honors Thesis. Advisor: Tom Cormen
%T Implementing a Database Information System for an Electronic Baseball Scorecard
%A Tiffany M. Wong
%R Technical Report TR2001-396
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-396.ps.Z
%X
We present our design and implementation of a database system of information
storage and retrieval for an electronic baseball scorecard. The program uses
the relational MySQL database to hold information and a Tcl API to handle
interactions between the database and the user interface code. This paper
discusses the inner workings of how information storage was broken down inside
the database, how queries were internally constructed in accordance with the
user's input, and how statistics for players and teams were calculated and
returned to the user. Finally, we discuss some limitations attached to our
current implementation of the program and propose improvements that can be made
in future versions.
%Z
Senior Honors Thesis. Advisor: Tom Cormen
%T Supporting Adaptive Ubiquitous Applications with the SOLAR System
%A Guanling Chen
%A David Kotz
%R Technical Report TR2001-397
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-397.ps.Z
%X
As we embed more computers into our daily environment, ubiquitous
computing promises to make them less noticeable and help to prevent
information overload. We see, however, few ubiquitous applications
that are able to adapt to the dynamics of user, physical, and
computational context. We believe that there are two challenges
causing this lack of ubiquitous applications: there is no flexible and
scalable way to support information collection and dissemination in a
ubiquitous and mobile environment, and there is no general approach to
building adaptive applications given heterogeneous contextual
information. We propose a system infrastructure, Solar, to meet these
challenges. Solar uses a subscription-based operator graph abstraction
and allows dynamic composition of stackable operators to manage
ubiquitous information sources. After developing a set of diverse
adaptive applications, we expect to identify fundamental techniques
for context-aware adaptation. Our expectation is that Solar's
end-to-end support for information collection, dissemination, and
utilization will make it easy to build adaptive applications for a
ubiquitous mobile environment with many users and devices.
%T A System for Audio Personalization with Applications on Wireless Devices
%A David Marmaros
%R Technical Report TR2001-398
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-398.pdf
%X
We present and analyze a system for dynamically tailoring discrete audio content for numerous users based on aggregate data and intuitive feedback mechanisms. The framework for this system utilizes a flexible client-server architecture to facilitate audio dissemination, with particular attention to distribution over wireless networks. We discuss the requirements and specifications of such a system. We further analyze the algorithms and protocols required for its operation. Finally, we outline and provide data from a demonstration of this application.
%Z
Senior Honors Thesis. Advisors: David Kotz and Daniela Rus.
%T WebALPS Implementation and Performance Analysis: Using Trusted Co-servers to Enhance Privacy and Security of Web Interactions
%A Shan Jiang
%R Technical Report TR2001-399
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-399.ps.Z
%X
The client-server model of the Web poses a fundamental trust issue:
clients are forced to trust in secrecy and correctness of computation
occurring at a remote server of unknown credibility. The current solution
for this problem is to use a PKI (Public Key Infrastructure) system and
SSL (Secure Sockets Layer) digital certificates to prove the claimed
identity of a server and establish an authenticated, encrypted channel
between the client and this server. However, this approach does not
address the security risks posed by potential malicious server
operators or any third parties who may penetrate the server sites.
The WebALPS (Web Applications with Lots of Privacy and Security)
approach is proposed to address these weaknesses by moving sensitive
computations at server side into trusted co-servers running inside
high-assurance secure coprocessors.
In this report, we examine the foundations of the credibility of
WebALPS co-servers. Then we will describe our work of designing and
building a prototype WebALPS co-server, which is integrated into the
widely-deployed, commercial-grade Apache server. We will also present
the performance test results of our system which support the argument
that WebALPS approach provides a systematic and practical way to address the
remote trust issue.
%Z
Master's thesis.
%T An Armored Data Vault
%A Alex Iliev
%R Technical Report TR2001-400
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-400.pdf
%X
We consider the problem of secure long-term archiving of
network traffic, an instance of the problem of storing data
securely. We approach the problem using secure hardware, which enables
the enforcement of flexible access policy. The policy cannot be
circumvented by anyone, even insiders, and so we are assured that
access to the data is as originally intended. The policy can be
expressed as any feasible computation, as it will be checked inside
the secure hardware without possibility of interference. We discuss
our design of a device to perform such network data archiving and have
implemented a prototpe device. We discuss other possible application
areas of the design.
%Z
Senior Honors Thesis.
Advisor: Sean Smith.
%T Outbound Authentication for Programmable Secure Coprocessors
%A Sean W. Smith
%R Technical Report TR2001-401
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-401.ps.Z
%X
A programmable secure coprocessor platform can help solve many
security problems in distributed computing. These solutions usually
require that coprocessor applications be able to participate as
full-fledged parties in distributed cryptographic protocols. Thus, to
fully enable these solutions, a generic platform must not only provide
programmability, maintenance, and configuration in the hostile
field---it must also provide outbound authentication for the
entities that result. A particular application on a particular
untampered device must be able to prove who it is to a party on the
other side of the Internet.
To be effective, a secure outbound authentication service must closely
mesh with the overall security architecture. Our initial architecture
only sketched a rough design for this service, and did not complete
it. This paper presents our research and development experience in
refining and implementing this design, to provide PKI-based outbound
authentication for the IBM 4758 Model 2 secure coprocessor platform.
%T Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs
%A Jeremy T. Fineman
%R Technical Report TR2001-402
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-402.ps.Z
%X
We present an improved version of the Dimensional Method for computing
multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor
system when the data consist of too many records to fit into
memory. Data are spread across parallel disks and processed in
sections. We use the Parallel Disk Model for analysis.
The simple Dimensional Method performs the 1-dimensional FFTs for
each dimension in term. Between each dimension, an out-of-core
permutation is used to rearrange the data to contiguous locations.
The improved Dimensional Method processes multiple dimensions at a
time.
We show that determining an optimal sequence and groupings of
dimensions is NP-complete. We then analyze the effects of two
modifications to the Dimensional Method independently: processing
multiple dimensions at one time, and processing single dimensions in a
different order.
Finally, we show a lower bound on the I/O complexity of the
Dimensional Method and present an algorithm that is approximately
asymptotically optimal.
%Z
Senior Honors Thesis. Advisor: Tom Cormen.
%T EcomRISK.org : A site to classify and organize the risks of performing business on the Internet
%A Aidan S. Marcuss
%R Technical Report TR2001-403
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-403.ps.Z
%X
As the use of the Internet and other computer networks to transact
business grows, there is an ever increasing need for those taking
part in those transactions to understand the risks of doing so. While
there are many web sites that have created valuable databases of
specific vulnerabilities for certain types of hardware and software,
there is a lack of focus on attempting to analyze the interaction of
businesses, their systems, computer networks, and their customers and
the risks that are created by either intended or unattended
interactions. EcomRISK.org is a web site that presents a clear
taxonomy to classify these risks and provides other features to aid
in the general discussion of e-commerce risk. The site, and the
taxonomy at the center of it, creates a database of these incidents
so they can be clearly searched. This paper discusses the creation of
EcomRISK.org, from vision to birth.
%Z
Senior Honors Thesis. Advisor: Fillia Makedon.
%T Efficient Compression of Generic Function Dispatch Tables
%A Eric Kidd
%R Technical Report TR2001-404
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-404.ps.Z
%X
A generic function is similar to an overloaded operator, but provides a way
to select an appropriate behavior at run-time instead of compile-time.
Dujardin and colleagues have proposed an algorithm for building and
compressing generic function dispatch tables.
We present several modifications to their algorithm, including an
improvement to Pseudo-Closest-Poles and two new algorithms for compressing
pole tables. The two new compression algorithms are simple and fast, and
one produces smaller output than the original.
%Z
Senior Honors Thesis. Advisor: Chris Hawblitzel.
%T DaSSFNet: An Extension to DaSSF for High-Performance Network Modeling
%A Mehmet Iyigun
%R Technical Report TR2001-405
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-405.ps.Z
%X
Scalable Simulation Framework (SSF) is a discrete-event simulation
framework providing a unified programming interface geared towards
network simulation. Dartmouth SSF (DaSSF) is a C++ implementation of
SSF, designed for simulating very large-scale multi-protocol
communication networks. As of the latest release, DaSSF lacks many
features present in SSF and this prevents it from achieving
mainstream use. To alleviate this shortcoming we designed and
implemented DaSSFNet which extends DaSSF to the levels of
functionality found in SSF. In this paper, we show that DaSSFNet and
SSFNet are identical in operation given the same input. We also show
that DaSSFNet is about twice as fast and has one third the memory
consumption of SSFNet when simulating identical networks. Therefore,
we argue, that the DaSSF simulation package with DaSSFNet now offers
a viable alternative to SSF in high-performance network simulation.
%Z
Senior Honors Thesis. Advisor: David M. Nicol
%T Fastab: Solving the Pitch to Notation Problem
%A Jeremy I. Robin
%R Technical Report TR2001-406
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-406.ps.Z
%X
I have always been frustrated with the length of time necessary to
notate a piece of music. Computers have simplified so many other
aspects of our lives, it seems that they should be able to simplify
this task as well. In fact, there are already two distinct ways that
engineers have attempted to attack this problem. The first analyzes
the waveform generated by microphone input and relies on Fourier
Analysis and other similar methods. The other examines the analog
signal generated by a electric guitar-like pickup placed beneath the
strings. The method used by Fastab relies much less on the musical
properties of an instrument. Instead, Fastab records where and when
the fingers and pick contact the instrument using digital electronics
and microprocessor technology. Fastab provides a solution to the
pitch to notation problem which is cheaper and more accurate than any
other device available today.
%Z
Senior Honors Thesis. Advisor: Scot Drysdale
%T TCP/IP Implementation within the Dartmouth Scalable Simulation Framework
%A Michael G. Khankin
%R Technical Report TR2001-407
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-407.ps.Z
%X
This paper discusses TCP/IP networking, and in particular, the DaSSF
implementation of TCP/IP. The paper reviews the protocols, outlines the
implementation design, and demonstrates some tests. In addition, some
performance and memory usage analysis is performed.
We find DaSSF TCP/IP to be a viable option to the existing SSF. DaSSF
TCP/IP is faster and uses less memory so we can simulate larger, more
complex, models.
%Z
Senior Honors Thesis. Advisor: David M. Nicol
%T Market-based Control of Mobile-agent Systems
%A Jonathan L. Bredin
%R Technical Report TR2001-408
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-408.ps.Z
%X
Modern distributed systems scatter sensors, storage, and computation
throughout the environment.
Ideally these devices communicate and share resources, but there
is seldom motivation for a device's owner to yield control to another user.
We establish markets for computational resources
to motivate principals to share resources with arbitrary users, to enforce
priority in distributed systems, to provide flexible and rational limitations
on the potential of an application, and to provide a lightweight structure
to balance the workload over time and between devices.
As proof of concept, we implement a structure software agents
can use to discover and negotiate access to networked resources.
The structure separates discovery, authentication, and consumption
enforcement as separate orthogonal issues to give system
designers flexibility.
Mobile agents represent informational and computational flow.
We develop mechanisms that distributively allocate computation among
mobile agents in two settings.
The first models a situation where users collectively own networked
computing resources and require priority enforcement.
We extend the allocation mechanism to allow
resource reservation to mitigate utility volatility.
The second, more general model relaxes the ownership assumption.
We apply our computational market to an open setting where a principal's
chief concern is revenue maximization.
Our simulations compare the performance of market-based allocation
policies to traditional policies and relate the cost of ownership and
consumption separation.
We observe that our markets effectively prioritize applications'
performance, can operate under uncertainty and network delay, provide
metrics to balance network load, and allow measurement of
market-participation risk versus reservation-based computation.
In addition to allocation problems, we investigate resource selection
to optimize execution time.
The problem is NP-complete if the costs and latencies are constant.
Both metrics' dependence on the chosen set complicates matters.
We study how a greedy approach, a novel heuristic, and a
shortest-constrained-path strategy perform in mobile-agent applications.
Market-based computational-resource allocation fertilizes
applications where previously there was a dearth of motive for or means
of cooperation.
The rationale behind mobile-agent performance optimization is also useful
for resource allocation in general distributed systems where an application
has a sequence of dependent tasks or when data collection is expensive.
%Z
A Ph.D thesis from the Department of Computer Science, Dartmouth College.
%T Web Spoofing 2001
%A Yougu Yuan
%A Eileen Zishuang Ye
%A Sean W. Smith
%R Technical Report TR2001-409
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-409.ps.Z
%X
The Web is currently the pre-eminent medium for electronic
service delivery to remote users. As a consequence,
authentication of servers is more important than ever. Even
sophisticated users base their decision whether or not to trust a
site on browser cues---such as location bar information, SSL
icons, SSL warnings, certificate information, response time, etc.
In their seminal work on web spoofing, Felten et al showed how a
malicious server could forge some of these cues---but using
approaches that are no longer reproducible. However, subsequent
evolution of Web tools has not only patched security holes---it
has also added new technology to make pages more interactive and
vivid. In this paper, we explore the feasibility of web spoofing
using this new technology---and we show how, in many cases, every
one of the above cues can be forged.
In particular, we show how a malicious server can forge all the
SSL information a client sees---thus providing a cautionary tale
about the security of one of the most common applications of PKI.
We stress that these techniques have been implemented, and are
available for public demonstration.
%T Securing Web Servers against Insider Attack
%A Shan Jiang
%A Sean Smith
%A Kazuhiro Minami
%R Technical Report TR2001-410
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-410.ps.Z
%X
Too often, ``security of Web transactions'' reduces to ``encryption of
the channel''---and neglects to address what happens at the server on
the other end. This oversight forces clients to trust the good
intentions and competence of the server operator---but gives clients
no basis for that trust. Furthermore, despite academic and industrial
research in secure coprocessing, many in the computer science
community still regard ``secure hardware'' as a synonym for
``cryptographic accelerator.' This oversight neglects the real
potential of COTS secure coprocessing technology to establish trusted
islands of computation in hostile environments---such as at web
servers with risk of insider attack.
In this paper, we apply secure coprocessing and cryptography to solve
this real problem in Web technology. We present a vision: using
secure coprocessors to establish trusted co-servers at Web servers and
moving sensitive computations inside these co-servers. We present a
prototype implementation of this vision that scales to realistic
workloads. Finally, we validate this approach by building a simple
E-voting application on top of our prototype.
From our experience, we conclude that this approach provides a
practical and effective way to enhance the security of Web servers
against insider attack.
%T Write Once, Move Anywhere: Toward Dynamic Interoperability of Mobile Agent Systems
%A Arne Grimstrup
%A Robert S. Gray
%A David Kotz
%A Thomas Cowin
%A Greg Hill
%A Niranjan Suri
%A Daria Chacon
%A Martin Hofmann
%R Technical Report TR2001-411
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-411.ps.Z
%X
Mobile agents are an increasingly popular paradigm,
and in recent years there has been a proliferation
of mobile-agent systems. These systems are, however,
largely incompatible with each other. In particular,
agents cannot migrate to a host that runs a
different mobile-agent system. Prior approaches to
interoperability have tried to force agents to use a
common API, and so far none have succeeded. Our
goal, summarized in the catch phrase ``Write
Once, Move Anywhere,'' led to our efforts to develop
mechanisms that support dynamic runtime
interoperability of mobile-agent systems. This paper
describes the Grid Mobile-Agent System,
which allows agents to migrate to different
mobile-agent systems.
%Z
Revised July 25, 2001.
%T Detecting Steganographic Messages in Digital Images
%A Hany Farid
%R Technical Report TR2001-412
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-412.ps.Z
%X
Techniques and applications for information hiding have
become increasingly more sophisticated and widespread. With
high-resolution digital images as carriers, detecting the presence of
hidden messages has also become considerably more difficult. It is
sometimes possible, nevertheless, to detect (but not necessarily
decipher) the presence of embedded messages. The basic approach taken
here works by finding predictable higher-order statistics of
``natural'' images within a multi-scale decomposition, and then
showing that embedded messages alter these statistics.
%T Differential Elastic Image Registration
%A Senthil Periaswamy
%A Hany Farid
%R Technical Report TR2001-413
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-413.ps.Z
%X
We have applied techniques from differential motion
estimation to the problem of automatic elastic registration of medical
images. This method models the mapping between images as a locally
affine but globally smooth warp. The mapping also explicitly accounts
for variations in image intensities. This approach is simple and
highly effective across a broad range of medical images. We show the
efficacy of this approach on several synthetic and clinical images.
%T Decentralized Control for Coordinated flow of Multi-Agent Systems
%A Valentino Crespi
%A George Cybenko
%A Massimo Santini
%A Daniela Rus
%R Technical Report TR2002-414
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-414.ps.Z
%X
This paper describes a distributed algorithm for coordinating the flow
of a mass of vehicles approaching a highway exit or a tollbooth. We
provide the problem formulation, a general methodology for distributed
control and an instantiation of this methodology to the coordinated
flow problem. We analyze our algorithm and provide experimental data.
%T Future Directions for Mobile-Agent Research
%A David Kotz
%A Robert S. Gray
%A Daniela Rus
%R Technical Report TR2002-415
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-415.ps.Z
%X
During a discussion in September 2000 the authors
examined the future of research on mobile agents and
mobile code. (A mobile agent is a running program
that can move from host to host in network at times
and to places of its own choosing.) In this paper we
summarize and reflect on that discussion. It became
clear that the field should shift its emphasis
toward mobile code, in all its forms, rather than to
continue its narrow focus on mobile
agents. Furthermore, we encourage the development of
modular components, so that application designers
may take advantage of code mobility without needing
to rewrite their application to fit in a monolithic
mobile-agent system. There are many potential
applications that may productively use mobile code,
but there is no ``killer application'' for mobile
agents. Finally, we note that although security is
an important and challenging problem, there are many
applications and environments with security
requirements well within the capability of existing
mobile-code and mobile-agent frameworks.
%T Virtual Hierarchies - An Architecture for Building and Maintaining Efficient and Resilient Trust Chains.
%A John C. Marchesini
%A Sean W. Smith
%R Technical Report TR2002-416
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-416.ps.Z
%X
In Public Key Infrastructure (PKI), the simple, monopolistic CA model
works fine until we consider the real world. Then, issues such as
scalability and mutually suspicious organizations create the need for
a multiplicity of CAs, which immediately introduces the problem of how
to organize them to balance resilience to compromise against
efficiency of path discovery.
However, security has given us tools such as secure coprocessing,
secret splitting, secret sharing, and threshold cryptography for
securely carrying out computations among multiple trust domains;
distributed computing has given us peer-to-peer networking, for
creating self-organizing distributed systems.
In this paper, we use these latter tools to address the former
problem by overlaying a virtual hierarchy on a mesh architecture of
peer CAs, and achieving both resilience and efficiency.
%T Web Spoofing Revisited: SSL and Beyond
%A Eileen Ye
%A Yougu Yuan
%A Sean Smith
%R Technical Report TR2002-417
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-417.ps.Z
%X
Can users believe what their browsers tell them? Even sophisticated
Web users decide whether or not to trust a server based on browser
cues such as location bar information, SSL icons, SSL warnings,
certificate information, and response time. In their seminal work on
Web spoofing, Felten et al showed how, in 1996, a
malicious server could forge some of these cues. However, this work
used genuine SSL sessions, and Web technology has evolved much since
1996.
The Web has since become the pre-eminent medium for electronic service
delivery to remote users, and the security of many commerce,
government, and academic network applications critically rests on the
assumption that users can authenticate the servers with which they
interact. This situation raises the question: is the browser-user
communication model today secure enough to warrant this assumption?
In this paper, we answer this question by systematically showing how a
malicious server can forge every one of the above cues. Our work
extends the prior results by examining contemporary browsers, and by
forging all of the SSL information a client sees, including the very
existence of an SSL session (thus providing a cautionary tale about
the security of one of the most common applications of PKI). We have
made these techniques available for public demonstration, because
anything less than working code would not convincingly answer the
question. We also discuss implications and potential countermeasures,
both short-term and long-term.
%Z
This TR supercedes TR2001-409.
%T Trusted Paths for Browsers: An Open-Source Solution to Web Spoofing
%A Eileen Ye
%A Sean Smith
%R Technical Report TR2002-418
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-418.ps.Z
%X
The security of the vast majority of ``secure'' Web services rests on
SSL server PKI. However, this PKI doesn't work if the the adversary
can trick the browser into appearing to tell the user the wrong thing
about the certificates and cryptography. The seminal web spoofing
work of Felten et al demonstrated the potential, in 1996, for
malicious servers to impersonate honest servers. Our recent follow-up
work explicitly shows how malicious servers can still do this---and
can also forge the existence of an SSL session and the contents of the
alleged server certificate.
This paper reports the results of our work to systematically defend
against Web spoofing, by creating a trusted path from the browser to
the user. Starting with the Mozilla source, we have implemented
techniques that protect a wide variety of browser-user communications,
that require little participation by the user and minimal disruption
of the displayed server content. We have prepared shell scripts that
install these modifications on the Mozilla source, to enable others to
replicate this work.
In on-going work, we are cleaning up and fine-tuning our code. In
future work, we hope to examine more deeply the role of user
interfaces in enabling users to make effective trust judgments.
%T FFTs for the 2-Sphere - Improvements and Variations
%A Dennis M. Healy
%A Daniel N. Rockmore
%A Peter J. Kostelec
%A Sean S. B. Moore
%R Technical Report TR2002-419
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-419.ps.Z
%X
Earlier work by Driscoll and Healy has produced an efficient
algorithm for computing the Fourier transform of band-limited
functions on the 2-sphere. In this paper we present a reformulation
and variation of the original algorithm which results in a greatly
improved inverse transform, and consequent improved convolution
algorithm for such functions. All require at most $O(N\log^2 N)$
operations where $N$ is the number of sample points. We also address
implementation considerations and give heuristics for allowing
reliable and computationally efficient floating point implementations
of slightly modified algorithms.
These claims are supported by extensive numerical experiments from
our implementation in C on DEC, HP, SGI and Linux Pentium
platforms. These results indicate that variations of the algorithm
are both reliable and efficient for a large range of useful problem
sizes. Performance appears to be architecture-dependent. The paper
concludes with a brief discussion of a few potential applications.
%Z
Preliminary versions of some of these results have appeared in the
Dartmouth College Department of Computer Science Technical Report
PCS-TR94-222 and ``An FFT for the 2-sphere and Applications'',
Proc. of ICASSP-96, Volume 3, pp. 1323--1326.
%T Context Aggregation and Dissemination in Ubiquitous Computing Systems
%A Guanling Chen
%A David Kotz
%R Technical Report TR2002-420
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-420.ps.Z
%X
Many ``ubiquitous computing'' applications need a constant flow of
information about their environment to be able to adapt to their
changing context. To support these ``context-aware'' applications we
propose a graph-based abstraction for collecting, aggregating, and
disseminating context information. The abstraction models context
information as events, produced by sources and flowing through a
directed acyclic graph of event-processing operators and delivered to
subscribing applications. Applications describe their desired event
stream as a tree of operators that aggregate low-level context
information published by existing sources into the high-level context
information needed by the application. The operator graph is thus the
dynamic combination of all applications' subscription trees.
In this paper, we motivate and describe our graph abstraction, and
discuss a variety of critical design issues. We also sketch our Solar
system, an implementation that represents one point in the design
space for our graph abstraction.
%Z
To appear in WMCSA 2002.
%T Solar: A pervasive-computing infrastructure for context-aware mobile applications
%A Guanling Chen
%A David Kotz
%R Technical Report TR2002-421
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-421.ps.Z
%X
Emerging pervasive computing technologies transform the way we live
and work by embedding computation in our surrounding environment. To
avoid increasing complexity, and allow the user to concentrate on her
tasks, applications must automatically adapt to their changing
\emph{context}, the physical and computational environment in which
they run. To support these ``context-aware'' applications we propose
a graph-based abstraction for collecting, aggregating, and
disseminating context information. The abstraction models context
information as \emph{events}, which are produced by \emph{sources},
flow through a directed acyclic graph of event-processing
\emph{operators}, and are delivered to subscribing applications.
Applications describe their desired event stream as a tree of
operators that aggregate low-level context information published by
existing sources into the high-level context information needed by the
application. The \emph{operator graph\/} is thus the dynamic
combination of all applications' subscription trees. In this paper,
we motivate our graph abstraction by discussing several applications
under development, sketch the architecture of our system (``Solar'')
that implements our abstraction, report some early experimental
results from the prototype, and outline issues for future research.
%T Controlling access to pervasive information in the ``Solar'' system
%A Kazuhiro Minami
%A David Kotz
%R Technical Report TR2002-422
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-422.ps.Z
%X
Pervasive-computing infrastructures necessarily collect a lot of
context information to disseminate to their context-aware
applications. Due to the personal or proprietary nature of much of
this context information, however, the infrastructure must limit
access to context information to authorized persons. In this paper we
propose a new access-control mechanism for event-based
context-distribution infrastructures. The core of our approach is
based on a conservative information-flow model of access control, but
users may express discretionary relaxation of the resulting
access-control list (ACL) by specifying relaxation functions.
This combination of automatic ACL derivation and user-specified ACL
relaxation allows access control to be determined and enforced in a
decentralized, distributed system with no central administrator or
central policy maker. It also allows users to express their personal
balance between functionality and privacy. Finally, our
infrastructure allows access-control policies to depend on
context-sensitive roles, allowing great flexibility.
We describe our approach in terms of a specific context-dissemination
framework, the Solar system, although the same principles would apply
to systems with similar properties.
%T Characterizing Usage of a Campus-wide Wireless Network
%A David Kotz
%A Kobby Essien
%R Technical Report TR2002-423
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-423.ps.Z
%X
Wireless local-area networks (WLANs) are increasingly common, but
little is known about how they are used. A clear understanding of
usage patterns in real WLANs is critical information to those who
develop, deploy, and manage WLAN technology, as well as those who
develop systems and application software for wireless networks. This
paper presents results from the largest and most comprehensive trace
of network activity in a large, production wireless LAN. For eleven
weeks we traced the activity of nearly two thousand users drawn from a
general campus population, using a campus-wide network of 476 access
points spread over 161 buildings. Our study expands on those done by
Tang and Baker, with a significantly larger and broader population.
We found that residential traffic dominated all other traffic,
particularly in residences populated by newer students; students are
increasingly choosing a wireless laptop as their primary computer.
Although web protocols were the single largest component of traffic
volume, network backup and file sharing contributed an unexpectedly
large amount to the traffic. Although there was some roaming within a
network session, we were surprised by the number of situations in
which cards roamed excessively, unable to settle on one access point.
Cross-subnet roams were an especial problem, because they broke IP
connections, indicating the need for solutions that avoid or
accommodate such roams.
%Z
A major revision of this paper appeared in Mobicom 2002.
The Mobicom paper contained two erroneous figures, however;
another report, TR2002-432, is the final corrected version.
%T Metasearch: Data Fusion for Document Retrieval
%A Mark H. Montague
%R Technical Report TR2002-424
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-424.ps.Z
%X
The metasearch problem is to optimally merge the ranked lists
output by an arbitrary number of search systems into one ranked list.
In this work:
(1) We show that metasearch improves upon not just the raw performance
of the input search engines, but also upon the consistency of the
input search engines from query to query.
(2) We experimentally prove that simply weighting input systems by
their average performance can dramatically improve fusion results.
(3) We show that score normalization is an important component of a
metasearch engine, and that dependence upon statistical outliers
appears to be the problem with the standard technique.
(4) We propose a Bayesian model for metasearch that outperforms the
best input system on average and has performance competetive with
standard techniques.
(5) We introduce the use of Social Choice Theory to the metasearch
problem, modeling metasearch as a democratic election. We adapt a
positional voting algorithm, the Borda Count, to create a metasearch
algorithm, acheiving reasonable performance.
(6) We propose a metasearch model adapted from a majoritarian voting
procedure, the Condorcet algorithm. The resulting algorithm is the
best performing algorithm in a number of situations.
(7) We propose three upper bounds for the problem, each bounding a
different class of algorithms.
We present experimental results for each algorithm using two types of
experiments on each of four data sets.
%Z
Ph.D dissertation.
%T The Future of Cryptography Under Quantum Computers
%A Marco A. Barreno
%R Technical Report TR2002-425
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-425.ps.Z
%X
Cryptography is an ancient art that has passed through many
paradigms, from simple letter substitutions to polyalphabetic
substitutions to rotor machines to digital encryption to public-key
cryptosystems. With the possible advent of quantum computers and the
strange behaviors they exhibit, a new paradigm shift in cryptography
may be on the horizon. Quantum computers could hold the potential to
render most modern encryption useless against a quantum-enabled
adversary. The aim of this thesis is to characterize this convergence
of cryptography and quantum computation.
We provide definitions for cryptographic primitives that frame them
in general terms with respect to complexity. We explore the various
possible relationships between BQP, the primary quantum complexity
class, and more familiar classes, and we analyze the possible
implications for cryptography.
%Z
This paper was written as a senior honors thesis with advisor Sean
W. Smith.
%T Role Definition Language (RDL): A Language to Describe Context-Aware Roles
%A Christopher P. Masone
%R Technical Report TR2002-426
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-426.ps.Z
%X
As wireless networks become more prevalent, a widening array
of computational resources becomes available to the mobile user.
Since not all users should have unrestricted access to these
resources, a method of access control must be devised. In a
context-aware environment, context information can be used to
supplement more conventional password-based access control systems.
We believe the best way to achieve this is through the use of
Context-Aware Role-Based Access Control, a model in which permissions
are assigned to entities called roles, each principal is a member of
one or more roles, and a role's membership is determined using
context information. We designed and implemented RDL
(Role-Definition Language), a simple, expressive and somewhat
extensible programming language to facilitate the description of
roles in terms of context information.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T Performance and Interoperability In Solar
%A A. Abram White
%R Technical Report TR2002-427
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-427.ps.Z
%X
Ubiquitous computing promises to integrate computers into our physical
environment, surrounding us with applications that are able to adapt
to our dynamics. Solar is a software infrastructure designed to
deliver contextual information to these applications. To serve the
large number and wide variety of context-aware devices envisioned by
ubiquitous computing, Solar must exhibit both high performance and the
ability to interoperate with many computing platforms. We created a
testing framework to measure the performance of distributed systems
such as Solar, as well as a pluggable data-transfer mechanism to
support the dissemination of information to heterogeneous
applications. This paper explores the testing framework developed,
analyzes its findings concerning the performance of the current Solar
prototype, presents several optimizations to Solar and their effects,
and finally discusses the design of the pluggable data-transfer
mechanism.
%Z
Senior Honors Thesis. Advisor: David Kotz.
See also TR2002-429.
%T Information-theoretic Bounds on the Training and Testing Error of Boosting
%A Sebastien M. Lahaie
%R Technical Report TR2002-428
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-428.ps.Z
%X
Boosting is a means of using weak learners as subroutines to produce a
strong learner with markedly better accuracy. Recent results showing
the connection between logistic regression and boosting provide the
foundation for an information-theoretic analysis of boosting. We
describe the analogy between boosting and gambling, which allows us to
derive a new upper bound on training error. This upper bound
explicitly describes the effect of noisy data on training error. We
also use information-theoretic techniques to derive an alternative upper-bound
on testing error which is independent of the size of the weak-learner
space.
%T XSLT and XQuery as Operator Languages
%A A. Abram White
%R Technical Report TR2002-429
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-429.ps.Z
%X
Ubiquitous computing promises to integrate computers into our physical
environment, surrounding us with applications that are able to adapt
to our dynamics. Solar is a software infrastructure designed to
deliver contextual information to these applications. Solar
represents context data as events, and uses small programs called
operators to filter, merge, aggregate, or transform event streams.
This paper explores the possibility of using XSLT and XQuery to build
language-neutral Solar operators.
%Z
See also TR2002-427.
%T Building Trusted Paths for Web Browsers
%A Eileen Zishuang Ye
%R Technical Report TR2002-430
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-430.ps.Z
%X
The communication between the Web browser and the human
user is one component of the server-client channel. It is not
the user but the browser that receives all server
information and establishes the secure connection. The
browser's user interface signals, such as SSL lock, https
protocol header et al., indicate whether the browser-server
communication at the current moment is secure. Those
user interface signals indicating the security status of
browser should be clearly and correctly understood by the
user.
A survey of modern Web browsers shows the information
provided by current browsers is insufficient for users to
make trust judgment. Our Web spoofing work further proved
that the browser status information is not reliable either.
We discuss the criteria for and how to build the
trusted paths between a browser and a human user. We present
an open source implementation of one of the
designs--synchronized random dynamic (SRD) boundary, based
on Modified Mozilla source code, together with its usability
study results.
%T Analysis of Protein Sequences Using Time Frequency and Kolmogorov-Smirnov Methods
%A Kobby Essien
%R Technical Report TR2002-431
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-431.ps.Z
%X
The plethora of genomic data currently available has resulted in a
search for new algorithms and analysis techniques to interpret
genomic data. In this two-fold study we explore techniques for
locating critical amino acid residues in protein sequences and for
estimating the similarity between proteins. We demonstrate the use
of the Short-Time Fourier Transform and the Continuous Wavelet
Transform together with amino acid hydrophobicity in locating
important amino acid domains in proteins and also show that the
Kolmogorov-Smirnov statistic can be used as a metric of protein
similarity.
%Z
Senior Honors Thesis. Advisor: Metin Akay.
%T Analysis of a Campus-wide Wireless Network
%A David Kotz
%A Kobby Essien
%R Technical Report TR2002-432
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-432.ps.Z
%X
Understanding usage patterns in wireless local-area networks (WLANs)
is critical for those who develop, deploy, and manage WLAN technology,
as well as those who develop systems and application software for
wireless networks. This paper presents results from the largest and
most comprehensive trace of network activity in a large, production
wireless LAN. For eleven weeks we traced the activity of nearly two
thousand users drawn from a general campus population, using a
campus-wide network of 476 access points spread over 161 buildings.
Our study expands on those done by Tang and Baker, with a
significantly larger and broader population.
We found that residential traffic dominated all other traffic,
particularly in residences populated by newer students; students are
increasingly choosing a wireless laptop as their primary computer.
Although web protocols were the single largest component of traffic
volume, network backup and file sharing contributed an unexpectedly
large amount to the traffic. Although there was some roaming within a
network session, we were surprised by the number of situations in
which cards roamed excessively, unable to settle on one access point.
Cross-subnet roams were an especial problem, because they broke IP
connections, indicating the need for solutions that avoid or
accommodate such roams.
%Z
This paper is a revision of the MOBICOM '02 paper by the same
title. The only difference is the correction of Figures 27-28
and the associated text. This report supplants TR2002-423.
%T Using the Emulab network testbed to evaluate the Armada I/O framework for computational grids
%A Ron Oldfield
%A David Kotz
%R Technical Report TR2002-433
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-433.ps.Z
%X
This short report describes our experiences using the Emulab network testbed
at the University of Utah to test performance of the Armada framework for
parallel I/O on computational grids.
%T Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
%A Ryan H. Lilien
%A Hany Farid
%A Bruce R. Donald
%R Technical Report TR2002-434
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-434.ps.Z
%X
We have developed an algorithm called Q5 for probabilistic
classification of healthy vs. disease whole serum samples using mass
spectrometry. The algorithm employs Principal Components Analysis (PCA)
followed by Linear Discriminant Analysis (LDA) on whole spectrum
Surface-Enhanced Laser Desorption/Ionization Time of Flight
(SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four
real datasets from complete, complex SELDI spectra of human blood
serum.
Q5 is a closed-form, exact solution to the problem of classification
of complete mass spectra of a complex protein mixture. Q5 employs a
novel probabilistic classification algorithm built upon a
dimension-reduced linear discriminant analysis. Our solution is
computationally efficient; it is non-iterative and computes the
optimal linear discriminant using closed-form equations. The optimal
discriminant is computed and verified for datasets of complete,
complex SELDI spectra of human blood serum. Replicate experiments of
different training/testing splits of each dataset are employed to
verify robustness of the algorithm. The probabilistic classification
method achieves excellent performance. We achieve sensitivity,
specificity, and positive predictive values above 97% on three
ovarian cancer datasets and one prostate cancer dataset. The Q5 method
outperforms previous full-spectrum complex sample spectral
classification techniques, and can provide clues as to the molecular
identities of differentially-expressed proteins and peptides.
%Z
To appear in Journal of Computational Biology (2003).
%T Distributed Algorithms for Guiding Navigation across a Sensor Network
%A Qun Li
%A Michael De Rosa
%A Daniela Rus
%R Technical Report TR2002-435
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-435.ps.Z
%X
We develop distributed algorithms for self-reconfiguring sensor
networks that respond to directing a target through a region. The
sensor network models the danger levels sensed across its area and has
the ability to adapt to changes. It represents the dangerous areas as
obstacles. A protocol that combines the artificial potential field of
the sensors with the goal location for the moving object guides the
object incrementally across the network to the goal, while maintaining
the safest distance to the danger areas. We report on hardware
experiments using a physical sensor network consisting of Mote
sensors.
%T Heterogeneous Self-Reconfiguring Robotics
%A Robert C. Fitch
%R Technical Report TR2002-436
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-436.ps.Z
%X
Self-reconfiguring robots are modular systems that can change shape,
or "reconfigure," to match structure to task. They comprise many
small, discrete, often identical modules that connect together and
that are minimally actuated. Global shape transformation is achieved
by composing local motions. Systems with a single module type, known
as "homogeneous" systems, gain fault tolerance, robustness and low
production cost from module interchangeability. However, we are
interested in "heterogeneous" systems, which include multiple types of
modules such as those with sensors, batteries or wheels. We believe
that heterogeneous systems offer the same benefits as homogeneous
systems with the added ability to match not only structure to task,
but also capability to task.
Although significant results have been achieved in understanding
homogeneous systems, research in heterogeneous systems is challenging
as key algorithmic issues remain unexplored. We propose in this thesis
to investigate questions in four main areas: 1) how to classify
heterogeneous systems, 2) how to develop efficient heterogeneous
reconfiguration algorithms with desired characteristics, 3) how to
characterize the complexity of key algorithmic problems, and 4) how to
apply these heterogeneous algorithms to perform useful new tasks in
simulation and in the physical world. Our goal is to develop an
algorithmic basis for heterogeneous systems. This has theoretical
significance in that it addresses a major open problem in the field,
and practical significance in providing self-reconfiguring robots with
increased capabilities.
%T Proofs of Soundness and Strong Normalization for Linear Memory Types
%A Heng Huang
%A Chris Hawblitzel
%R Technical Report TR2002-437
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-437.ps.Z
%X
Efficient low-level systems need more control over memory than safe high-level languages usually provide. As a result, run-time systems are typically written in unsafe languages such as C. This report describes an abstract machine designed to give type-safe code more control over memory. It includes complete definitions and proofs.
%T Exact formulae for the Lovasz Theta Function of sparse Circulant Graphs
%A Valentino Crespi
%R Technical Report TR2002-438
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-438.ps.Z
%X
The Lovasz theta function has attracted a lot of attention for its
connection with diverse issues, such as communicating without errors
and computing large cliques in graphs. Indeed this function enjoys the
remarkable property of being computable in polynomial time, despite
being sandwitched between clique and chromatic number, two well known
hard to compute quantities.
In this paper I provide a closed formula for the Lovasz function of a
specific class of sparse circulant graphs thus generalizing Lovasz
results on cycle graphs (circulant graphs of degree 2).
%T 3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings
%A Chris J. Langmead
%A Bruce R. Donald
%R Technical Report TR2003-439
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-439.pdf
%X
Recognition of a protein's fold provides valuable
information about its function. While many sequence-based
homology prediction methods exist, an important challenge remains:
two highly dissimilar sequences can have similar folds --- how can
we detect this rapidly, in the context of structural genomics?
High-throughput NMR experiments, coupled with novel algorithms for
data analysis, can address this challenge. We report an automated
procedure for detecting 3D-structural homologies from sparse,
unassigned protein NMR data.
Our method identifies the 3D-structural models in a protein
structural database whose geometries best fit the unassigned
experimental NMR data. It does not use sequence information and is
thus not limited by sequence homology. The method can also be
used to confirm or refute structural predictions made by other
techniques such as protein threading or sequence homology. The
algorithm runs in O(pnk3) time, where p is the number of
proteins in the database, n is the number of residues in the
target protein, and k is the resolution of a rotation search.
The method requires only uniform 15N-labelling of the protein
and processes unassigned 1H-15N residual dipolar couplings,
which can be acquired in a couple of hours. Our experiments on NMR
data from 5 different proteins demonstrate that the method
identifies closely related protein folds, despite low-sequence
homology between the target protein and the computed
model.
%Z
A revised and expanded version of this TR will appear as a refereed
paper at the
IEEE Computer Society Bioinformatics Conference
,
Stanford, California (August, 2003),
%T Efficient Security for BGP Route Announcements
%A David M. Nicol
%A Sean W. Smith
%A Meiyuan Zhao
%R Technical Report TR2003-440
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-440.R2.ps.Z
%X
The Border Gateway Protocol (BGP) determines how Internet traffic is
routed throughout the entire world; malicious behavior by one or more
BGP speakers could create serious security issues. Since the protocol
depends on a speaker honestly reporting path information sent by
previous speakers and involves a large number of independent speakers,
the Secure BGP (S-BGP) approach uses public-key cryptography to ensure
that a malicious speaker cannot fabricate this information. However,
such public-key cryptography is expensive: S-BGP requires a digital
signature operation on each announcement sent to each peer, and a
linear (in the length of the path) number of verifications on each
receipt. We use simulation of a 110 AS system derived from the
Internet to evaluate the impact that the processing costs of
cryptography have on BGP convergence time. We find that under heavy
load the convergence time using ordinary S-BGP is nearly twice as
large as under BGP. We examine the impact of highly aggressive caching
and pre-computation optimizations for S-BGP, and find that convergence
time is much closer to BGP. However, these optimizations may be
unrealistic, and are certainly expensive of memory. We consequently
use the structure of BGP processing to design optimizations that
reduce cryptographic overhead by amortizing the cost of private-key
signatures over many messages. We call this method
Signature-Amortization (S-A). We find that S-A provides as good or
better convergence times as the highly optimized S-BGP, but without
the cost and complications of caching and pre-computation. It is
possible therefore to minimize the impact route validation has on
convergence, by being careful with signatures, rather than consumptive
of memory.
%Z
Revision 2 released May 9, 2003.
Original revision 1, of February 2003, is available in
pdf or
ps.Z.
%T Flexible and Scalable Public Key Security for SSH
%A Yasir Ali
%A S. W. Smith
%R Technical Report TR2003-441
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-441.pdf
%X
A standard tool for secure remote access, the SSH protocol uses public-key cryptography to establish an encrypted and integrity-protected channel with a remote server. However, widely-deployed implementations of the protocol are vulnerable to man-in-the-middle attacks, where an adversary substitutes her public key for the server's. This danger particularly threatens a traveling user Bob borrowing a client machine.
Imposing a traditional X.509 PKI on all SSH servers and clients is neither flexible nor scalable nor (in the foreseeable future) practical. Requiring extensive work or an SSL server at Bob's site is also not practical for many users.
This paper presents our experiences designing and implementing an alternative scheme that solves the public-key security problem in SSH without requiring such an a priori universal trust structure or extensive sysadmin work--although it does require a modified SSH client. (The code is available for public download.)
%Z
A revised version, published as a conference paper, as follows:
Y. Ali, S.W. Smith.
"Flexible and Scalable Public Key Security for SSH."
Public Key Infrastructure: EuroPKI 2004, pp. 43-56,
LNCS 3093, June 2004. Springer-Verlag. DOI 10.1007/b98201.
http://dx.doi.org/10.1007/b98201
http://www.springerlink.com/content/h8jng6kc1rf3j97g/
%T Privacy-enhanced credential services
%A Alex Iliev
%A Sean Smith
%R Technical Report TR2003-442
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-442.ps.Z
%X
The use of credential directories in PKI and authorization systems such as
Shibboleth introduces a new privacy risk: an insider at the directory can learn
much about otherwise protected interactions by observing who makes queries, and
what they ask for. Recent advances in Practical Private Information Retrieval
provide promising countermeasures. In this paper, we extend this technology to
solve this new privacy problem, and present a design and preliminary prototype
for a LDAP-based credential service that can prevent even an insider from
learning anything more than the fact a query was made. Our preliminary
performance analysis suggests that the complete prototype may be sufficiently
robust for academic enterprise settings.
%Z
Submitted to the 2nd Annual PKI Research Workshop.
%T Keyjacking: Risks of the Current Client-side Infrastructure
%A John C. Marchesini
%A Sean W. Smith
%A Meiyuan Zhao
%R Technical Report TR2003-443
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-443.ps.Z
%X
In theory, PKI can provide a flexible and strong way to
authenticate users in distributed information systems. In practice,
much is being invested in realizing this vision via client-side SSL
and browser-based keystores. Exploring this vision, we demonstrate
that browsers will use personal certificates to authenticate requests
that the person neither knew of nor approved (and which password-based
systems would have defeated), and we demonstrate the easy permeability
of these keystores (including new attacks on medium and high-security
IE/XP keys). We suggest some countermeasures, but also suggest that a
fundamental rethinking of the trust, usage, and storage model might
result in a more effective PKI.
%T Stupid Columnsort Tricks
%A Geeta Chaudhry
%A Thomas H. Cormen
%R Technical Report TR2003-444
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-444.ps.Z
%X
Leighton's columnsort algorithm sorts on an $r \times s$ mesh, subject
to the restrictions that $s$ is a divisor of~$r$ and that $r \geq
2s^2$ (so that the mesh is tall and thin). We show how to mitigate
both of these restrictions. One result is that the requirement that
$s$ is a divisor of~$r$ is unnecessary; columnsort sorts correctly
whether or not $s$ divides~$r$. We present two algorithms that, as
long as $s$ is a perfect square, relax the restriction that $r \geq
2s^2$; both reduce the exponent of~$s$ to~$3/2$. One algorithm
requires $r \geq 4s^{3/2}$ if $s$ divides~$r$ and $r \geq 6s^{3/2}$ if
$s$ does not divide~$r$. The other algorithm requires $r \geq
4^{3/2}$, and it requires $s$ to be a divisor of~$r$. Both algorithms
have applications in increasing the maximum problem size in
out-of-core sorting programs.
%T Relaxing the Problem-Size Bound for Out-of-Core Columnsort
%A Geeta Chaudhry
%A Elizabeth A. Hamon
%A Thomas H. Cormen
%R Technical Report TR2003-445
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-445.ps.Z
%X
Previous implementations of out-of-core columnsort limit the problem
size to $N \leq \sqrt{(M/P)^3 / 2}$, where $N$ is the number of
records to sort, $P$ is the number of processors, and $M$ is the total
number of records that the entire system can hold in its memory (so
that $M/P$ is the number of records that a single processor can hold
in its memory). We implemented two variations to out-of-core
columnsort that relax this restriction. Subblock columnsort is based
on an algorithmic modification of the underlying columnsort algorithm,
and it improves the problem-size bound to $N \leq (M/P)^{5/3} /
4^{2/3}$ but at the cost of additional disk I/O\@. $M$-columnsort
changes the notion of the column size in columnsort, improving the
maximum problem size to $N \leq \sqrt{M^3 / 2}$ but at the cost of
additional computation and communication. Experimental results on a
Beowulf cluster show that both subblock columnsort and $M$-columnsort
run well but that $M$-columnsort is faster. A further advantage of
$M$-columnsort is that it handles a wider range of problem sizes than
subblock columnsort.
%T Efficient and Practical Constructions of LL/SC Variables
%A Prasad Jayanti
%A Srdjan Petrovic
%R Technical Report TR2003-446
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-446.pdf
%X
Over the past decade, LL/SC have emerged as the most suitable
synchronization instructions for the design of lock-free algorithms.
However, current architectures do not support these instructions;
instead, they support either CAS or RLL/RSC (e.g. POWER4, MIPS, SPARC, IA-64). To bridge this gap, this paper presents two efficient wait-free
algorithms for implementing 64-bit LL/SC objects from 64-bit CAS or
RLL/RSC objects.
Our first algorithm is practical: it has a small, constant time complexity (of 4 for LL and 5 for SC) and a space overhead of only 4 words per process. This algorithm uses unbounded sequence numbers.
For theoretical interest, we also present a more complex bounded algorithm
that still guarantees constant time complexity and O(1) space overhead per process.
The LL/SC primitive is free of the well-known ABA problem that afflicts CAS. By efficiently implementing LL/SC words from CAS words,
this work presents an efficient general solution to the ABA problem.
%T Billiards Adviser as a Search in a Continuous Domain with Significant Uncertainty
%A Thomas Mueller
%R Technical Report TR2003-448
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-448.pdf
%X
Typical search algorithms are limited to problems in which there is a certain number of moves for any given state, and the effect of each move is well known. In order to overcome this limitation, we consider the problem of determining the optimal shot given the positions of balls on a billiards table. Our solution includes the image recognition necessary to determine each ball's position, the calculation of the optimal shot, and the presentation of that shot to the player. The focus of the paper is on the second part - determining the angle and force with which the player should attempt to hit the cue ball for each shot in order to sink all of the other balls with the fewest shots. The solution to this problem is unique from other game search algorithms in that it must take into account the infinite number of possible shots given any configuration of balls as well as the fact that the player is not likely to hit the ball exactly how he attempts to do so. We compare the performance of our algorithm with one that ignores the latter fact to show that our modifications do in fact improve performance for a search in a continuous domain with significant uncertainty.
%Z
Advisor: Zack Butler
%T An Active Learning Approach to Efficiently Ranking Retrieval Engines
%A Lisa A. Torrey
%R Technical Report TR2003-449
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-449.ps.Z
%X
Evaluating retrieval systems, such as those submitted to the
annual TREC competition, usually requires a large number of
documents to be read and judged for relevance to query
topics. Test collections are far too big to be exhaustively
judged, so only a subset of documents is selected to form
the judgment ``pool.'' The selection method that TREC uses
produces pools that are still quite large. Research has
indicated that it is possible to rank the retrieval systems
correctly using substantially smaller pools.
This paper introduces an active learning algorithm whose
goal is to reach the correct rankings using the smallest
possible number of relevance judgments. It adds one document
to the pool at a time, always trying to select the document
with the highest information gain. Several variants of this
algorithm are described, each with improvements on the one
before. Results from experiments are included for comparison
with the traditional TREC pooling method. The best version
of the algorithm reliably outperforms the traditional
method, although its degree of improvement varies.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T An Evaluation of the Impact of Models for Radio Propagation on the Simulation of 802.11b Wireless Networks
%A Evan W. Richardson
%R Technical Report TR2003-450
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-450.ps.Z
%X
Working with an existing wireless network simulator, we describe the addition
of both a method for modeling arbitrary terrain, and for calculating signal
attenuation with the Irregular Terrain Model (ITM). We also investigate ITM's effects on upper protocol layer in comparison to the Two-Ray Ground Reflection
model. Upon examination, it was found that aside from the terrain between the
transmitter and receiver, ITM's various parameters are of little significance
in the computed signal attenuation. Further, examination of the behavior of the
upper protocol layers revealed that at high traffic levels, choice of propagation
model can have significant effects on the results of the simulation.
%Z
Senior Honors Thesis. Advisor: Felipe Perrone.
%T 802.11b Wireless Network Visualization and Radiowave Propagation Modeling
%A Chris Lentz
%R Technical Report TR2003-451
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-451.pdf
%X
This paper outlines the methods of creating detailed coverage maps
of 802.11b networks, with an emphasis on minimizing the expenses and
time involved. The goal of this work is to develop and present a
streamlined, reproducible approach to wireless visualization as well
as techniques for predicting coverage area before conducting network
installations.
After evaluating these coverage maps, a repeated series of field
measurements will be checked against interpolated values in order to
improve techniques for extrapolation of data for unsampled regions. If
successful, these extrapolation techniques will provide additional
guidelines for, and assist modeling of, new wireless network
installations. However, this paper demonstrates that due to the
microcellular structure of indoor/outdoor 802.11b networks, accurate
interpolation and propagation prediction techniques do not exist
independent of highly specific location models. In lieu of the
creation of extensive simulation environments, best practice
guidelines for municipal wireless network planning and deployment are
presented.
%Z
Senior thesis. Advisor: David Kotz.
%T An Analysis of Convergence Properties of the Border Gateway Protocol Using Discrete Event Simulation
%A Brian J. Premore
%R Technical Report TR2003-452
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-452.ps.Z
%X
The Internet is an enormous internetwork formed by connecting tens of
thousands of independently managed computer networks. Though the
Internet has no central authority and is highly heterogeneous, a
universally adopted addressing scheme---defined by the Internet Protocol
(IP)---makes interaction between the individual networks possible.
Complementing IP is the Border Gateway Protocol (BGP), which facilitates
communication between parts of the internetwork by determining paths by
which data can get from one network to any other. Just as IP is used
ubiquitously as an addressing scheme, BGP is used ubiquitously for the
purpose of network-to-network routing.
Because BGP is universal, its well-being is the concern of everyone.
In other words, when BGP suffers, everyone suffers. Even when just one
instance of BGP on one router is ill-behaved, it can have global
effects. Unfortunately, as the Internet has grown, the amount of stress
put on BGP has increased. For a long time, the behavior of inter-domain
routing was studied minimally and was assumed to be working just fine.
Research eventually showed, however, that routing was not actually
functioning so smoothly, and the highly dynamic nature of the Internet
was taking its toll on the routing infrastructure. This discovery
prompted a closer look at the behavior of BGP.
Though its underlying premise is a simple distributed shortest-path
algorithm, the dynamic nature of the Internet, combined with some
additional constraints in the protocol, has made analytical approaches
to studying the protocol infeasible. Measurement-based approaches have
been taken, but they are difficult to implement and have minimal leeway
for allowing exploration of the protocol's behavior under different
conditions. For these reasons we have taken the approach of simulation
in order to begin to understand some of the complex ways in which BGP
behaves. Simulation allows one to explore the protocol more fully,
testing it under various conditions and modifying the protocol itself to
explore the consequences of its fundamental design.
We have studied BGP behavior with respect to several parameters, some
external (network characteristics) and some internal (protocol
characteristics). We show that there is room for improvement in the
protocol, in particular with respect to convergence following changes in
availability of an address in the network. The rate-limiting mechanism
of the protocol is a particular parameter of concern. Although it was
previously thought to help improve convergence, we found that in some
cases it can have drastic degrading effects. As a result of our work,
we suggest ways in which BGP could be modified in practice to reduce the
instability of the protocol.
%Z
This is a Ph.D. thesis. It differs from the version of the thesis which
appears in the Dartmouth College library in a couple of significant
ways. First, it is single-spaced. Figures have moved around in order
to accommodate this change. Second, it includes some corrections. The
primary correction is that some bibliography entries were reordered in
order to properly alphabetize them. This has the side effect that the
numbered citations throughout the document are different in this version
than in the original version.
%T SPADE: SPKI/SDSI for Attribute Release Policies in a Distributed Environment
%A Sidharth P. Nazareth
%R Technical Report TR2003-453
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-453.ps.Z
%X
Shibboleth is a federated administrated system that supports
inter-institutional authentication and authorization for sharing of
resources. SPKI/SDSI is a public key infrastructure whose creation
was motivated by the perception that X.509 is too complex and
flawed. This thesis addresses the problem of how users that are part
of a Public Key Infrastructure in a distributed computing system can
effectively specify, create, and disseminate their Attribute Release
Policies for Shibboleth using SPKI/SDSI. This thesis explores
existing privacy mechanims, as well as distributed trust management
and policy based systems. My work describes the prototype for a Trust
Management Framework called SPADE (SPKI/SDSI for Attribute Release
Policies in a Distributed Environment) that I have designed,
developed and implemented. The principal result of this research has
been the demonstration that SPKI/SDSI is a viable approach for trust
management and privacy policy specification, especially for
minimalistic policies in a distributed environment.
%Z
M.S Thesis.
Advisor: Sean Smith
%T Discrete-Event Fluid Modeling of Background TCP Traffic
%A David M. Nicol
%A Guanhua Yan
%R Technical Report TR2003-454
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-454.ps.Z
%X
TCP is the most widely
used transport layer protocol used in the internet today.
A TCP session adapts the demands it places on the network
to observations of bandwidth availability on the network.
Because TCP is adaptive, any model of its behavior that aspires to be
accurate must be influenced by other network traffic.
This point is especially important in the context of
using simulation to evaluate some new network algorithm of interest
(e.g. reliable multi-cast) in an environment where the background
traffic affects---and is affected by---its behavior.
We need to generate background traffic efficiently in a way
that captures the salient features of TCP, while
the reference and background traffic representations
interact with each other.
This paper describes a fluid model of TCP and a switching
model that has flows represented by fluids interacting with
packet-oriented flows. We describe
conditions under which a fluid model
produces exactly the same behavior
as a packet-oriented model, and we
quantify the performance advantages of the approach
both analytically and empirically. We observe that very significant
speedups may be attained while keeping high accuracy.
%T Persistence and Prevalence in the Mobility of Dartmouth Wireless Network Users
%A Clara E. Lee
%R Technical Report TR2003-455
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-455.ps.Z
%X
Wireless local-area networks (WLANs) are increasing in popularity. As
more people use WLANs it is important to understand how these users
behave. We analyzed data collected over three months of 2002 to
measure the persistence and prevalence of users of the Dartmouth
wireless network.
We found that most of the users of Dartmouth's network have
short association times and a high rate of mobility. This
observation fits with the predominantly student population of
Dartmouth College, because students do not have a fixed
workplace and are moving to and from classes all day.
%Z
The data in this paper is highly suspect; see TR2003-480.
%T Discovery, Visualization and Analysis of Gene Regulatory Sequence Elements in Genomes
%A Daniel F. Simola
%R Technical Report TR2003-456
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-456.pdf
%X
The advent of rapid DNA sequencing has produced an explosion in the amount of available sequence information, permitting us to ask many new questions about DNA. There is a pressing need to design algorithms that can provide answers to questions related to the control of gene expression, and thus to the structure, function, and behavior of organisms. Such algorithms must filter through massive amounts of informational noise to identify meaningful conserved regulatory DNA sequence elements.
We are approaching these questions with the notion that visualization is a key to exploring data relationships. Understanding the exact nature of these relationships can be very difficult by simply interpreting raw data. The ability to look at data in a graphical form allows us to apply our innate capacity to think visually to discern the subtle relationships that might not be recognizable otherwise.
This thesis provides computational tools to visually identify and analyze candidate motifs in the DNA of a species. This includes a parsing utility to store genomic data and an application to search for and visually identify motifs. Using these tools, novel and previously compiled gene sets were identified using the genome of the plant species Arabidopsis thaliana.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T Electronic Documents and Digital Signatures
%A Kunal Kain
%R Technical Report TR2003-457
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-457.pdf
%X
Often, the main motivation for using PKI in business environments is to streamline workflow, by enabling humans to digitally sign electronic documents, instead of manually signing paper ones. However, this application fails if adversaries can construct electronic documents whose viewed contents can change in useful ways, without invalidating the digital signature. In this paper, we examine the space of such attacks, and describe how many popular electronic document formats and PKI packages permit them.
%Z
A revised version was published as follows:
K. Kain, S.W. Smith, R. Asokan.
"Digital Signatures and Electronic Documents: A Cautionary Tale."
Advanced Communications and Multimedia Security,
pp. 293-307, September 2002. Kluwer Academic Publishers.
http://portal.acm.org/citation.cfm?id=647802.737169
http://www.cs.dartmouth.edu/~sws/pubs/ksa02.pdf
%T Power Conservation in the Network Stack of Wireless Sensors
%A Michael De Rosa
%R Technical Report TR2003-458
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-458.ps.Z
%X
Wireless sensor networks have recently become an incredibly active research area in the networking community. Much attention has been given to the construction of power-conserving protocols and techniques, as battery life is the one factor that prevents successful wide-scale deployment of such networks. These techniques concentrate on the optimization of network behavior, as the wireless transmission of data is the most expensive operation performed by a sensor node. Very little work has been published on the integration of such techniques, and their suitability to various application domains. This paper presents an exhaustive power consumption analysis of network stacks constructed with common algorithms, to determine the interactions between such algorithms and the suitability of the resulting network stack for various applications.
%Z
Senior Honors Thesis. Advisor: Bob Gray.
%T Efficient I/O for Computational Grid Applications
%A Ron A. Oldfield
%R Technical Report TR2003-459
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-459.ps.Z
%X
High-performance computing increasingly occurs on "computational grids"
composed of heterogeneous and geographically distributed systems of
computers, networks, and storage devices that collectively act as a
single "virtual" computer. A key challenge in this environment is to
provide efficient access to data distributed across remote data
servers. This dissertation explores some of the issues associated
with I/O for wide-area distributed computing and describes an I/O
system, called Armada, with the following features: a framework to
allow application and dataset providers to flexibly compose graphs
of processing modules that describe the distribution, application
interfaces, and processing required of the dataset before or after
computation; an algorithm to restructure application graphs to
increase parallelism and to improve network performance in a wide-area
network; and a hierarchical graph-partitioning scheme that deploys
components of the application graph in a way that is both beneficial
to the application and sensitive to the administrative policies of
the different administrative domains. Experiments show that
applications using Armada perform well in both low- and high-bandwidth
environments, and that our approach does an exceptional job of
hiding the network latency inherent in grid computing.
%Z
This is a reformatted version of Ron Oldfield's Ph.D. dissertation.
Unlike the dissertation submitted to Dartmouth College, this version
is single-spaced, uses 11pt fonts, and is formatted specifically for
double-sided printing.
%T Enhancing Asynchronous Parallel Computing
%A Elizabeth A. Hamon
%R Technical Report TR2003-460
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-460.ps.Z
%X
In applications using large amounts of data, hiding the latency
inherent in accessing data far from the processor is often necessary
in order to achieve high performance. Several researchers have
observed that one way to address the challenge of latency is by using
a common structure: in a series of passes, the program reads in the
data, performs various operations on it, and writes out the data.
Passes often consist of a pipeline structure composed of different
stages. In order to achieve high performance, the stages are
frequently overlapped, for example, by using asynchronous threads.
Out-of-core parallel programs provide one such example of this
pattern. The development and debugging time resulting from
coordinating overlapping stages, however, can be substantial.
Moreover, modifying the structure of the overlap in an attempt to
achieve higher performance can require significant additional time on
the part of the programmer. This thesis presents FG, a Framework
Generator designed to coordinate the stages of a pipeline and allow
the programmer to easily experiment with the pipeline's structure,
thus significantly reducing time to solution. We also discuss
preliminary results of using FG in an out-of-core sorting program.
%T Trusted S/MIME Gateways
%A Mindy J. Pereira
%R Technical Report TR2003-461
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-461.ps.Z
%X
The utility of Web-based email clients is clear: a user is able to
access their email account from any computer anywhere at any
time. However, this option is unavailable to users whose security
depends on their key pair being stored either on their local computer
or in their browser. Our implementation seeks to solve two problems
with secure email services. The first that of mobility: users must
have access to their key pairs in order to perform the necessary
cryptographic operations. The second is one of transition: initially,
users would not want to give up their regular email clients. Keeping
these two restrictions in mind, we decided on the implementation of a
secure gateway system that works in conjunction with an existing mail
server and client. Our result is PKIGate, an S/MIME gateway that uses
the DigitalNet (formerly Getronics) S/MIME Freeware Library and IBM's
4758 secure coprocessor. This thesis presents motivations for the
project, a comparison with similar existing products, software and
hardware selection, the design, use case scenarios, a discussion of
implementation issues, and suggestions for future work.
%Z
Senior Honors Thesis. Advisor: Sean Smith.
%T Distributed planning and control for modular robots with unit-compressible modules
%A Zack Butler
%A Daniela Rus
%R Technical Report TR2003-462
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-462.ps.Z
%X
Self-reconfigurable robots are versatile systems consisting of large
numbers of independent modules. Effective use of these systems
requires parallel actuation and planning, both for efficiency and
independence from a central controller. This paper presents the
PacMan algorithm, a technique for distributed actuation and planning
for systems with two- or three-dimensional unit-compressible modules.
We give two versions of the algorithm along with correctness analysis.
We also analyze the parallel actuation capability of the algorithm,
showing that it will not deadlock and will avoid disconnecting the
robot. We have implemented PacMan on the Crystal robot, a hardware
system developed in our lab, and we present experiments and discuss
the feasibility of large-scale implementation.
%T Investigation of Third Party Rights Service and Shibboleth Modification to Introduce the Service
%A Sanket Agrawal
%R Technical Report TR2003-463
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-463.ps.Z
%X
Shibboleth is an architecture to support inter-institutional sharing of
electronic resources that are subject to access control. Codifying
copyright in Shibboleth authorization policies is difficult because of the
copyright exceptions which can be highly subjective. Third Party Rights
Service is a high-level concept that has been suggested as a solution to
approximate the exceptions of copyright law. In this thesis, I investigate
the components of the Third Party Rights Service. I design and analyze a
modified Shibboleth architecture based on these components. The resulting
architecture allows for the phased addition of the resources to make use
of the Third Party Rights Service, while keeping the existing resources in
Shibboleth.
%T A Surface-based Approach for Classification of 3D Neuroanatomic Structures
%A Li Shen
%A James Ford
%A Fillia Makedon
%A Andrew Saykin
%R Technical Report TR2003-464
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-464.ps.Z
%X
We present a new framework for 3D surface object classification that combines a powerful shape description method with suitable pattern classification techniques. Spherical harmonic parameterization and normalization techniques are used to describe a surface shape and derive a dual high dimensional landmark representation. A point distribution model is applied to reduce the dimensionality. Fisher's linear discriminants and support vector machines are used for classification. Several feature selection schemes are proposed for learning better classifiers. After showing the effectiveness of this framework using simulated shape data, we apply it to real hippocampal data in schizophrenia and perform extensive experimental studies by examining different combinations of techniques. We achieve best leave-one-out cross-validation accuracies of 93% (whole set, N=56) and 90% (right-handed males, N=39), respectively, which are competitive with the best results in previous studies using different techniques on similar types of data. Furthermore, to help medical diagnosis in practice, we employ a threshold-free receiver operating characteristic (ROC) approach as an alternative evaluation of classification results as well as propose a new method for visualizing discriminative patterns.
%T Using Low Level Linear Memory Management for Type-Preserving Mark-Sweep Garbage Collector
%A Edward Wei
%R Technical Report TR2003-465
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-465.ps.Z
%X
Efficient low-level systems such as garbage collectors need more
control over memory than safe high-level languages usually provide.
Due to this constraint, garbage collectors are typically written in
unsafe languages such as C. A collector of this form usually resides
as a trusted primitive runtime service outside the model of the
programming language. The type safety of these languages depends on
the assumption that the garbage collector will not violate any typing
invariants. However, no realistic systems provide proof of this
assumption.
A garbage collector written in a strongly typed language can
guarantee not only the safety of the garbage collector and the program
being garbage collected (mutator), but also the interaction between
the collector and the mutator. Removing the garbage collector from
the trusted computing base has many additional benefits: Untrusted
code could be given more control over memory management without
sacrificing security. Low-level code such as device drivers could
interface in a safe way with a garbage collector. For these and the
growing prevalence of garbage collectors in the typical programming
system necessitate a safe solution.
Previous research by Wang et al introduced a safe copying
collector based on regions, where the live graph structure of the heap
is copied from an old region to a newer region. This paper seeks to
improve the efficiency of type-preserving garbage collection with the
introduction of a type-preserving mark and sweep garbage collector.
%Z
Undergraduate Honors Thesis. Advisor: Chris Hawblitzel.
Associated software release at http://www.cs.dartmouth.edu/~hawblitz/
%T Digital Art Forensics
%A Siwei Lyu
%A Daniel Rockmore
%A Hany Farid
%R Technical Report TR2003-466
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-466.pdf
%X
We describe a computational technique for digitally authenticating
works of art. This approach builds statistical models of an artist
from a set of authenticated works. Additional works are then
authenticated against this model. The statistical model consists of
first- and higher-order wavelet statistics. We show preliminary
results from our analysis of thirteen drawings by Pieter Bruegel the
Elder. We also present preliminary results showing how these
techniques may be applicable to determining how many hands contributed
to a single painting.
%T The mistaken axioms of wireless-network research
%A David Kotz
%A Calvin Newport
%A Chip Elliott
%R Technical Report TR2003-467
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-467.ps.Z
%X
Most research on ad-hoc wireless networks makes simplifying
assumptions about radio propagation. The ``Flat Earth'' model of the
world is surprisingly popular: all radios have circular range, have
perfect coverage in that range, and travel on a two-dimensional plane.
CMU's ns-2 radio models are better but still fail to represent
many aspects of realistic radio networks, including hills, obstacles,
link asymmetries, and unpredictable fading. We briefly argue that key
``axioms'' of these types of propagation models lead to simulation
results that do not adequately reflect real behavior of ad-hoc
networks, and hence to network protocols that may not work well (or at
all) in reality. We then present a set of 802.11 measurements that
clearly demonstrate that these ``axioms'' are contrary to fact. The
broad chasm between simulation and reality calls into question many of
results from prior papers, and we summarize with a series of
recommendations for researchers considering analytic or simulation
models of wireless networks.
%Z
For more information related to this project see
http://www.cs.dartmouth.edu/~campus/.
%T Formal Properties of Linear Memory Types
%A Heng Huang
%A Lea Wittie
%A Chris Hawblitzel
%R Technical Report TR2003-468
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-468.ps.Z
%X
Efficient low-level systems need more control over memory than safe high-level languages usually provide. As a result, run-time systems are typically written in unsafe languages such as C. This report describes an abstract machine designed to give type-safe code more control over memory. It includes complete definitions and proofs of preservation, progress, strong normalization, erasure, and translation correctness.
%T MEMS for Infosecurity
%A Bruce R. Donald
%A Craig McGray
%A Daniela Rus
%R Technical Report TR2003-469
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2003
%Z
Abstract and paper lost.
%T Using caching for browsing anonymity
%A Anna M. Shubina
%A Sean W. Smith
%R Technical Report TR2003-470
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-470.ps.Z
%X
Privacy-providing tools, including tools that provide anonymity, are
gaining popularity in the modern world. Among the goals of their users is
avoiding tracking and profiling. While some businesses are unhappy with the
growth of privacy-enhancing technologies, others can use lack of information
about their users to avoid unnecessary liability and even possible harassment
by parties with contrary business interests, and to gain a competitive
market edge.
Currently, users interested in anonymous browsing have the choice only
between single-hop proxies and the few more complex systems that are available.
These still leave the user vulnerable to long-term intersection attacks.
In this paper, we propose a caching proxy system for allowing users to
retrieve data from the World-Wide Web in a way that would provide recipient
unobservability by a third party and sender unobservability by the recipient
and thus dispose with intersection attacks, and report on the prototype we
built using Google.
%T Bear: An Open-Source Virtual Secure Coprocessor based on TCPA
%A Rich MacDonald
%A Sean W. Smith
%A John Marchesini
%A Omen Wild
%R Technical Report TR2003-471
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-471.ps.Z
%X
This paper reports on our ongoing project to use TCPA to transform a
desktop Linux machine into a virtual secure coprocessor: more powerful
but less secure than higher-end devices. We use TCPA hardware and
modified boot loaders to protect fairly static components, such as a
trusted kernel; we use an enforcer module---configured as Linux
Security Module---to protected more dynamic system components; we use
an encrypted loopback filesystem to protect highly dynamic components.
All our code is open source and available under GPL from
http://enforcer.sourceforge.net/
%Z
Superceded by TR2003-476.
%T Evaluating Location Predictors with Extensive Wi-Fi Mobility Data
%A Libo Song
%A David Kotz
%A Ravi Jain
%A Xiaoning He
%R Technical Report TR2003-472
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2003
%Z
This report with this number was not released until 2004,
so it was given a new number and appeared as TR2004-491.
A revised version was published as a short paper:
Libo Song, David Kotz, Ravi Jain, and Xiaoning He.
Mobicom poster: Evaluating location predictors with extensive Wi-Fi
mobility data.
ACM SIGMOBILE Mobile Computing and Communication Review,
7(4):64-65, October 2003.
%T A Probability-Based Similarity Measure for Saupe Alignment Tensors with Applications to Residual Dipolar Couplings in NMR Structural Biology
%A Anthony K. Yan
%A Christopher J. Langmead
%A Bruce Randall Donald
%R Technical Report TR2003-474
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-474.pdf
%X
High-throughput NMR structural biology and NMR structural genomics
pose a fascinating set of geometric challenges. A key bottleneck in
NMR structural biology is the resonance assignment problem. We seek to
accelerate protein NMR resonance assignment and structure
determination by exploiting a priori structural information. In
particular, a method known as Nuclear Vector Replacement (NVR) has
been proposed as a method for solving the assignment problem given a
priori structural information [24,25]. Among several
different kinds of input data, NVR uses a particular type of NMR data
known as residual dipolar couplings (RDCs). The basic physics
of residual dipolar couplings tells us that the data should be
explainable by a structural model and set of parameters contained
within the Saupe alignment tensor.
In the NVR algorithm, one estimates the Saupe alignment tensors and
then proceeds to refine those estimates. We would like to quantify
the accuracy of such estimates, where we compare the estimated Saupe
matrix to the correct Saupe matrix. In this work, we propose a way to
quantify this comparison. Given a correct Saupe matrix and an
estimated Saupe matrix, we compute an upper bound on the probability
that a randomly rotated Saupe tensor would have an error smaller than
the estimated Saupe matrix. This has the advantage of being a
quantified upper bound which also has a clear interpretation in terms
of geometry and probability. While the specific application of our
rotation probability results is given to NVR, our novel methods can be
used for any RDC-based algorithm to bound the accuracy of the
estimated alignment tensors. Furthermore, they could also be used in
X-ray crystallography or molecular docking to quantitate the accuracy
of calculated rotations of proteins, protein domains, nucleic acids,
or small molecules.
%Z
A revised and expanded version of this paper has been accepted at a
journal and will appear as: "A Probability-Based Similarity Measure
for Saupe Alignment Tensors with Applications to Residual Dipolar
Couplings in NMR Structural Biology", in The International Journal
of Robotics Research Special Issue on Robotics Techniques
Applied to Computational Biology, 2004.
%T On the Complexity of Implementing Certain Classes of Shared Objects
%A King Y. Tan
%R Technical Report TR2003-475
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-475.ps.Z
%X
We consider shared memory systems in which asynchronous processes
cooperate with each other by communicating via shared data objects,
such as counters, queues, stacks, and priority queues. The common
approach to implementing such shared objects is based on locking: To
perform an operation on a shared object, a process obtains a lock,
accesses the object, and then releases the lock. Locking, however,
has several drawbacks, including convoying, priority inversion, and
deadlocks. Furthermore, lock-based implementations are not
fault-tolerant: if a process crashes while holding a lock, other
processes can end up waiting forever for the lock.
Wait-free linearizable implementations were conceived to overcome
most of the above drawbacks of locking. A wait-free implementation
guarantees that if a process repeatedly takes steps, then its
operation on the implemented data object will eventually complete,
regardless of whether other processes are slow, or fast, or have
crashed.
In this thesis, we first present an efficient wait-free linearizable
implementation of a class of object types, called closed and closable
types, and then prove time and space lower bounds on wait-free
linearizable implementations of another class of object types, called
perturbable types.
(1) We present a wait-free linearizable implementation of n-process
closed and closable types (such as swap, fetch&add, fetch&multiply,
and fetch&L, where L is any of the boolean operations and, or, or
complement) using registers that support load-link (LL) and
store-conditional (SC) as base objects.
The time complexity of the implementation grows linearly with
contention, but is never more than O(log ^2 n). We believe that this
is the first implementation of a class of types (as opposed to a
specific type) to achieve a sub-linear time complexity.
(2) We prove linear time and space lower bounds on the wait-free
linearizable implementations of n-process perturbable types (such as
increment, fetch&add, modulo k counter, LL/SC bit, k-valued
compare&swap (for any k >= n), single-writer snapshot) that use
resettable consensus and historyless objects (such as registers that
support read and write) as base objects.
This improves on some previously known Omega(sqrt{n}) space
complexity lower bounds. It also shows the near space optimality of
some known wait-free linearizable implementations.
%T Experimenting with TCPA/TCG Hardware, Or: How I Learned to Stop Worrying and Love The Bear
%A John Marchesini
%A Sean W. Smith
%A Omen Wild
%A Rich MacDonald
%R Technical Report TR2003-476
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-476.pdf
%X
Over the last few years, our group has been working on applications of
secure coprocessors---but has been frustrated by the limited
computational environment and high expense of such devices. Over the
last few years, the TCPA (now TCG) has produced a specification for a
trusted platform module (TPM)---a small hardware addition
intended to improve the overall security of a larger machine (and tied
up with a still-murky vision of Windows-based trusted computing).
Some commodity desktops now come up with these TPMs.
Consequently, we began an experiment to see if (in the absence of a
Non-Disclosure Agreement) we could use this hardware to transform a
desktop Linux machine into a virtual secure coprocessor: more powerful
but less secure than higher-end devices. This experiment has several
purposes: to provide a new platform for secure coprocessor
applications, to see how well the TCPA/TCG approach works, and (by
working in open source) to provide a platform for the broader
community to experiment with alternative architectures in the
contentious area of trusted computing.
This paper reports what we have learned so far: the approach is
feasible, but effective deployment requires a more thorough look at OS
security.
%Z
This report, TR2003-476, supersedes TR2003-471 of August 2003.
Furthermore, the December 15, 2003 version of TR2003476 fixes
typos found in the December 4, 2003 version.
%T Problems with the Dartmouth wireless SNMP data collection
%A Tristan Henderson
%A David Kotz
%R Technical Report TR2003-480
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-480-rev3.pdf
%X
The original Dartmouth wireless network study used SNMP to query the
college's Cisco 802.11b access points. The perl scripts that
performed the SNMP queries suffered from some problems, in that they
queried inappropriate SNMP values, or misunderstood the meaning of
other values. This data was also used in a subsequent analysis. The
same scripts were used to collect data for a subsequent study of
another wireless network. This document outlines these problems and
indicates which of the data collected by the original scripts may be
invalid.
%Z
See http://www.cs.dartmouth.edu/~campus/ for more information.
Revision 2 of 12/31/03 reflects updated information about
the Balazinska paper.
Revision 3 of 10/29/04 reflects a difference between the preliminary
and final version of the Kotz/Essien journal paper.
%T A Subgroup Algorithm to Identify Cross-Rotation Peaks Consistent with Non-Crystallographic Symmetry
%A Ryan H. Lilien
%A Chris Bailey-Kellogg
%A Amy A. Anderson
%A Bruce R. Donald
%R Technical Report TR2003-481
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-481.pdf
%X
Molecular replacement (MR) often plays a prominent role in determining
initial phase angles for structure determination by X-ray
crystallography. In this paper, an efficient quaternion-based
algorithm is presented for analyzing peaks from a cross-rotation
function to identify model orientations consistent with
non-crystallographic symmetry (NCS), and to generate NCS-consistent
orientations missing from the list of cross-rotation peaks. Our
algorithm, CRANS, analyzes the rotation differences between each pair
of cross-rotation peaks to identify finite subgroups of NCS. Sets of
rotation differences satisfying the subgroup axioms correspond to
orientations compatible with the correct NCS. The CRANS algorithm was
first tested using cross-rotation peaks computed from structure factor
data for three test systems, and then used to assist in the de novo
structure determination of dihydrofolate reductase-thymidylate
synthase (DHFR-TS) from Cryptosporidium hominis. In every case, the
CRANS algorithm runs in seconds to identify orientations consistent
with the observed NCS and to generate missing orientations not present
in the cross-rotation peak list. The CRANS algorithm has application
in every molecular replacement phasing effort with NCS.
%Z
A revised version of this paper has been accepted for publication
and will appear in the journal Acta Crystallographica D:
Biological Crystallography. In press (2004).
%T Greenpass: Flexible and Scalable Authorization for Wireless Networks
%A Sean Smith
%A Nicholas C. Goffee
%A Sung Hoon Kim
%A Punch Taylor
%A Meiyuan Zhao
%A John Marchesini
%R Technical Report TR2004-484
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-484.pdf
%X
Wireless networks break the implicit assumptions that supported
authorization in wired networks (that is: if one could connect, then
one must be authorized). However, ensuring that only authorized users
can access a campus-wide wireless network creates many challenges: we
must permit authorized guests to access the same network resources
that internal users do; we must accommodate the de-centralized way
that authority flows in real universities; we also must work within
standards, and accommodate the laptops and systems that users already
have, without requiring additional software or plug-ins.
This paper describes our ongoing project to address this problem,
using SPKI/SDSI delegation on top of X.509 keypair within EAP-TLS.
Within the ``living laboratory'' of Dartmouth's wireless network, this
project lets us solve real problem with wireless networking, while
also experimenting with trust flows and testing the limits of current
tools.
%T Using SPKI/SDSI for Distributed Maintenance of Attribute Release Policies in Shibboleth
%A Sidharth Nazareth
%A Sean Smith
%R Technical Report TR2004-485
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-485.pdf
%X
The Shibboleth middleware from Internet2 provides a way for
users at higher-education institutions to access remote electronic
content in compliance with the inter-institutional license agreements
that govern such access. To protect end-user privacy, Shibboleth
permits users to construct attribute release policies that control
what user credentials a given content provider can obtain. However,
Shibboleth leaves unspecified how to construct these policies.
To be effective, a solution needs to accommodate the typical nature of
a university: a set of decentralized fiefdoms. This need argues for a
public-key infrastructure (PKI) approach---since public-key
cryptography does not require parties to agree on a secret beforehand,
and parties distributed throughout the institution are unlikely to
agree on anything. However, this need also argues against the strict
hierarchical structure of traditional PKI---policy in different
fiefdoms will be decided differently, and originate within the
fiefdom, rather than from an overall root.
This paper presents our design and prototype of a system that uses the
decentralized public-key framework of SPKI/SDSI to solve this problem.
%T A meeting detector and its applications
%A Jue Wang
%A Guanling Chen
%A David Kotz
%R Technical Report TR2004-486
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-486.pdf
%X
In this paper we present a context-sensing component that recognizes
meetings in a typical office environment. Our prototype detects the
meeting start and end by combining outputs from pressure and motion
sensors installed on the chairs. We developed a telephone controller
application that transfers incoming calls to voice-mail when the user
is in a meeting. Our experiments show that it is feasible to detect
high-level context changes with ``good enough'' accuracy, using
low-cost, off-the-shelf hardware, and simple algorithms without
complex training. We also note the need for better metrics to measure
context detection performance, other than just accuracy. We propose
several metrics appropriate for our application in this paper. It may
be useful, however, for the community to define a set of general
metrics as a basis to compare different approaches of context
detection.
%T High-Throughput 3D Homology Detection via NMR Resonance Assignment
%A Christopher J. Langmead
%A Bruce R. Donald
%R Technical Report TR2004-487
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2003
%U http://www.cs.dartmouth.edu/reports/TR2004-487.pdf
%X
One goal of the structural genomics initiative is the identification of new
protein folds. Sequence-based structural homology prediction
methods are an important means for prioritizing unknown proteins
for structure determination. However, an important challenge
remains: two highly dissimilar sequences can have similar folds
--- how can we detect this rapidly, in the context of structural
genomics? High-throughput NMR experiments, coupled with novel
algorithms for data analysis, can address this challenge. We
report an automated procedure, called HD, for detecting 3D
structural homologies from sparse, unassigned protein NMR
data. Our method identifies 3D models in a protein structural
database whose geometries best fit the unassigned experimental NMR
data. HD does not use, and is thus not limited by sequence
homology. The method can also be used to confirm or refute
structural predictions made by other techniques such as protein
threading or homology modelling. The algorithm runs in $O(pn^{5/2}
\log {(cn)} + p \log p)$ time, where $p$ is the number of proteins
in the database, $n$ is the number of residues in the target
protein and $c$ is the maximum edge weight in an integer-weighted
bipartite graph. Our experiments on real NMR data from 3
different proteins against a database of 4,500 representative
folds demonstrate that the method identifies closely related
protein folds, including sub-domains of larger proteins, with as
little as 10-30\% sequence homology between the target protein (or
sub-domain) and the computed model. In particular, we report no
false-negatives or false-positives despite significant percentages
of missing experimental data.
%Z
A revised version of this paper will appear in the IEEE
Computational Systems Bioinformatics Conference (CSB),
Stanford CA. (August, 2004).
%T Application-controlled loss-tolerant data dissemination
%A Guanling Chen
%A David Kotz
%R Technical Report TR2004-488
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-488.pdf
%X
Reactive or proactive mobile applications require continuous
monitoring of their physical and computational environment to make
appropriate decisions in time. These applications need to monitor data
streams produced by sensors and react to changes. When mobile sensors
and applications are connected by low-bandwidth wireless networks,
sensor data rates may overwhelm the capacity of network links or of
the applications. In traditional networks and distributed systems,
flow-control and congestion-control policies either drop data or force
the sender to pause. When the data sender is sensing the physical
environment, however, a pause is equivalent to dropping
data. Arbitrary data drops are not necessarily acceptable to the
reactive mobile applications receiving sensor data. Data distribution
systems must support application-specific policies that selectively
drop data objects when network or application buffers overflow.
In this paper we present a data-dissemination service, PACK, which
allows applications to specify customized data-reduction
policies. These policies define how to discard or summarize data flows
wherever buffers overflow on the dissemination path, notably at the
mobile hosts where applications often reside. The PACK service
provides an overlay infrastructure to support mobile data sources and
sinks, using application-specific data-reduction policies where
necessary along the data path. We uniformly apply the data-stream
``packing'' abstraction to buffer overflow caused by network
congestion, slow receivers, and the temporary disconnections caused by
end-host mobility. We demonstrate the effectiveness of our approach
with an application example and experimental measurements.
%T Keyjacking: The Surprising Insecurity of Client-side SSL
%A John Marchesini
%A Sean W. Smith
%A Meiyuan Zhao
%R Technical Report TR2004-489
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 13,
%U http://www.cs.dartmouth.edu/reports/TR2004-489.pdf
%X
In theory, PKI can provide a flexible and strong way to authenticate
users in distributed information systems. In practice, much is being
invested in realizing this vision via client-side SSL and various
client keystores. However, whether this works depends on whether what
the machines do with the private keys matches what the humans think
they do: whether a server operator can conclude from an SSL request
authenticated with a user's private key that the user was aware of and
approved that request. Exploring this vision, we demonstrate via a
series of experiments that this assumption does not hold with standard
desktop tools, even if the browser user does all the right things. A
fundamental rethinking of the trust, usage, and storage model might
result in more effective tools for achieving the PKI vision.
%Z
This TR supercedes TR2003-443. A preliminary version appeared in the
proceedings of the 2nd Annual PKI Research Workshop in April of 2003.
%T A Case Study of Four Location Traces
%A Guanling Chen
%A David Kotz
%R Technical Report TR2004-490
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-490.pdf
%X
Location is one of the most important context information that an
ubiquitous-computing application may leverage. Thus understanding the
location systems and how location-aware applications interact with
them is critical for design and deployment of both the location
systems and location-aware applications. In this paper, we analyze a
set of traces collected from two small-scale one-building location
system and two large-scale campus-wide location systems. Our goal is
to study characteristics of these location systems ant how these
factors should be taken into account by a potentially large number of
location-aware applications with different needs. We make empirical
measurements of several important metrics and compare the results
across these location systems. We discuss the implication of these
results on location-aware applications and their supporting software
infrastructure, and how location systems could be improved to better
serve applications' needs. In places where possible, we use
location-aware applications discussed in existing literatures as
illustrating examples.
%T Evaluating next-cell predictors with extensive Wi-Fi mobility data
%A Libo Song
%A David Kotz
%A Ravi Jain
%A Xiaoning He
%R Technical Report TR2004-491
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2004
%X
Location is an important feature for many applications, and wireless networks can better serve their clients by anticipating client mobility. As a result, many location predictors have been proposed in the literature, though few have been evaluated with empirical evidence. This paper reports on the results of the first extensive empirical evaluation of location predictors, using a two-year trace of the mobility patterns of over 6,000 users on Dartmouth's campus-wide Wi-Fi wireless network. We implemented and compared the prediction accuracy of several location predictors drawn from four major families of domain-independent predictors, namely Markov-based, compression-based, PPM, and SPM predictors. We found that low-order Markov predictors performed as well or better than the more complex and more space-consuming compression-based predictors. Predictors of both families fail to make a prediction when the recent context has not been previously seen. To overcome this drawback, we added a simple fallback feature to each predictor and found that it significantly enhanced its accuracy in exchange for modest effort. Thus the Order-2 Markov predictor with fallback was the best predictor we studied, obtaining a median accuracy of about 72\% for users with long trace lengths. We also investigated a simplification of the Markov predictors, where the prediction is based not on the most frequently seen context in the past, but the most recent, resulting in significant space and computational savings. We found that Markov predictors with this recency semantics can rival the accuracy of standard Markov predictors in some cases. Finally, we considered several seemingly obvious enhancements, such as smarter tie-breaking and aging of context information, and discovered that they had little effect on accuracy. The paper ends with a discussion and suggestions for further work.
%Z
PDF removed because this version was superceded by the later journal
version, and per IEEE rules we can only post the final version. A
revised version of this paper appeared in IEEE TMC 5(12),
pp. 1633-1649, December 2006. An earlier (and shorter) version of
this paper appeared at IEEE Infocom, March 7-11, 2004.
%T Identification of Novel Small Molecule Inhibitors of Core-Binding Factor Dimerization by Computational Screening against NMR Molecular Ensembles
%A Ryan H. Lilien
%A Mohini Sridharan
%A Bruce R. Donald
%R Technical Report TR2004-492
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-492.pdf
%X
The long development process of novel pharmaceutical compounds begins
with the identification of a lead inhibitor compound. Computational
screening to identify those ligands, or small molecules, most likely
to inhibit a target protein may benefit the pharmaceutical development
process by reducing the time required to identify a lead
compound. Typically, computational ligand screening utilizes
high-resolution structural models of both the protein and ligand to
fit or `dock' each member of a ligand database into the binding site
of the protein. Ligands are then ranked by the number and quality of
interactions formed in the predicted protein-ligand complex. It is
currently believed that proteins in solution do not assume a single
rigid conformation but instead tend to move through a small region of
conformation space. Therefore, docking ligands against a static
snapshot of protein structure has predictive limitations because it
ignores the inherent flexibility of the protein. A challenge,
therefore, has been the development of docking algorithms capable of
modeling protein flexibility while balancing computational
feasibility. In this paper, we present our initial development and
work on a molecular ensemble-based algorithm to model protein
flexibility for protein-ligand binding prediction. First, a molecular
ensemble is generated from molecular structures satisfying
experimentally-measured NMR constraints. Second, traditional
protein-ligand docking is performed on each member of the protein's
molecular ensemble. This step generates lists of ligands predicted to
bind to each individual member of the ensemble. Finally, lists of top
predicted binders are consolidated to identify those ligands predicted
to bind multiple members of the protein's molecular ensemble. We
applied our algorithm to identify inhibitors of Core Binding Factor
(CBF) among a subset of approximately 70,000 ligands of the Available
Chemicals Directory. Our 26 top-predicted binding ligands are
currently being tested experimentally in the wetlab by both
NMR-binding experiments (15N-edited Heteronuclear Single-Quantum
Coherence (HSQC)) and Electrophoretic Gel Mobility Shift Assays
(EMSA). Preliminary results indicate that of approximately 26 ligands
tested, three induce perturbations in the protein's NMR chemical
shifts indicative of ligand binding and one ligand
(2-amino-5-cyano-4-tertbutyl thiazole) causes a band pattern in the
EMSA indicating the disruption of CBF dimerization.
%T The Kerf toolkit for intrusion analysis
%A Javed Aslam
%A Sergey Bratus
%A David Kotz
%A Ron Peterson
%A Daniela Rus
%A Brett Tofel
%R Technical Report TR2004-493
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-493.pdf
%X
We consider the problem of intrusion analysis and present the
Kerf Toolkit, whose purpose is to provide an efficient and
flexible infrastructure for the analysis of attacks. The Kerf Toolkit
includes a mechanism for securely recording host and network logging
information for a network of workstations, a domain-specific language
for querying this stored data, and an interface for viewing the
results of such a query, providing feedback on these results, and
generating new queries in an iterative fashion. We describe the
architecture of Kerf, present examples to demonstrate the
power of our query language, and discuss the performance of our
implementation of this system.
%T An Improved Nuclear Vector Replacement Algorithm for Nuclear Magnetic Resonance Assignment
%A Christopher J. Langmead
%A Bruce R. Donald
%R Technical Report TR2004-494
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2003
%U http://www.cs.dartmouth.edu/reports/TR2004-494.pdf
%X
We report an improvement to the Nuclear Vector Replacement (NVR) algorithm for high-throughput Nuclear Magnetic Resonance (NMR) resonance assignment. The new algorithm improves upon our earlier result in terms of accuracy and computational complexity. In particular, the new NVR algorithm assigns backbone resonances without error (100% accuracy) on the
same test suite examined in [Langmead and Donald J. Biomol. NMR 2004], and runs in $O(n^{5/2} \log {(cn)})$ time where $n$ is the number of amino acids in the primary sequence of the protein, and $c$ is the maximum edge weight in an integer-weighted bipartite graph.
%T Dependency management in distributed settings
%A Guanling Chen
%A David Kotz
%R Technical Report TR2004-495
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-495.pdf
%X
Ubiquitous-computing environments are heterogeneous and volatile in
nature. Systems that support ubicomp applications must be
self-managed, to reduce human intervention. In this paper, we present
a general service that helps distributed software components to manage
their dependencies. Our service proactively monitors the liveness of
components and recovers them according to supplied policies. Our
service also tracks the state of components, on behalf of their
dependents, and may automatically select components for the dependent
to use based on evaluations of customized functions. We believe that
our approach is flexible and abstracts away many of the complexities
encountered in ubicomp environments. In particular, we show how we
applied the service to manage dependencies of context-fusion operators
and present some experimental results.
%Z
A short 2-page poster abstract of this work appears in the
International Conference on Autonomic Computing (ICAC-04),
May 2004.
%T The Changing Usage of a Mature Campus-wide Wireless Network
%A Tristan Henderson
%A David Kotz
%A Ilya Abyzov
%R Technical Report TR2004-496
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-496.pdf
%X
Wireless Local Area Networks (WLANs) are now common
on academic and corporate campuses. As ``Wi-Fi''
technology becomes ubiquitous, it is increasingly important
to understand trends in the usage of these networks.
This paper analyzes an extensive network trace from a
mature 802.11 WLAN, including more than 550 access
points and 7000 users over seventeen weeks. We employ
several measurement techniques, including syslogs, telephone
records, SNMP polling and tcpdump packet sniffing.
This is the largest WLAN study to date, and the first
to look at a large, mature WLAN and consider geographic
mobility. We compare this trace to a trace taken after the
networkÕs initial deployment two years ago.
We found that the applications used on the WLAN
changed dramatically. Initial WLAN usage was dominated
by Web traffic; our new trace shows significant increases
in peer-to-peer, streaming multimedia, and voice
over IP (VoIP) traffic. On-campus traffic now exceeds offcampus
traffic, a reversal of the situation at the WLANÕs
initial deployment. Our study indicates that VoIP has been
used little on the wireless network thus far, and most VoIP
calls are made on the wired network. Most calls last less
than a minute.
We saw more heterogeneity in the types of clients used,
with more embedded wireless devices such as PDAs and
mobile VoIP clients. We define a new metric for mobility,
the ``session diameter.'' We use this metric to show that
embedded devices have different mobility characteristics
than laptops, and travel further and roam to more access
points. Overall, users were surprisingly non-mobile, with
half remaining close to home about 98% of the time.
%Z
A revised version will appear in Mobicom 2004.
See also http://www.cs.dartmouth.edu/~campus/.
%T A Holesome File System
%A Darren Erik Vengroff
%A David Kotz
%R Technical Report TR2004-497
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-497.pdf
%X
We present a novel approach to fully dynamic management of physical
disk blocks in Unix file systems. By adding a single system call,
zero, to an existing file system, we permit applications to create
holes, that is, regions of files to which no physical disk
blocks are allocated, far more flexibly than previously possible.
zero can create holes in the middle of existing files.
Using zero, it is possible to efficiently implement applications
including a variety of databases and I/O-efficient computation
systems on top of the Unix file system. zero can also be used to
implement an efficient file-system-based paging mechanism. In some
I/O-efficient computations, the availability of zero effectively
doubles disk capacity by allowing blocks of temporary files to be
reallocated to new files as they are read.
Experiments on a Linux ext2 file system augmented by zero
demonstrate that where their functionality overlaps, zero is more
efficient than ftruncate(). Additional experiments reveal that
in exchange for added effective disk capacity, I/O-efficient code pays
only a small performance penalty.
%Z
This paper was written in July 1995, but sat unpublished until May
2004 when it was released as this technical report, unchanged.
%T Efficient Wait-Free Implementation of Atomic Multi-Word Buffer
%A Rachel B. Ringel
%R Technical Report TR2004-498
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June, 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-498.pdf
%X
This thesis proposes algorithms for implementing a atomic multi-word buffer,
which can be accessed concurrently by multiple readers and a single writer,
from the hardware-supported shared memory. The algorithms are required to be
wait-free: each process reads or writes the multi-word buffer in a bounded
number of its own steps, regardless of whether other processes are fast, slow
or have crashed.
Our first algorithm is built from multi-writer, multi-reader variables
whereas the second algorithm is built from single-writer, multi-reader
variables. For either algorithm, the worst-case running time of a read or a
write operation on the m-word buffer is O(m). The space complexity of the
algorithms is O(mn). Neither algorithm requires hardware support for any
special synchronization instructions; the ability to read or write into any
machine word is sufficient. The algorithms significantly improve on Peterson's
algorithm, which has O(mn) time complexity for the write operation on the
buffer.
%Z
Senior Honors Thesis. Advisor: Prasad Jayanti.
%T Mobile Agents Simulation with DaSSF
%A Nikita E Dubrovsky
%R Technical Report TR2004-499
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-499.pdf
%X
Mobile agents are programs that can migrate from machine to machine in a network of computers and have complete control over their movement. Since the performance space of mobile agents has not been characterized fully, assessing the effectiveness of using mobile agents over a traditional client/server approach currently requires implementing an agent system and running time-consuming experiments.
This report presents a simple mobile-agent simulation that can provide quick information on the performance and scalability of a generic information retrieval (IR) mobile-agent system under different network configurations. The simulation is built using the DaSSF and DaSSFNet frameworks, resulting in high performance and great configuration flexibility. This report also implements a real D'Agents mobile-agent IR system, measuring the performance of the system. A comparison of these real-world performance results and those given by the simulation suggest that the simulation has good accuracy in predicting the scalability of a mobile-agent system. Thus this report argues that simulation provides a good way to quickly assess the performance and scalability of an IR mobile-agent system under different network configurations.
%Z
Senior Honors Thesis. Advisor: Bob Gray.
%T Scheduling Pipelined, Multi-Threaded Programs in Linux
%A Brunn W. Roysden
%R Technical Report TR2004-500
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-500.pdf
%X
A process causes latency when it performs I/O or communication.
Pipelined processes mitigate latency by concurrently executing
multiple threads--- sequences of operations--- and overlapping
computation, communication, and I/O. When more than one thread is
ready to run, the scheduler determines which thread in fact runs.
This paper presents techniques for scheduling pipelines, with the
following three findings.
First, using Linux kernel version 2.6 and the NPTL threads package, we
observe a 3-6% performance improvement over kernel version 2.4 and the
LinuxThreads package.
Second, we test techniques that both take advantage of prior knowledge
about whether a program is I/O-bound or compute-bound and raise and
lower priorities before the pipeline begins working. These
techniques, referred to as fixed scheduling, further improve
performance by 5% in the case of the compute-bound columnsort
algorithm. In the I/O-bound algorithm, fixed scheduling failed to
yield better performance than the default scheduling.
Third, we test simple, adjusting methods that do not take advantage of
prior knowledge about whether a program in compute-bound or I/O-bound
but rather adjust scheduling as the pipeline progresses. These
techniques, called adjusting scheduling, fail to yield better
performance than the default scheduling in any of our test cases.
In addition, we suggest new scheduling calls and other operating-system improvements.
%Z
Senior honors thesis; advisor: Tom Cormen.
%T Synchronizing Keyframe Facial Animation to Multiple Text-to-Speech Engines and Natural Voice with Fast Response Time
%A William Pechter
%R Technical Report TR2004-501
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-501.pdf
%X
This thesis aims to create an automated lip-synchronization system for real-time applications. Specifically, the system is required to be fast, consist of a limited number of keyframes with small memory requirements, and create fluid and believable animations that synchronize with text-to-speech engines as well as raw voice data.
The algorithms utilize traditional keyframe animation and a novel method of keyframe selection. Additionally, phoneme-to-keyframe mapping, synchronization, and simple blending rules are employed. The algorithms provide blending between keyframe images, borrow information from neighboring phonemes, accentuate phonemes b, p and m, differentiate between keyframes for phonemes with allophonic variations, and provide prosodromic variation by including emotion while speaking. The lip-sync animation synchronizes with multiple synthesized voices and human speech. A fast and versatile online real-time java chat interface is created to exhibit vivid facial animation.
Results show that the animation algorithms are fast and show accurate lip-synchronization. Additionally, surveys showed that the animations are visually pleasing and improve speech understandability 96% of the time. Applications for this project include internet chat capabilities, interactive teaching of foreign languages, animated news broadcasting, enhanced game technology, and cell phone messaging.
%Z
Senior Honors Thesis. Advisors: Lorie Loeb, Hany Farid, and Stephen Linder
%T A Secure Network Node Approach to the Policy Decision Point in Distributed Access Control
%A Geoffrey H. Stowe
%R Technical Report TR2004-502
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-502.pdf
%X
To date, the vast majority of access control research and development
has been on gathering, managing, and exchanging information about
users. But an equally important component which has yet to be fully
developed is the Policy Decision Point - the system that decides
whether an access request should be granted given certain attributes
of the requestor. This paper describes the research and implementation
of a new PDP system for an undergraduate honors project. This PDP
system employs three unique features which differentiate it from
existing technology: collaboration capabilities, trusted management,
and interoperability with other access control systems. Security
considerations and future research areas are also discussed.
%Z
Senior Honors Thesis. Advisor: Ed Feustel
%T Enhancing Expressiveness of Speech through Animated Avatars for Instant Messaging and Mobile Phones
%A Joseph E. Pechter
%R Technical Report TR2004-503
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-503.pdf
%X
This thesis aims to create a chat program that allows users to communicate via an animated avatar that provides believable lip-synchronization and expressive emotion. Currently many avatars do not attempt to do lip-synchronization. Those that do are not well synchronized and have little or no emotional expression. Most avatars with lip synch use realistic looking 3D models or stylized rendering of complex models. This work utilizes images rendered in a cartoon style and lip-synchronization rules based on traditional animation. The cartoon style, as opposed to a more realistic look, makes the mouth motion more believable and the characters more appealing. The cartoon look and image-based animation (as opposed to a graphic model animated through manipulation of a skeleton or wireframe) also allows for fewer key frames resulting in faster speed with more room for expressiveness.
When text is entered into the program, the Festival Text-to-Speech engine creates a speech file and extracts phoneme and phoneme duration data. Believable and fluid lip-synchronization is then achieved by means of a number of phoneme-to-image rules. Alternatively, phoneme and phoneme duration data can be obtained for speech dictated into a microphone using Microsoft SAPI and the CSLU Toolkit.
Once lip synchronization has been completed, rules for non-verbal animation are added. Emotions are appended to the animation of speech in two ways: automatically, by recognition of key words and punctuation, or deliberately, by user-defined tags. Additionally, rules are defined for idle-time animation.
Preliminary results indicate that the animated avatar program offers an improvement over currently available software. It aids in the understandability of speech, combines easily recognizable and expressive emotions with speech, and successfully enhances overall enjoyment of the chat experience. Applications for the program include use in cell phones for the deaf or hearing impaired, instant messaging, video conferencing, instructional software, and speech and animation synthesis.
%Z
Senior Honors Thesis. Advisors: Lorie Loeb, Hany Farid, Stephen Linder
%T Simulating mobile ad hoc networks: a quantitative evaluation of common MANET simulation models
%A Calvin Newport
%R Technical Report TR2004-504
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-504.pdf
%X
Because it is difficult and costly to conduct real-world mobile ad hoc network experiments, researchers commonly rely on computer simulation to evaluate their routing protocols. However, simulation is far from perfect. A growing number of studies indicate that simulated results can be dramatically affected by several sensitive simulation parameters. It is also commonly noted that most simulation models make simplifying assumptions about radio behavior. This situation casts doubt on the reliability and applicability of many ad hoc network simulation results.
In this study, we begin with a large outdoor routing experiment testing the performance of four popular ad hoc algorithms (AODV, APRL, ODMRP, and STARA). We present a detailed comparative analysis of these four implementations. Then, using the outdoor results as a baseline of reality, we disprove a set of common assumptions used in simulation design, and quantify the impact of these assumptions on simulated results. We also more specifically validate a group of popular radio models with our real-world data, and explore the sensitivity of various simulation parameters in predicting accurate results. We close with a series of specific recommendations for simulation and ad hoc routing protocol designers.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T Technological Implications for Privacy
%A David Kotz
%R Technical Report TR2004-505
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-505.pdf
%X
The World-Wide Web is increasingly used for commerce and access to
personal information stored in databases. Although the Web is ``just
another medium'' for information exchange, the fact that all the
information is stored in computers, and all of the activity happens in
computers and computer networks, makes it easier (cheaper) than every
to track users' activities. By recording and analyzing user's
activities in the Web, activities that may seem to be quite private to
many users, it is more likely than ever before that a person's privacy
may be threatened. In this paper I examine some of the technology in
the Web, and how it affects the privacy of Web users. I also briefly
summarize some of the efforts to regulate privacy on the Internet.
%Z
Originally written in 1998; this revision was dated January 3, 1999.
The work was unofficially released, then made a Technical Report
noin June 2004.
%T Access Control in a Distributed Decentralized Network: An XML Approach to Network Security using XACML and SAML
%A Paul J. Mazzuca
%R Technical Report TR2004-506
%I Dartmouth College, Computer Science
%C Hanover, NH
%D Spring 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-506.pdf
%X
The development of eXtensible Distributed Access Control (XDAC) systems is influenced by the transference of data access and storage from the local computer to the network. In this distributed system, access control is determined by independent components which transmit requests and decisions over a network, utilizing XML signing capabilities found in the Security Assertion Markup Language (SAML). All resources in the XDAC system are protected by the first component, a Policy Enforcement Point (PEP), which acts as the main divider between the requesting entity and the requested resource. The PEP grants access to a resource only if the second component, a Policy Decision Point (PDP), returns a permit response after consulting a set of applicable policies based on the requester's attributes, the resource, the action that the requester desires to apply to that resource, and optionally the environment. With Sun's eXtensible Access Control Markup Language (XACML), the XML encoded policies can be combined among multiple nodes across a network using XACML rules and algorithms to formulate a single decision based on an XACML request. In this thesis project, I build a secure and efficient XDAC System based on XACML, implement an extension to the SAML Assertion design by including XACML Attributes and Results, describe in-detail about the many features that a XDAC System should embody, and show how a XDAC System would be effectively used in modern day computing.
%Z
Senior Honors Thesis. Advisor: Ed Feustel.
%T Experimental evaluation of wireless simulation assumptions
%A David Kotz
%A Calvin Newport
%A Robert S. Gray
%A Jason Liu
%A Yougu Yuan
%A Chip Elliott
%R Technical Report TR2004-507
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-507.pdf
%X
All analytical and simulation research on ad~hoc
wireless networks must necessarily model radio
propagation using simplifying assumptions. Although it
is tempting to assume that all radios have circular
range, have perfect coverage in that range, and
travel on a two-dimensional plane, most researchers
are increasingly aware of the need to represent more
realistic features, including hills, obstacles, link
asymmetries, and unpredictable fading. Although many
have noted the complexity of real radio propagation,
and some have quantified the effect of overly simple
assumptions on the simulation of ad~hoc network
protocols, we provide a comprehensive review of six
assumptions that are still part of many ad~hoc
network simulation studies. In particular, we use an
extensive set of measurements from a large outdoor
routing experiment to demonstrate the weakness of
these assumptions, and show how these assumptions
cause simulation results to differ significantly
from experimental results. We close with a series of
recommendations for researchers, whether they
develop protocols, analytic models, or simulators
for ad~hoc wireless networks.
%Z
This paper is significantly revised and extended from TR2003-467,
"The mistaken axioms of wireless-network research."
%T PPL: a Packet Processing Language
%A Eric G. Krupski
%R Technical Report TR2004-508
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-508.pdf
%X
Any computing device or system that uses the internet needs to analyze and identify the contents of network packets. Code that does this is often written in C, but reading, identifying, and manipulating network packets in C requires writing tricky and tedious code. Previous work has offered specification languages for describing the format of network packets, which would allow packet type identification without the hassles of doing this task in C. For example, McCann and Chandra's Packet Types [3] system allows the programmer to define arbitrary packet types and generates C unctions which match given data against a specified packet type. This paper will present a packet processing language named PPL, which extends McCann and Chandraâs Packet Types to allow the programmer to not only describe arbitrary packet types, but also to control when and how a matching is attempted, with ML-style pattern matching. PPL is intended for multiple applications, such as intrusion detection systems, quick prototypes of new protocols, and IP de-multiplexing code.
%Z
Senior Honors Thesis. Advisor: Chris Hawblitzel.
Code and other materials available at
http://www.cs.dartmouth.edu/reports/TR2004-508.tgz
(440KB).
%T Greenpass Client Tools for Delegated Authorization in Wireless Networks
%A Nicholas C. Goffee
%R Technical Report TR2004-509
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-509.pdf
%X
Dartmouth's Greenpass project seeks to provide strong access control
to a wireless network while simultaneously providing flexible guest
access; to do so, it augments the Wi-Fi Alliance's existing WPA
standard, which offers sufficiently strong user authentication and
access control, with authorization based on SPKI certificates. SPKI
allows certain local users to delegate network access to guests by
issuing certificates that state, in essence, "he should get access
because I said it's okay." The Greenpass RADIUS server described in
Kim's thesis [55] performs an authorization check based on such
statements so that guests can obtain network access without requiring
a busy network administrator to set up new accounts in a centralized
database. To our knowledge, Greenpass is the first working
delegation-based solution to Wi-Fi access control.
My thesis describes the Greenpass client tools, which allow a guest
to introduce himself to a delegator and allow the delegator to issue a
new SPKI certificate to the guest. The guest does not need custom
client software to introduce himself or to connect to the Wi-Fi
network. The guest and delegator communicate using a set of Web
applications. The guest obtains a temporary key pair and X.509
certificate if needed, then sends his public key value to a Web server
we provide. The delegator looks up her guest's public key and runs a
Java applet that lets her verify her guests' identity using visual
hashing and issue a new SPKI certificate to him. The guest's new
certificate chain is stored as an HTTP cookie to enable him to "push"
it to an authorization server at a later time. I also describe how
Greenpass can be extended to control access to a virtual private
network (VPN) and suggest several interesting future research and
development directions that could build on this work.
%Z
Masters thesis. Advisor: Sean Smith
%T Greenpass RADIUS Tools for Delegated Authorization in Wireless Networks
%A Sung Hoon Kim
%R Technical Report TR2004-510
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-510.pdf
%X
Dartmouth's Greenpass project extends how public key cryptography can be used
to secure the wireless LAN with a RADIUS (Remote Authentication Dial In User
Service) server that is responsible for handling authentication requests from
clients (called supplicants in the 802.1x authentication model). This thesis
describes the design and implementation of the authentication process of
Greenpass, specifically what decisions are made in determining who is granted
access and how a small modification of already existing protocols can be used
to provide guest access in a way that better reflects how delegation of
authority works in the real world.
Greenpass takes advantage of the existing PKI to authenticate local
Dartmouth users via X.509 identity certificates using EAP-TLS. We use the
flexibility of SPKI/SDSI (Simple Public Key Infrastructure/Simple Distributed
Security Infrastructure) authorization certificates to distribute the
responsibility of delegating access to guests to certain authorized delegators,
avoiding some of the necessary steps and paperwork associated with having a
large centralized entity responsible for the entire institution. This thesis
also discusses how our solution can be adapted to support different methods of
guest delegation and investigates the possibility of eliminating the
cumbersome central entity and administrative overhead traditionally
associated with public key cryptography.
%Z
Masters thesis. Advisor: Sean Smith
%T Outdoor Experimental Comparison of Four Ad Hoc Routing Algorithms
%A Robert S. Gray
%A David Kotz
%A Calvin Newport
%A Nikita Dubrovsky
%A Aaron Fiske
%A Jason Liu
%A Christopher Masone
%A Susan McGrath
%A Yougu Yuan
%R Technical Report TR2004-511
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-511.pdf
%X
Most comparisons of wireless ad hoc routing
algorithms involve simulated or indoor trial runs,
or outdoor runs with only a small number of nodes,
potentially leading to an incorrect picture of
algorithm performance. In this paper, we report on
the results of an outdoor trial run of four
different routing algorithms, APRL, AODV, GPSR, and
STARA, running on top of thirty-three 802.11-enabled
laptops moving randomly through an athletic
field. The laptops generated random traffic
according to the traffic patterns observed in a
prototype application, and ran each routing
algorithm for a fifteen-minute period over the
course of the hour-long trial run. The 33-laptop
experiment represents one of the largest outdoor
tests of wireless routing algorithms, and three of
the algorithms each come from a different
algorithmic class, providing insight into the
behavior of ad hoc routing algorithms at larger
real-world scales than have been considered so
far. In addition, we compare the outdoor results
with both indoor ("tabletop") and simulation results
for the same algorithms, examining the differences
between the indoor results and the outdoor
reality. The paper also describes the software
infrastructure that allowed us to implement the ad
hoc routing algorithms in a comparable way, and use
the same codebase for indoor, outdoor, and simulated
trial runs.
%Z
See also TR2004-507 and TR2004-504.
%T Testing the Greenpass Wireless Security System
%A Kimberly S. Powell
%R Technical Report TR2004-512
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-512.pdf
%X
Greenpass, developed by Nick Goffee, Sung Hoon Kim, Meiyuan Zhao and John Marchesini under the supervision of Sean Smith and Punch Taylor, is a wireless security solution that implements SPKI/SDSI delegation on top of X.509 keypairs within the EAP-TLS authentication protocol. This system aims to model the decentralized way that authorization flows in real-world enterprise settings and provide a seamless solution that allows for easy access to all resources in the network by both registered users and authorized guests. These goals are achieved through the deployment of a delegation tool, which allows an active entity associated to the organization's network to grant authorization to another entity previously unauthorized to use the network.
This paper describes the testing process of the first prototype for this system. It examines trust and usability issues of the Greenpass Wireless Security System and determines the accuracy of the system's implementation in relation to its objectives. It then addresses the planning and execution of a small-scale demo for this prototype based on the examined issues and makes projections for further tests on a larger scale.
%Z
Senior Honors Thesis. Advisors: Sean Smith and Denise Anthony
%T Performance Evaluation of a Resource Discovery Service
%A Jue Wang
%R Technical Report TR2004-513
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October, 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-513.pdf
%X
In a pervasive computing environment, the number
and variety of resources (services, devices, and contextual
information resources) make it necessary for applications to
accurately discover the best ones quickly. Thus a resource-discovery
service, which locates specific resources and establishes network
connections as better resources become available, is necessary for
those applications. The performance of the resource-discovery service
is important when the applications are in a dynamic and mobile
environment. In this thesis, however, we do not focus on the resource-
discovery technology itself, but the evaluation of the scalability and
mobility of the resource discovery module in Solar, a context fusion
middleware. Solar has a naming service that provides resource
discovery, since the resource names encode static and dynamic
attributes. The results of our experiments show that Solar's
resource discovery performed generally well in a typical dynamic environment,
although Solar can not be scaled as well as it should. And we identify the
implementation issues related to that problem. We
also discuss experience, insights, and lessons learned from our
quantitative analysis of the experiment results.
%T Solar: Building A Context Fusion Network for Pervasive Computing
%A Guanling Chen
%R Technical Report TR2004-514
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-514.pdf
%X
The complexity of developing context-aware pervasive-computing
applications calls for distributed software infrastructures that
assist applications to collect, aggregate, and disseminate contextual
data. In this dissertation, we present a Context Fusion Network
(CFN), called Solar, which is built with a scalable and self-organized
service overlay. Solar is flexible and allows applications to select
distributed data sources and compose them with customized data-fusion
operators into a directed acyclic information flow graph. Such a
graph represents how an application computes high-level understandings
of its execution context from low-level sensory data. To manage
application-specified operators on a set of overlay nodes called
Planets, Solar provides several unique services such as
application-level multicast with policy-driven data reduction to
handle buffer overflow, context-sensitive resource discovery to handle
environment dynamics, and proactive monitoring and recovery to handle
common failures. Experimental results show that these services
perform well on a typical DHT-based peer-to-peer routing substrate.
In this dissertation, we also discuss experience, insights, and
lessons learned from our quantitative analysis of the input sensors, a
detailed case study of a Solar application, and development of other
applications in different domains.
%Z
This is a reformatted version of Guanling Chen's Ph.D. dissertation. Unlike
the dissertation submitted to Dartmouth College, this version is
single-spaced, uses 11pt fonts, and is formatted specifically for
double-sided printing.
%T Exposing Digital Forgeries by Detecting Duplicated Image Regions
%A Alin C. Popescu
%A Hany Farid
%R Technical Report TR2004-515
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-515.pdf
%X
We describe an efficient technique that automatically detects
duplicated regions in a digital image. This technique works by first
applying a principal component analysis to small fixed-size image
blocks to yield a reduced dimension representation. This
representation is robust to minor variations in the image due to
additive noise or lossy compression. Duplicated regions are then
detected by lexicographically sorting all of the image blocks. We show
the efficacy of this technique on credible forgeries, and quantify its
robustness and sensitivity to additive noise and lossy JPEG
compression.
%T Parallel Out-of-Core Sorting: The Third Way
%A Geeta Chaudhry
%R Technical Report TR2004-517
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-517.pdf
%X
Sorting very large datasets is a key subroutine in almost any application that
is built on top of a large database. Two ways to sort out-of-core data
dominate the literature: merging-based algorithms and partitioning-based
algorithms. Within these two paradigms, all the programs that sort
out-of-core data on a cluster rely on assumptions about the input distribution.
We propose a third way of out-of-core sorting: oblivious algorithms. In all,
we have developed six programs that sort out-of-core data on a cluster. The
first three programs, based completely on Leighton's columnsort algorithm,
have a restriction on the maximum problem size that they can sort.
The other three programs relax this restriction; two are based on our
original algorithmic extensions to columnsort. We present experimental
results to show that our algorithms perform well. To the best of our knowledge,
the programs presented in this thesis are the first to sort out-of-core
data on a cluster without making any simplifying assumptions about the
distribution of the data to be sorted.
%Z
Doctoral dissertation. Advisor: Thomas H. Cormen
%T Creating and Detecting Doctored and Virtual Images: Implications to The Child Pornography Prevention Act
%A Hany Farid
%R Technical Report TR2004-518
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-518.pdf
%X
The 1996 Child Pornography Prevention Act (CPPA) extended
the existing federal criminal laws against child pornography to
include certain types of "virtual porn". In 2002, the United States
Supreme Court found that portions of the CPPA, being overly broad and
restrictive, violated First Amendment rights. The Court ruled that
images containing an actual minor or portions of a minor are not
protected, while computer generated images depicting a fictitious
"computer generated" minor are constitutionally protected. In this
report I outline various forms of digital tampering, placing them in
the context of this recent ruling. I also review computational
techniques for detecting doctored and virtual (computer generated)
images.
%T Heterogeneous Self-Reconfiguring Robotics
%A Robert C. Fitch
%R Technical Report TR2004-519
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September, 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-519.pdf
%X
Self-reconfiguring (SR) robots are modular systems that can autonomously change shape, or reconfigure, for increased versatility and adaptability in unknown environments. In this thesis, we investigate planning and control for systems of non-identical modules, known as heterogeneous SR robots. Although previous approaches rely on module homogeneity as a critical property, we show that the planning complexity of fundamental algorithmic problems in the heterogeneous case is equivalent to that of systems with identical modules. Primarily, we study the problem of how to plan shape changes while considering the placement of specific modules within the structure. We characterize this key challenge in terms of the amount of free space available to the robot and develop a series of decentralized reconfiguration planning algorithms that assume progressively more severe free space constraints and support reconfiguration among obstacles. In addition, we compose our basic planning techniques in different ways to address problems in the related task domains of positioning modules according to function, locomotion among obstacles, self-repair, and recognizing the achievement of distributed goal-states. We also describe the design of a novel simulation environment, implementation results using this simulator, and experimental results in hardware using a planar SR system called the Crystal Robot. These results encourage development of heterogeneous systems. Our algorithms enhance the versatility and adaptability of SR robots by enabling them to use functionally specialized components to match capability, in addition to shape, to the task at hand.
%Z
Doctoral dissertation. Advisor: Daniela Rus
%T Mercer Kernels for Object Recognition with Local Features
%A Siwei Lyu
%R Technical Report TR2004-520
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-520.pdf
%X
In this paper, we propose a new class of kernels for object
recognition based on local image feature representations. Formal
proofs are given to show that these kernels satisfy the Mercer
condition and reflect similarities between sets of local features. In
addition, multiple types of local features and semilocal constraints
are incorporated to reduce mismatches between local features, thus
further improve the classification performance. Experimental results
of SVM classifiers coupled with the proposed kernels are reported on
ecognition tasks with the standard COIL-100 database and compared
with existing methods. The proposed kernels achieved satisfactory
performance and were robust to changes in object configurations and
image degradations.
%Z
Submitted to CVPR 2005.
%T Composing a Well-Typed Region
%A Chris Hawblitzel
%A Heng Huang
%A Lea Wittie
%R Technical Report TR2004-521
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-521.pdf
%X
Efficient low-level systems need more control over memory than
safe high-level languages usually provide. In particular, safe languages
usually prohibit explicit deallocation, in order to prevent dangling pointers.
Regions provide one safe deallocation mechanism; indeed, many region calculi
have appeared recently, each with its own set of operations and often complex
rules. This paper encodes regions from lower-level typed primitives (linear
memory, coercions, and delayed types), so that programmers can design their own
region operations and rules.
%T Efficient Wait-Free Implementation of Multiword LL/SC Variables
%A Prasad Jayanti
%A Srdjan Petrovic
%R Technical Report TR2004-523
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-523.pdf
%X
Since the design of lock-free data structures often poses a formidable intellectual challenge, researchers are constantly in search of abstractions and primitives that simplify this design.
The multiword LL/SC object is such a primitive: many existing algorithms are based on this primitive, including the nonblocking and wait-free universal
constructions of Anderson and Moir (1995), the closed objects construction of Chandra et al.(1998) and the snapshot algorithms of Jayanti (2002, 2004).
In this paper, we consider the problem of implementing a W-word LL/SC object shared by N processes. The previous best algorithm, due to Anderson and Moir (1995), is time optimal (LL and SC operations run in O(W) time), but has a space complexity of O(N^2W).
We present an algorithm that uses novel buffer management ideas to cut down the space complexity by a factor of N to O(NW), while still being time optimal.
%T A Survey of WPA and 802.11i RSN Authentication Protocols
%A Kwang-Hyun Baek
%A Sean W. Smith
%A David Kotz
%R Technical Report TR2004-524
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-524.pdf
%X
In the new standards for WLAN security, many choices exist for the authentication process. In this paper, we list eight desired properties of WLAN authentication protocols, survey eight recent authentication protocols, and analyze the protocols according to the desired properties.
%T Secure Hardware Enhanced MyProxy
%A John Marchesini
%A Sean W. Smith
%R Technical Report TR2004-525
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-525.pdf
%X
In 1976, Whitfield Diffie and Martin Hellman demonstrated how "New
Directions In Cryptography" could enable secure information exchange
between parties that do not share secrets. In order for public key
cryptography to work in modern distributed environments, we need an
infrastructure for finding and trusting other parties' public keys
(i.e., a PKI). A number of useful applications become possible with
PKI. While the applications differ in how they use keys (e.g., S/MIME
uses the key for message encryption and signing, while client-side SSL
uses the key for authentication), all applications share one
assumption: users have keypairs.
In previous work, we examined the security aspects of some of the
standard keystores and the their interaction with the OS. We
concluded that desktops are not safe places to store private keys, and
we demonstrated the permeability of keystores such as the default
Microsoft keystore and the Mozilla keystore. In addition to being
unsafe, these desktop keystores have the added disadvantage of being
immobile.
In other previous work, we examined trusted computing. In industry, a
new trusted computing initiative has emerged: the Trusted Computing
Platform Alliance (TCPA) (now renamed the Trusted Computing Group
(TCG)). The goal of the TCG design is lower-assurance security that
protects an entire desktop platform and is cheap enough to be
commercially feasible. Last year, we built a trusted computing
platform based on the TCG specifications and hardware.
The picture painted by these previous projects suggests that common
desktops are not secure enough for use as PKI clients, and trusted
computing can improve the security of client machines. The question
that I propose to investigate is: "Can I build a system which applies
trusted computing hardware in a reasonable manner in order to make
desktops usable for PKI?" My design begins with the Grid community's
"MyProxy" credential repository, and enhances it to take advantage of
secure hardware on the clients, at the repository, and in the policy
framework. The result is called "Secure Hardware Enhanced MyProxy".
%T Type-Safe Operating System Abstractions
%A Lea Wittie
%R Technical Report TR2004-526
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-526.pdf
%X
Operating systems and low-level applications are usually written in
languages like C and assembly, which provide access to low-level
abstractions. These languages have unsafe type systems that allow many
bugs to slip by programmers. For example, in 1988, the Internet Worm
exploited several insecure points in Unix including the finger
command. A call to finger with an unexpected argument caused a buffer
overflow, leading to the shutdown of most Internet traffic. A finger
application written in a type-safe language would have prevented its
exploit and limited the points the Internet Worm could attack. Such
vulnerabilities are unacceptable in security-critical applications
such as the secure coprocessors of the Marianas network, secStore key
storage from Plan 9, and self-securing storage.
This research focuses on safe language techniques for
building OS components that cannot cause memory or IO errors. For
example, an Ethernet device driver communicates with its device
through IO operations. The device depends on FIFO queues to send and
receive packets. A mistake in an IO operation can overflow or
underflow the FIFO queues, cause memory errors, or cause configuration
inconsistencies on the device. Data structures such as FIFO queues can
be written safely in safe languages such as Java and ML but these
languages do not allow the access to the low-level resources that an
OS programmer needs. Therefore, safe OS components require a language
that combines the safety of Java with the low-level control of C.
My research formalizes the concurrency, locks, and system state
needed by the safety-critical areas of a device driver. These formal
concepts are built on top of an abstract syntax and rules that
guarantees basic memory safety using linear and singleton types to
implement safe memory load and store operations. I proved that the
improved abstract machine retains the property of soundness, which
means that all well-typed programs will be able to execute until they
reach an approved end-state. Together, the concurrency, locks, and
state provide safety for IO operations and data structures.
Using the OSKit from the University of Utah as a starting point, I
developed a small operating system. I ported the 3c509 Ethernet device
driver from C to Clay, a C-like type-safe language that uses a type
system powerful enough to enforce invariants about low-level devices
and data structures. The resulting driver works safely in a
multi-threaded environment. It is guaranteed to obtain locks before
using shared data. It cannot cause a FIFO queue to overflow or
underflow and it will only call IO operations when invariants are
satisfied.
This type-safe driver demonstrates an actual working application of
the theoretical components of my research. The abstract machine is powerful
enough to encode a given OS specification and enforce a provably matching
implementation. These results lead towards fundamentally secure computing
environments.
%Z
Doctoral dissertation. Advisor: Chris Hawblitzel
%T Discrete-Time Fractional Differentiation from Integer Derivatives
%A Hany Farid
%R Technical Report TR2004-528
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-528.pdf
%X
Discrete-time fractional derivative filters (1-D and 2-D) are shown to
be well approximated from a small set of integer derivatives. A
fractional derivative of arbitrary order (and, in 2-D, of arbitrary
orientation) can therefore be efficiently computed from a linear
combination of integer derivatives of the underlying signal or image.
%T Secure Context-sensitive Authorization
%A Kazuhiro Minami
%A David Kotz
%R Technical Report TR2004-529
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2004
%U http://www.cs.dartmouth.edu/reports/TR2004-529.pdf
%X
There is a recent trend toward rule-based authorization systems to achieve
flexible security policies. Also, new sensing technologies in pervasive
computing make it possible to define context-sensitive rules, such as ``allow
database access only to staff who are currently located in the main office.''
However, these rules, or the facts that are needed to verify authority, often
involve sensitive context information. This paper presents a secure
context-sensitive authorization system that protects confidential information in
facts or rules. Furthermore, our system allows multiple hosts in a distributed
environment to perform the evaluation of an authorization query in a
collaborative way; we do not need a universally trusted central host that
maintains all the context information. The core of our approach is to decompose
a proof for making an authorization decision into a set of sub-proofs produced
on multiple different hosts, while preserving the integrity and confidentiality
policies of the mutually untrusted principals operating these hosts. We prove
the correctness of our algorithm.
%T High-Throughput Inference of Protein-Protein Interaction Sites from Unassigned NMR Data by Analyzing Arrangements Induced By Quadratic Forms on 3-Manifolds
%A Ramgopal R. Mettu
%A Ryan H. Lilien
%A Bruce R. Donald
%R Technical Report TR2005-530
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-530.pdf
%X
We cast the problem of identifying protein-protein interfaces, using only unassigned NMR spectra, into a geometric clustering problem. Identifying protein-protein interfaces is critical to understanding inter- and intra-cellular communication, and NMR allows the study of protein interaction in solution. However it is often the case that NMR studies of a protein complex are very time-consuming, mainly due to the bottleneck in assigning the chemical shifts, even if the apo structures of the constituent proteins are known. We study whether it is possible, in a high-throughput manner, to identify the interface region of a protein complex using only unassigned chemical shift and residual dipolar coupling (RDC) data.
We introduce a geometric optimization problem where we must cluster the cells in an arrangement on the boundary of a 3-manifold. The arrangement is induced by a spherical quadratic form, which in turn is parameterized by SO(3)xR^2. We show that this formalism derives directly from the physics of RDCs. We present an optimal algorithm for this problem that runs in O(n^3 log n) time for an n-residue protein. We then use this clustering algorithm as a subroutine in a practical algorithm for identifying the interface region of a protein complex from unassigned NMR data. We present the results of our algorithm on NMR data for 7 proteins from 5 protein complexes and show that our approach is useful for high-throughput applications in which we seek to rapidly identify the interface region of a protein complex.
%Z
A revised version of this paper has been accepted for publication and will appear at ISMB (2005) and Bioinformatics (2005).
%T Statistical Tools for Digital Image Forensics
%A Alin C. Popescu
%R Technical Report TR2005-531
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2004
%U http://www.cs.dartmouth.edu/reports/TR2005-531.pdf
%X
A digitally altered image, often leaving no visual clues of
having been tampered with, can be indistinguishable from an authentic
image. The tampering, however, may disturb some underlying statistical
properties of the image. Under this assumption, we propose five
techniques that quantify and detect statistical perturbations found in
different forms of tampered images: (1) re-sampled images (e.g.,
scaled or rotated); (2) manipulated color filter array interpolated
images; (3) double JPEG compressed images; (4) images with duplicated
regions; and (5) images with inconsistent noise patterns. These
techniques work in the absence of any embedded watermarks or
signatures. For each technique we develop the theoretical foundation,
show its effectiveness on credible forgeries, and analyze its
sensitivity and robustness to simple counter-attacks.
%Z
Ph.D. thesis, advisor: Hany Farid.
%T SHEMP: Secure Hardware Enhanced MyProxy
%A John Marchesini
%A Sean Smith
%R Technical Report TR2005-532
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-532.pdf
%X
While PKI applications differ in how they use keys, all applications
share one assumption: users have keypairs. In previous work, we
established that desktop keystores are not safe places to store
private keys, because the TCB is too large. These keystores are also
immobile, difficult to use, and make it impossible for relying parties
to make reasonable trust judgments. Since we would like to use
desktops as PKI clients and cannot realistically expect to redesign
the entire desktop, this paper presents a system that works within the
confines of modern desktops to shrink the TCB needed for PKI
applications. Our system (called Secure Hardware Enhanced MyProxy
(SHEMP)) shrinks the TCB in space and allows the TCB's size to vary
over time and over various application sensitivity levels, thus making
desktops usable for PKI.
%T Spatial Multipath Location Aided Routing
%A Soumendra Nanda
%R Technical Report TR2005-533
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-533.pdf
%X
Mobile ad-hoc networks (MANETs) are infrastructure-free networks of mobile nodes
that communicate with each other wirelessly. There are several routing schemes
that have been proposed and several of these have been already extensively
simulated or implemented as well. The primary applications of such networks have
been in disaster relief operations, military use, conferencing and environment
sensing. There are several ad hoc routing algorithms at present that utilize
position information (usually in two dimensional terms) to make routing
decisions at each node. Our goal is to utilize three-dimensional (3D) position
information to provide more reliable as well as efficient routing for certain
applications. We thus describe extensions to various location aware routing
algorithms to work in 3D. We propose a new hierarchical, zone-based 3D routing
algorithm, based on GRID by Liao, Tseng and Sheu. Our new algorithm called
"Hyper-GRID" is a hybrid algorithm that uses multipath routing (alternate path
caching) in 3D. We propose replacing LAR with Multipath LAR (MLAR) in GRID. We
have implemented MLAR and are validating MLAR through simulation using ns-2 and
studying its efficiency, scalability and other properties. We use a random
waypoint mobility model and compare our MLAR approach versus LAR, AODV and AOMDV
in both 2D and 3D for a range of traffic and mobility scenarios. Our simulation
results demonstrate the performance benefits of MLAR over LAR and AODV in most
mobility situations. AOMDV delivers more packets than MLAR consistently, but
does so at the cost of more frequent flooding of control packets and thus higher
bandwidth usage than MLAR.
%Z
MS Thesis ,Originally submitted 6th June 2004, Advised by Robert S. Gray
%T Department of Computer Science Activity 1998-2004
%A David Kotz
%R Technical Report TR2005-534
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-534.pdf
%X
This report summarizes much of the research and teaching activity of the Department of Computer Science at Dartmouth College between late 1998 and late 2004. The material for this report was collected as part of the final report for NSF Institutional Infrastructure award EIA-9802068, which funded equipment and technical staff during that six-year period. This equipment and staff supported essentially all of the departmentÕs research activity during that period.
%T Graphical Models of Residue Coupling in Protein Families
%A John Thomas
%A Naren Ramakrishnan
%A Chris Bailey-Kellogg
%R Technical Report TR2005-535
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-535.pdf
%X
Identifying residue coupling relationships within a protein family can
provide important insights into intrinsic molecular processes, and has
significant applications in modeling structure and dynamics,
understanding function, and designing new or modified proteins. We
present the first algorithm to infer an undirected graphical model
representing residue coupling in protein families. Such a model
serves as a compact description of the joint amino acid distribution,
and can be used for predictive (will this newly designed protein be
folded and functional?), diagnostic (why is this protein not stable or
functional?), and abductive reasoning (what if I attempt to graft
features of one protein family onto another?). Unlike current
correlated mutation algorithms that are focused on assessing
dependence, which can conflate direct and indirect relationships, our
algorithm focuses on assessing independence, which modularizes
variation and thus enables efficient reasoning of the types described
above. Further, our algorithm can readily incorporate, as priors,
hypotheses regarding possible underlying mechanistic/energetic
explanations for coupling. The resulting approach constitutes a
powerful and discriminatory mechanism to identify residue coupling
from protein sequences and structures. Analysis results on the
G-protein coupled receptor (GPCR) and PDZ domain families demonstrate
the ability of our approach to effectively uncover and exploit models
of residue coupling.
%Z
To appear in BIOKDD05.
%T Detection of Covert Channel Encoding in Network Packet Delays
%A Vincent Berk
%A Annarita Giani
%A George Cybenko
%R Technical Report TR2005-536
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-536-rev1.pdf
%X
Covert channels are mechanisms for communicating information in
ways that are difficult to detect. Data exfiltration can be an
indication that a computer has been compromised by an attacker
even when other intrusion detection schemes have failed to detect
a successful attack. Covert timing channels use packet
inter-arrival times, not header or payload embedded information,
to encode covert messages. This paper investigates the channel
capacity of Internet-based timing channels and proposes a
methodology for detecting covert timing channels based on how
close a source comes to achieving that channel capacity. A
statistical approach is then used for the special case of binary
codes.
%Z
This revision differs from the original only in the correction of one
reference.
%T An O(n^{5/2} log n) Algorithm for the Rectilinear Minimum Link-Distance Problem in Three Dimensions (Extended Abstract)
%A Robert Scot Drysdale
%A Clifford Stein
%A David P. Wagner
%R Technical Report TR2005-538
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-538.pdf
%X
In this paper we consider the Rectilinear Minimum Link-Distance Problem in Three Dimensions. The problem is well studied in two dimensions, but is relatively unexplored in higher dimensions.
We solve the problem in O(B n log n) time, where n is the number
of corners among all obstacles, and B is the size of a BSP decomposition of the space containing the obstacles. It has been shown that
in the worst case B = Theta(n^{3/2}), giving us an overall worst case time of O(n^{5/2} log n).
Previously known algorithms have had worst-case running times of Omega(n^3).
%Z
Submitted to CCCG 2005
%T Preventing Theft of Quality of Service on Open Platforms
%A Kwang-Hyun Baek
%A Sean W. Smith
%R Technical Report TR2005-539
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-539.pdf
%X
As multiple types of traffic converge onto one network (frequently wireless), enterprises face a tradeoff between effectiveness and security. Some types of traffic, such as voice-over-IP (VoIP), require certain quality of service (QoS) guarantees to be effective. The end client platform is in the best position to know which packets deserve this special handling. In many environments (such as universities), end users relish having control over their own machines.
However, if end users administer their own machines, nothing stops dishonest ones from marking undeserving traffic for high QoS. How can an enterprise ensure that only appropriate traffic receives high QoS, while also allowing end users to retain control over their own machines?
In this paper, we present the design and prototype of a solution, using SELinux, TCPA/TCG hardware, Diffserv, 802.1x, and EAP-TLS.
%Z
REVISION 1 of May 5, 2005.
%T Classifying the Mobility of Users and the Popularity of Access Points
%A Minkyong Kim
%A David Kotz
%R Technical Report TR2005-540
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-540.pdf
%X
There is increasing interest in location-aware systems and applications. It is
important for any designer of such systems and applications to understand the
nature of user and device mobility. Furthermore, an understanding of the effect
of user mobility on access points (APs) is also important for designing,
deploying, and managing wireless networks. Although various studies of wireless
networks have provided insights into different network environments and user
groups, it is often hard to apply these findings to other situations, or to
derive useful abstract models.
In this paper, we present a general methodology for extracting mobility
information from wireless network traces, and for classifying mobile users and
APs. We used the Fourier transform to convert time-dependent location
information to the frequency domain, then chose the two strongest periods and
used them as parameters to a classification system based on Bayesian theory. To
classify mobile users, we computed diameter (the maximum distance
between any two APs visited by a user during a fixed time period) and observed
how this quantity changes or repeats over time. We found that user mobility had
a strong period of one day, but there was also a large group of users that had
either a much smaller or much bigger primary period. Both primary and secondary
periods had important roles in determining classes of mobile users. Users with
one day as their primary period and a smaller secondary period were most
prevalent; we expect that they were mostly students taking regular classes. To
classify APs, we counted the number of users visited each AP. The primary
period did not play a critical role because it was equal to one day for most of
the APs; the secondary period was the determining parameter. APs with one day
as their primary period and one week as their secondary period were most
prevalent. By plotting the classes of APs on our campus map, we discovered that
this periodic behavior of APs seemed to be independent of their geographical
locations, but may depend on the relative locations of nearby APs. Ultimately,
we hope that our study can help the design of location-aware services by
providing a base for user mobility models that reflect the movements of real
users.
%Z
Revision of a paper that appeared at LoCA 2005.
%T Aggregated Path Authentication for Efficient BGP Security
%A Meiyuan Zhao
%A Sean W. Smith
%A David M. Nicol
%R Technical Report TR2005-541
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-541.pdf
%X
The border gateway protocol (BGP) controls inter-domain routing in the
Internet. BGP is vulnerable to many attacks, since routers rely on
hearsay information from neighbors. Secure BGP (S-BGP) uses DSA to
provide route authentication and mitigate many of these
risks. However, many performance and deployment issues prevent S-BGP's
real-world deployment. Previous work has explored improving S-BGP
processing latencies, but space problems, such as increased message
size and memory cost, remain the major obstacles. In this paper, we
combine two efficient cryptographic techniques---signature
amortization and aggregate signatures---to design new aggregated path
authentication schemes. We propose six constructions for aggregated
path authentication that substantially improve efficiency of S-BGP's
path authentication on both speed and space criteria. Our performance
evaluation shows that the new schemes achieve such an efficiency that
they may overcome the space obstacles and provide a real-world
practical solution for BGP security.
%T A toy rock climbing robot
%A Matthew P. Bell
%R Technical Report TR2005-542
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-542.pdf
%X
The goal of this thesis was to build a simple toy rock climbing robot, and to explore problems related to grasping, path planning, and robot control. The robot is capable of climbing a wall of pegs either under manual control through a host system and an infrared interface, or on the basis of a set of pre-recorded keyframes. In addition, the robot can climb certain peg configurations using a cyclic gait. The robot climbs in an open-loop mode without sensor feedback. All communications are sent through the IR connection, and the tether to the robot consists only of two power wires.
%Z
Senior Honors Thesis. Advisor: Devin Balkcom.
%T On-line Metasearch, Pooling, and System Evaluation
%A Robert A. Savell
%R Technical Report TR2005-543
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-543.pdf
%X
This thesis presents a unified method for simultaneous solution of three problems in Information Retrieval--- metasearch (the fusion of ranked lists returned by retrieval systems to elicit improved performance), efficient system evaluation (the accurate evaluation of retrieval systems with small numbers of relevance judgements), and pooling or ``active sample selection" (the selection of documents for manual judgement in order to develop sample pools of high precision or pools suitable for assessing system quality). The thesis establishes a unified theoretical framework for addressing these three problems and naturally generalizes their solution to the on-line context by incorporating feedback in the form of relevance judgements.
The algorithm--- Rankhedge for on-line retrieval, metasearch and system evaluation--- is the first to address these three problems simultaneously and also to generalize their solution to the on-line context. Optimality of the Rankhedge algorithm is developed via Bayesian and maximum entropy interpretations. Results of the algorithm prove to be significantly superior to previous methods when tested over a range of TREC (Text REtrieval Conference) data. In the absence of feedback, the technique equals or exceeds the performance of benchmark metasearch algorithms such as CombMNZ and Condorcet. The technique then dramatically improves on this performance during the on-line metasearch process. In addition, the technique generates pools of documents which include more relevant documents and produce more accurate system evaluations than previous techniques. The thesis includes an information-theoretic examination of the original Hedge algorithm as well as its adaptation to the context of ranked lists. The work also addresses the concept of information-theoretic similarity within the Rankhedge context and presents a method for decorrelating the predictor set to improve worst case performance. Finally, an information-theoretically optimal method for probabilistic ``active sampling" is presented with possible application to a broad range of practical and theoretical contexts.
%Z
Doctoral dissertation. Advisors: Jay Aslam (NEU), Dan Rockmore, Chris Bailey-Kellogg, and John Lafferty (CMU).
%T Managing Access Control in Virtual Private Networks
%A Twum Djin
%R Technical Report TR2005-544
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-544.pdf
%X
Virtual Private Network technology allows remote network users to benefit from
resources on a private network as if their host machines actually resided on the network.
However, each resource on a network may also have its own access control policies,
which may be completely unrelated to network access. Thus usersÕ access to a network
(even by VPN technology) does not guarantee their access to the sought resources. With
the introduction of more complicated access privileges, such as delegated access, it is
conceivable for a scenario to arise where a user can access a network remotely (because
of direct permissions from the network administrator or by delegated permission) but
cannot access any resources on the network. There is, therefore, a need for a network
access control mechanism that understands the privileges of each remote network user on
one hand, and the access control policies of various network resources on the other hand,
and so can aid a remote user in accessing these resources based on the user's privileges.
This research presents a software solution in the form of a centralized access
control framework called an Access Control Service (ACS), that can grant remote users
network presence and simultaneously aid them in accessing various network resources
with varying access control policies. At the same time, the ACS provides a centralized
framework for administrators to manage access to their resources. The ACS achieves
these objectives using VPN technology, network address translation and by proxying
various authentication protocols on behalf of remote users.
%Z
Senior honors thesis.
%T Automatic Image Orientation Determination with Natural Image Statistics
%A Siwei Lyu
%R Technical Report TR2005-545
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2004
%U http://www.cs.dartmouth.edu/reports/TR2005-545.pdf
%X
In this paper, we propose a new method for automatically determining
image orientations. This method is based on a set of natural image
statistics collected from a multi-scale multi-orientation image
decomposition (e.g., wavelets). From these statistics, a two-stage
hierarchal classification with multiple binary SVM classifiers is
employed to de- termine image orientation. The proposed method is
evaluated and compared to existing methods with experiments performed
on 18040 natural images, where it showed promising performance.
%T Boolean operations with two dimensional segment trees
%A Robert Scot Drysdale
%A David P. Wagner
%R Technical Report TR2005-546
%I Dartmouth College, Computer Science
%C Hanover, NH
%D Sometime 2005
%Z
Abstract and paper lost.
%T Towards Tiny Trusted Third Parties
%A Alexander Iliev
%A Sean Smith
%R Technical Report TR2005-547
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-547.pdf
%X
Many security protocols hypothesize the existence of a {\em trusted
third party (TTP)} to ease handling of computation and data too
sensitive for the other parties involved. Subsequent discussion
usually dismisses these protocols as hypothetical or impractical,
under the assumption that trusted third parties cannot exist. However,
the last decade has seen the emergence of hardware-based devices that,
to high assurance, can carry out computation unmolested; emerging
research promises more. In theory, such devices can perform the role
of a trusted third party in real-world problems.
In practice, we have found problems. The devices aspire to be
general-purpose processors but are too small to accommodate real-world
problem sizes. The small size forces programmers to hand-tune each
algorithm anew, if possible, to fit inside the small space without
losing security. This tuning heavily uses operations that
general-purpose processors do not perform well. Furthermore, perhaps
by trying to incorporate too much functionality, current devices are
also too expensive to deploy widely.
Our current research attempts to overcome these barriers, by
focusing on the effective use of {\em tiny} TTPs ({\em T3Ps}). To
eliminate the programming obstacle, we used our experience building
hardware TTP apps to design and prototype an efficient way to execute
{\em arbitrary} programs on T3Ps while preserving the critical trust
properties. To eliminate the performance and cost obstacles, we are
currently examining the potential hardware design for a T3P optimized
for these operations.
In previous papers, we reported our work on the programming
obstacle. In this paper, we examine the potential hardware designs. We
estimate that such a T3P could outperform existing devices by several
orders of magnitude, while also having a gate-count of only 30K-60K,
one to three orders of magnitude smaller than existing devices.
%T Lower Bounds on the Communication Complexity of Shifting
%A Marco D. Adelfio
%R Technical Report TR2005-548
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-548.pdf
%X
We study the communication complexity of the SHIFT (equivalently,
SUM-INDEX) function in a 3-party simultaneous message model. Alice and
Bob share an n-bit string x and Alice holds an index i and Bob an index
j. They must send messages to a referee who knows only n, i and j,
enabling him to determine x[(i+j) mod n]. Surprisingly, it is possible
to achieve nontrivial savings even with such a strong restriction: Bob
can now make do with only ceil(n/2) bits. Here we show that this bound
is completely tight, for all n. This is an exact lower bound, with no
asymptotics involved.
%Z
Senior Honors Thesis, 2005. Awarded High Honors.
%T Structural Analysis of Social Networks with Wireless Users
%A Guanling Chen
%A David Kotz
%R Technical Report TR2005-549
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-549.pdf
%X
Online interactions between computer users form Internet-based social
networks. In this paper we present a structural analysis of two such
networks with wireless users. In one network the wireless users
participate in a global file-sharing system, and in the other they
interact with each other through a local music-streaming application.
%T Mining Frequent and Periodic Association Patterns
%A Guanling Chen
%A Heng Huang
%A Minkyong Kim
%R Technical Report TR2005-550
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-550.pdf
%X
Profiling the clients' movement behaviors is useful for mobility
modeling, anomaly detection, and location prediction. In this paper,
we study clients' frequent and periodic movement patterns in a campus
wireless network. We use offline data-mining algorithms to discover
patterns from clients' association history, and analyze the reported
patterns using statistical methods. Many of our results reflect the
common characteristics of a typical academic campus, though we also
observed some unusual association patterns. There are two challenges:
one is to remove noise from data for efficient pattern discovery, and
the other is to interpret discovered patterns. We address the first
challenge using a heuristic-based approach applying domain knowledge.
The second issue is harder to address because we do not have the
knowledge of people's activities, but nonetheless we could make
reasonable interpretation of the common patterns.
%T More Efficient Secure Function Evaluation Using Tiny Trusted Third Parties
%A Alexander Iliev
%A Sean Smith
%R Technical Report TR2005-551
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-551.pdf
%X
We investigate the use of trustworthy devices, which function as
trusted third parties (TTPs), to solve general two-party Secure
Function Evaluation (SFE) problems. We assume that a really
trustworthy TTP device will have very limited protected memory and
computation environment---a \emph{tiny TTP}. This precludes trivial
solutions like "just run the function in the TTP".
Traditional scrambled circuit evaluation approaches to SFE have a very
high overhead in using indirectly-addressed arrays---every array
access's cost is linear in the array size. The main gain in our
approach is that array access can be provided with much smaller
overhead---$O(\sqrt{N}\log N)$. This expands the horizon of problems
which can be efficiently solved using SFE. Additionally, our technique
provides a simple way to deploy arbitrary programs on tiny TTPs.
In our prototype, we use a larger (and expensive) device, the IBM 4758
secure coprocessor, but we also speculate on the design of future tiny
devices that could greatly improve the current prototype's efficiency
by being optimized for the operations prevalent in our algorithms.
We have prototyped a compiler for the secure function definition
language (SFDL) developed in the Fairplay project. Our compiler
produces an arithmetic circuit, augmented with \emph{array access
gates} which provide more efficient secure access to arrays. We then
have a circuit interpreter in the 4758 to evaluate such a circuit on
given inputs. It does this gate by gate, requiring very little
protected space. We report on the performance of this prototype, which
confirms our approach's strength in handling indirectly-addressed
arrays.
%T On the Design of an Immersive Environment for Security-Related Studies
%A Yougu Yuan
%R Technical Report TR2005-552
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-552.pdf
%X
The Internet has become an essential part of normal operations of both
public and private sectors. Many security issues are not addressed in
the original Internet design, and security now has become a large
concern for networking research and study. There is an imperative need
to have an simulation environment that can be used to help study
security-related research problems. In the thesis we present our
effort to build such an environment: Real-time Immersive Network
Simulation Environment (RINSE). RINSE features flexible configuration
of models using various networking protocols and real-time user
interaction. We also present the Estimate Next Infection (ENI) model
we developed for Internet scanning worms using RINSE, and the effort
of combining multiple resolutions in worm modeling.
%Z
Masters thesis. Advisor: David Nicol
%T An Untethered, Electrostatic, Globally Controllable MEMS Micro-Robot: Supplementary videos
%A Bruce R. Donald
%A Christopher G. Levey
%A Craig D. McGray
%A Igor Paprotny
%A Daniela Rus
%R Technical Report TR2005-553
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%X
We present a steerable, electrostatic, untethered, MEMS micro-robot,
with dimensions of 60 µm by 250 µm by
10 µm. This micro-robot is 1 to 2 orders of magnitude
smaller in size than previous micro-robotic systems. The device
consists of a curved, cantilevered steering arm, mounted on an
untethered scratch drive actuator. These two components are fabricated
monolithically from the same sheet of conductive polysilicon, and
receive a common power and control signal through a capacitive
coupling with an underlying electrical grid. All locations on the grid
receive the same power and control signal, so that the devices can be
operated without knowledge of their position on the substrate and
without constraining rails or tethers. Control and power delivery
waveforms are broadcast to the device through the capacitive power
coupling, and are decoded by the electromechanical response of the
device body. Individual control of the component actuators provides
two distinct motion gaits (forward motion and turning), which together
allow full coverage of a planar workspace (the robot is globally
controllable). These MEMS micro-robots demonstrate turning error of
less than 3.7 °/mm during forward motion, turn with radii as
small as 176 µm, and achieve speeds of over
200 µm/sec, with an average step size of 12 nm. They
have been shown to operate open-loop for distances exceeding
35 cm without failure, and can be controlled through
teleoperation to navigate complex paths. This document contains movies
showing the actuation of the micro-robots during open-loop actuation
and teleoperation experiments. The videos have been sped up for ease
of viewing. On each video, the time-scale is noted in the lower-right
corner of the screen.
%Z
This technical report is a multimedia web page, available at
http://www.cs.dartmouth.edu/reports/TR2005-553.CD/index.html
%T Efficiently Implementing a Large Number of LL/SC Objects
%A Prasad Jayanti
%A Srdjan Petrovic
%R Technical Report TR2005-554
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-554.pdf
%X
Over the past decade, a pair of instructions called load-linked (LL)
and store-conditional (SC) have emerged as the most suitable
synchronization instructions for the design of lock-free algorithms. However, current architectures do not support these instructions;
instead, they support either CAS (e.g., UltraSPARC, Itanium)
or restricted versions of LL/SC (e.g., POWER4, MIPS, Alpha).
Thus, there is a gap between what algorithm designers want (namely, LL/SC) and what multiprocessors actually support (namely, CAS or RLL/RSC).
To bridge this gap, a flurry of algorithms that implement LL/SC from CAS have appeared in the literature.
The two most recent algorithms are due to Doherty, Herlihy, Luchangco, and Moir (2004)
and Michael (2004).
To implement M LL/SC objects shared by N processes, Doherty et al.'s algorithm uses
only O(N + M) space, but is only non-blocking and not wait-free.
Michael's algorithm, on the other hand, is wait-free, but uses O(N^2 + M) space.
The main drawback of his algorithm is the time complexity of the SC operation:
although the expected amortized running time of SC is only O(1),
the worst-case running time of SC is O(N^2).
The algorithm in this paper overcomes this drawback.
Specifically, we design a wait-free algorithm that achieves a space complexity of O(N^2 + M),
while still maintaining the O(1) worst-case running time for LL and SC operations.
%T The Theory of Trackability with Applications to Sensor Networks
%A Valentino Crespi
%A George V. Cybenko
%A Guofei Jiang
%R Technical Report TR2005-555
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-555.pdf
%X
In this paper, we formalize the concept of tracking in a sensor
network and develop a rigorous theory of {\em trackability} that
investigates the rate of growth of the number of consistent tracks
given a sequence of observations made by the sensor network. The
phenomenon being tracked is modelled by a nondeterministic finite
automaton and the sensor network is modelled by an observer capable of
detecting events related, typically ambiguously, to the states of the
underlying automaton.
More formally, an input string, $Z^t$, of $t+1$ symbols (the
sensor network observations) that is presented to a nondeterministic
finite automaton, $M$, (the model) determines a set, ${\cal H}_M(Z^t)$,
of state sequences (the tracks or hypotheses) that are capable of
generating the input string $Z^t$. We study the growth of the
size of this set, $|{\cal H}_M(Z^t)|$, as a function of the length of
the input string, $t+1$. Our main result is that for a given
automaton and sensor coverage, the worst-case rate of growth is either
polynomial or exponential in $t$, indicating a kind of phase
transition in tracking accuracy.
The techniques we use include the Joint Spectral Radius,
$\rho(\Sigma)$, of a finite set, $\Sigma$, of $(0,1)$-matrices derived
from $M$. Specifically, we construct a set of matrices, $\Sigma$,
corresponding to $M$ with the property that $\rho(\Sigma) \leq 1$ if
and only if $|{\cal H}_M(Z^t)|$ grows polynomially in $t$. We also
prove that for $(0,1)$-matrices, the decision problem
$\rho(\Sigma)\leq 1$ is Turing decidable and, therefore, so is the
problem of deciding whether worst case state sequence growth for a
given automaton is polynomial or exponential. These results have
applications in sensor networks, computer network security and
autonomic computing as well as various tracking problems of recent
interest involving detecting phenomena using noisy observations of
hidden states.
%T Efficient Wait-Free Algorithms for Implementing LL/SC Objects
%A Srdjan Petrovic
%R Technical Report TR2005-556
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-556.pdf
%X
Over the past decade, a pair of instructions called load-linked (LL)
and store-conditional (SC) have emerged as the most suitable
synchronization instructions for the design of lock-free algorithms.
However, current architectures do not support these instructions;
instead, they support either CAS (e.g., UltraSPARC, Itanium, Pentium)
or restricted versions of LL/SC (e.g., POWER4, MIPS, Alpha).
Thus, there is a gap between what algorithm designers want (namely, LL/SC)
and what multiprocessors actually support (namely, CAS or restricted LL/SC).
To bridge this gap, this thesis presents a series of efficient, wait-free
algorithms that implement LL/SC from CAS or restricted LL/SC.
%T Natural Image Statistics for Digital Image Forensics
%A Siwei Lyu
%R Technical Report TR2005-557
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-557.pdf
%X
We describe a set of natural image statistics that are built upon two
multi-scale image decompositions, the quadrature mirror filter pyramid
decomposition and the local angular harmonic decomposition. These
image statistics consist of first- and higher-order statistics that
capture certain statistical regularities of natural images. We propose
to apply these image statistics, together with classification
techniques, to three problems in digital image forensics: (1)
differentiating photographic images from computer-generated
photorealistic images, (2) generic steganalysis; (3) rebroadcast image
detection. We also apply these image statistics to the traditional
art authentication for forgery detection and identification of artists
in an art work. For each application we show the effectiveness of these
image statistics and analyze their sensitivity and robustness.
%Z
Ph.D dissertation. Advisor: Hany Farid.
%T Improving Large-Scale Network Traffic Simulation with Multi-Resolution Models
%A Guanhua Yan
%R Technical Report TR2005-558
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-558.pdf
%X
Simulating a large-scale network like the Internet is a challenging
undertaking because of the sheer volume of its
traffic. Packet-oriented representation provides high-fidelity details
but is computationally expensive; fluid-oriented representation offers
high simulation efficiency at the price of losing packet-level
details. Multi-resolution modeling techniques exploit the advantages
of both representations by integrating them in the same simulation
framework. This dissertation presents solutions to the problems
regarding the efficiency, accuracy, and scalability of the traffic
simulation models in this framework. The ``ripple effect'' is a
well-known problem inherent in event-driven fluid-oriented traffic
simulation, causing explosion of fluid rate changes. Integrating
multi-resolution traffic representations requires estimating arrival
rates of packet-oriented traffic, calculating the queueing delay upon
a packet arrival, and computing packet loss rate under buffer
overflow. Real time simulation of a large or ultra-large network
demands efficient background traffic simulation. The dissertation
includes a rate smoothing technique that provably mitigates the
``ripple effect'', an accurate and efficient approach that integrates
traffic models at multiple abstraction levels, a sequential algorithm
that achieves real time simulation of the coarse-grained traffic in a
network with 3 tier-1 ISP (Internet Service Provider) backbones using
an ordinary PC, and a highly scalable parallel algorithm that
simulates network traffic at coarse time scales.
%Z
Ph.D dissertation. 183 pages.
Advisor: David Nicol.
%T Performance Evaluation of Distributed Security Protocols Using Discrete Event Simulation
%A Meiyuan Zhao
%R Technical Report TR2005-559
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-559.pdf
%X
The Border Gateway Protocol (BGP) that manages inter-domain routing on
the Internet lacks security. Protective measures using public key
cryptography introduce complexities and costs.
To support authentication and other security functionality in large
networks, we need public key infrastructures (PKIs). Protocols that
distribute and validate certificates introduce additional complexities
and costs. The certification path building algorithm that helps users
establish trust on certificates in the distributed network environment
is particularly complicated.
Neither routing security nor PKI come for free. Prior to this work,
the research study on performance issues of these large-scale
distributed security systems was minimal. In this thesis, we evaluate
the performance of BGP security protocols and PKI systems. We answer
the questions about how the performance affects protocol behaviors and
how we can improve the efficiency of these distributed protocols to
bring them one step closer to reality.
The complexity of the Internet makes an analytical approach difficult;
and the scale of Internet makes empirical approaches also
unworkable. Consequently, we take the approach of simulation. We have
built the simulation frameworks to model a number of BGP security
protocols and the PKI system. We have identified performance problems
of Secure BGP (S-BGP), a primary BGP security protocol, and proposed
and evaluated Signature Amortization (S-A) and Aggregated Path
Authentication (APA) schemes that significantly improve efficiency of
S-BGP without compromising security. We have also built a simulation
framework for general PKI systems and evaluated certification path
building algorithms, a critical part of establishing trust in
Internet-scale PKI, and used this framework to improve algorithm
performance.
%Z
Ph.D. Thesis
%T Combinatorial Theorems about Embedding Trees on the Real Line
%A Amit Chakrabarti
%A Subhash Khot
%R Technical Report TR2005-560
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-560.pdf
%X
We consider the combinatorial problem of embedding a tree metric into
the real line with low distortion. For two special families of trees ---
the family of complete binary trees and the family of subdivided stars
--- we provide embeddings whose distortion is provably optimal, up to a
constant factor. We also prove that the optimal distortion of a linear
embedding of a tree can be arbitrarily low or high even when it has
bounded degree.
%T A Quasi-PTAS for Unsplittable Flow on Line Graphs
%A Nikhil Bansal
%A Amit Chakrabarti
%A Amir Epstein
%A Baruch Schieber
%R Technical Report TR2005-561
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-561.pdf
%X
We study the Unsplittable Flow Problem (UFP) on a line graph, focusing
on the long-standing open question of whether the problem is APX-hard.
We describe a deterministic quasi-polynomial time approximation scheme
for UFP on line graphs, thereby ruling out an APX-hardness result,
unless NP is contained in DTIME(2^polylog(n)). Our result
requires a quasi-polynomial bound on all edge capacities and demands in
the input instance.
Earlier results on this problem included a polynomial time
(2+epsilon)-approximation under the assumption that no demand exceeds any
edge capacity (the "no-bottleneck assumption") and a super-constant
integrality gap if this assumption did not hold. Unlike most earlier
work on UFP, our results do not require a no-bottleneck assumption.
%T Computation reuse in stacking and unstacking
%A Anne Loomis
%R Technical Report TR2005-563
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November, 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-563.pdf
%X
Algorithms for dynamic simulation and control are fundamental to many
applications, including computer games and movies, medical simulation,
and mechanical design. I propose to explore efficient algorithms for
finding a stable unstacking sequence -- an order in which we can
remove every object from a structure without causing the structure to
collapse under gravity at any step.
We begin with a basic unstacking sequence algorithm: consider the
set of all objects in a structure. Collect all possible subsets into a
disassembly graph. Search the graph, testing the stability of each
node as it is visited. Any path of stable nodes from start to goal is
a stable unstacking sequence.
I propose to show how we can improve the performance of individual
stability tests for three-dimensional structures with Coulomb
friction, and give effective methods for searching the disassembly
graph. I will also analyze the computational complexity of stable
unstacking problems, and explore a classification of structures based
on characteristics of their stable unstacking sequences.
In preliminary work, I have shown that we can reuse computation from
one stability test of a planar subassembly to the next. The
implementation, which solves the system dynamics as a linear
complementarity problem (LCP), outperforms an implementation that
solves the system statics as a linear program (LP). This is
surprising because LCPs are more complex than LPs, and dynamics
equations are more complex than statics equations.
%Z
Master's thesis proposal.
%T A Steerable, Untethered, 250x60 micron MEMS Mobile Micro-Robot
%A Bruce R. Donald
%A Christopher G. Levey
%A Craig D. McGray
%A Igor Paprotny
%A Daniela Rus
%R Technical Report TR2005-564
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-564.pdf
%X
We present a steerable, electrostatic, untethered, MEMS micro-robot,
with dimensions of 60 µm by 250 µm by 10 µm. This
micro-robot is 1 to 2 orders of magnitude smaller in size
than previous micro-robotic systems. The device consists of a
curved, cantilevered steering arm, mounted on an untethered scratch
drive actuator. These two components are fabricated monolithically
from the same sheet of conductive polysilicon, and receive a common
power and control signal through a capacitive coupling with an
underlying electrical grid. All locations on the grid receive the
same power and control signal, so that the devices can be operated
without knowledge of their position on the substrate and without
constraining rails or tethers. Control and power delivery waveforms
are broadcast to the device through the capacitive power coupling,
and are decoded by the electromechanical response of the device
body. Individual control of the component actuators provides two
distinct motion gaits (forward motion and turning), which together
allow full coverage of a planar workspace (the robot is globally
controllable). These MEMS micro-robots demonstrate turning error of
less than 3.7 °/mm during forward motion, turn with
radii as small as 176 µm, and achieve speeds of over 200
µm/sec, with an average step size of 12 nm. They have been shown
to operate open-loop for distances exceeding 35 cm without failure,
and can be controlled through teleoperation to navigate complex
paths.
%Z
A revised version of this paper has appeared in the
proceedings of the 12th International Symposium of Robotics Research
(ISRR 2005), San Francisco, CA, October 12th-15th, 2005.
%T How hard is it to cheat in the Gale-Shapley Stable Matching Algorithm
%A Chien-Chung Huang
%R Technical Report TR2005-565
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-565.pdf
%X
We study strategy issues surrounding the stable marriage
problem. Under the Gale-Shapley algorithm (with men proposing),
a classical theorem says that it is impossible for every liar
to get a better partner. We try to challenge this theorem.
First, observing a loophole in the statement of the theorem,
we devise a coalition strategy in which a non-empty subset of the
liars gets a better partner and no man is worse off than before.
This strategy is restricted in that not everyone has the incentive
to cheat. We attack the classical theorem further
by means of randomization. However, this theorem shows
surprising robustness: it is impossible that every liar has the
chance to improve while no one gets hurt. Hence, this impossibility
result indicates that it is always hard to induce some people to falsify their
lists. Finally, to overcome the problem of lacking motivation, we exhibit
another randomized lying strategy in which every liar can expect to
get a better partner, though with a chance of getting a worse one.
%T A Combined Routing Method for Ad hoc Wireless Networks
%A Zhenhui Jiang
%R Technical Report TR2005-566
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2005
%U http://www.cs.dartmouth.edu/reports/TR2005-566.pdf
%X
To make ad hoc wireless networks adaptive to different mobility and
traffic patterns, we studied in this thesis an approach to swap from
one protocol to another protocol dynamically, while routing
continues. By the insertion of a new layer, we were able to make each
node in the ad hoc wireless network notify each other about the
protocol swap. To ensure that routing works efficiently after the
protocol swap, we initialized the destination routing protocolÕs data
structures and reused the previous routing information to build the
new routing table. We also tested our approach under different network
topologies and traffic patterns in static networks to learn whether
the swap is fast and whether the swap incurs too much overload . We
found that the swap latency is related to the destination protocol and
the topology of the network. We also found that the control packet
ratio after swap is close to the protocol running without swap, which
means our method does not incur too many control packets for swap.
%Z
Masters thesis. Advisor: David Kotz.
A shortened and updated version of this work appears as
TR2007-588.
%T A Novel Minimized Dead-End Elimination Criterion and Its Application to Protein Redesign in a Hybrid Scoring and Search Algorithm for Computing Partition Functions over Molecular Ensembles
%A Ivelin Georgiev
%A Ryan H. Lilien
%A Bruce R. Donald
%R Technical Report TR2006-570
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-570.pdf
%X
Novel molecular function can be achieved by redesigning an enzyme's active site so that it will perform its chemical reaction on a novel substrate. One of the main challenges for protein redesign is the efficient evaluation of a combinatorial number of candidate structures. The modeling of protein flexibility, typically by using a rotamer library of commonly-observed low-energy side-chain conformations, further increases the complexity of the redesign problem. A dominant algorithm for protein redesign is Dead-End Elimination (DEE), which prunes the majority of candidate conformations by eliminating rigid rotamers that provably are not part of the Global Minimum Energy Conformation (GMEC). The identified GMEC consists of rigid rotamers (i.e., rotamers that have not been energy-minimized) and is thus referred to as the rigid-GMEC. As a post-processing step, the conformations that survive DEE may be energy-minimized. When energy minimization is performed after pruning with DEE, the combined protein design process becomes heuristic, and is no longer provably accurate: a conformation that is pruned using rigid-rotamer energies may subsequently minimize to a lower energy than the rigid-GMEC. That is, the rigid-GMEC and the conformation with the lowest energy among all energy-minimized conformations (the minimized-GMEC) are likely to be different. While the traditional DEE algorithm succeeds in not pruning rotamers that are part of the rigid-GMEC, it makes no guarantees regarding the identification of the minimized-GMEC. In this paper we derive a novel, provable, and efficient DEE-like algorithm, called minimized-DEE (MinDEE), that guarantees that rotamers belonging to the minimized-GMEC will not be pruned, while still pruning a combinatorial number of conformations. We show that MinDEE is useful not only in identifying the minimized-GMEC, but also as a filter in an ensemble-based scoring and search algorithm for protein redesign that exploits energy-minimized conformations. We compare our results both to our previous computational predictions of protein designs and to biological activity assays of predicted protein mutants. Our provable and efficient minimized-DEE algorithm is applicable in protein redesign, protein-ligand binding prediction, and computer-aided drug design.
%Z
A revised version of this paper will appear in the Proceedings of the Tenth Annual International Conference on Research in Computational Molecular Biology (RECOMB), Venice Lido, Italy, April 2006.
%T Secure Context-sensitive Authorization
%A Kazuhiro Minami
%R Technical Report TR2006-571
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-571.pdf
%X
Pervasive computing leads to an increased integration between the real world
and the computational world, and many applications in pervasive computing adapt to
the user's context, such as the location of the user and relevant devices, the
presence of other people, light or sound conditions, or available network
bandwidth, to meet a user's continuously changing requirements without taking
explicit input from the users.
We consider a class of applications that wish to consider a user's context
when deciding whether to authorize a user's access to important physical or
information resources. Such a context-sensitive authorization scheme is
necessary when a mobile user moves across multiple administrative domains
where they are not registered in advance. Also, users interacting with their
environment need a non-intrusive way to access resources, and clues about
their context may be useful input into authorization policies for these
resources. Existing systems for context-sensitive authorization take a
logic-based approach, because a logical language makes it possible to define a
context model where a contextual fact is expressed with a boolean predicate
and to derive higher-level context information and authorization decisions
from contextual facts.
However, those existing context-sensitive authorization systems have a central
server that collects context information, and evaluates policies to make
authorization decisions on behalf of a resource owner. A centralized solution
assumes that all resource owners trust the server to make correct decisions,
and all users trust the server not to disclose private context information.
In many realistic applications of pervasive computing, however, the resources,
users, and sources of context information are inherently distributed among
many organizations that do not necessarily trust each other. Resource owners
may not trust the integrity of context information produced by another domain,
and context sensors may not trust others with the confidentiality of data they
provide about users.
In this thesis, we present a secure distributed proof system for
context-sensitive authorization. Our system enables multiple hosts to evaluate
an authorization query in a peer-to-peer way, while preserving the
confidentiality and integrity policies of mutually untrusted principals
running those hosts. We also develop a novel caching and revocation mechanism
to support context-sensitive policies that refer to information in dozens of
different administrative domains. Contributions of this thesis include the
definition of fine-grained security policies that specify trust relations
among principals in terms of information confidentiality and integrity, the
design and implementation of a secure distributed proof system, a proof for
the correctness of our algorithm, and a performance evaluation showing
that the amortized performance of our system scales to dozens of servers in
different domains.
%Z
Doctoral dissertation. Advisor: David Kotz
%T A simple computational method for the identification of disease-associated loci in complex, incomplete pedigrees
%A Gregory Leibon
%A Dan Rockmore
%A Martin R. Pollak
%R Technical Report TR2006-573
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-573.pdf
%X
We present an approach, called the Shadow Method, for the identification of disease loci from dense genetic marker maps in complex, potentially incomplete pedigrees. Shadow is a simple method based on an analysis of the patterns of obligate meiotic recombination events in genotypic data. This method can be applied to any high density marker map and was specifically designed to explore the fact that extremely dense marker maps are becoming more readily available. We also describe how to interpret and associated meaningful P-Values to the results. Shadow has significant advantages over traditional parametric linkage analysis methods in that it can be readily applied even in cases in which the topology of a pedigree or pedigrees can only be partially determined. In addition, Shadow is robust to variability in a range of parameters and in particular does not require prior knowledge of mode of inheritance, penetrance, or clinical misdiagnosis rate. Shadow can be used for any SNP data, but is especially effective when applied to dense samplings. Our primary example uses data from Affymetrix 100k SNPChip samples in which we illustrate our approach by analyzing simulated data as well as genome-wide SNP data from two pedigrees with inherited forms of kidney failure, one of which is compared with a typical LOD score analysis.
%T Limited Delegation (Without Sharing Secrets) in Web Applications
%A Nicholas J. Santos
%R Technical Report TR2006-574
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-574.pdf
%X
Delegation is the process wherein an entity Alice designates
an entity Bob to speak on her behalf. In password-based security
systems, delegation is easy: Alice gives Bob her password.
This is a useful feature, and is used often in the real world.
But it's also problematic. When Alice shares her password,
she must delegate all her permissions, but she may wish to
delegate a limited set. Also, as we move towards PKI-based
systems, secret-sharing becomes impractical. This thesis explores
one solution to these problems. We use proxy certificates in a
non-standard way so that user Alice can delegate a subset of her
privileges to user Bob in a secure, decentralized way for web
applications.
We identify how delegation changes the semantics of access
control, then build a system to demonstrate these possibilities
in action. An extension on top of Mozilla's Firefox web browser
allows a user to create and use proxy certificates for delegation,
and a module on top of the Apache web server accepts multiple
chains of these certificates. This is done in a modified SSL
session that should not break current SSL implementations.
%Z
Senior Honors Thesis. Advisor: Sean W. Smith.
%T Computation Reuse in Statics and Dynamics Problems for Assemblies of Rigid Bodies
%A Anne Loomis
%R Technical Report TR2006-576
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June, 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-576.pdf
%X
The problem of determining the forces among contacting rigid bodies is
fundamental to many areas of robotics, including manipulation
planning, control, and dynamic simulation. For example, consider the
question of how to unstack an assembly, or how to find stable regions
of a rubble pile. In considering problems of this type over discrete
or continuous time, we often encounter a sequence of problems with
similar substructure. The primary contribution of our work is the
observation that in many cases, common physical structure can be
exploited to solve a sequence of related problems more efficiently
than if each problem were considered in isolation.
We examine three general problems concerning rigid-body assemblies:
dynamic simulation, assembly planning, and assembly stability given
limited knowledge of the structure's geometry.
To approach the dynamic simulation and assembly planning applications,
we have optimized a known method for solving the system dynamics. The
accelerations of and forces among contacting rigid bodies may be
computed by formulating the dynamics equations and contact constraints
as a complementarity problem. Dantzig's algorithm, when applicable,
takes n or fewer major cycles to find a solution to the linear
complementarity problem corresponding to an assembly with n contacts.
We show that Dantzig's algorithm will find a solution in n - k or
fewer major cycles if the algorithm is initialized with a solution to
the dynamics problem for a subassembly with k internal contacts.
Finally, we show that if we have limited knowledge of a structure's
geometry, we can still learn about stable regions of its surface by
physically pressing on it. We present an approach for finding stable
regions of planar assemblies: sample presses on the surface to
identify a stable cone in wrench space, partition the space of
applicable wrenches into stable and unstable regions, and map these
back to the surface of the structure.
%Z
Master's thesis.
%T SAMPLED: Shared Anonymous Music PLayback using wirelEss Devices
%A Constantinos C. Neophytou
%R Technical Report TR2006-577
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-577.pdf
%X
Recent advances in mobile computing enable many new applications, yet at the
same time create privacy implications caused by the increasing amount of data
that becomes available. This thesis will explore the possibilities of
wireless-enabled portable devices and their attending privacy implications.
We will describe how such a device containing personal information about the
musical preferences of its user can help improve the user's experience in a
social setting where music is played for all, and at the same time preserve
each user's privacy.
%Z
Undergraduate project; advisor Tristan Henderson.
%T Wait-Free and Obstruction-Free Snapshot
%A Khanh Do Ba
%R Technical Report TR2006-578
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-578.pdf
%X
The snapshot problem was first proposed over a decade ago and has
since been well-studied in the distributed algorithms community. The
challenge is to design a data structure consisting of $m$ components,
shared by upto $n$ concurrent processes, that supports two
operations. The first, $Update(i,v)$, atomically writes $v$ to the
$i$th component. The second, $Scan()$, returns an atomic snapshot of
all $m$ components. We consider two termination properties:
wait-freedom, which requires a process to always terminate in a
bounded number of its own steps, and the weaker obstruction-freedom,
which requires such termination only for processes that eventually
execute uninterrupted.
First, we present a simple, time and space optimal, obstruction-free
solution to the single-writer, multi-scanner version of the snapshot
problem (wherein concurrent Updates never occur on the same
component). Second, we assume hardware support for compare&swap (CAS)
to give a time-optimal, wait-free solution to the multi-writer,
single-scanner snapshot problem (wherein concurrent Scans never
occur). This algorithm uses only $O(mn)$ space and has optimal CAS,
write and remote-reference complexities. Additionally, it can be
augmented to implement a general snapshot object with the same time
and space bounds, thus improving the space complexity of $O(mn^2)$ of
the only previously known time-optimal solution.
%Z
Senior Honors Thesis. Advisor: Prasad Jayanti.
%T Metric Measurements on a Plane from a Single Image
%A Micah K. Johnson
%A Hany Farid
%R Technical Report TR2006-579
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-579.pdf
%X
The past decade has seen considerable advances in the application of
principles from projective geometry to problems in image analysis and
computer vision. In this paper, we review a subset of this work, and
leverage these results for the purpose of forensic analysis.
Specifically, we review three techniques for making metric
measurements on planar surfaces from a single image. The resulting
techniques should prove useful in forensic settings where real-world
measurements are required.
%T Visualizing Paths in Context
%A Fabio Pellacini
%A Lori Lorigo
%A Geri Gay
%R Technical Report TR2006-580
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-580.pdf
%X
Data about movement through a space is increasingly becoming available for capture and analysis. In many applications, this data is captured or modeled as transitions between a small number of areas of interests, or a finite set of states, and these transitions constitute paths in the space. Similarities and differences between paths are of great importance to such analyses, but can be difficult to assess. In this work we present a visualization approach for representing paths in context, where individual paths can be compared to other paths or to a group of paths. Our approach summarizes path behavior using a simple circular layout, including information about state and transition likelihood using Markov random models, together with information about specific path and state behavior. The layout avoids line crossovers entirely, making it easy to observe patterns while reducing visual clutter. In our tool, paths can either be compared in their natural sequence or by aligning multiple paths using Multiple Sequence Alignment, which can better highlight path similarities. We applied our technique to eye tracking data and cell phone tower data used to capture human movement.
%T Cheating to Get Better Roommates in a Random Stable Matching
%A Chien-Chung Huang
%R Technical Report TR2006-582
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December, 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-582.pdf
%X
This paper addresses strategies for the stable
roommates problem, assuming that a stable matching is chosen at
random. We investigate how a cheating man should permute his
preference list so that he has a higher-ranking roommate
probabilistically.
In the first part of the paper, we identify a necessary condition
for creating a new stable roommate for the cheating man. This
condition precludes any possibility of his getting a new
roommate ranking higher than all his stable roommates when everyone
is truthful. Generalizing
to the case that multiple men collude, we derive another
impossibility result: given any stable matching in which
a subset of men get their best possible roommates, they
cannot cheat to create a new stable matching in which they all get strictly
better roommates than in the given matching.
Our impossibility result, considered in the context of the stable marriage
problem, easily re-establishes the celebrated Dubins-Freedman Theorem.
The more generalized Demange-Gale-Sotomayor Theorem states that
a coalition of men and women cannot cheat to create a stable
matching in which everyone of them gets a strictly better partner
than in the Gale-Shapley algorithm (with men proposing).
We give a sharper result: a coalition of men and women
cannot cheat together so that, in a newly-created stable matching,
every man in the coalition gets a strictly better partner than
in the Gale-Shapley algorithm while none of the women
in the coalition is worse off.
In the second part of the paper, we present two cheating strategies
that guarantee that the cheating man's new probability
distribution over stable roommates majorizes the original one. These
two strategies do not require the knowledge of
the probability distribution of the
cheating man. This is important because the problem of counting stable
matchings is \#P-complete. Our strategies only require knowing the
set of stable roommates that the cheating man has and
can be formulated in polynomial time. Our second cheating strategy
has an interesting corollary in the context of
stable marriage with the Gale-Shapley algorithm.
Any woman-optimal strategy will ensure that every woman,
cheating or otherwise, ends up with a partner at least as good as when everyone is truthful.
%T Digital Image Ballistics from JPEG Quantization
%A Hany Farid
%R Technical Report TR2006-583
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-583.pdf
%X
Most digital cameras export images in the JPEG file format. This
lossy compression scheme employs a quantization table that controls
the amount of compression achieved. Different cameras typically
employ different tables. A comparison of an image's quantization
scheme to a database of known cameras affords a simple technique for
confirming or denying an image's source. Similarly, comparison to a
database of photo-editing software can be used in a forensic setting
to determine if an image was edited after its original recording.
%T Tools and algorithms to advance interactive intrusion analysis via Machine Learning and Information Retrieval
%A Javed Aslam
%A Sergey Bratus
%A Virgil Pavlu
%R Technical Report TR2006-584
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-584.pdf
%X
We consider typical tasks that arise in the intrusion analysis of log data from the perspectives of Machine Learning and Information Retrieval, and we study a number of data organization and interactive learning techniques to improve the analyst's efficiency. In doing so, we attempt to translate intrusion analysis problems into the language of the abovementioned disciplines and to offer metrics to evaluate the effect of proposed techniques. The Kerf toolkit contains prototype implementations of these techniques, as well as data transformation tools that help bridge the gap between the real world log data formats and the ML and IR data models.
We also describe the log representation approach that Kerf prototype tools are based on. In particular, we describe the connection between decision trees, automatic classification algorithms and log analysis techniques implemented in Kerf.
%T Path Planning Algorithms under the Link-Distance Metric
%A David P. Wagner
%R Technical Report TR2006-585
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2006
%U http://www.cs.dartmouth.edu/reports/TR2006-585.pdf
%X
The Traveling Salesman Problem and the Shortest Path Problem
are famous problems in computer science which have been well
studied when the objective is measured using the Euclidean distance.
Here we examine these geometric problems under a different set of
optimization criteria. Rather than considering the total distance
traversed by a path, this thesis looks at reducing the number of
times a turn is made along that path, or equivalently, at reducing
the number of straight lines in the path.
Minimizing this objective value, known as the link-distance, is useful
in situations where continuing in a given direction is cheap, while turning
is a relatively expensive operation. Applications exist in VLSI, robotics,
wireless communications, space travel, and other fields where it is
desirable to reduce the number of turns.
This thesis examines rectilinear and non-rectilinear variants of the
Traveling Salesman Problem under this metric. The objective of these
problems is to find a path visiting a set of points which has the smallest
number of bends. A 2-approximation algorithm is given for the rectilinear
problem, while for the non-rectilinear problem, an O(log n)-approximation
algorithm is given. The latter problem is also shown to be NP-Complete.
Next, the Rectilinear Minimum Link-Distance Problem, also known as
the Minimum Bends Path Problem, is considered. Here the objective
is to find a rectilinear path between two points among rectilinear obstacles
which has the minimum number of bends, while avoiding passing through
any of the obstacles. The problem has been well studied in two dimensions,
but is relatively unexplored in higher dimensions. A main result of this
thesis is an O(n^{5/2} log n) time algorithm solving this problem in three
dimensions. Previously known algorithms have had worst-case running
times of Omega(n^3).
This algorithm requires a data structure that supports efficient operations
on pointsets within rectangular regions of the Euclidean plane. We
design a new data structure, which is a variation on the segment tree,
in order to support these operations.
Finally, an implementation of the data structure and of the algorithm
solving the Minimum Link-Distance Problem demonstrates their
experimental running times and ease of implementation.
%Z
Doctoral dissertation.
Co-Advisors: Robert Scot Drysdale, Clifford Stein;
Thesis Committee: Amit Chakrabarti, Joseph S. B. Mitchell
%T People-Centric Urban Sensing: Security Challenges for the New Paradigm
%A Peter Johnson
%A Apu Kapadia
%A David Kotz
%A Nikos Triandopoulos
%R Technical Report TR2007-586
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-586.pdf
%X
We study the security challenges that arise in \emph{people-centric
urban sensing}, a new sensor-networking paradigm that leverages
humans as part of the sensing infrastructure. Most prior work on
sensor networks has focused on collecting and processing ephemeral
data about the environment using a static topology and an
application-aware infrastructure. People-centric urban sensing,
however, involves collecting, storing, processing and fusing large
volumes of data related to every-day human activities. Sensing is
performed in a highly dynamic and mobile environment, and supports
(among other things) pervasive computing applications that are
focused on enhancing the user's experience. In such a setting, where
humans are the central focus, there are new challenges for
information security; not only because of the complex and dynamic
communication patterns, but also because the data originates from
sensors that are carried by a person---not a tiny sensor thrown in
the forest or mounted on the neck of an animal. In
this paper we aim to instigate discussion about this critical
issue---because people-centric sensing will never succeed without
adequate provisions for security and privacy. To that end, we
outline several important challenges and suggest general solutions
that hold promise in this new paradigm of sensor networks.
%T Protein Design by Mining and Sampling an Undirected Graphical Model of Evolutionary Constraints
%A John Thomas
%A Naren Ramakrishnan
%A Chris Bailey-Kellogg
%R Technical Report TR2007-587
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-587.pdf
%X
Evolutionary pressures on proteins to maintain structure and function have constrained their sequences over time and across species. The sequence record thus contains valuable information regarding the acceptable variation and covariation of amino acids in members of a protein family. When designing new members of a protein family, with an eye toward modified or improved stability or functionality, it is incumbent upon a protein engineer to uncover such constraints and design conforming sequences. This paper develops such an approach for protein design: we first mine an undirected probabilistic graphical model of a given protein family, and then use the model generatively to sample new sequences. While sampling from an undirected model is difficult in general, we present two complementary algorithms that effectively sample the sequence space constrained by our protein family model. One algorithm focuses on the high-likelihood regions of the space. Sequences are generated by sampling the cliques in a graphical model according to their likelihood while maintaining neighborhood consistency. The other algorithm designs a fixed number of high-likelihood sequences that are reflective of the amino acid composition of the given family. A set of shuffled sequences is iteratively improved so as to increase their mean likelihood under the model. Tests for two important protein families, WW domains and PDZ domains, show that both sampling methods converge quickly and generate diverse high-quality sets of sequences for further biological study.
%Z
Submitted to KDD 2007.
%T A Combined Routing Method for Wireless Ad Hoc Networks
%A Soumendra Nanda
%A Zhenhui Jiang
%A David Kotz
%R Technical Report TR2007-588
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-588.pdf
%X
To make ad hoc wireless networks adaptive to different mobility and traffic patterns, this paper proposes an approach to swap from one protocol to another protocol dynamically, while routing continues. By the insertion of a thin new layer, we were able to make each node in the ad hoc wireless network notify each other about the protocol swap. To ensure that routing works efficiently after the protocol swap, we initialized the destination routing protocol's data structures and reused the previous routing information to build the new routing table. We also tested our approach under different network topologies and traffic patterns in static networks to learn whether the swap was fast and whether the swap incurred too much overhead. We found that the swap latency was related to the nature of the destination protocol and the topology of the network. We also found that the control packet ratio after swap was close to that of the protocol running without swap, which indicates that our method does not incur too much overhead for the swap.
%Z
This report is superceded by TR2009-641; please see that report instead.
This report is a shortened and updated version of the M.S. thesis that appeared as
TR2005-566.
%T Lighting with Sketches
%A Alexander Wakefield Steinberg
%R Technical Report TR2007-589
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-589.pdf
%X
Lighting design is a fundamental aspect of computer cinematography, where it is used to support storytelling by affecting the mood, style, and believability of a scene. Traditionally, lighting has requred the tedious adjustment of large set parameters that describe complex lighting setups, including lights positions, colors, shapes, etc. This work presents an interactive user interface that facilitates lighting workflow by using a sketching paradigm for light creation. Lights are specified by a series of strokes that define various properties of illumination such as shape of the light and position of illuminated and shadowed areass. The system will them perform a nonlinear optimization over all the light parameters to find a match to the controlling sketches. To demonstrate our prototype system, we lit a simple scene fully with our application, showing that sketching paradigms ar promising to facilitate the lighting workflow.
%Z
Advisor: Fabio Pellacini
%T Secure Cryptographic Precomputation with Insecure Memory
%A Patrick P. Tsang
%A Sean W. Smith
%R Technical Report TR2007-590
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-590.pdf
%X
Precomputation dramatically reduces the execution latency of many
cryptographic algorithms. To sustain the reduced latency over time
during which these algorithms are routinely invoked, however, a pool
of precomputation results must be stored and be readily available.
While precomputation is an old and well-known technique, how to
securely and yet efficiently store these precomputation results has
largely been ignored. For instance, requiring tamper-proof memory
would be too expensive, if not unrealistic, for precomputation to be
cost-effective.
In this paper, we propose an architecture that provides secure
storage for cryptographic precomputation using only insecure memory,
which may be eavesdropped or even tampered with. Specifically, we
design a small tamper-resistant hardware module that we call the
{\em Queue Security Proxy (QSP)}, which situates on the data-path
between the processor and the insecure memory. Our analysis shows
that our design is secure, efficient, flexible and yet inexpensive.
In particular, our design's timing overhead and hardware cost are
independent of the storage size.
We also discuss in this paper several interesting extensions to our
proposed architecture. We plan to prototype our design assuming the
scenario of precomputing DSA signatures, effectively building a
cost-effective low-latency DSA signing secure coprocessor.
%T Dumbots: Unexpected Botnets through Networked Embedded Devices
%A Kwang-Hyun Baek
%A Sergey Bratus
%A Sara Sinclair
%A Sean W. Smith
%R Technical Report TR2007-591
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-591.pdf
%X
Currently, work on botnets focuses primarily on PCs. However, as lightweight computing devices with
embedded operating systems become more ubiquitous, they present a new and very disturbing target
for botnet developers. In this paper, we present both an empirical demonstration on a widely deployed
multimedia box, as well as an evaluation of the deeper potential of these dumbots.
%T SCML: A Structural Representation for Chinese Characters
%A Daniel G. Peebles
%R Technical Report TR2007-592
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-592.pdf
%X
Chinese characters are used daily by well over a billion people. They
constitute the main writing system of China and Taiwan, form a major
part of written Japanese, and are also used in South Korea. Anything
more than a cursory glance at these characters will reveal a high
degree of structure to them, but computing systems do not currently
have a means to operate on this structure. Existing character
databases and dictionaries treat them as numerical code points, and
associate with them additional `hand-computed' data, such as stroke
count, stroke order, and other information to aid in specific
searches. Searching by a character's `shape' is effectively impossible
in these systems.
I propose a new approach to representing these characters, through an
XML-based language called SCML. This language, by encoding an abstract
form of a character, allows the direct retrieval of important
information such as stroke count and stroke order, and permits useful
but previously impossible automated analysis of characters. In
addition, the system allows the design of a view that takes abstract
SCML representations as character models and outputs glyphs based on
an aesthetic, facilitating the creation of `meta-fonts' for Chinese
characters. Finally, through the creation of a specialized database,
SCML allows for efficient structural character queries to be performed
against the body of inserted characters, thus allowing people to
search by the most obvious of a character's characteristics: its
shape.
%Z
Senior Honors Thesis. Advisor: Devin Balkcom.
%T Closest and Farthest-Line Voronoi Diagrams in the Plane
%A Mark C. Henle
%R Technical Report TR2007-593
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-593.pdf
%X
Voronoi diagrams are a geometric structure containing proximity
information useful in efficiently answering a number of common
geometric problems associated with a set of points in the plane.. They
have applications in fields ranging from crystallography to biology.
Diagrams of sites other than points and with different distance
metrics have been studied. This paper examines the Voronoi diagram of
a set of lines, which has escaped study in the computational geometry
literature.
The combinatorial and topological properties of the
closest and farthest Voronoi diagrams are analyzed and O(n^2) and
O(n log n) algorithms are presented for their computation
respectively.
%Z
Senior Honors Thesis. Advisor: Robert L. Scot Drysdale
%T Exploring the Integration of Memory Management and Trusted Computing
%A Nihal A. D'Cunha
%R Technical Report TR2007-594
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-594.pdf
%X
This thesis addresses vulnerabilities in current Trusted Computing
architecture by exploring a design for a better Trusted Platform
Module (TPM); one that integrates more closely with the CPU's Memory
Management Unit (MMU). We establish that software-based attacks on
trusted memory can be carried out undetectably by an adversary on
current TCG/TPM implementations. We demonstrate that an attacker with
sufficient privileges can compromise the integrity of a TPM-protected
system by modifying critical loaded code and static data after
measurement has taken place. More specifically, these attacks
illustrate the Time Of Check vs. Time of Use (TOCTOU) class of
attacks.
We propose to enhance the MMU, enabling it to detect when memory
containing trusted code or data is being maliciously modified at
run-time. On detection, it should be able to notify the TPM of these
modifications. We seek to use the concepts of selective memory
immutability as a security tool to harden the MMU, which will result
in a more robust TCG/TPM implementation. To substantiate our ideas for
this proposed hardware feature, we designed and implemented a software
prototype system, which employs the monitoring capabilities of the Xen
virtual machine monitor.
We performed a security evaluation of our prototype and validated
that it can detect all our software-based TOCTOU attacks. We applied
our prototype to verify the integrity of data associated with an
application, as well as suggested and implemented ways to prevent
unauthorized use of data by associating it with its owner process. Our
performance evaluation reveals minimal overhead.
%Z
M.S. Thesis. Advisor: Sean Smith.
%T When One Pipeline Is Not Enough
%A Thomas H. Cormen
%A Priya Natarajan
%A Elena Riccio Davidson
%R Technical Report TR2007-596
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-596.pdf
%X
Pipelines that operate on buffers often work well to mitigate the high
latency inherent in interprocessor communication and in accessing data
on disk. Running a single pipeline on each node works well when each
pipeline stage consumes and produces data at the same rate. If a
stage might consume data faster or slower than it produces data, a
single pipeline becomes unwieldy.
We describe how we have extended the FG programming environment to
support multiple pipelines in two forms. When a node might send and
receive data at different rates during interprocessor communication,
we use disjoint pipelines that send and receive on each node. When a
node consumes and produces data from different streams on the node, we
use multiple pipelines that intersect at a particular stage.
Experimental results for two out-of-core sorting algorithms---one
based on columnsort and the other a distribution-based
sort---demonstrate the value of multiple pipelines.
%T A Security Assessment of Trusted Platform Modules
%A Evan R. Sparks
%R Technical Report TR2007-597
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-597.ps.Z
%X
Trusted Platform Modules (TPMs) are becoming ubiquitous devices included
in newly released personal computers. Broadly speaking, the aim
of this technology is to provide a facility for authenticating the platform on
which they are running: they are able to measure attest to the authenticity
of a hardware and software configuration. Designed to be cheap, commodity
devices which motherboard and processor vendors can include in
their products with minimal marginal cost, these devices have a good theoretical
design. Unfortunately, there exist several practical constraints on
the effectiveness of TPMs and the architectures which employ them which
leave them open to attack. We demonstrate some hardware and software attacks
against these devices and architectures. These attacks include Time of
Check/Time of Use attacks on the Integrity Measurment Architecture, and a
bus attack against the Low Pin Count bus. Further, we explore the possibility
of side-channel attacks against TPMs.
%Z
Senior Honors Thesis.
Advisor: Sean Smith.
%T Two's Company, Three's a Crowd: Stable Family and Threesome Roommates Problems
%A Chien-Chung Huang
%R Technical Report TR2007-598
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December, 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-598.pdf
%X
We investigate Knuth's eleventh open question on
stable matchings. In the stable family problem, sets of
women, men, and dogs are given, all of whom state their preferences among
the other two groups. The goal is to organize them into family units,
so that no three of them have the incentive to desert
their assigned family members to form a new family.
A similar problem,
called the threesome roommates problem, assumes that a group
of persons, each with their preferences among the combinations of
two others, are to be partitioned into triples. Similarly,
the goal is to make sure that no three persons want to break up
with their assigned roommates.
Ng and Hirschberg were the first to investigate these two problems.
In their formulation, each participant provides a
strictly-ordered list of all combinations. They proved that under
this scheme, both problems are NP-complete. Their paper reviewers
pointed out that their reduction exploits \emph{inconsistent}
preference lists and they wonder whether these two problems
remain NP-complete if preferences are required to be consistent.
We answer in the affirmative.
In order to give these two problems a broader outlook, we also
consider the possibility that participants can express indifference,
on the condition that the preference consistency has to be maintained.
As an example, we propose a scheme in which all participants
submit two (or just one in the roommates case) lists ranking
the other two groups separately. The order of the combinations
is decided by the sum of their ordinal numbers. Combinations
are tied when the sums are equal. By introducing indifference,
a hierarchy of stabilities can be defined. We prove that all
stability definitions lead to NP-completeness for existence
of a stable matching.
%T Light-Based Sample Reduction Methods for Interactive Relighting of Scenes with Minute Geometric Scale
%A William B. Kerr
%A Fabio Pellacini
%R Technical Report TR2007-600
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-600.pdf
%X
Rendering production-quality cinematic scenes requires high computational and temporal costs. From an artist's perspective, one must wait for several hours for feedback on even minute changes of light positions and parameters. Previous work approximates scenes so that adjustments on lights may be carried out with interactive feedback, so long as geometry and materials remain constant. We build on these methods by proposing means by which objects with high geometric complexity at the subpixel level, such as hair and foliage, can be approximated for real-time cinematic relighting. Our methods make no assumptions about the geometry or shaders in a scene, and as such are fully generalized. We show that clustering techniques can greatly reduce multisampling, while still maintaining image fidelity at an error significantly lower than sparsely sampling without clustering, provided that no shadows are computed. Scenes that produce noise-like shadow patterns when sparse shadow samples are taken suffer from additional error introduced by those shadows. We present a viable solution to scalable scene approximation for lower sampling reolutions, provided a robust solution to shadow approximation for sub-pixel geomery can be provided in the future.
%T Blacklistable Anonymous Credentials: Blocking Misbehaving Users without TTPs (Extended Version)
%A Patrick P. Tsang
%A Man Ho Au
%A Apu Kapadia
%A Sean W. Smith
%R Technical Report TR2007-601
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-601.pdf
%X
Several credential systems have been proposed in which users can
authenticate to services anonymously. Since anonymity can give users
the license to misbehave, some variants allow the selective
deanonymization (or linking) of misbehaving users upon a complaint
to a trusted third party (TTP). The ability of the TTP to revoke a
user's privacy at any time, however, is too strong a punishment for
misbehavior. To limit the scope of deanonymization, systems such as
``e-cash'' have been proposed in which users are deanonymized under
only certain types of well-defined misbehavior such as ``double
spending.'' While useful in some applications, it is not possible
to generalize such techniques to more subjective definitions of
misbehavior.
We present the first anonymous credential system in which services
can ``blacklist'' misbehaving users without contacting a TTP. Since
blacklisted users remain anonymous, misbehaviors can be judged
subjectively without users fearing arbitrary deanonymization by a
TTP.
%Z
This technical report is the extended version of the paper to appear in CCS '07 under the same title.
%T Fast-Converging Tatonnement Algorithms for the Market Problem
%A Richard Cole
%A Lisa Fleischer
%R Technical Report TR2007-602
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-602.pdf
%X
Why might markets tend toward and remain near equilibrium
prices? In an effort to shed light on this question from an
algorithmic perspective, this paper defines and analyzes
two simple tatonnement algorithms that differ from
previous algorithms that have been subject to asymptotic analysis
in three significant respects: the price update for a good depends
only on the price, demand, and supply for that good, and on no
other information; the price update for each good occurs
distributively and asynchronously; the algorithms work (and the
analyses hold) from an arbitrary starting point.
Our algorithm introduces a new and natural update rule.
We show that this update rule leads to fast convergence toward
equilibrium prices in a broad class of markets that satisfy the weak
gross substitutes property. These are the first analyses for
computationally and informationally distributed algorithms
that demonstrate polynomial convergence.
Our analysis identifies three parameters characterizing the markets,
which govern the rate of convergence of our protocols. These parameters
are, broadly speaking:
1. A bound on the fractional rate of change of demand for each
good with respect to fractional changes in its price.
2. A bound on the fractional rate of change of demand for each
good with respect to fractional changes in wealth.
3. The relative demand for money at equilibrium prices.
We give two protocols. The first assumes global knowledge of only
the first parameter. For this protocol, we also provide a matching lower
bound in terms of these parameters. Our second protocol assumes no global
knowledge whatsoever.
%T YASIR: A Low-Latency, High-Integrity Security Retrofit for Legacy SCADA Systems
%A Patrick P. Tsang
%A Sean W. Smith
%R Technical Report TR2007-603
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-603.pdf
%X
We construct a bump-in-the-wire (BITW) solution that retrofits
security into time-critical communications over bandwidth-limited
serial links between devices in Supervisory Control And Data
Acquisition (SCADA) systems. Previous BITW solutions fail to provide
the necessary security within timing constraints; the previous
solution that does provide the necessary security is not BITW. At a
comparable hardware cost, our BITW solution provides sufficient
security, and yet incurs minimal end-to-end communication latency. A
microcontroller prototype of our solution is under development.
%Z
Updated as TR2008-617.
You can find it
here.
%T Video Stabilization and Enhancement
%A Hany Farid
%A Jeffrey B. Woodward
%R Technical Report TR2007-605
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-605.pdf
%X
We describe a simple and computationally efficient approach for video stabilization and
enhancement. By combining multiple low-quality video frames, it is possible to extract
a high-quality still image. This technique is particularly helpful in identifying people,
license plates, etc. from low-quality video surveillance cameras.
%T The Quality of Open Source Production: Zealots and Good Samaritans in the Case of Wikipedia
%A Denise Anthony
%A Sean W. Smith
%A Tim Williamson
%R Technical Report TR2007-606
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-606.pdf
%X
New forms of production based in electronic technology, such as
open-source and open-content production, convert private commodities
(typically software) into essentially public goods. A number of
studies find that, like in other collective goods, incentives for
reputation and group identity motivate contributions to open source
goods, thereby overcoming the social dilemma inherent in producing
such goods. In this paper we examine how contributor motivations
affect the quality of contributions to the open-content online
encyclopedia Wikipedia. We find that quality is associated with
contributor motivations, but in a surprisingly inconsistent
way. Registered users' quality increases with more contributions,
consistent with the idea of participants motivated by reputation and
commitment to the Wikipedia community. Surprisingly, however, we find
the highest quality from the vast numbers of anonymous "Good
Samaritans" who contribute only once. Our findings that Good
Samaritans as well as committed "zealots" contribute high quality
content to Wikipedia suggest that it is the quantity as well as the
quality of contributors that positively affects the quality of open
source production.
%Z
A preliminary version of this paper was published online in November 2005.
%T Exclusion and Object Tracking in a Network of Processes
%A Yih-Kuen Tsay
%A Chien-Chung Huang
%R Technical Report TR2007-608
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2007
%U http://www.cs.dartmouth.edu/reports/TR2007-608.pdf
%X
This paper concerns two fundamental problems in distributed
computing---mutual exclusion and mobile object tracking.
For a variant of the mutual exclusion problem where the network topology
is taken into account, all existing distributed solutions make use
of tokens.
It turns out that these token-based solutions for
mutual exclusion can also be adapted for object tracking, as the
token behaves very much like a mobile object.
To handle objects with replication, we go further to consider the
more general $k$-exclusion problem which has not been as well studied in
a network setting.
A strong fairness property for $k$-exclusion requires
that a process trying to enter the critical section will
eventually succeed even if \emph{up to} $k-1$ processes stay in
the critical section indefinitely.
We present a comparative survey of existing token-based mutual
exclusion algorithms, which have provided much inspiration for later
$k$-exclusion algorithms. We then propose two solutions
to the $k$-exclusion problem, the second of which meets the strong
fairness requirement. Fault-tolerance issues are also discussed along with the suggestion of
a third algorithm that is also strongly fair. Performances of the three algorithms are compared by simulation.
Finally, we show how the various exclusion algorithms
can be adapted for tracking mobile objects.
%T Settling for limited privacy: how much does it help?
%A Anna M. Shubina
%R Technical Report TR2008-609
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2007
%U http://www.cs.dartmouth.edu/reports/TR2008-609.pdf
%X
This thesis explores practical and theoretical aspects of several
privacy-providing technologies, including tools for anonymous web-browsing,
verifiable electronic voting schemes, and private information retrieval from databases. State-of-art privacy-providing schemes are frequently impractical for implementational reasons or for sheer information-theoretical reasons due to the amount of information that needs to be transmitted. We have been researching the question of whether relaxing the requirements on such schemes, in particular settling for imperfect but sufficient in real-world situations privacy, as opposed to perfect privacy, may be helpful in producing more practical or more efficient schemes.
This thesis presents three results. The first result is the introduction of caching as a technique for providing anonymous web-browsing at the cost of sacrificing some functionality provided by anonymizing systems that do not use caching. The second result is a coercion-resistant electronic voting scheme with nearly perfect privacy and nearly perfect voter verifiability. The third result consists of some lower bounds and some simple upper bounds on the amount of communication in nearly private information retrieval schemes; our work is the first in-depth exploration of private information schemes with imperfect privacy.
%Z
Ph.D dissertation. Advisor: Amit Chakrabarti.
%T Active Behavioral Fingerprinting of Wireless Devices
%A Sergey Bratus
%A Cory Cornelius
%A Daniel Peebles
%A David Kotz
%R Technical Report TR2008-610
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-610.pdf
%X
We propose a simple active method for discovering facts about the
chipset, the firmware or the driver of an 802.11 wireless device by
observing its responses (or lack thereof) to a series of crafted
non-standard or malformed 802.11 frames. We demonstrate that such
responses can differ significantly enough to distinguish between a
number of popular chipsets and drivers. We expect to significantly
expand the number of recognized device types through community
contributions of signature data for the proposed open fingerprinting
framework. Our method complements known fingerprinting approaches, and
can be used to interrogate and spot devices that may be spoofing their
MAC addresses in order to conceal their true architecture from other
stations, such as a fake AP seeking to engage clients in complex
protocol frame exchange (e.g., in order to exploit a driver
vulnerability). In particular, it can be used to distinguish rogue
APs from legitimate APs before association.
%Z
Short version presented at WiSec 2008, Alexandria, VA
%T Evaluating Mobility Predictors in Wireless Networks for Improving Handoff and Opportunistic Routing
%A Libo Song
%R Technical Report TR2008-611
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-611.pdf
%X
We evaluate mobility predictors in wireless networks. Handoff
prediction in wireless networks has long been considered as a mechanism
to improve the quality of service provided to mobile wireless users.
Most prior studies, however, were based on theoretical analysis,
simulation with synthetic mobility models, or small wireless network
traces. We study the effect of mobility prediction for a large
realistic wireless situation.
We tackle the problem by using traces collected from a large
production wireless network to evaluate several major families of
handoff-location prediction techniques, a set of handoff-time
predictors, and a predictor that jointly predicts handoff location and
time. We also propose a fallback mechanism, which uses a lower-order
predictor whenever a higher-order predictor fails to predict.
We found that low-order Markov predictors, with our proposed fallback
mechanisms, performed as well or better than the more complex and more
space-consuming compression-based handoff-location predictors. Although
our handoff-time predictor had modest prediction accuracy, in the
context of mobile voice applications we found that bandwidth reservation
strategies can benefit from the combined location and time handoff
predictor, significantly reducing the call-drop rate without
significantly increasing the call-block rate.
We also developed a prediction-based routing protocol for mobile
opportunistic networks. We evaluated and compared our protocol's
performance to five existing routing protocols, using simulations driven
by real mobility traces. We found that the basic routing protocols are
not practical for large-scale opportunistic networks. Prediction-based
routing protocols trade off the message delivery ratio against resource
usage and performed well and comparable
to each other.
%Z
Ph.D dissertation. Advisor: David Kotz
%T Localized Bridging Centrality for Distributed Network Analysis
%A Soumendra Nanda
%A David Kotz
%R Technical Report TR2008-612
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-612.pdf
%X
Centrality is a concept often used in social network analysis to study different properties of networks that are modeled as graphs. We present a new centrality metric called Localized Bridging Centrality (LBC). LBC is based on the Bridging Centrality (BC) metric that Hwang et al. recently introduced. Bridging nodes are nodes that are located in between highly connected regions. LBC is capable of identifying bridging nodes with an accuracy comparable to that of the BC metric for most networks. As the name suggests, we use only local information from surrounding nodes to compute the LBC metric, while, global knowledge is required to calculate the BC metric. The main difference between LBC and BC is that LBC uses the egocentric definition of betweenness centrality to identify bridging nodes, while BC uses the sociocentric definition of betweenness centrality. Thus, our LBC metric is suitable for distributed computation and has the benefit of being an order of magnitude faster to calculate in computational complexity. We compare the results produced by BC and LBC in three examples. We applied our LBC metric for network analysis of a real wireless mesh network. Our results indicate that the LBC metric is as powerful as the BC metric at identifying bridging nodes that have a higher flow of information through them (assuming a uniform distribution of network flows) and are important for the robustness of the network.
%T Complete Configuration Space Analysis for Structure Determination of Symmetric Homo-oligomers by NMR
%A Shobha Potluri
%R Technical Report TR2008-613
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-613.pdf
%X
Symmetric homo-oligomers (protein complexes with similar subunits arranged symmetrically) play pivotal roles in complex biological processes such as ion transport and cellular regulation. Structure determination of these complexes is necessary in order to gain valuable insights into their mechanisms. Nuclear Magnetic Resonance (NMR) spectroscopy is an experimental technique used for structural studies of such complexes. The data available for structure determination of symmetric homo-oligomers by NMR is often sparse and ambiguous in nature, raising concerns about existing heuristic approaches for structure determination. We have developed an approach that is complete in that it identifies all consistent conformations, data-driven in that it separately evaluates the consistency of structures to data and biophysical constraints and efficient in that it avoids explicit consideration of each of the possible structures separately. By being complete, we ensure that native conformations are not missed. By being data-driven, we are able to separately quantify the information content in the data alone versus data and biophysical modeling. We take a configuration space (degree-of-freedom) approach that provides a compact representation of the conformation space and enables us to efficiently explore the space of possible conformations. This thesis demonstrates that the configuration space-based method is robust to sparsity and ambiguity in the data and enables complete, data-driven and efficient structure determination of symmetric homo-oligomers.
%Z
Ph.D thesis, completed in March 2007.
Filed as a tech report February 2008.
Advisor: Chris Bailey-Kellogg.
%T Experiment Planning for Protein Structure Elucidation and Site-Directed Protein Recombination
%A Xiaoduan Ye
%R Technical Report TR2008-614
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2007
%U http://www.cs.dartmouth.edu/reports/TR2008-614.pdf
%X
In order to most effectively investigate protein structure and improve protein function, it is
necessary to carefully plan appropriate experiments. The combinatorial number of possible
experiment plans demands effective criteria and efficient algorithms to choose the one that
is in some sense optimal. This thesis addresses experiment planning challenges in two
significant applications. The first part of this thesis develops an integrated computational-experimental
approach for rapid discrimination of predicted protein structure models by
quantifying their consistency with relatively cheap and easy experiments (cross-linking
and site-directed mutagenesis followed by stability measurement). In order to obtain the
most information from noisy and sparse experimental data, rigorous Bayesian frameworks
have been developed to analyze the information content. Efficient algorithms have been
developed to choose the most informative, least expensive, and most robust experiments.
The effectiveness of this approach has been demonstrated using existing experimental data
as well as simulations, and it has been applied to discriminate predicted structure models
of the pTfa chaperone protein from bacteriophage lambda.
The second part of this thesis seeks to choose optimal breakpoint locations for protein
engineering by site-directed recombination. In order to increase the possibility of obtaining
folded and functional hybrids in protein recombination, it is necessary to retain the evolutionary
relationships among amino acids that determine protein stability and functionality.
A probabilistic hypergraph model has been developed to model these relationships, with
edge weights representing their statistical significance derived from database and a protein
family. The effectiveness of this model has been validated by showing its ability to
distinguish functional hybrids from non-functional ones in existing experimental data. It
has been proved to be NP-hard in general to choose the optimal breakpoint locations for
recombination that minimize the total perturbation to these relationships, but exact and
approximate algorithms have been developed for a number of important cases.
%Z
Ph.D thesis; Advisor Chris Bailey-Kellogg.
%T PPAA: Peer-to-Peer Anonymous Authentication (Extended Version)
%A Patrick P. Tsang
%A Sean W. Smith
%R Technical Report TR2008-615
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-615.pdf
%X
In the pursuit of authentication schemes that balance user privacy
and accountability, numerous anonymous credential systems have been
constructed. However, existing systems assume a client-server
architecture in which only the clients, but not the servers, care
about their privacy. In peer-to-peer (P2P) systems where both
clients and servers are peer users with privacy concerns, no
existing system correctly strikes that balance between privacy and
accountability.
In this paper, we provide this missing piece: a credential system in
which peers are {\em pseudonymous} to one another (that is, two who
interact more than once can recognize each other via pseudonyms) but
are otherwise anonymous and unlinkable across different peers. Such
a credential system finds applications in, e.g., Vehicular Ad-hoc
Networks (VANets) and P2P networks.
We formalize the security requirements of our proposed credential
system, provide a construction for it, and prove the security of our
construction. Our solution is efficient: its complexities are
independent of the number of users in the system.
%Z
This technical report is the extended version
of the paper to appear in ACNS '08 under the same title.
%T Bounded Unpopularity Matchings
%A Chien-Chung Huang
%A Kavitha Telikepalli
%A Dimitrios Michail
%A Meghana Nasre
%R Technical Report TR2008-616
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-616.pdf
%X
We investigate the following problem: given a set of jobs and a set of people with preferences over the jobs, what is the optimal way of matching people to jobs? Here we consider the notion of \emph{popularity}. A matching $M$ is popular if there is no matching $M'$ such that more people prefer $M'$ to $M$ than the other way around. Determining whether a given instance admits a popular matching and, if so, finding one, was studied in \cite{AIKM05}. If there is no popular matching, a reasonable substitute is a matching whose {\em unpopularity} is bounded. We consider two measures of unpopularity - {\em unpopularity factor} denoted by $u(M)$ and {\em unpopularity margin} denoted by $g(M)$. McCutchen recently showed that computing a matching $M$
with the minimum value of $u(M)$ or $g(M)$ is NP-hard, and that if $G$ does not admit a popular matching, then we have $u(M) \ge 2$ for all matchings $M$ in $G$.
Here we show that a matching $M$ that achieves $u(M) = 2$ can be computed in $O(m\sqrt{n})$ time (where $m$ is the number of edges in $G$ and $n$ is the number of nodes) provided a certain graph $H$ admits a matching that matches all people. We also describe a sequence of graphs: $H = H_2, H_3,\ldots,H_k$ such that if $H_k$ admits a matching that matches all people, then we can compute in $O(km\sqrt{n})$ time a matching $M$ such that $u(M) \le k-1$ and $g(M) \le n(1-\frac{2}{k})$. Simulation results suggest that our algorithm finds a matching with low unpopularity.
%T YASIR: A Low-Latency, High-Integrity Security Retrofit for Legacy SCADA Systems (Extended Version)
%A Patrick P. Tsang
%A Sean W. Smith
%R Technical Report TR2008-617
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-617.pdf
%X
We construct a bump-in-the-wire (BITW) solution that retrofits
security into time-critical communications over bandwidth-limited
serial links between devices in legacy Supervisory Control And Data
Acquisition (SCADA) systems, on which the proper operations of
critical infrastructures such as the electric power grid
rely. Previous BITW solutions do not provide the necessary security
within timing constraints; the previous solution that does is not
BITW.
At a hardware cost comparable to existing solutions, our BITW solution
provides sufficient security, and yet incurs minimal end-to-end
communication latency.
%Z
This technical report is the extended version of the paper to appear in
IFIP SEC 2008 under the same title. An earlier version of the paper appeared
as Dartmouth Computer Science Technical Report TR2007-603.
%T The Weakest Failure Detector to Solve Mutual Exclusion
%A Vibhor Bhatt
%A Nicholas Christman
%A Prasad Jayanti
%R Technical Report TR2008-618
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-618.pdf
%X
Mutual exclusion is not solvable in an asynchronous
message-passing system where processes are subject to crash
failures. Delporte-Gallet et. al. determined the weakest
failure detector to solve this problem when a majority of
processes are correct. Here we identify the weakest failure
detector to solve mutual exclusion in any environment, i.e.,
regardless of the number of faulty processes.
We also show a relation between mutual exclusion and
consensus, arguably the two most fundamental problems in
distributed computing. Specifically, we show that a failure
detector that solves mutual exclusion is sufficient to solve
non-uniform consensus but not necessarily uniform consensus.
%T Mesh-Mon: a Monitoring and Management System for Wireless Mesh Networks
%A Soumendra Nanda
%R Technical Report TR2008-619
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-619.pdf
%X
A mesh network is a network of wireless routers that employ multi-hop routing
and can be used to provide network access for mobile clients. Mobile mesh
networks can be deployed rapidly to provide an alternate communication
infrastructure for emergency response operations in areas with limited or
damaged infrastructure.
In this dissertation, we present Dart-Mesh: a Linux-based layer-3 dual-radio
two-tiered mesh network that provides complete 802.11b coverage in the Sudikoff
Lab for Computer Science at Dartmouth College. We faced several challenges in
building, testing, monitoring and managing this network. These challenges
motivated us to design and implement Mesh-Mon, a network monitoring system to
aid system administrators in the management of a mobile mesh network. Mesh-Mon
is a scalable, distributed and decentralized management system in which mesh
nodes cooperate in a proactive manner to help detect, diagnose and resolve
network problems automatically. Mesh-Mon is independent of the routing protocol
used by the mesh routing layer and can function even if the routing protocol
fails. We demonstrate this feature by running Mesh-Mon on two versions of
Dart-Mesh, one running on AODV (a reactive mesh routing protocol) and the second
running on OLSR (a proactive mesh routing protocol) in separate experiments.
Mobility can cause links to break, leading to disconnected partitions. We
identify critical nodes in the network, whose failure may cause a partition. We
introduce two new metrics based on social-network analysis: the Localized
Bridging Centrality (LBC) metric and the Localized Load-aware Bridging
Centrality (LLBC) metric, that can identify critical nodes efficiently and in a
fully distributed manner.
We run a monitoring component on client nodes, called Mesh-Mon-Ami, which also
assists Mesh-Mon nodes in the dissemination of management information between
physically disconnected partitions, by acting as carriers for management data.
We conclude, from our experimental evaluation on our 16-node Dart-Mesh testbed,
that our system solves several management challenges in a scalable manner, and
is a useful and effective tool for monitoring and managing real-world mesh
networks.
%Z
PhD Thesis. Advisor: David Kotz.
Examining Committee: David Kotz, Andrew Campbell, Robert L. Drysdale, Robert S. Gray.
%T A Dynamically Refocusable Sampling Infrastructure for 802.11 Networks
%A Udayan Deshpande
%R Technical Report TR2008-620
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-620.pdf
%X
The edge of the Internet is increasingly wireless. Enterprises large and small, homeowners,
and even whole cities have deployed Wi-Fi networks for their users,
and many users never need to--- or never bother to--- use the wired
network. With the advent of high-throughput wireless networks (such as
802.11n) some new construction, even of large enterprise build- ings,
may no longer be wired for Ethernet. To understand Internet traffic,
then, we need to understand the wireless edge.
Measuring Wi-Fi traffic, however, is challenging. It is insufficient
to capture traffic in the access points, or upstream of the access
points, because the activity of neighboring networks, ad hoc networks,
and physical interference cannot be seen at that level. To truly
understand the MAC-layer behavior, we need to capture frames from the
air using Air Monitors (AMs) placed in the vicinity of the
network. Such a capture is always a sample of the network activity,
since it is physically impossible to capture a full trace: all frames
from all channels at all times in all places.
We have built a monitoring infrastructure that captures frames from the 802.11 network.
This infrastructure includes several "channel sampling" strategies
that will capture repre- sentative traffic from the network. Further,
the monitoring infrastructure needs to modify its behavior according
to feedback received from the downstream consumers of the captured
traffic in case the analysis needs traffic of a certain type. We call
this technique "refocusing". The "coordinated sampling" technique
improves the efficiency of the monitoring by utilizing the AMs
intelligently.
Finally, we deployed this measurement infrastructure within our Computer Science
building to study the performance of the system with real network
traffic.
%Z
Ph.D Dissertation. Advisor: David Kotz.
Thesis Committee. David Kotz, Andrew Campbell, Chris Bailey-Kellogg, Brian Noble.
%T Group-Aware Stream Filtering
%A Ming Li
%R Technical Report TR2008-621
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-621.pdf
%X
Recent years have witnessed a new class of monitoring applications
that need to continuously collect information from remote data
sources. Those data sources, such as web click-streams, stock quotes,
and sensor data, are often characterized as fast-rate high-volume
``streams''. Distributed stream-processing systems are thus designed
to efficiently use system resources to serve the data-acquisition
needs of the applications. Most of the state-of-the-art
stream-processing systems assume an Ethernet-based network whose
bandwidth is abundant, and focus on mechanisms to save computational
power and memory. For applications involving wireless networks,
particularly multi-hop mesh networks, we recognize that the most
limiting factor in efficiently processing streams lies in the
network's highly constrained bandwidth. Hence, this dissertation
proposes a group-aware stream filtering approach that saves bandwidth
at the cost of increased CPU time, for low-bandwidth data-streaming
systems.
This approach, used together with multicasting, exploits two
overlooked properties of monitoring applications: 1) many of them can
tolerate some degree of ``slack'' in their data quality requirements,
and 2) there may exist multiple subsets of the source data satisfying
the quality needs of an application. We can thus choose the ``best
alternative'' subset for each application to maximize the data overlap
within the group to best benefit from multicasting. After proving the
problem NP-hard, we introduce a suite of heuristics-based algorithms
that ensure data quality, specifically data granularity and
timeliness, in addition to preserving network bandwidth.
Our framework for group-aware stream filtering is extensible and
supports a diverse range of filtering needs of monitoring
applications. We evaluate this approach with a prototype system based
on real-world data sets. The results show that quality-managed
group-aware filtering is effective in trading CPU time for bandwidth
savings, compared with self-interested stream filtering. We also
evaluate the effect of each algorithm on temporal freshness of the
data. Finally, we discuss other application realms that might benefit
from group-aware stream filtering.
%Z
Ph.D Dissertation. Advisor: David Kotz
Thesis committee: David Kotz, Andrew Campbell, Paul Thompson, Apratim
Purakayastha
%T Linkability in Activity Inference Data Sets
%A Jeffrey Fielding
%R Technical Report TR2008-623
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-623.pdf
%X
Activity inference is an active area of ubiquitous computing research. By training machine learning algorithms
on data from sensors worn by volunteers, researchers hope to develop software that can interact more naturally with
the user by inferring what the user is doing. In this thesis, we use the same sensor data to infer which volunteer is
carrying the sensors. Such inference could be useful -- for example, a mobile device might infer who is carrying it
and adapt to that user's preferences. It also raises some privacy concerns, since an attacker could learn more about a
user by linking together several sensor traces from the same user. We develop a model to differentiate users based on
their sensor data, and examine its accuracy as well as the potential benefits and pitfalls.
%Z
Senior Honors Thesis. Advisors: Tanzeem Choudhury and David Kotz.
%T Making RBAC Work in Dynamic, Fast-Changing Corporate Environments
%A Ruslan Y. Dimov
%R Technical Report TR2008-624
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-624.pdf
%X
In large organizations with tens of thousands of employees, managing individual people's permissions is tedious and error prone, and thus a possible source of security risks. Role-Based Access Control addresses this problem by grouping users into roles, which reflect job functions in the corporation. Permissions are assigned to roles instead of directly to users, which means that all users assigned to a role have the same set of permissions with respect to that role. However, adoption of RBAC in organizations such as investment banks is hindered by two main factors: first, it is costly and time-consuming to define roles. Second, there are certain job functions (such as consultant) that cannot be expressed as RBAC roles, because their users need to have different permission sets.
The topic of this thesis is to investigate whether roles can be applied to domains that exhibit the peculiarities of the investment bank example. We introduce a new framework for roles that allows us to separately represent what the role means as a job function, and what permissions its individual users have. That way we maintain the key property of RBAC - that the number of roles is small, while allowing for variations among users. We have also investigated machine learning approaches in order to figure out whether roles are concepts that can be learned or approximated by a function. We present our findings that certain learning schemes, such as Probably Approximately Correct (PAC) earning and Instance-based learning are not applicable to roles, while others - such as decision-tree learning, might be useful.
%Z
Senior Honors Thesis. Advisor: Sean Smith.
%T Anchor-Free Localization in Mixed Wireless Sensor Network Systems
%A Yurong Xu
%R Technical Report TR2008-626
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-626.pdf
%X
Recent technological advances have fostered the emergence of Wireless
Sensor Networks (WSNs), which consist of tiny, wireless, battery-powered nodes
that are expected to revolutionize the ways in which we understand and construct
complex physical systems. A fundamental property needed to use and maintain these
WSNs is ``localization'', which allows the establishment of spatial relationships
among nodes over time.
This dissertation presents a series of Geographic Distributed Localization
(GDL) algorithms for mixed WSNs, in which both static and mobile nodes can
coexist. The GDL algorithms provide a series of useful methods for localization in
mixed WSNs. First, GDL provides an approximation called ``hop-coordinates'', which
improves the accuracy of both hop-counting and connectivity-based measurement
techniques. Second, GDL utilizes a distributed algorithm to compute the locations
of all nodes in static networks with the help of the hop-coordinates
approximation. Third, GDL integrates a sensor component into this localization
paradigm for possible mobility and as a result allows for a more complex
deployment of WSNs as well as lower costs. In addition, the development of GDL
incorporated the possibility of manipulated communications, such as wormhole
attacks. Simulations show that such a localization system can provide fundamental
support for security by detecting and localizing wormhole attacks.
Although several localization techniques have been proposed in the past
few years, none currently satisfies our requirements to provide an accurate,
efficient and reliable localization for mixed WSNs. The contributions of this
dissertation are: (1) our measurement technique achieves better accuracy both in
measurement and localization than other methods; (2) our method significantly
improves the efficiency of localization in updating location in mixed WSNs by
incorporating sensors into the method; (3) our method can detect and locate the
communication that has been manipulated by a wormhole in a network without relying
on a central server.
%Z
Ph.D Dissertation. Advisor: Fillia Makedon.
%T Detecting kernel rootkits
%A Ashwin Ramaswamy
%R Technical Report TR2008-627
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-627.pdf
%X
Kernel rootkits are a special category of malware that are deployed directly in the
kernel and hence have unmitigated reign over the functionalities of the kernel itself.
We seek to detect such rootkits that are deployed in the real world by first observing
how the majority of kernel rootkits operate. To this end, comparable to how rootkits
function in the real world, we write our own kernel rootkit that manipulates the network
driver, thus giving us control over all packets sent into the network.
We then implement a mechanism to thwart the attacks of such rootkits by noticing
that a large number of the rootkits deployed today rely heavily on the redirection of
function pointers within the kernel. By overwriting the desired function pointer to its
own function, a rootkit can perform a proverbial man-in-the-middle attack.
Our goal is not just the detection of kernel rootkits, but also to levy as little an
impact on system performance as possible. Hence our technique is to leverage existing
kernel functionalities (in the case of Linux) such as kprobes to identify potential attack
scenarios from within the sytem rather than from outside it (such as a VMM). We hope
to introduce real-world security in devices where performance and resource constraints
are tantamount to security considerations.
%Z
M.S. Thesis Proposal. Advisor: Sean W. Smith
%T Key Management for Secure Power SCADA
%A Manya K. Sleeper
%R Technical Report TR2008-628
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-628.pdf
%X
This thesis proposes a key management protocol for secure
power SCADA systems that seeks to take advantage of the full security
capacity of a given network by allowing devices to use public key
cryptography for key management if they are capable of doing so and
reverting to symmetric key cryptography only when such use is
necessitated by the weakness of a given device. Allowing devices to
obtain different levels of security permits SCADA networks to maximize
their security in the decades before such networks are capable of
implementing fully public key-based key management protocols. Such a
system is obtained through the use of a protocol based on a modified
version of SSL using X.509 certificates containing encrypted symmetric
keys that allow master devices the option of using the symmetric keys for
encrypting the shared secret used to create keying material, instead of
using a slave device's public key. This thesis presents the protocol and
uses proof-of-concept code to carry out a performance evaluation of the
key management scheme.
%Z
Senior Honors Thesis. Advisor: Sean Smith.
%T Lighting and Optical Tools for Image Forensics
%A Micah K. Johnson
%R Technical Report TR2008-629
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2008
%U http://www.cs.dartmouth.edu/reports/TR2008-629.pdf
%X
We present new forensic tools that are capable of detecting traces of tampering in digital images without the use of watermarks or specialized hardware. These tools operate under the assumption that images contain natural properties from a variety of sources, including the world, the lens, and the sensor. These properties may be disturbed by digital tampering and by measuring them we can expose the forgery.
In this context, we present the following forensic tools: (1) illuminant direction, (2) specularity, (3) lighting environment, and (4) chromatic aberration. The common theme of these tools is that they exploit lighting or optical properties of images. Although each tool is not applicable to every image, they add to a growing set of image forensic tools that together will complicate the process of making a convincing forgery.
%Z
Ph.D Dissertation. Advisor: Hany Farid.
%T Planar Microassembly by Parallel Actuation of MEMS Microrobots (Microassembly Video)
%A Bruce R. Donald
%A Christopher G. Levey
%A Igor Paprotny
%R Technical Report TR2008-630
%I Dartmouth College, Computer Science
%C Hanover, NH
%D August 2008
%X
Movie of a representative microassembly experiment using devices from species 1,3,4 and 5, recorded through an optical microscope. The robots are initially arranged along the corners of a rectangle with sides 1 by 0.9 mm. The assembly experiment is divided into three stages. During stage 1, devices 4 and 5 dock together to form the initial stable shape. In stage 2, device 3 docks with the initial stable shape, while during stage 3, device 1 docks with the stable shape, forming the final assembly.
%Z
This technical report is a multimedia web page, available at
http://www.cs.dartmouth.edu/reports/TR2008-630/
.
%T Pas de Deux avec les Microrobots (Video)
%A Bruce R. Donald
%A Christopher G. Levey
%A Igor Paprotny
%R Technical Report TR2008-631
%I Dartmouth College, Computer Science
%C Hanover, NH
%D Â Â Â August 2008
%X
Video captured through an optical microscope, showing simultaneous control and operation of two stress-engineered microrobots. The dimensions of our microrobots are 260 x 60 x 10 micrometers; each robot consists of an unthetered scratch-drive actuator that provides forward motion, and a steering-arm actuator that controls whether the robot moves in a straight line or turns.
Our stress-engineered microrobots are electrostatically powered via a global control signal transmitted to all the robots regardless of the their position and orientation within their operating environment. Hence, a single control and power-delivery signal must be used to simultaneously control all robots within the same operating environment, resulting in a highly underactuated system. Despite this high level of underactution we are able to achieve independent control of the individual microrobots by designing their steering-arms to respond to different voltage levels of the supplied control signal.
This example uses nested hysteresis gaps. A hysteresis gap is the difference between the snap-down and release voltages for a steering-arm actuator. Nested hysteresis gaps allow us to set the states of the steering-arms (up or down) to any configuration. As shown in this video, all four states of the two microrobot steering-arms are used to choreograph their motion.
A disadvantage of nested hysteresis gaps is that they are control-voltage bandwidth intensive, limiting the number of simultaneously-controllable devices. An alternative multi-microrobot control scheme that minimizes control-bandwidth is described in [1].
%Z
[1] B. R. Donald, C. G. Levey and I. Paprotny. "Planar Microassembly by Parallel Actuation of MEMS Microrobots." Journal of Microelectromechanical Systems,2008, (Epub before print) Digital Object Identifier: 10.1109/JMEMS.2008.924251.
This technical report is a multimedia web page, available at http://www.cs.dartmouth.edu/reports/TR2008-631/ . %T TwoKind Authentication: Protecting Private Information in Untrustworthy Environments (Extended Version) %A Katelin Bailey %A Apu Kapadia %A Linden Vongsathorn %A Sean W. Smith %R Technical Report TR2008-632 %I Dartmouth College, Computer Science %C Hanover, NH %D August 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-632.pdf %X We propose and evaluate TwoKind Authentication, a simple and effective technique that allows users to limit access to their private information in untrustworthy environments. Users often log in to Internet sites from insecure computers, and more recently have started divulging their email passwords to social-networking sites, thereby putting their private communications at risk. To mitigate this problem, we explore the use of multiple authenticators for the same account that are associated with specific sets of privileges. In its simplest form, TwoKind features two modes of authentication, a low and a high authenticator. By using a low authenticator, users can signal to the server they are in an untrusted environment, following which the server restricts the user's actions, including access to private data. In this paper, we seek to evaluate the effectiveness of multiple authenticators in promoting safer behavior in users. We demonstrate the effectiveness of this approach through a user experiment --- we find that users make a distinction between the two authenticators and generally behave in a security-conscientious way, protecting their high authenticator a majority of the time. Our study suggests that TwoKind will be beneficial to several Internet applications, particularly if the privileges can be customized to a user's security preferences. %Z Expanded version of the WPES 2008 paper. %T Attribute-Based, Usefully Secure Email %A Christopher P. Masone %R Technical Report TR2008-633 %I Dartmouth College, Computer Science %C Hanover, NH %D August 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-633.pdf %X A secure system that cannot be used by real users to secure real-world processes is not really secure at all. While many believe that usability and security are diametrically opposed, a growing body of research from the field of Human-Computer Interaction and Security (HCISEC) refutes this assumption. All researchers in this field agree that focusing on aligning usability and security goals can enable the design of systems that will be more secure under actual usage. We bring to bear tools from the social sciences (economics, sociology, psychology, etc.) not only to help us better understand why deployed systems fail, but also to enable us to accurately characterize the problems that we must solve in order to build systems that will be secure in the real world. Trust, a critically important facet of any socio-technical secure system, is ripe for analysis using the tools provided for us by the social sciences. There are a variety of scopes in which issues of trust in secure systems can be stud- ied. We have chosen to focus on how humans decide to trust new correspondents. Current secure email systemsÑsuch as S/MIME and PGP/MIMEÑare not expressive enough to capture the real ways that trust flows in these sorts of scenarios. To solve this problem, we begin by applying concepts from social science research to a variety of such cases from interesting application domains; primarily, crisis management in the North American power grid. We have examined transcripts of telephone calls made between grid manage- ment personnel during the August 2003 North American blackout and extracted several different classes of trust flows from these real-world scenarios. Combining this knowl- edge with some design patterns from HCISEC, we develop criteria for a system that will enable humans apply these same methods of trust-building in the digital world. We then present Attribute-Based, Usefully Secure Email (ABUSE) and not only show that it meets our criteria, but also provide empirical evidence that real users are helped by the system. %Z Ph.D Dissertation. Advisor: Sean W. Smith. %T LZfuzz: a fast compression-based fuzzer for poorly documented protocols %A Sergey Bratus %A Axel Hansen %A Anna Shubina %R Technical Report TR2008-634 %I Dartmouth College, Computer Science %C Hanover, NH %D September 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-634.pdf %X Real-world infrastructure offers many scenarios where protocols (and other details) are not released due to being considered too sensitive or for other reasons. This situation makes it hard to apply fuzzing techniques to test their security and reliability, since their full documentation is only available to their developers, and domain developer expertise does not necessarily intersect with fuzz-testing expertise (nor deployment responsibility). State-of-the-art fuzzing techniques, however, work best when protocol specifications are available. Still, operators whose networks include equipment communicating via proprietary protocols should be able to reap the benefits of fuzz-testing them. In particular, administrators should be able to test proprietary protocols in the absence of end-to-end application-level encryption to understand whether they can withstand injection of bad traffic, and thus be able to plan adequate network protection measures. Such protocols can be observed in action prior to fuzzing, and packet captures can be used to learn enough about the structure of the protocol to make fuzzing more efficient. Various machine learning approaches, e.g. bioinformatics methods, have been proposed for learning models of the targeted protocols. The problem with most of these approaches to date is that, although sometimes quite successful, they are very computationally heavy and thus are hardly practical for application by network administrators and equipment owners who cannot easily dedicate a compute cluster to such tasks. We propose a simple method that, despite its roughness, allowed us to learn facts useful for fuzzing from protocol traces at much smaller CPU and time costs. Our fuzzing approach proved itself empirically in testing actual proprietary SCADA protocols in an isolated control network test environment, and was also successful in triggering flaws in implementations of several popular commodity Internet protocols. Our fuzzer, LZfuzz (pronounced ``lazy-fuzz'') relies on a variant of Lempel--Ziv compression algorithm to guess boundaries between the structural units of the protocol, and builds on the well-known free software GPF fuzzer. %T BLAC: Revoking Repeatedly Misbehaving Anonymous Users Without Relying on TTPs %A Patrick P. Tsang %A Man Ho Au %A Apu Kapadia %A Sean W. Smith %R Technical Report TR2008-635 %I Dartmouth College, Computer Science %C Hanover, NH %D October 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-635.pdf %X Several credential systems have been proposed in which users can authenticate to service providers anonymously. Since anonymity can give users the license to misbehave, some variants allow the selective deanonymization (or linking) of misbehaving users upon a complaint to a trusted third party (TTP). The ability of the TTP to revoke a user's privacy at any time, however, is too strong a punishment for misbehavior. To limit the scope of deanonymization, systems have been proposed in which users are deanonymized if they authenticate ``too many times,'' such as ``double spending'' with electronic cash. While useful in some applications, it is not possible to generalize such techniques to more subjective definitions of misbehavior, e.g., it is not possible to block users who ``deface too many webpages'' on a website. We present BLAC, the first anonymous credential system in which service providers can revoke the credentials of repeatedly misbehaving users without relying on a TTP. Since revoked users remain anonymous, misbehaviors can be judged subjectively without users fearing arbitrary deanonymization by a TTP. Finally, our construction supports a $d$-strikes-out revocation policy, whereby users who have been subjectively judged to have repeatedly misbehaved at least $d$ times are revoked from the system. %Z The preliminary part of this report is based on our work originally published in the proceedings of the 14th ACM Conference on Computer and Communication Security (CCS '07), October 2007, under the title of "Blacklistable Anonymous Credentials: Blocking Misbehaving Users without TTPs". At the same period of time, we published an extended version of the CCS '07 paper as Dartmouth Computer Science technical report TR2007-601, which contained further details that did not fit into the conference paper. In this report, we make a significant additional contribution by extending our original construction of BLAC to provide more flexible revocation --- SPs can specify a d-strikes-out revocation policy, so that users can authenticate anonymously only if they have not misbehaved d or more times. Such a policy forgives a few (i.e., up to d-1) misbehaviors, but then blocks users who misbehave repeatedly. Following authentication, users remain anonymous, and SPs learn only whether a user has crossed the threshold of d misbehaviors. The original construction of BLAC is a special case with d=1. %T Toward Evaluating Lighting Design Interface Paradigms for Novice Users %A William Brandon Kerr %A Fabio Pellacini %R Technical Report TR2008-636 %I Dartmouth College, Computer Science %C Hanover, NH %D November 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-636.pdf %X Lighting design is a complex and fundamental task in computer cinematography, involving adjustment of light parameters to define final scene appearance. Many lighting interfaces have been proposed to improve lighting design work flow. These paradigms exist in three paradigm categories: direct light parameter manipulation, indirect light feature manipulation (e.g., shadow dragging), and goal-based optimization of light through painting. To this date, no formal evaluation of the relative effectiveness of these methods has been performed. In this paper, we present a first step toward evaluating the three paradigms in the form of a user study with novice users. We focus our evaluation on simple tasks that directly affect lighting features, such as highlights, shadows and intensity gradients, in scenes with up to 2 point lights and 5 objects under direct illumination. We perform quantitative experiments to measure relative efficiency between interfaces together with qualitative input to explore the intuitiveness of the paradigms. Our results indicate that paint-based goal specification is more cumbersome than either direct or indirect manipulation. Furthermore, our investigation suggests improvements to not only the implementation of the paradigms, but also overall paradigm structure for further exploration. %Z Revised version to appear in SIGGRAPH 2009. %T Nymble: Blocking Misbehaving Users in Anonymizing Networks %A Patrick P. Tsang %A Apu Kapadia %A Cory Cornelius %A Sean W. Smith %R Technical Report TR2008-637 %I Dartmouth College, Computer Science %C Hanover, NH %D December 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-637.pdf %X Anonymizing networks such as Tor allow users to access Internet services privately by using a series of routers to hide the client's IP address from the server. The success of such networks, however, has been limited by users employing this anonymity for abusive purposes such as defacing popular websites. Website administrators routinely rely on IP-address blocking for disabling access to misbehaving users, but blocking IP addresses is not practical if the abuser routes through an anonymizing network. As a result, administrators block \emph{all} known exit nodes of anonymizing networks, denying anonymous access to misbehaving and behaving users alike. To address this problem, we present Nymble, a system in which servers can ``blacklist'' misbehaving users, thereby \emph{blocking users without compromising their anonymity}. Our system is thus agnostic to different servers' definitions of misbehavior --- servers can blacklist users for whatever reason, and the privacy of blacklisted users is maintained. %Z Nymble first appeared in a PET '07 paper. This paper presents a significantly improved construction and a complete rewrite and evaluation of our (open-source) implementation. %T Digital Image Ballistics from JPEG Quantization: A Followup Study %A Hany Farid %R Technical Report TR2008-638 %I Dartmouth College, Computer Science %C Hanover, NH %D December 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-638.pdf %X The lossy JPEG compression scheme employs a quantization table that controls the amount of compression achieved. Because different cameras typically employ different tables, a comparison of an image's quantization scheme to a database of known cameras affords a simple technique for confirming or denying an image's source. This report describes the analysis of quantization tables extracted from 1,000,000 images downloaded from Flickr.com. %T Functional Monitoring Without Monotonicity %A Chrisil Arackaparambil %A Joshua Brody %A Amit Chakrabarti %R Technical Report TR2008-639 %I Dartmouth College, Computer Science %C Hanover, NH %D December 2008 %U http://www.cs.dartmouth.edu/reports/TR2008-639.pdf %X The notion of distributed functional monitoring was recently introduced by Cormode, Muthukrishnan and Yi to initiate a formal study of the communication cost of certain fundamental problems arising in distributed systems, especially sensor networks. In this model, each of k sites reads a stream of tokens and is in communication with a central coordinator, who wishes to continuously monitor some function f of \sigma, the union of the k streams. The goal is to minimize the number of bits communicated by a protocol that correctly monitors f(\sigma), to within some small error. As in previous work, we focus on a threshold version of the problem, where the coordinator's task is simply to maintain a single output bit, which is 0 whenever f(\sigma) \leq \tau(1 - \epsilon) and 1 whenever f(\sigma) \geq \tau. Following Cormode et al., we term this the (k, f, \tau, \epsilon) functional monitoring problem. In previous work, some upper and lower bounds were obtained for this problem, with f being a frequency moment function, e.g., F_0, F_1, F_2. Importantly, these functions are monotone. Here, we further advance the study of such problems, proving three new classes of results. First, we prove new lower bounds on this problem when f = F_p, for several values of p. Second, we study the effect of non-monotonicity of f on our ability to give nontrivial monitoring protocols, by considering f = F_p with deletions allowed, as well as f = H, the empirical Shannon entropy of a stream. Third, we provide nontrivial monitoring protocols when f is either H, or any of a related class of entropy functions (Tsallis entropies). These are the first nontrivial algorithms for distributed monitoring of non-monotone functions. %T Authenticated Streamwise On-line Encryption %A Patrick P. Tsang %A Rouslan V. Solomakhin %A Sean W. Smith %R Technical Report TR2009-640 %I Dartmouth College, Computer Science %C Hanover, NH %D March 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-640.pdf %X In Blockwise On-line Encryption, encryption and decryption return an output block as soon as the next input block is received. In this paper, we introduce Authenticated Streamwise On-line Encryption (ASOE), which operates on plaintexts and ciphertexts as streams of arbitrary length (as opposed to fixed-sized blocks), and thus significantly reduces message expansion and end-to-end latency. Also, ASOE provides data authenticity as an option. ASOE can therefore be used to efficiently secure resource-constrained communications with real-time requirements such as those in the electric power grid and wireless sensor networks. We investigate and formalize ASOE's strongest achievable notion of security, and present a construction that is secure under that notion. An instantiation of our construction incurs zero end-to-end latency due to buffering and only 48 bytes of message expansion, regardless of the plaintext-size. %T A Combined Routing Method for Ad Hoc Wireless Networks %A Soumendra Nanda %A Zhenhui Jiang %A David Kotz %R Technical Report TR2009-641 %I Dartmouth College, Computer Science %C Hanover, NH %D February 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-641.pdf %X Several simulation and real world studies show that certain ad hoc routing protocols perform better than others under specific mobility and traffic patterns. In order to exploit this phenomena, we propose a novel approach to adapt a network to changing conditions; we introduce "a combined routing method" that allows the network to seamlessly swap from one routing protocol to another protocol dynamically, while routing continues uninterrupted. By creating a thin new virtual layer, we enable each node in the ad hoc wireless network notify each other about the protocol swap and we do not make any changes to existing routing protocols. To ensure that routing works efficiently after the protocol swap, we reuse information from the previous protocol's routing table while initializing the data structures for the new routing protocol. We study the feasibility of our technique and the overheads incurred while swapping between AODV, ODMRP and APRL under different network topologies and traffic patterns through detailed simulations. Our results show that the swap latency is related to the nature of the destination protocol and the topology of the network. We also find that the control packet ratio of a routing protocol during and after a swap is close to that of the protocol running before a swap, thus indicating that our approach does not add excessive overhead. %Z This report is an updated version of TR2007-588. %T Approximability of the Unsplittable Flow Problem on Trees %A Chrisil Arackaparambil %A Amit Chakrabarti %A Chien-Chung Huang %R Technical Report TR2009-642 %I Dartmouth College, Computer Science %C Hanover, NH %D March 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-642.pdf %X We consider the approximability of the Unsplittable Flow Problem (UFP) on tree graphs, and give a deterministic quasi-polynomial time approximation scheme for the problem when the number of leaves in the tree graph is at most poly-logarithmic in $n$ (the number of demands), and when all edge capacities and resource requirements are suitably bounded. Our algorithm generalizes a recent technique that obtained the first such approximation scheme for line graphs. Our results show that the problem is not APX-hard for such graphs unless NP \subseteq DTIME(2^{polylog(n)}). Further, a reduction from the Demand Matching Problem shows that UFP is APX-hard when the number of leaves is Omega(n^\epsilon) for any constant \epsilon > 0. Together, the two results give a nearly tight characterization of the approximability of the problem on tree graphs in terms of the number of leaves, and show the structure of the graph that results in hardness of approximation. %T Dynamic Universal Accumulators for DDH Groups and Their Application to Attribute-Based Anonymous Credential Systems %A Man Ho Au %A Patrick P. Tsang %A Willy Susilo %A Yi Mu %R Technical Report TR2009-643 %I Dartmouth College, Computer Science %C Hanover, NH %D April 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-643.pdf %X We present the first dynamic universal accumulator that allows (1) the accumulation of elements in a DDH-hard group G and (2) one who knows x such that y=g^x has --- or has not --- been accumulated, where g generates G, to efficiently prove her knowledge of such x in zero knowledge, and hence without revealing, e.g., x or y. We introduce the Attribute-Based Anonymous Credential System (ABACS), which allows the verifier to authenticate anonymous users according to any access control policy expressible as a formula of possibly negated boolean user attributes. We construct the system from our accumulator. %Z This report is the extended version of the paper to appear in CT-RSA '09 under the same title. %T Autoscopy: Detecting Pattern-Searching Rootkits via Control Flow Tracing %A Ashwin Ramaswamy %R Technical Report TR2009-644 %I Dartmouth College, Computer Science %C Hanover, NH %D May 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-644.pdf %X Traditional approaches to rootkit detection assume the execution of code at a privilege level below that of the operating system kernel, with the use of virtual machine technologies to enable the detection system itself to be immune from the virus or rootkit code. In this thesis, we approach the problem of rootkit detection from the standpoint of tracing and instrumentation techniques, which work from within the kernel and also modify the kernel's run-time state to detect aberrant control flows. We wish to investigate the role of emerging tracing frameworks (Kprobes, DTrace etc.) in enforcing operating system security without the reliance on a full-blown virtual machine just for the purposes of such policing. We first build a novel rootkit prototype that uses pattern-searching techniques to hijack hooks embedded in dynamically allocated memory, which we present as a showcase of emerging attack techniques. We then build an intrusion detection system-- autoscopy, atop kprobes, that detects anomalous control flow patterns typically exhibited by rootkits within a running kernel. Furthermore, to validate our approach, we show that we were able to successfully detect 15 existing Linux rootkits. We also conduct performance analyses, which show the overhead of our system to range from 2% to 5% on a wide range of standard benchmarks. Thus by leveraging tracing frameworks within operating systems, we show that it is possible to introduce real-world security in devices where performance and resource constraints are tantamount to security considerations. %Z M.S. Thesis. Advisor: Sean W. Smith. %T Automated Tracking of Dividing Nuclei in Microscopy Videos of Living Cells %A Evan L. Tice %R Technical Report TR2009-645 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-645.pdf %X Many cell biologists perform analysis of multinucleated cell data in order to better under- stand the mechanisms that regulate cell division. Sbalzarini, et al., have developed methods for automatically tracking nuclei in cell data in order to aid in this time-consuming analysis. In this paper, we present an implementation of the Sbalzarini tracking algorithm, introduce a new algorithm we developed which is able to identify mitosis events, and present other software tools we have developed to aid in the automated detection of nucleus data. %Z Senior Honors Thesis. Advisors: Alex Barnett, Amy Gladfelter, Hany Farid %T An Information Complexity Approach to the Inner Product Problem %A Amit Chakrabarti %A William B. Henderson-Frost %A Ranganath Kondapally %R Technical Report TR2009-646 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-646.pdf %X We prove a lower bound of the randomized communication complexity of the inner product function on the uniform distribution. %Z Senior Honors Thesis. Advisor: Amit Chakrabarti. %T Hawk: 3D Gestured-Based Interactive Bird Flight Simulation %A Thomas Y. Eastman %R Technical Report TR2009-647 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-647.pdf %X Control interfaces provide the most tangible connection between human users and computer software. This link is especially important in interactive real-time applications, like games and simulations, because users desire efficient controls that allow them to maximize their interactivity and immersion with the software. Traditionally, interfaces have been largely limited to keyboards and mice. Recently, however, technological advances have made motion-sensitive devices not only available to mainstream consumers but have also lifted restrictions limiting devices to two-dimensional motion. This work presents a 3-dimensional motion-sensitive interface alongside a natural application. Players can control a soaring red-tailed hawk and perform various intuitive flight maneuvers using two Nintendo Wii Remotes (Wiimotes). %Z Senior Honors Thesis. Advisor: Fabio Pellacini. %T Surface Reconstruction through Time %A LeeAnn T. Brash %R Technical Report TR2009-648 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-648.pdf %X Surface reconstruction is an area of computational geometry that has been progressing rapidly over the last decade. Current algorithms and their implementations can reconstruct surfaces from a variety of input and the accuracy and precision improve with each new development. These all make use of various heuristics to achieve a reconstruction. Much of this work consists of reconstructing a still object from point samples taken from the object's surface. We examine reconstructing an n-dimensional object and its motion by treating time as an (n + 1)st axis. Our input consists of (n-1)-dimensional scans taken over time and at di?erent positions on the original object. This input is mapped into (n + 1) dimensions where the (n + 1)st dimension is a scaled time axis and then fed into an existing surface reconstruction algorithm. A cross section of the reconstructed surface perpendicular to the time axis yields an approximation to the shape of the n-dimensional surface at the corresponding point in time. The intended application for this work is the reconstruction of medical images from scanning technology such as MRI or CT into moving 3d surfaces. We investigate reconstructing 2d moving surfaces through time as a preliminary step towards the moving 3d problem. We spend most of our efforts in this thesis on the problem of computing a scaling factor for mapping time into the (n + 1)st axis to minimize the number of scans needed to meet the sampling requirements for an existing surface reconstruction algorithm. We give three bounds, based on features of the 2d moving object, that are necessary to accomplish this. %Z M.S. Thesis. Advisor: Scot Drysdale. %T A Computational Framework for Certificate Policy Operations %A Gabriel A. Weaver %A Scott Rea %A Sean W. Smith %R Technical Report TR2009-650 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-650.pdf %X The trustworthiness of any Public Key Infrastructure (PKI) rests upon the expectations for trust, and the degree to which those ex- pectations are met. Policies, whether implicit as in PGP and SDSI/SPKI or explicitly required as in X.509, document expectations for trust in a PKI. The widespread use of X.509 in the context of global e-Science infrastructures, financial institutions, and the U.S. Federal government demands efficient, transparent, and reproducible policy decisions. Since current manual processes fall short of these goals, we designed, built, and tested computational tools to process the citation schemes of X.509 certificate policies defined in RFC 2527 and RFC 3647. Our PKI Policy Repository, PolicyBuilder, and PolicyReporter improve the consistency of certificate policy operations as actually practiced in compliance au- dits, grid accreditation, and policy mapping for bridging PKIs. Anecdotal and experimental evaluation of our tools on real-world tasks establishes their actual utility and suggests how machine-actionable policy might empower individuals to make informed trust decisions in the future. %T Developing an Improved, Web-Based Classroom Response System with Web Services %A Oleg B. Seletsky %R Technical Report TR2009-651 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-651.pdf %X Classroom Response Systems (CRS) are an in-class technology used to poll students and instantly display an aggregate representation of their responses. CRS have been around since the 1970s and have become increasingly more popular in higher education lecture halls. Even though technology, specifically computers and communications, has improved significantly since the 1970s, CRS have remained surprisingly unchanged. The purpose of this project was to develop an innovative web-based CRS using web services. The web-based aspect utilizes Dartmouth's wireless campus while the web services back-end makes the product more extensible. Lastly, we added a set of out-of-class learning tools for students as well as an in-class tool called the Confusion Meter to enhance student-to-instructor communication. With these features, our goal was to create a free, open-source system that enhances the teaching and learning experience and remains extensible and developer-friendly, unlike any commercial CRS currently available. %Z Senior Honors Thesis. Advisor: Sean W. Smith %T The Effects of Introspection on Computer Security Policies %A Stephanie A. Trudeau %R Technical Report TR2009-652 %I Dartmouth College, Computer Science %C Hanover, NH %D June 2009 %U http://www.cs.dartmouth.edu/reports/TR2009-652.pdf %X What does it mean to be an expert? And what makes an expert more capable than a non-expert when it comes to evaluating and articulating their impressions about something as commonly practiced as food tasting? How do we explain those behaviors that humans perform very well, but don't quite know why? Studies have shown that there exists a class of activities that we as humans execute well intuitively, but that we perform much worse upon introspection. Evidence supports the claim that the act of introspection actually causes us to do more poorly at these tasks. My goal is to apply this idea to computer security. At present, designs for most security policy interfaces leave much to be desired. This lack of usability leaves these systems in need of improvement, possibly causing users to become more vulnerable than they otherwise would have. My research includes a user study on the privacy policies of the interface for a social networking website similar to Facebook. Evidence from the study supports the claim that the act of introspecting upon one's personal security policy actually makes one worse at making policy decisions. %Z Senior Honors Thesis. Advisor: Sean Smith, Scout Sinclair. %T Maze Game for the Macintosh %A Mark Sherman %R Technical Report DCS-TR86-101 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X Maze is a multiperson game using a broadcast protocol on the Macintosh. This disk contains the sources (in TML Pascal) for a Maze game. %Z You may download this disk as a Zip archive. %T Instructions for Using Logic %A John W. Scott %R Technical Report PCS-TR86-102 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-102.pdf %X This document provides instructions on the use of "Logic", a digital logic simulator for the Macintosh written for Dartmouth students in the Machine Architecture and Digital Electronics courses. %T Digital Logic Simulator %A John W. Scott %R Technical Report DCS-TR86-103 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X Logic is a Macintosh program that simulates discrete, digital logic circuits. The two disks contain an executable version of the program, MacApp sources and brief documentation (which is also available separately in hardcopy). %Z You may download this disk as a Zip archive. %T Uses of Generics in Ada %A Mark Sherman %R Technical Report PCS-TR86-104 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-104.pdf %X This paper discusses how Ada generic procedures and packages can be used in novel ways to provide general initialization of records, expressions using discriminants, record field hiding and explicit control of parameter binding. %T Parallel Accessible Memory %A Shinji Nakamura %R Technical Report PCS-TR86-105 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-105.pdf %X A new design of a memory device which allows simultaneous access to more than one location is proposed. The unique feature of this multiple accessibility of the memory is realized by applying a binomial concentrator, a type of sparse crossbar interconnection network, to content-addressable memory. The organization of the memory system and the concentration network structure as well as the network characteristics are described along with a distributive control algorithm. Applications of the memory system to parallel processing environments are also included. %T Algorithms for Iterative Array Multiplication %A Shinji Nakamura %R Technical Report PCS-TR86-106 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-106.pdf %X Algorithms for the parallel multiplication of two n bit binary numbers by an iterative array of logic cells are discussed. The regular interconnection structures of the multiplier array cell elements, which are ideal for VLSI implementation, are described. The speed and hardware complexity of two new iterative array algorithms, both of which require n cell delays for one n by n bit multiplication, are compared with a straight-forward iterative array algorithm having a 2n cell delay and its higher radix version having an n cell delay. %T What Should a Discrete Mathematics Course Be? %A Kenneth P. Bogart %R Technical Report PCS-TR86-107 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This paper presents the results of a survey of mathematics and computer science departments on their needs and intentions for survey courses in discrete mathematics. The paper then presents an idealized syllabus for such a course, recognizing that many topics in the syllabus should be regarded as optional. The paper concludes with a brief summary of the experience at Dartmouth in teaching such a course. %Z This TR appears to have been lost. (April 2008) %T Dartmouth-Smalltalk: An Exercise in Implementation %A Joon Sup Lee %R Technical Report PCS-TR86-108 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-108.pdf %X This report discusses a preliminary implementation of the Smalltalk-80 virtual machine for Vax/Unix. %T MacIP Sources and Applications %A Mark Sherman %A Tim Maroney %R Technical Report DCS-TR86-109 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X The MacIP set consists of three disks, all single-sided, MFS formatted. Two disks contain the Lisa Pascal/Lisa Assembler sources for the network libraries and application programs; the third disk contains the runnable applications along with an AppleTalk configured system. The current version by Tim Maroney fixes all reported bugs mentioned in the MacIP Technical Report and includes the TCP library and Telnet program. %Z You may download this disk as a Zip archive. %T Stack Package Implementing Tasks and Coroutines for TML Pascal on the Macintosh %A Mark Sherman %R Technical Report DCS-TR86-110 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This disk contains the necessary TML Pascal and MDS Assembler files for implementing tasking and coroutines in TML Pascal Programs. Two example programs and brief documentation are provided. MDS format disk. %Z You may download this disk as a Zip archive. %T QDT (Quickdraw Terminal) Disks %A Mark Sherman %R Technical Report DCS-TR86-111 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This disk set consists of one disk containing a running version of QDT over a serial line, one disk containing a running version of QDT over Apple Talk (using KSP), and one disk containing the Lisa Pascal sources for creating QDT (either version). %Z You may download this disk as a Zip archive. %T QDT (Quickdraw Terminal) Libraries %A Mark Sherman %A Ann Marks %R Technical Report TCS-TR86-112 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This Unix tar-tape contains the C libraries used on Unix and the PL/1 Libraries used on DCTS for communicating with QDT. (No longer available.) %T A Collection of MDS Example Programs %A Joe Bergin %A Robert L. Scot Drysdale %A Jerry Keough %A Larry Gallagher %A Vivian Sewelson %A Mark Sherman %R Technical Report DCS-TR86-113 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This MFS formatted disk contains a collection of example programs written for teaching machine architecture concepts using the Macintosh Development System. Sample programs illustrate how instructions can be treated as data, computer arithmetic, looping features, interrupt handling, coroutines, recursive subprograms and stack frames, traps and direct manipulation of the screen. In addition, a supervisor program is provided that implements basic, line-oriented input and output facilities for the 68000 assembly language programs on the Macintosh. %Z You may download this disk as a Zip archive. %T Music Editor %A Ed Grosz %R Technical Report DCS-TR86-114 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This disk contains a simple music editor that creates resources that can be used directly by the Macintosh Sound Driver. The MFS formatted disk contains both the runnable program and the Lisa Object Pascal/MacApp Sources. %Z You may download this disk as a Zip archive. %T KD Tree Simulator %A Ed Grosz %R Technical Report DCS-TR86-115 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This MFS formatted disk contains an animation program for illustrating KD-Trees (D-dimensional search trees) (actually, only 2 dimensions are shown). Both the runnable application and the Lisa Object Pascal/MacApp sources are included. %Z This diskette appears to have been lost (April 2008). %T A Practical, Distributed Environment for Macintosh Software Development %A Mark Sherman %A Ann Marks %A Rob Collins %A Heather Anderson %A Jerry Godes %A Denis Devlin %A Leonid Spector %A Vivian Sewelson %R Technical Report PCS-TR86-116 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-116.pdf %X We describe a development environment we created for prototyping software for the Macintosh. The programs are developed and executed on a large time-shared computer but can use the full facilities of the Macintosh. By using this system, we combine the advantages of the large system, such as large amounts of disk storage and automatic file backups, with the advantages of the Macintosh, such as advanced graphics, mouse control and sound synthesis. We also describe several projects that used the distributed development system. We conclude with a description of our future plans for this environment. %T View-3 and Ada: Tools for Building Systems with Many Tasks %A Ann Kratzer %A Mark Sherman %R Technical Report PCS-TR86-117 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-117.pdf %X This paper discusses some useful features for tools that are intended to be used for developing systems with multiple tasks. We include a description of one tool that has been built, View-3. We also describe some problems that might be encountered when trying to fit this kind of tool into an APSE system. %T A Distributed Strategy for Resource Allocation %A Ann Kratzer %R Technical Report PCS-TR86-118 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-118.pdf %X In this paper, we present a decentralized algorithm for determining how resources should be allocated to sites. This algorithm is general in the sense that it can be used for allocating different kinds of resources. This algorithm can be applied either statically, used on a quiescent system, or used dynamically as the system runs. Throughout, we consider only systems in which a resource has a fixed cost associated with it regardless of where it may be located (i.e. the system is homogeneous). %T Task Queues: A General Model for the Implementation of Communications Protocols %A Ann Kratzer %R Technical Report PCS-TR86-119 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-119.pdf %X When any computer communications network is built, its communications protocol must always be implemented. The protocol is implementetd on the switching nodes of the network. The node software must respond in real time to events generated external to the switching node. Thus, the software running on a switching node constitutes a concurrent program; this complicates the design, implementation and testing of the switching node software. The task queue model presented in this paper defines a structure for this software that facilitates the design, implementation and testing of communications protocols. %T View 3: A Programming Environment for Distributed Programming %A Ann Kratzer %R Technical Report PCS-TR86-120 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-120.pdf %X View 3 is an experimental programming environment to support the development and use of distributed programs. It builds upon three major concepts. First, distributed programs and distributed processes are basic objects. Second, the port mechanism allows a process to exchange information with another process, a file or an I/O device without concern for the type of the object on the other end of the port. Third, test and use of distributed programs are facilitated by the user interface program screen format that allows the user to control both the format and contents of the physical terminal. %T Creating Havoc: Havoc Development Program %A David Cohn %A Stephen Madancy %R Technical Report PCS-TR86-121 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-121.pdf %X One area where use of the computer is essential is in the modern scientific laboratory. High speed computation, data storage and data analysis enable scientists to perform experiments that would otherwise be impractical. A problem inherent to the effective use of special purpose laboratory computers, however, is the fact that this equipment has generally been developed for highly specific uses, and has either tried to cope with existing high-level languages or has abandoned the attempt and required the user to program in a low-level assembly or machine language. Our idea was to design, develop and implement a programming language that is suited to the needs of a laboratory scientist. Our results have led us to believe that the best way to achieve our goals was using an interpretive/compiled programming environment (similar in spirit to FORTH) in which large programs could be built in small, coherent pieces, that could easily be tested on as high or low a level as the programmer desired. Our language, Havoc, adheres to these principles while providing many of the more widespread and useful language features not found in FORTH. Besides giving it motivation, this preliminary report describes the current design and implementation status of the HAVOC system. The current version of the HAVOC system is available for the Macintosh. %T Havoc V.85 Software %A David Cohn %A Stephen Madancy %R Technical Report DCS-TR86-122 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X This disk contains alpha release 85 of the HAVOC system, with examples. It is written in C and 68000 assembler. %Z You may download this disk as a Zip archive. %T BRUCE: A Graphics System with Hidden Line and Hidden Surface Algorithms %A Keith Vetter %A Christopher Roche %R Technical Report PCS-TR86-123 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-123.pdf %X Accurately representing the physical world by computer is a topic which has direct benefits to fields like chemistry and architecture, and is a source of much research in computer science. This paper examines the steps necessary to develop and implement a graphical system that will allow for the modeling of physical world objects. In particular, this is a description of BRUCE: a graphical system that will describe a world of three dimensional polyhedra, implementing algorithms for hidden line and hidden surface removal. This paper also deals with the problems incurred along the way and suggestions for further improvement of BRUCE. %T A Network Package for the Macintosh Using the DoD Internet Protocols %A Mark Sherman %R Technical Report PCS-TR86-124 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-124.pdf %X This memo describes the design, implementation and use of the MacIP libraries and programs for running DoD Internet Protocols on the Macintosh over Apple Talk. It is preliminary documentation and out of date, but it is all that exists. %T Functions Returning Values of Dynamic Size %A Mark Sherman %A Andy Hisgen %A Jonathan Rosenberg %A David Alex Lamb %R Technical Report PCS-TR86-125 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-125.pdf %X Modern programming languages, such as Ada (Ichbiah 80), permit the definition of functions that return values whose size can not be determined until the function returns. This paper discusses five implementation techniques that can be used to implement this capability. Comparisons of the techniques are provided and guidelines for selecting a particular technique for a compiler are given. %T Using Low-Cost Workstations to Investigate Computer Networks and Distributed Systems %A Mark Sherman %A Ann Marks %R Technical Report PCS-TR86-126 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-126.pdf %X A quality education in contemporary computer science requires that students gain experience with realistic systems. Many efforts at bringing empirical computer science to undergraduates focus on rather old technologies, for example, building a compiler or simulating a disk scheduler. Although efforts are being made to use some newer technologies, the efforts are concentrating on teaching traditional material in a new medium. However, the medium itself -- networked workstations in a server environment -- is worthy of exploration by undergraduate students in a laboratory setting. At Dartmouth, we developed a Computer Network Laboratory to let students experiment with computer networks, protocols and distributed systems. Through this article, we wish to share our experiences in the design of the laboratory and give an example of how the laboratory was used in a computer network course. %T Binary Trees (v. 2.1 -- September 25, 1985) %A John Glenn %R Technical Report DCS-TR86-127 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X Binary Tree is a program that uses animation to illustrate insertion, deletion and searching of nodes in a binary search tree. Pre-order, in-order, and post-order tree walks are supported. Binary Tree can also perform a recursive descent parse of arithmetic expressions and show the resulting abstract syntax tree. Lisa Pascal source and application available on a single-sided, MFS format microdisk. %Z You may download this disk as a Zip archive. %T An Image Processing Software Package for the Laser Scanning Phase Modulation Microscope %A William J. Murray %R Technical Report PCS-TR86-128 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-128.pdf %X This thesis documents the most recent effort to develop a user-friendly image processing software package for the Laser Scanning Phase Modulation Microscope (LSPMM). The LSPMM is composed of three integrated subsystems, the Laser Scanning (LS) system, the Phase Modulation (PM) system, and Digital Image Acquisition (DIA) system. Under the control of the image processing software, the DIA system can receive and store the digital image data, display the image on a monochrome monitor, and process the image to provide the microscopist with quantitative information regarding the image. The implementation of this image processing software package required the specification of a four level software hierarchy to serve as an organizational framework, with the highest level interacting with the LSPM microscopist, and the lowest level performing hardware control. This framework should prove useful for the development and implementation of additional software in the future. The programs that were developed accept command line arguments; however, most will interactively query the user if the command line arguments are not known. This software provides the microscopist with the capability to scan, save, and display a 512 by 512 pixel image. The image may be scanned to, saved from, or displayed in either of the two DeAnza image display memory planes. Considerable effort has been made to incorporate all of the devices useful for image processing into a single operating system kernel. This alleviates the problem of taking down one operating system and bringing up another version in order to dump image files on magnetic tape. %T An Algorithm for Resource Allocation Requiring Low Overhead Communication %A Ann Marks %R Technical Report PCS-TR86-129 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-129.pdf %X A heuristic algorithm for allocating resource units to sites in a distributed system is presented. Starting with a given allocation of sites, the algorithm performs a series of optimizations involving pairs of sites in an attempt to improve the worst pair-wise imbalance present in the system; termination occurs when no further improvement is possible. After outlining the general form of the algorithm, which effectively defines an entire family of algorithms, we present theoretical results that speak to the performance of the algorithm as measured in the number of optimizations that can be done, the amount of control communication required and the worst case imbalance of the resulting allocation. Subsequently, two particular algorithms in the family are given and the results of a simulation study of their performance is presented. %T Finding Largest Empty Circles with Location Constraints %A L. Paul Chew %A Robert L. Scot Drysdale %R Technical Report PCS-TR86-130 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-130.pdf %X Let S be a set of n points in the plane and let CH(S) represent the convex hull of S. The Largest Empty Circle (LEC) problem is the problem of finding the largest circle centered with CH(S) such that no point of S lies within the circle. Shamos and Hoey (SH75) outlined an algorithm for solving this problem in time O(n log n) by first computing the Voronoi diagram, V(S), in time O(n log n), then using V(S) and CH(S) to compute the largest empty circle in time O(n). In a recent paper [Tou83], Toussaint pointed out some problems with the algorithm as outlined by Shamos and presented an algorithm which, given V(S) and CH(S), solves the LEC problem in time O(n log n). In this note we show that Shamos' original claim was correct: given V(S) and CH(S), the LEC problem can be solved in time O(n). More generally, given V(S) and a convex k-gon P, the LEC centered within P can be found in time O(k+n). We also improve on an algorithm given by Toussaint for computing the LEC when the center is constrained to lie within an arbitrary simple polygon. Given a set S of n points and an arbitrary simple k-gon P, the largest empty circle centered within P can be found in time O(kn + n log n). This becomes O(kn) if the Voronoi diagram of S is already given. %T The Pairwise Intersection Problem for Monotone Polygons %A David B. Levine %R Technical Report PCS-TR86-131 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-131.pdf %X Geometric intersection problems arise in a number of areas of computer science including graphics and VLSI design rule checking. Previous work has concentrated on solving the pairwise intersection problem for line segments and iso-oriented rectangles. This thesis extends that work by presenting efficient algorithms to solve the pairwise intersection problem for monotone polygons. For general segments, the problem has been solved in O(N+I)*logN) time using a sweeping line technique, where N is the number of segments and I is the number of intersections reported. We combine this technique with approaches taken to solve the iso-oriented rectangle problem to yield an algorithm which solves the pairwise intersection problem for monotone polygons in the same asymptotic time. In addition, there are certain classes of line segments for which the pairwise intersection problem may be solved in O(N*logN + I) time, the best possible. We generalize each such class of line segments to a class of polygons and present algorithms to solve the associated polygon problem. Finally, we discuss the impacts which possible improvements to the line segment problem would have on our results. %T Voronoi Diagrams Based on Convex Distance Functions %A L. Paul Chew %A Robert L. Scot Drysdale %R Technical Report PCS-TR86-132 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %X We present an "expanding waves" view of Voronoi diagrams that allows such diagrams to be defined for very general metrics and for distance measures that do not qualify as matrics. If a pebble is dropped into a still pond, circular waves move out from the point of impact. If n pebbles are dropped simultaneously, the paces where wave fronts meet define the Voronoi diagram on the n points of impact. The Voronoi diagram for any normed matric, including the Lp metrics, can be obtained by changing the shape of the wave front from a circle to the shape of the "circle" in that metric. (For example, the "circle" in the L1 metric is diamond shaped.) For any convex wave shape there is a corresponding convex distance function. Even if the shape is not symmetric about its center (a triangle, for example), although the resulting distance function is not a metric, it can still be used to define a Voronoi diagram. Like Voronoi diagrams based on the Euclidean metric, the Voronoi diagrams based on other nomed metrics can be used to solve various closest-point problems (all-nearest-neighbors, minimum spanning trees, etc.). Some of these problems also make sense for convex distance functions which are not metrics. In particular, the "largest empty circle" problem becomes the "largest empty convex shape" problem, and "motion planning for a disc" becomes "motion planning for a convex shape". These problems can both be solved quickly given the Voronoi diagram. We present an asymptotically optimal algorithm for computing Voronoi diagrams based on convex distance functions. %Z This TR appears to have been lost. (April 2008) %T Producing Software Using Tools in a Workstation Environment %A Mark Sherman %A Robert L. Scot Drysdale %R Technical Report PCS-TR86-134 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-134.pdf %X We discuss how we taught students to build and use translation, interpretive, editing and monitoring tools in an undergraduate software engineering course. Students used the tools on low-cost workstations (Macintoshes) to build large, group projects. The students' projects used all available features of workstation environments, including graphics, windows, fonts, mice, networks, and sound generators. We found that 1) the use of tools increased student productivity, 2) a shift in a data structure and algorithm topics is needed to cover material relevant for workstation environments, 3) new topics in system design are required for a workstation environment, 4) traditional material can be easily illustrated with a workstation environment and 5) students enjoyed being able to manipulate the advanced features of workstations in their work, which in turn increased their motivation for and concentration on the course material. %T Making Mail Friendlier: Adding Macintosh features and multimedia documents in UNIX mail %A John R. Meier %R Technical Report PCS-TR86-135 %I Dartmouth College, Computer Science %C Hanover, NH %D 1986 %U http://www.cs.dartmouth.edu/reports/TR86-135.pdf %X This paper describes a Macintosh application which acts as a front-end to Unix mail. Features of the Macintosh interface such as icons, menus, and windows replace the command driven interface. Complicated editing commands are replaced with mouse selection and cut, copy, and paste. Message can be composed of text, pictures, and any Macintosh file, because they are encoded into plain text, sent through the mail system,and then unencoded by the receiving end. The designs of the mail server and communications interface are such that mail servers and communications other than Unix mail and a serial line may be easily implemented. %T Learning Object-Centered Representations %A Peter A. Sandon %R Technical Report PCS-TR87-139 %I Dartmouth College, Computer Science %C Hanover, NH %D 1987 %Z This TR number is incorrect. Please see TR88-139. %T The Sharing of Rights and Information in a Capability-Based Protection System %A Matt Bishop %R Technical Report PCS-TR88-136 %I Dartmouth College, Computer Science %C Hanover, NH %D 1988 %U http://www.cs.dartmouth.edu/reports/TR88-136.pdf %X The paper examines the question of sharing of rights and information in the Take-Grant Protection Model by concentrating on the similarities between the two; in order to do this, we state and prove new theorems for each that specifically show the similarities. The proof for one of the original theorems is also provided. These statements of necessary and sufficient conditions are contrasted to illustrate the proposition that transferring rights and transferring information are fundamentally the same, as one would expect in a capability-based system. We then discuss directions for future research in light of these results. %T Theft of Information in the Take-Grant Protection Model %A Matt Bishop %R Technical Report PCS-TR88-137 %I Dartmouth College, Computer Science %C Hanover, NH %D 1988 %U http://www.cs.dartmouth.edu/reports/TR88-137.pdf %X (Revised 5/90). Questions of information flow are in many ways more important than questions of access control, because the goal of many security policies is to thwart the unauthorized release of information, not merely the illicit obtaining of access rights to that information. The Take-Grant Protection Model is an excellent theoretical tool for examining such issues because conditions necessary and sufficienct for information to flow between tow objects, and for rights to object to be obtained or stolen, are known. In this paper we extend these results by examinig the question of information flow from an object the owner of which is unwilling to release that information. Necessary and sufficient conditions for such "theft of information" to occur are derived, and bounds on the number of subjects that must take action for the theft to occur are presented. To emphasize the usefulness of these results, the security policies of complete isolation,transfer of rights with the cooperation of an owner, and transfer of information (but not rights) with the cooperation of the owner are presented; the last is usedto model a simple reference monitor guarding a resource. %T An Application of a Fast Data Encryption Standard Implementation %A Matt Bishop %R Technical Report PCS-TR88-138 %I Dartmouth College, Computer Science %C Hanover, NH %D 1988 %U http://www.cs.dartmouth.edu/reports/TR88-138.pdf %X The Data Encryption Standard is used as the basis for the UNIX password encryption scheme. Some of the security of that scheme depends on the speed of the implementation. This paper presents a mathematical formulation of a fast implementation of the DES in software, discusses how the mathematics can be translated into code, and then analyzes the UNIX password scheme to show how these results can be used to implement it. Experimental results are provided for several computers to show that the given method speeds up the computation of a password by roughly 20 times (depending on the specific computer). %T Learning Object-Centered Representations %A Peter A. Sandon %R Technical Report PCS-TR88-139 %I Dartmouth College, Computer Science %C Hanover, NH %D 1988 %U http://www.cs.dartmouth.edu/reports/TR88-139.pdf %X When we look at a familiar object from a novel viewpoint, we are usually able to recognize it. In this thesis, we address the problem of learning to recognize objects under transformations associated with viewpoint. Our vision model combines a hierarchical representation of shape features with an explicit representation of the transformation. Shape features are represented in a layered pyramid-shaped subnetwork, while the transformation is explicitly represented in an auxiliary subnetwork. The two connectionist networks are conjunctively combined to allow object- centered shape features to be computed in the upper layers of the network. A simulation of a 2-D translation subnetwork demonstrates the ability to learn to recognize shapes in different locations in an image, such that those same shapes can be recognized in novel locations. Two new learning methods are presented, which provide improved behavior over previous backpropagation methods. Both methods involve ciompetitive interactions among clusters of nodes. The new learning methods demonstrate improved learning over the generalized delta rule when applied to a number of network tasks. In the first method, called error modification, competition is based on the error signals computed from the gradient of the output error. The result of this competition is a set of midified error signals representing a contrast enhanced version of the original errors. The error modification method reduces the occurrence of network configurations that correspond to local error minima. In the second method, called error augmentation, competition is based on that activations of the nodes in the cluster. Network changes resulting from this competition augment those specified by the error gradient computation. This competition is implemented by the trace comparison rule, a new self-organizing mechanism that is effective in developing highly discriminating features within the cluster. The error augmentation method improves learning in the lower network layers when backpropagged error is weak. %T On the Worst Case of Three Algorithms for Computing the Jacobi Symbol %A Jeffrey Shallit %R Technical Report PCS-TR89-140 %I Dartmouth College, Computer Science %C Hanover, NH %D 1989 %U http://www.cs.dartmouth.edu/reports/TR89-140.pdf %X We study the worst-case behavior of three iterative algorithms- Eisenstein's algorithm, Lebesgue's algorithm, and the "ordinary" Jacobi symbol algorithm - for computing the Jacobi symbol. Each algorithm is similar in format to the Euclidean algorithm for computing gcd (u,v). %T Asymptotically Fast Algorithms for Spherical and Related Transforms %A James R. Driscoll %A Dennis M. Healy %R Technical Report PCS-TR89-141 %I Dartmouth College, Computer Science %C Hanover, NH %D 1989 %U http://www.cs.dartmouth.edu/reports/TR89-141.pdf %X This paper considers the problem of computing the harmonic expansion of functions defined on the sphere. We begin by proving convolution theorems that relate the convolution of two functions on the sphere to a "multiplication" in the sprectral domain, as well as the multiplication of two functions on the sphere to a "convolution" in the spectral domain. These convolution theorems are then used to develop a sampling theorem on the sphere. %T A Comparison of Consistency Control Protocols %A Michael Goldweber %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR89-142 %I Dartmouth College, Computer Science %C Hanover, NH %D 1989 %U http://www.cs.dartmouth.edu/reports/TR89-142.pdf %X In this paper we analyze three protocols for maintaining the mutual consistency of replicated objects in a distributed computing environment and compare their performance with that of an oracle protocol whose performance is optimal. We examine these protocols, two dynamic protocols and the majority consensus protocol, via simulations using two measures of availability. The analysis shows that the dynamic protocols, under realistic assumptions, do not perform significantly better than the static voting scheme. Finally we demonstrate that none of these approaches perform as well as our oracle protocol which is shown to be an upper bound on availability. %T Matching Multiple Patterns From Right to Left %A Samuel W. Bent %R Technical Report PCS-TR90-143 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-143.pdf %X We address the problem of matching multiple pattern strings against a text string. Just as the Aho-Corasick algorithm generalizes the Knuth-Morris-Pratt single-pattern algorithm to handle multiple patterns, we exhibit two generalizations of the Boyer-Moore algorithm to handle multiple patterns. In order to obtain worst-case time bounds better than quadratic, our algorithms remember some of the previous history of the matching. %T A Bound of Data Availability when Networks Partition %A Michael Goldweber %A Donald B. Johnson %R Technical Report PCS-TR90-145 %I Dartmouth College, Computer Science %C Hanover, NH %D March 1990 %U http://www.cs.dartmouth.edu/reports/TR90-145.pdf %X Many consistency or replication control schemes that increase data availability in distributed systems exist, and the search for improvements continues, though there have been no good nontrivial upper bound demonstrating how much improvement is possible. We present a new upper bound for data availability under replication for general networks. In addition we also describe a new technique that yields near optimal levels of data availability with respect to this bound. %T Planar Graphs and Sparse Graphs from Efficient Motion Planning in the Plane %A L. Paul Chew %R Technical Report PCS-TR90-146 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-146.pdf %X Given a source, a destination, and a number of obstacles in the plane, the Motion Planning Program is to determine the best path to move an object (a robot) from the source to the destination without colliding with any of the obstacles. For us, motion is restricted to the plane, the robot is represented by a point, and the obstacles are represented by a set of polygons with a total of n vertices among all the polygonal obstacles. %T Building Voronoi Diagrams for Convex Polygons in Linear Expected Time %A L. Paul Chew %R Technical Report PCS-TR90-147 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-147.pdf %X Let P be a list of points in the plane such that the points of P taken in order form the vertices of a convex polygon. We introduce a simple, linear expected-time algorithm for finding the Voronoi diagram of the points in P. Unlike previous results on expected-time algorithms for Voronoi diagrams, this method does not require any assumptions about the distribution of points. With minor modifications, this method can be used to design fast algorithms for certain problems involving unrestricted sets of points. For example, fast expected-time algorithms can be designed to delete a point from a Voronoi diagram, to build an order k Voronoi diagram for an arbitrary set of points, and to determine the smallest enclosing circle for points at the vertices of a convex hull. %T There is a Planar Graph Almost as Good as the Complete Graph %A L. Paul Chew %R Technical Report PCS-TR90-148 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-148.pdf %X Given a set S of points in the plane, there is a triangulation of S such that a path found within this triangulation has length bounded by a constant times the straight-line distance between the endpoints of the path. Specifically, for any two points a and b of S there is a path along edges of the triangulation with length less that Ã10 times [ab], where [ab] is the straight-line Euclidean distance between a and b. The triangulation that has this property is the L1 metric Delauney triangulation for the set S. This result can be applied to motion planning in the plane. Given a source, a destination, and a set of polygonal obstacles of size n, an O(n) size data structure can be used to find a reasonable approximation to the shortest path between the source and the destination in O (n log n) time. %T Term Reduction Using Directed Congruence Closure %A L. Paul Chew %R Technical Report PCS-TR90-149 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-149.pdf %X Many problems in computer science can be described in terms of reduction rules that tell how to transform terms. Problems that can be handled in this way include interpreting programs, implementing abstract data types, and proving certain kinds of theorems. A terms is said to have a normal form if it can be transformed, using the reduction rules, into a term to which no further reduction rules apply. In this paper, we extend the Congruence Closure Algorithm, an algorithm for finding the consequences of a finite set of equations, to develop Directed Congruence Closure, a technique for finding the normal form of a term provided the reduction rules satisfy the conditions for a regular term rewriting system. This technique is particularly efficient because it inherits, from the Congruence Closure Algorithm, the ability to remember all objects that have already been proved equivalent. %T Applying the Take-Grant Protection Model %A Matt Bishop %R Technical Report PCS-TR90-151 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-151.pdf %X The Take-Grant Protection Model has in the past been used to model multilevel security hierarchies and simple protection systems. The models are extended to include theft of rights and sharing of information, and additional security policies are examined. The analysis suggests that in some cases the basic rules of the Take-Grant Protection Model should be augmented to represent the policy properly; when appropriate, such modifications are made and their effects with respect to the policy and its Take-Grant representations are discussed %T A Proactive Password Checker %A Matt Bishop %R Technical Report PCS-TR90-152 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-152.pdf %X Password selection has long been a difficult issue; traditionally, passwords are either assigned by the computer or chosen by the user. When the computer does the assignments, the passwords are often hard to remember; when the User makes the selection, the passwords are often easy to guess. This paper describes a technique, and a mechanism, to allow users to select passwords which to them are easy to remember but to others would be very difficult to guess. The technique is site, user, and group configurable, and allows rapid changing of constraints impossed upon the passwords. Although experience with this technique has been limited, it appears to have much promise. %T Administrator's Guide to the Digital Signature Facility "Rover" %A Matt Bishop %R Technical Report PCS-TR90-153 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-153.pdf %X This document describes the installation and maintenance of the rover utility, which provides a digital signature capability for internet messages. %T Effects of Replication on Data Availability %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR90-155 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-155.pdf %X In this paper we examine the effects of replication on the availability of data in a large network. This analysis differs from previous analyses in that it compares the performance of a dynamic consistency control protocol not only to that of other consistency control protocols, but also to the performance of non-replication and to an upper bound on data availability. This analysis also differes in that we gather extensive simulations on large networks subject to partitions at realistically high component reliabilities. We examine the dynamic consistency protocol presented by Jajodia and Mutchler [9, 12] and by Long and Paris[18] along with two proposed enhancements to this protocol[10,11]. We study networks of 101 sites and up to 5050 links (fully-connected) in which all components, although highly reliable, are subject to failure. We demonstrate the importance in this realistic environment of an oft neglected parameter of the system model, the ratio of transaction submissions to component failures. We also show the impact of the number of copies on both the protocol performance and the potential of replicaion as measured by the upper bound. Our simulations show that the majority of current protocol performs optimally for topologies that yield availabilities of at least 65%. On the other hand, the availability provided by non-replicaion is inferior to that of the majority of current protocol by a most 5.9 percentage points for these same topologies. At this point of maximum difference, theprimary copy protocol yields availability 59.1% and the majority of current protocol yields availability 65.0%. We discuss the characteristics of the model limiting the performance of replication. %T A Tight Upper Bound on the Benefits of Replication and Consistency Control Protocols %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR90-157 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-157.pdf %X We present an upper bound on the performance provided by a protocol guaranteeing mutually exclusive access to a replicated resource in a network subject to component failure and subsequent partitioning. The bound is presented in terms of the performance of a single resource in the same network. The bound is tight and is the first such bound known to us. Since mutual exclusion is one of the requirements for maintaining the consistency of a database object, this bound provides an upper limit on the availability provided by any database consistency control protocol, including those employing dynamic data relocation and replication. We show that if a single copy provides availability A for 0 <= A <= 1, then no scheme can achieve availability greater than sqrt(A) in the same network. We show this bound to be the best possible for any network with availability greater than .25. Although, as we proved, the problem of calculating A is #P-complete, we describe a method for approximating the optimal location for a single copy which adjusts dynamically to current network characteristcs. This bound is most useful for high availabilities, which tend to be obtainable with modern networks and their constituent components. %T Finding Optimal Quorum Assigments for Distributed Databases %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR90-158 %I Dartmouth College, Computer Science %C Hanover, NH %D 1990 %U http://www.cs.dartmouth.edu/reports/TR90-158.pdf %X Replication has been studied as a method of increasing the availability of a data item in a distributed database subject to component failures and consequent partitioning. The potential for partitioning requires that a protocol be employed which guarantees that any access to a data item is aware of the most recent update to that data item. By minimizing the number of access requests denied due to this constraint, we maximize availability. In the event that all access requests are reads, placing one copy of the data item at each site clearly leads to maximum availability. The other extreme, all access requests are write requests or are treated as such, has been studied extensively in the literature. In this paper we investigate the performance of systems with both read and write requests. We describe a distributed on-line algorithm for determining the optimal parameters, or optimal quorum assignments, for a commonly studied protocol, the quorum consensus protocol[9]. We also show how to incorporate these optimization techniques into a dynamic quorum reassignment protocol. In addition, we demonstrate via simulation both the value of this algorithm and the effect of various read-write rations on availability. This simulation, on 101 sites and up to 5050 links(fully- connected), demonstrates that the techniques described here can greatly increase data availability, and that the best quorum assignments are frequently realized at the extreme values of the quorum parameters. %T Privacy-Enhanced Electronic Mail %A Matt Bishop %R Technical Report PCS-TR91-150 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-150.pdf %X (Revision 3). The security of electronic mail sent through the Internet may be described in exactly three words: there is none. The Privacy and Security Research Group has recommended implementing mechanisms designed to provide security enhancements. The first set of mechanisms provides a protocol to provide privacy, integrity, and authentication for electronic mail; the second provides a certificate-based key management infrastructure to support key distribution throughout the internet, to support the first set of mechanisms. This paper describes these mechanisms, as well as the reasons behind their selection and how these mechanisms can be used to provide some measure of securtiy in the exchange of electronic mail. %T A Security Analysis of Version 2 of the Network Time Protocol NTP: A Report to the Privacy and Security Research Group %A Matt Bishop %R Technical Report PCS-TR91-154 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-154.pdf %X The Network Time Protocol is being used throughout the Internet to provide an accurate time service. This paper examines the security requirements of such a service, analyzes version 2 of the NTP protocol to determine how well it meets these requirements, and suggests improvements where appropriate. %T An Overview of Computer Viruses in a Research Environment %A Matt Bishop %R Technical Report PCS-TR91-156 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-156.pdf %X The threat of attack by computer viruses is in reality a very small part of a much more general threat, specifically attacks aimed at subverting computer security. This paper examines computer viruses as malicious logic in a research and development environment, relates them to various models of security and integrity, and examines current research techniques aimed at controlling the threats viruses in particular, and malicious logic in gerneral, pose to computer systems. Finally, a brief examination of the vulnerabilities of research and development systems that malicious logic and computer viruses may exploit is undertaken. %T Implementation Notes on bdes(1) %A Matt Bishop %R Technical Report PCS-TR91-158 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-158.pdf %X This note describes the implementation of bdes, the file encryption program being distributed in the 4.4 release of the Berkeley Software Distribution. It implements all modes of the Data Encryption Standard program. %T Optimal Parallel and Sequential Algorithms for the Vertex Updating Problem of a Minimum Spanning Tree %A Donald B. Johnson %A Panagiotis Metaxas %R Technical Report PCS-TR91-159 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-159.pdf %X We present a set of rules that can be used to give optimal solutions to the vertex updating problem for a minimum spanning tree: Update a given MST when a new vertex z is introducted, along with weighted edges that connect z with the vertices of the graph. These rules lead to simple parallel algorithms that run in O(lg n) parallel time using n/lg n EREW PRAMs. They can also be used to derive simple linear-time sequential algorithms for the same problem. Furthermore, we show how our solution can be used to solve the multiple vertex updating problem. %T Connected Components in O(lg3/2|V|) Parallel Time for the CREW PRAM %A Donald B. Johnson %A Panagiotis Metaxas %R Technical Report PCS-TR91-160 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-160.pdf %X Computing the connected components of an undirected graph G = (V,E) on |V| = n vertices and |E| = m edges is a fundamental computational problem. The best known parallel algorithm for the CREW PRAM model runs on O(lg2n) time using n2/lg2n processors [CLC82,HCS79]. For the CRCW PRAM model in which concurrent writing is permitted, the best known algorithm runs in O(lg n) time using almost (n+m)/lg n processors [SV82,CV86,AS87]. Unfortunately, simulating this algorithm on the weaker CREW model increases its running time to O(lg2n) [CDR86, KR90,Vis83]. We present here an efficient and simple algorithm that runs in O(lg 3/2n) time using n+m CREW processors. %T On Minimizing Hardware Overhead for Exhaustive Circuit Testability %A Dimitrios Kagaris %A Fillia Makedon %R Technical Report PCS-TR91-161 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-161.pdf %X Exhaustive built-in self testing is given much attention as a viable technique in the context of VLSI technology. In this paper, we present heuristic in order to make exhaustive testing of combinational circuits practical. The goal is to place a small number of register cells on the nets of the input circuit so that the input dependency of combinational elements in the circuit is less than a small given integer k. Our heuristic guarantees that each output can be individually tested with 2k test patterns and can be used as a subroutine to generat efficient test patterns to test all the outputs of the circuit simultaneously. For example, we can connect the register cells in a Linear Feedback Shift Register(LFSR). Minimizing the number of the inserted register cells reduces the hardware overhead as well as the upper bound on the number of test patterns generated. A heuristic approach has been proposed only for the case when an element in the circuit schematic denotes a boolean gate. An element may, however, also be used to represent a combinatorial circuit model. Our heuristic applies to this case as well. Extensive experimentation indicates that the proposed technique is very efficient. %T A Metric Towards Efficient Exhaustive Test Pattern Generation %A Dimitrios Kagaris %A Fillia Makedon %R Technical Report PCS-TR91-162 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-162.pdf %X A viable technique [7] in built-in self-test (BIST)[2] is to generate test patterns pseudo-exhaustively by using linear feedback shift registers (LFSR's). The goal is to find an appropriate primitive polynomial of degree d that will generat 2d test patterns in order to exercise all circuit outputs simultaneously. In an attempt to reduce the degree d of the polynomial the following strategy was proposed in [6,5]. In the first phase, partition the circuit into segments by inserting a small number of register cells, so that the input dependency of any circuit element in the segments is no more than d. Then, obain an appropriate primitive polynomial of degree d by inserting additional register cells. In [12] we have proposed a heuristic for phase one that does not necessarily partition the circuit. Extensive experimentation has shown that this results in a considerably smaller cell overhead. In this paper we extend our heuristic in [12], so that the minimization of the number of register cells is done in conjunction with a quantity that naturally reflects the difficulty of deriving an appropriate primitive polynomial of degree d. Experimentation shows that the proposed heuristic results again in an overall smaller number of register cells than a partition based approach and in an efficient framework for test pattern generation. %T Multipacket Routing on Rings %A Fillia Makedon %A Adononios Simvonis %R Technical Report PCS-TR91-163 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-163.pdf %X We study multipacket routing problems. We divide the multipacket routing problem into two classes, namely, distance limited and bisection limited routing problems. Then, we concentrate on rings of processors. Having a full understanding of the multipacket routing problem on rings is essential before trying to attack the problem for the more general case of r-dimensional meshes and tori. We prove a new lower bound of 2n/3 routing steps for the case of distance limited routing problems. We also give an algorithm that tightens this lower bound. For bisection limited problems, we present an algorithm that completes the routing in near optimal time. %T Ilona: An advanced CAI Tutorial System for the Fundamentals of Logic %A Otto Mayer %A Graham E. Oberem %A Fillia Makedon %R Technical Report PCS-TR91-164 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-164.pdf %X An advanced tutorial system for teaching the fundamentals of logic has been developed to run on UNIX work stations and commonly available micro-computers. An important part of this tutorial is the intelligent problem solving environment which allows students to practise wiriting logical sentences in mathematical notation. A natural language system for intelligent logic narrative analysis (ILONA) allows students to type in their own logical sentences in plain English and then have the computer check their working when they write these in mathematical form. ILONA is an intelligent tutoring system which allows students a great deal of initiative in problem solving and provides a degree of flexibility in answer evaluation not found in traditional CAI systems. The concepts and structures used in the development of ILONA are easily transferable to other domains. %T An Object-Oriented Learning/Design Support Environment %A Jill P. David %A Julie C. Jumes %A Fillia Makedon %R Technical Report PCS-TR91-165 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-165.pdf %X We present an object-oriented experimental learning and design support environment, call AVT, for an Algorithm Visualization Tool, implemented in Digitalk's Smalltalk/V1 on a Macintosh II2, AVT provides a domain- independent visualization tool, an exploratory learning environment, and an experimental heuristic design environment. Algorithm visualization is the exploration of ways to visualize intuitively the computational behavior of an algorithm using multiple views, some of which are visual in the graphical sense [2,4]. AVT employs other views (combining text and graphics) to explain the problem, the strategy, the heuristics, and the reasoning process behind the solutions. User interaction in AVT includes not only passive viewingof the animated algorithmic process but also active participation in the design of the steps of the algorithm. Object-Oriented Programming(OOP)offers an attractive paradigm for rapidly implementing heuristics as well as more coherent and understandable code [1,12]. Inheritance properties of OOP languages capture natural mechanisms such as specialization, abstraction, and evolution allowing us to model our environment in a more natural manner[11]. %T A Parallel Algorithm for the Minimum Spanning Tree %A Donald B. Johnson %A Panagiotis Metaxas %R Technical Report PCS-TR91-166 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-166.pdf %T Complexity of Network Reliability and Optimal Database Placement Problems %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR91-167 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-167.pdf %X A fundamental problem of distributed database design in an existing network where components can fail is finding an optimal location at which to place the database in a centralized system or copies of each data item in a decentralized or replicated system. In this paper it is proved for the first time exactly how hard this placement problem is under the measure of data availability. Specifically, we show that the optimal placement problem for availability is #P- complete, a measure of intractability at least as severe as NP-completeness. Given the anticipated computational difficulty of finding an exact solution, we go on to describe an effective, practical method for approximating the optimal copy placement. To obtain these results, we model the environment in which a distributed database operates by a probabilistic graph, which is a set of fully-reliable vertices representing sites, and a set of edges representing communication links, each operational with a rational probability. We prove that finding the optimal copy placement in a probabilistic graph is #P-complete by giving a sequence of reductions from #Satisfiability. We generalize this result to networks in which each site and each link has an independent, rational operational probability and to networks in which all the sites or all the links have a fixed, uniform operational probabilities. %T Availability Issues in Data Replication in Distributed Database %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR91-168 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-168.pdf %X Replication of data at more than one site in a distributed database has been reported to increase the availability in data in systems where sites and links are subject to failure. We have shown in results summarized in this paper that in many interesting cases the advantage is slight. A well-placed single copy is available to transactions almost as much of the time as is correct replicated data no matter how ingeniously it is managed. We explain these findings in terms of the behavior of the partitions that form in networks where components fail. We also show that known and rather simple protocols for the maintenance of multiple copies are essentially best possible by comparing them against an unrealizable "protocol" that knows the future. We complete our study of these questions by reporting that while computing the availability of data is #P-complete, nonetheless there is a tight analytical bound on the amount replication can improve over a well-located single copy. We close with some observations regarding system design motivated by this work. %T Effects of Replication on the Duration of Failure in Distributed Databases %A Donald B. Johnson %A Larry Raab %R Technical Report PCS-TR91-169 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-169.pdf %X Replicating data objects has been suggested as a means of increasing the performance of a distributed database system in a network subject to link and site failures. Since a network may partition as a consequence of such failures, a data object may become unavailable from a given site for some period of time. In this paper we study duration failure, which we define as the length of time, once the object becomes unavailable from a particular site, that the object remains unavailable. We show that, for networks composed of highly-reliable components, replication does not substantially reduce the duration of failure. We model a network as a collection of sites and links, each failing and recovering independently according to a Poisson process. Using this model, we demonstrate via simulation that the duration of failure incurred using a non-replicated data object is nearly as short as that incurred using a replicated object and a replication control protocol, including an unrealizable protocol which is optimal with respect to availability. We then examine analytically a simplified system in which the sites but not the links are subject to failure. We prove that if each site operates with probability p, then the optimal replication protocol, Available Copies [5,26], reduces the duration of failure by at most a factor of 1-p/1+p. Lastly, we present bounds for general systems, those in which both the sites and the communications between the sites may fail. We prove, for example, that if sites are 95% reliable and a communications failure is sufficiently short (either infallible or satisfying a function specified in the paper) then replication can improve the duration of failure by at most 2.7% of that experienced using a single copy. These results show that replication has only a small effect of the duration of failure in present-day partitionable networks comprised of realistically reliable components. %T Optimal Algorithms for Multipacket Routing Problems on Rings %A Fillia Makedon %A Antonios Symvonis %R Technical Report PCS-TR91-174 %I Dartmouth College, Computer Science %C Hanover, NH %D 1991 %U http://www.cs.dartmouth.edu/reports/TR91-174.pdf %X We study multipacket routing problems. We divide the multipacket routing problem into two classes, namely, distance limited and bisection limited routing problems. Then, we concentrate on rings of processors. We prove a new lower bound of 2n/ 3 routing steps for the case of distance limited routing problems. We also give an algorithm that tightens this lower bound. For bisection limited problems the lower bound is kn/ 4,k >2, where k is the number of packets per processor. The trivial algorithm needs in the worst case k | n /2| steps to terminate. An algorithm that completes the routing in kn /4 + 2.5 n routing steps is given. We define the class of pure routing algorithms and we demonstrate that new lower bounds hold if the routing is to be done by an algorithm in this class. %T Parallel Computer Needs at Dartmouth College %A David Kotz %A Fillia Makedon %A Matt Bishop %A Robert L. Scot Drysdale %A Donald B. Johnson %A Panagiotis Metaxas %R Technical Report PCS-TR92-176 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-176.ps.Z %X To determine the need for a parallel computer on campus, a committee of the Graduate Program in Computer Science surveyed selected Dartmouth College faculty and students in December, 1991, and January, 1992. We hope that the information in this report can be used by many groups on campus, including the Computer Science graduate program and DAGS summer institute, Kiewit's NH Supercomputer Initiative, and by numerous researchers hoping to collaborate with people in other disciplines. We found significant interest in parallel supercomputing on campus. An on-campus parallel supercomputing facility would not only support numerous courses and research projects, but would provide a locus for intellectual activity in parallel computing, encouraging interdisciplinary collaboration. We believe that this report is a first step in that direction. %T Multiplicatively Weighted Crystal Growth Voronoi Diagrams (Thesis) %A Barry F. Schaudt %R Technical Report PCS-TR92-177 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-177.pdf %X Voronoi diagrams and variants of Voronoi diagrams have been used for many years to model crystal growth. If the boundary of the growing crystals are circular and all the crystals start at the same time and have the same constant growth rate, then the Voronoi diagram is used to model the growth. If the crystals start at different times, the additively weighted Voronoi diagram is used to model the crystal growth. In this thesis, I propose a new type of Voronoi diagram called the multiplicatively weighted crystal growth Voronoi diagram, that can be used to model crystal growth when the crystals have different constant growth rates. In this new model, the distance from a site to a point in its region is measured along a shortest path lying entirely within the region. In the multiplicatively weighted crystal growth Voronoi diagram, a growing crystal (or region) may "wrap around" another site's region. When a region wraps around, distances from the site are in part measured along the boundary of the two regions, treating one of the regions as an obstacle, rather than along a straight line that passes through the region. The worst case size of the multiplicatively weighted crystal growth Voronoi, diagram is 0(n 2). To construct the diagram, techniques from numerical analysis are used to approximate and to intersect curves described by a system of first order differential equations. Numerical methods to approximated a curve construct a polygonal approximation of the curve. One step of the numerical methods constructs an edge of the polygonal approximation. In the new Voronoi diagram, a step may require 0(n ) constant time operations. Let S be the number of steps required by the numerical method used just to draw the diagram. In the worst case, the algorithm presented in this thesis requires O (n 3) intersection calculations plus O (nS lg S ) time using O (n 3 + S ) space. A variant of this algorithm requires O (n 3) intersection calculations plus O (nS 2 + n 2S ) time using O (n 2) space. Also presented are some variants of the new Voronoi diagram. One of these variants uses a convex polygon distance function. The multiplicatively weighted crystal growth Voronoi diagram using a convex polygon distance function does not require numerical methods to construct. %T Parallel Algorithms For Graph Problems (Thesis) %A Panagiotis Metaxas %R Technical Report PCS-TR92-178 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-178.pdf %X In this thesis we examine three problems in graph theory and propose efficient parallel algorithms for solving them. We also introduce a number of parallel algorithmic techniques. %T Multiprocessor File System Interfaces %A David Kotz %R Technical Report PCS-TR92-179 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-179.ps.Z %X Increasingly, file systems for multiprocessors are designed with parallel access to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications. Although file system software can transparently provide high-performance access to parallel disks, a new file system interface is needed to facilitate parallel access to a file from a parallel application. We describe the difficulties faced when using the conventional (Unix-like) interface in parallel applications, and then outline ways to extend the conventional interface to provide convenient access to the file for parallel programs, while retaining the traditional interface for programs that have no need for explicitly parallel file access. Our interface includes a single naming scheme, a multiopen operation, local and global file pointers, mapped file pointers, logical records, multifiles, and logical coercion for backward compatibility. %T A Visualization System for Correctness Proofs of Graph Algorithms %A Peter A. Gloor %A Donald B. Johnson %A Fillia Makedon %A Panagiotis Metaxas %R Technical Report PCS-TR92-180 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-180.pdf %X In this paper we describe a system for visualizing correctness proofs of graph algorithms. The system has been demonstrated for a greedy algorithm. PrimÕs algorithm for finding a minimum spanning tree of an undirected, weighted graph. We believe that our system is particularly appropriate for greedy algorithms, though much of what we discuss can guide visualization of proofs in other contexts. While an example is not a proof, our system provides concrete examples to illustrate the operation of the algorithm. These examples can be referred to by the user interactively and alternatively with the visualization of the proof where the general case is portrayed abstractly. %T On The De Bruijn Torus Problem %A Glenn Hurlbert %A Garth Isaak %R Technical Report PCS-TR92-181 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-181.pdf %X A (kn;n)k-de Bruijn Cycle is a cyclic k-ary sequence with the property that every k-ary n-tuple appears exactly once contiguously on the cycle. A (kr, ks; m, n)k-de Bruijn Torus is a k-ary krXks toroidal array with the property that every k-ary m x n matrix appears exactly once contiguously on the torus. As is the case with de Bruijn cycles, the 2-dimensional version has many interesting applications, from coding and communications to pseudo-random arrays, spectral imaging, and robot self-location. J.C. Cock proved the existence of such tori for all m, n, and k, and Chung, Diaconis, and Graham asked if it were possible that r = s and m -= n for n even. Fan, Fan, Ma and Siu showed this was possible for k - 2. Combining new techniques with old, we prove the result for k > 2 and show that actually much more is possible. The cases in 3 or more dimensions remain. %T How to Encrypt /usr/dict/words in About a Second %A Peter Su %A Matt Bishop %R Technical Report PCS-TR92-182 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-182.pdf %X We present an implementation of the Data Encryption Standard on the Connection Machine architecture. The DES encryption algorithm is ideally suited to the Connection Machine because it consists of bit serial operations, and thousands of encryptions can be done in parallel, independently of one another. Thus, our code encrypts passwords about ten times faster than the fastest competition that we know about. In addition, the nature of the Connection Machine's architecture is such that some of the optimizations that make DES run much faster on conventional architectures have no effect on the performance of the Connection Machine. Our comparison of a simple implementation along with one that uses many optimizations illustrates this fact. %T Concurrent Local Search for Fast Proximity Algorithms on Parallel and Vector Architectures %A Peter Su %R Technical Report PCS-TR92-183 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-183.pdf %X This paper presents a fast algorithm for solving the all-nearest-neighbors problem. The algorithm uses a data parallel style of programming which can be efficiently utilized on a variety of parallel and vector architectures [4,21,26]. I have implemented the algorithm in C on one such architecture, the Cray Y-MP. On one Cray CPU, the implementation is about 19 times faster than a fast sequential algorithm running on a Sparc workstation. The main idea in the algorithm is to divide the plane up into a fixed grid of cells, or buckets. When the points are well distributed, the algorithm processes each query point, q, by searching a small number of cells close to q. Bentley, WEide and Yao first presented this idea for conventional architectures [3], but the technique works equally well on parallel and vector machines, leading to a simple, efficient algorithm. We can also use the cell technique to solve a wide variety of basic computational problems such as finding closest pairs, sorting and constructing Voronoi diagrams and Delaunay triangulations. %T Building Segment Trees in Parallel %A Peter Su %A Robert L. Scot Drysdale %R Technical Report PCS-TR92-184 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-184.pdf %X The segment tree is a simple and important data structure in computational geometry [7,11]. We present an experimental study of parallel algorithms for building segment trees. We analyze the algorithms in the context of both the PRAM (Parallel Random Access Machine) and hypercube architectures. In addition, we present performance data for implementations developed on the Connection Machine. We compare two different parallel alforitms, and we also compare our parallel algorithms to a good sequential algorithm for doing the same job. In this way, we evaluate the overall efficiency of our parallel methods. Our performance results illustrates the problems involved in using popular machine models(PRAM) and analysis techniques (asymptotic efficiency) to predict the performance of parallel algorithms on real machines. We present two different analyses of our algorithms and show that neither is effective in predicting the actual performance numbers that we obtained. %T Algorithms for Closest Point Problems: Practice and Theory %A Peter Su %R Technical Report PCS-TR92-185 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-185.pdf %X This paper describes and evaluates know sequential algorithms for constructing planar Voronoi diagrams and Delaunay triangulations. In addition, it describes a new incremental algorithm which is simple to understand and implement, but whose performance is competitive with all known methods. The experiments in this paper are more than just simple benchmarks, they evaluate the expected performance of the algorithms in a precise and machine independent fashion. Thus, the paper also illustrates how to use experimental tools to both understand the behaviour of different algorithms and to guide the algorithm design process. %T Formal Implementation of High-Level Languages for Data-Parallel Programming %A Deb Banerjee %R Technical Report PCS-TR92-186 %I Dartmouth College, Computer Science %C Hanover, NH %D 1992 %U http://www.cs.dartmouth.edu/reports/TR92-186.pdf %X The success of parallel architectures has been limited by the lack of high-level parallel programming languages and useful programming models. The data-parallel model of programming has been demonstrated to be useful and natural on a wide variet of parallel architectures. This dissertation presents a set of formal techniques for compiling high- level languages based on data-parallelism. %T VideoScheme: A Programmable Video Editing System for Automation and Media Recognition %A James Matthews %A Peter A. Gloor %A Fillia Makedon %R Technical Report PCS-TR93-187 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-187.pdf %X The recent development of powerful, inexpensive hardware and software support had made digital video editing possible on personal computers and workstations. To date the video editing application category has been dominated by visual, easy-to-use, direct manipulation interfaces. These systems bring high-bandwidth human-computer interaction to a task formerly characterized by slow, inflexible, indirectly-operated machines. However, the direct manipulation computer interfaces are limited by their manual nature, and can not easily accommodate algorithmically- defined operations. This paper proposes a melding of the common direct manipulation interfaces with a programming language which we have enhanced to manipulate digital audio and video. The result is a system which can automate routine tasks as well as perform tasks based on sophisticated media recognition algorithms. %T Integrating Theory and Practice in Parallel File Systems %A Thomas H. Cormen %A David Kotz %R Technical Report PCS-TR93-188 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-188-update.ps.Z %X Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present a list of capabilities that must be provided by the system to support these optimal algorithms: control over declustering, querying about the configuration, independent I/O, turning off file caching and prefetching, and bypassing parity. We summarize recent theoretical and empirical work that justifies the need for these capabilities. %Z The original 1993 tech report was superceded by the conference version. The paper has since been revised, re-issued on 9/20/94 as an updated technical report. %T Efficient Parallel Algorithms for some Tree Layout Problems %A J. Diaz %A A. Gibbons %A Grammati E. Pantziou %A M. Serna %A Paul G. Spirakis %A J. Toran %R Technical Report PCS-TR93-189 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-189.pdf %X The minimum cut and minimum length linear arrangement problems usually occur in solving wiring problems and have a lot in common with job sequencing questions. Both problems are NP-complete for general graphs and in P for trees. We present here two algorithms in NC. The first solves the minimum length linear arrangement problem for unrooted trees in $O(\log^2 n)$ time and $O(n^2 3^{\log n})$ CREW PRAM processors. The second algorithm solves the minimum cut arrangement for unrooted trees of maximum degree $d$ in $O(d \log^2 n)$ time and $O(n^2 /\log n)$ CREW PRAM processors. %T Throughput of Existing Multiprocessor File Systems (An Informal Study) %A David Kotz %R Technical Report PCS-TR93-190 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-190.ps.Z %X Fast file systems are critical for high-performance scientific computing, since many scientific applications have tremendous I/O requirements. Many parallel supercomputers have only recently obtained fully parallel I/O architectures and file systems, which are necessary for scalable I/O performance. Scalability aside, I show here that many systems lack sufficient absolute performance. I do this by surveying the performance reported in the literature, summarized in an informal table. %T Accurate Verification of Five-Axis Numerically Controlled Machining %A Jerome L. Quinn %R Technical Report PCS-TR93-191 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-191.pdf %T Off-line Cursive Handwriting Recognition Using Style Parameters %A Berrin A. Yanikoglu %A Peter A. Sandon %R Technical Report PCS-TR93-192 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-192.ps.Z %X (No abstract available). %T Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems %A Thomas H. Cormen %A Leonard F. Wisniewski %R Technical Report PCS-TR93-193 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-193.ps.Z %Z This TR is superceded by TR94-223. %T Vector Layout in Virtual-Memory Systems for Data-Parallel Computing %A Thomas H. Cormen %R Technical Report PCS-TR93-194 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-194.pdf %T Wavelet Localization of the Radon Transform %A Tim Olson %A Joe Destefano %R Technical Report PCS-TR93-196 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-196.pdf %T The Expected Lifetime of "Single-Address-Space" Operating Systems %A David Kotz %A Preston Crow %R Technical Report PCS-TR93-198 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-198.ps.Z %X Trends toward shared-memory programming paradigms, large (64-bit) address spaces, and memory-mapped files have led some to propose the use of a single virtual-address space, shared by all processes and processors. Typical proposals require the single address space to contain all process-private data, shared data, and stored files. To simplify management of an address space where stale pointers make it difficult to re-use addresses, some have claimed that a 64-bit address space is sufficiently large that there is no need to ever re-use addresses. Unfortunately, there has been no data to either support or refute these claims, or to aid in the design of appropriate address-space management policies. In this paper, we present the results of extensive kernel-level tracing of the workstations in our department, and discuss the implications for single-address-space operating systems. We found that single-address-space systems will not outgrow the available address space, but only if reasonable space-allocation policies are used, and only if the system can adapt as larger address spaces become available. %Z The on-line version is a revision of March 15, 1996. An earlier revised version appeared in SIGMETRICS '94. The original technical report is not available on-line. %T On-Line and Dynamic Shortest Paths Through Graph Decompositions %A Hristo N. Djidjev %A Grammati E. Pantziou %A Christos D. Zaroliagis %R Technical Report PCS-TR93-200 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-200.pdf %X We describe algorithms for finding shortest paths and distances in a planar digraph which exploit the particular topology of the input graph. An important feature of our algorithms is that they can work in a dynamic environment, where the cost of any edge can be changed or the edge can be deleted. For outerplanar digraphs, for instance, the data structures can be updated after any such change in only $O(\log n)$ time, where $n$ is the number of vertices of the digraph. We also describe the first parallel algorithms for solving the dynamic version of the shortest path problem. Our results can be extended to hold for digraphs of genus $o(n)$. %T Parallel Max Cut Approximations %A Grammati E. Pantziou %A Paul G. Spirakis %A Christos D. Zaroliagis %R Technical Report PCS-TR93-201 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-201.pdf %X Given a graph with positive integer edge weights one may ask whether there exists an edge cut whose weight is bigger than a given number. This problem is NP-complete. We present here an approximation algorithm in NC which provides tight upper bounds to the proportion of edge cuts whose size is bigger than a given number. Our technique is based on the methods to convert randomized parallel algorithms into deterministic ones introduced by Karp and Wigderson. The basic idea of those methods is to replace an exponentially large sample space by one of polynomial size. In this work, we prove the interesting result that the statistical distance of random variables of the small sample space is bigger than the statistical distance of corresponding variables of the exponentially large space, which is the space of all edge cuts taken equiprobably. %T Parallel h-v Drawings of Binary Trees %A Panagiotis Metaxas %A Grammati E. Pantziou %A Antonios Symvonis %R Technical Report PCS-TR93-202 %I Dartmouth College, Computer Science %C Hanover, NH %D 1993 %U http://www.cs.dartmouth.edu/reports/TR93-202.pdf %X In this paper we present a method to obtain optimal h-v and inclusion drawings in parallel. Based on parallel tree contraction, our method computes optimal (with respect to a class of cost functions of the enclosing rectangle) drawings in $O(\log^2 n)$ parallel time by using a polynomial number of EREW processors. The number of processors reduces substantially when we study minimum area drawings. The method can be extended to compute optimal inclusion layouts in the case where each leaf $l$ of the tree is represented by rectangle $l_x \times l_y$ (the dimensions of which are part of the input). For polynomial area layouts, our work places the problem of obtaining optimal size h-v or inclusion drawings in NC, presenting the first algorithm with polylogarithmic time complexity. Our method also yields an NC algorithm for the slicing floorplanning problem. Whether this problems was in NC was an open question~\cite{CT90}. %T Quickest Paths: Faster Algorithms and Dynamization %A Dimitrios Kagaris %A Grammati E. Pantziou %A Spyros Tragoudas %A Christos D. Zaroliagis %R Technical Report PCS-TR94-204 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-204.pdf %X Given a network $N=(V,E,{c},{l})$, where $G=(V,E)$, $|V|=n$ and $|E|=m$, is a directed graph, ${c}(e) > 0$ is the capacity and ${l}(e) \ge 0$ is the lead time (or delay) for each edge $e\in E$, the quickest path problem is to find a path for a given source--destination pair such that the total lead time plus the inverse of the minimum edge capacity of the path is minimal. The problem has applications to fast data transmissions in communication networks. The best previous algorithm for the single--pair quickest path problem runs in time $O(r m+r n \log n)$, where $r$ is the number of distinct capacities of $N$ \cite{ROS}. In this paper, we present algorithms for general, sparse and planar networks that have significantly lower running times. For general networks, we show that the time complexity can be reduced to $O(r^{\ast} m+r^{\ast} n \log n)$, where $r^{\ast}$ is at most the number of capacities greater than the capacity of the shortest (with respect to lead time) path in $N$. For sparse networks, we present an algorithm with time complexity $O(n \log n + r^{\ast} n + r^{\ast} \tilde{\gamma} \log \tilde{\gamma})$, where $\tilde{\gamma}$ is a topological measure of $N$. Since for sparse networks $\tilde{\gamma}$ ranges from $1$ up to $\Theta(n)$, this constitutes an improvement over the previously known bound of $O(r n \log n)$ in all cases that $\tilde{\gamma}=o(n)$. For planar networks, the complexity becomes $O(n \log n + n\log^3 \tilde{\gamma}+ r^{\ast} \tilde{\gamma})$. Similar improvements are obtained for the all--pairs quickest path problem. We also give the first algorithm for solving the dynamic quickest path problem. %T Efficient Sequential and Parallel Algorithms for the Negative Cycle Problem %A Dimitrios Kavvadias %A Grammati E. Pantziou %A Paul G. Spirakis %A Christos D. Zaroliagis %R Technical Report PCS-TR94-206 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-206.pdf %X We present here an algorithm for detecting (and outputting, if exists) a negative cycle in an $n$-vertex planar digraph $G$ with real edge weights. Its running time ranges from $O(n)$ up to $O(n^{1.5}\log n)$ as a certain topological measure of $G$ varies from $1$ up to $\Theta(n)$. Moreover, an efficient CREW PRAM implementation is given. Our algorithm applies also to digraphs whose genus $\gamma$ is $o(n)$. %T Issues and Obstacles with Multimedia Authoring %A Fillia Makedon %A Samuel A. Rebelsky %A Matthew Cheyney %A Charles B. Owen %A Peter A. Gloor %R Technical Report PCS-TR94-207 %I Dartmouth College, Computer Science %C Hanover, NH %D May 1995 %U http://www.cs.dartmouth.edu/reports/TR94-207.ps.Z %X Unlike traditional authoring, multimedia authoring involves making hard choices, forecasting technological evolution and adapting to software and hardware technology changes. It is, perhaps, an unstable field of endeavor for an academic to be in. Yet, it is important that academics are, in fact, part of this process. This paper discusses some of the common threads shared by three dissimilar cases of multimedia authoring which we have experimented with, that of multimedia conference proceedings, multimedia courseware development and multimedia information kiosks. We consider these applications from an academic point of view and review the benefits and pitfalls of academic development while sharing points of hard-learned wisdom. We draw on experiences from some of the projects run at the Dartmouth Experimental Visualization Laboratory (DEVlab), where we have been developing different types of multimedia applications. %Z Invited Presentation, EdMedia '94 Vancouver, Canada. Also published in the EdMedia'94 conference proceedings. This Technical Report has occasionally been listed (accidentally) as TR95-256. TR94-207 is the correct number. %T Conference on a Disk: A Successful Experiment in Hypermedia Publishing (Extended Abstract) %A M. Cheyney %A P. Gloor %A D. B. Johnson %A F. Makedon %A J. Matthews %A P. Metaxas %R Technical Report PCS-TR94-208 %I Dartmouth College, Computer Science %C Hanover, NH %D March 1994 %U http://www.cs.dartmouth.edu/reports/TR94-208.pdf %X Academic conferences are a long-standing and effective form of multimedia communication. Conference participants can transmit and recieve information through sight, speech, gesture, text, and touch. This same-time, same-place communication is sufficiently valuable to justify large investments in time and travel funds. Printed conference proceedings are attempts to recapture the value of a life conference, but they are limited by a fragmented and inefficient approach to the problem. We addressed this problem in the multimedia proceedings of the DAGS'92 conference. The recently published CD-ROM delibers text, graphic, audio, and video information as an integrated whole, with extensive provisions for random access and hypermedia linking. We belive that this project provides a model for future conference publications and highlights some of the research issues that must be resolved before similar publications can be quickly and inexpensively produced. %T Videoscheme: A Research, Authoring, and Teaching Tool for Multimedia %A J. Matthews %A F. Makedon %A P. Gloor %R Technical Report PCS-TR94-209 %I Dartmouth College, Computer Science %C Hanover, NH %D March 1994 %U http://www.cs.dartmouth.edu/reports/TR94-209.pdf %X The availability of digital multimedia technology poses new challenges to researchers, authors, and educators, even as it creates new opportunities for rich communication. This paper suggests interactive computer programming as a fruitful approach to these challenges. VideoScheme, a prototype video programming environment, is described along with promising applications. %T Dynamic File-Access Characteristics of a Production Parallel Scientific Workload %A David Kotz %A Nils Nieuwejaar %R Technical Report PCS-TR94-211 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-211.ps.Z %X Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the characteristics of the expected workload, but until now there have been no comprehensive workload characterizations of multiprocessor file systems. We began the CHARISMA project in an attempt to fill that gap. We instrumented the common node library on the iPSC/860 at NASA Ames to record all file-related activity over a two-week period. Our instrumentation is different from previous efforts in that it collects information about every read and write request and about the mix of jobs running in the machine (rather than from selected applications). The trace analysis in this paper leads to many recommendations for designers of multiprocessor file systems. First, the file system should support simultaneous access to many different files by many jobs. Second, it should expect to see many small requests, predominantly sequential and regular access patterns (although of a different form than in uniprocessors), little or no concurrent file-sharing between jobs, significant byte- and block-sharing between processes within jobs, and strong interprocess locality. Third, our trace-driven simulations showed that these characteristics led to great success in caching, both at the compute nodes and at the I/O nodes. Finally, we recommend supporting strided I/O requests in the file-system interface, to reduce overhead and allow more performance optimization by the file system. %Z TR94-230 is related. %T Job Scheduling in Rings %A Perry Fizzano %A Clifford Stein %A David R. Karger %A Joel Wein %R Technical Report PCS-TR94-213 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-213.ps.Z %X We give distributed approximation algorithms for job scheduling in a ring architecture. In contrast to almost all other parallel scheduling models, the model we consider captures the influence of the underlying communications network by specifying that task migration from one processor to another takes time proportional to the distance between those two processors in the network. As a result, our algorithms must balance both computational load and communication time. The algorithms are simple, require no global control, and work in a variety of settings. All come with small constant-factor approximation guarantees; the basic algorithm yields schedules of length at most 4.22 times optimal. We also give a lower bound on the performance of any distributed algorithm some results for a simple capacitated case, and the results of simulation experiments, which give better results than our worst-case analysis. %T A 2-3/4-Approximation Algorithm for the Shortest Superstring Problem %A Chris Armen %A Clifford Stein %R Technical Report PCS-TR94-214 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-214.pdf %X Given a collection of strings S={s_1,...,s_n} over an alphabet Sigma, a superstring alpha of S is a string containing each s_i as a substring, that is, for each i, 1<=i<=n, alpha contains a block of |s_i| consecutive characters that match s_i exactly. The shortest superstring problem is the problem of finding a superstring alpha of minimum length. The shortest superstring problem has applications in both computational biology and data compression. The problem is NP-hard [GallantMS80]; in fact, it was recently shown to be MAX SNP-hard [BlumJLTY91]. Given the importance of the applications, several heuristics and approximation algorithms have been proposed. Constant factor approximation algorithms have been given in [BlumJLTY91] (factor of 3), [TengY93] (factor of 2-8/9), [CzumajGPR94] (factor of 2-5/6) and [KosarajuPS94] (factor of 2-50/63). Informally, the key to any algorithm for the shortest superstring problem is to identify sets of strings with large amounts of similarity, or overlap. While the previous algorithms and their analyses have grown increasingly sophisticated, they reveal remarkably little about the structure of strings with large amounts of overlap. In this sense, they are solving a more general problem than the one at hand. In this paper, we study the structure of strings with large amounts of overlap and use our understanding to give an algorithm that finds a superstring whose length is no more than 2-3/4 times that of the optimal superstring. We prove several interesting properties about short periodic strings, allowing us to answer questions of the following form: given a string with some periodic structure, characterize all the possible periodic strings that can have a large amount of overlap with the first string. %T Fast Greedy Triangulation Algorithms %A Matthew T. Dickerson %A Robert L. Scot Drysdale %A Scott A. McElfresh %A Emo Welzl %R Technical Report PCS-TR94-215 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-215.ps.Z %X The greedy triangulation of a set $S$ of $n$ points in the plane is the triangulation obtained by starting with the empty set and at each step adding the shortest compatible edge between two of the points, where a compatible edge is defined to be an edge that crosses none of the previously added edges. In this paper we present a simple, practical algorithm that computes the greedy triangulation in expected time $O(n \log n)$ and space $O(n)$ for points uniformly distributed over any convex shape. A variant of this algorithm should be fast for some other distributions. As part of this algorithm we give an edge compatiblity test that requires $O(n)$ time for both tests and updates to the underlying data structure. We also prove properties about the expected lengths of edges in greedy and Delaunay triangulations of uniformly distributed points. %T Scheduling in a Ring with Unit Capacity Links %A Perry Fizzano %A Clifford Stein %R Technical Report PCS-TR94-216 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-216.ps.Z %X We consider the problem of scheduling unit-sized jobs on a ring of processors with the objective of minimizing the completion time of the last job. Unlike much previous work we place restrictions on the capacity of the network links connecting processors. We give a polynomial time centralized algorithm that produces optimal length schedules. We also give a simple distributed 2-approximation algorithm. %T SPEDE: A Simple Programming Environment for Distributed Execution (Users' Manual) %A James Gochee %R Technical Report PCS-TR94-217 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-217.ps.Z %X Traditional single processor computers are quickly reaching their full computational potentials. The quest for faster and faster chips have brought technology to the point where the laws of physics are hampering future gains. Significant gains in speed must therefore come from using multiple processors instead of a single processor. This technology usually represents itself in the form of a parallel computer, such as the Connection Machine Model 5. Recently however, much interest has been focused on software that organizes single processor computers to behave like a parallel computer. This is desirable for sites which have large installations of workstations, since the cost of new parallel systems are prohibitive. SPEDE, a Simple Programming Environment for Distributed Execution, was designed for this purpose. It allows UNIX based machines of varying hardware types to be organized and utilized by a programmer of parallel applications. SPEDE is a user level system in that it requires no special privileges to run. Every user keeps a separate copy of the system so that security issues are covered by the normal UNIX operating environment. SPEDE is characterized as a large grained distributed environment. This means that applications which have a large processing to I/O ratio will be much more effective than those with a small ratio. SPEDE allows users to coordinate the use of many computers through a straightforward interface. Machines are organized by classes, which are terms that can be used to label and group them into more manageable units. For example, users might want to create a class based on the byte ordering of machines, or by their location. Users can then specify more completely which machines they want to use for a particular session. Sessions are essentially the interaction between objects in the SPEDE environment. A user creates an object to perform a certain task, such as constructing part of a fractal image. Objects can send and receive messages from other objects using a simple interface provided with SPEDE. Objects are machine independent, which means that the same object can be run simultaneously on different platforms. This is achieved by translating all messages into standard network byte ordering. However, if user data is being passed between objects, it is the user's responsibility to make sure byte ordering is correct. The SPEDE system involves several major components. These components help control and manage object interactions. Figure 1 shows a running session running with three machines (each surrounded by an oval rectangle). There are also three objects running, two named MandComp and one named Mand. Each object is on a different machine, although it is possible to have multiple objects on a single machine. In the figure, the lines connecting the various entities represent socket connections. UNIX sockets are the transport mechanism used in SPEDE, although one could implement a lower level protocol for more efficient communication. Sockets can also be a problem because some machines have strict limits on the number of connections a user can have open at any given time. %Z A Senior Thesis in the Department of Math and Computer Science, Dartmouth College, 1992.
See also Technical Report PCS-TR94-218 for a technical description of SPEDE. %T SPEDE: Simple Programming Environment for Distributed Execution %A James Gochee %R Technical Report PCS-TR94-218 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-218.ps.Z %X One of the main goals for people who use computer systems, particularly computational scientists, is speed. In the quest for ways to make applications run faster, engineers have developed parallel computers, which use more than one CPU to solve a task. However, many institutions already posses significant computational power in networks of workstations. Through software, it is possible to glue together clusters of machines to simulate a parallel environment. SPEDE is one such system, designed to place the potential of local machines at the fingertips of the programmer. Through a simple interface, users design computational objects that can be linked and run in parallel. The goal of the project is to have a small portable environment that allows various types of computer systems to interact. SPEDE requires no altering of the kernel and does not require system privileges to use. Using SPEDE, programmers can get significant speedup for computationally intensive problems. As an example, a Mandelbrot image generator was implemented, that attained a five-fold speedup with eight processors. %Z A Senior Thesis in the Department of Math and Computer Science, Dartmouth College, 1992.
See also Technical Report PCS-TR94-217 for a user's manual for SPEDE. %T Fast Spherical Transforms on Distance Transitive Graphs %A James R. Driscoll %A Dennis M. Healy %A Daniel N. Rockmore %R Technical Report PCS-TR94-219 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-219.pdf %T A Detailed Simulation Model of the HP 97560 Disk Drive %A David Kotz %A Song Bac Toh %A Sriram Radhakrishnan %R Technical Report PCS-TR94-220 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-220.ps.Z %X We implemented a detailed model of the HP 97560 disk drive, to replicate a model devised by Ruemmler and Wilkes (both of Hewlett-Packard, HP). Our model simulates one or more disk drives attached to one or more SCSI buses. The design is broken into three components: a test driver, the disk model itself, and the discrete-event simulation support. Thus, the disk model can be easily extracted and used in other simulation environments. We validated our model using traces obtained from HP, using the same "demerit" measure as Ruemmler and Wilkes. We obtained a demerit percentage of 3.9%, indicating that our model was extremely accurate. This paper describes our implementation, and is meant for those wishing to use our model, see our validation, or understand our code. %Z The software is available. %T Efficiency and Stability Issues in the Numerical Computation of Fourier Transforms and Convolutions on the 2-Sphere %A Dennis M. Healy %A Sean S. B. Moore %A Daniel N. Rockmore %R Technical Report PCS-TR94-222 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-222.pdf %T Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems %A Thomas H. Cormen %A Thomas Sundquist %A Leonard F. Wisniewski %R Technical Report PCS-TR94-223 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-223.ps.Z %X We give asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bit-matrix-multiply/complement (BMMC) permutations on parallel disk systems. In a BMMC permutation on N records, where N is a power of 2, each (lg N)-bit source address x maps to a corresponding (lg N)-bit target address y by the matrix equation y = Ax XOR c, where matrix multiplication is performed over GF(2). The characteristic matrix A is (lg N) x (lg N) and nonsingular over GF(2). Under the Vitter-Shriver parallel-disk model with N records, D disks, B records per block, and M records of memory, we show a universal lower bound of $\Omega \left( \frac{N}{BD} \left( 1 + \frac{\rank{\gamma}}{\lg (M/B)} \right) \right)$ parallel I/Os for performing a BMMC permutation, where gamma is the lower left (lg (N/B)) x (lg B) submatrix of the characteristic matrix. We adapt this lower bound to show that the algorithm for bit-permute/complement (BPC) permutations in Cormen93a is asymptotically optimal. We also present an algorithm that uses at most $\frac{2N}{BD} \left( 4 \ceil{\frac{\rank{\gamma}}{\lg (M/B)}} + 4 \right)$ parallel I/Os, which asymptotically matches the lower bound and improves upon the BMMC algorithm in Cormen93a. When rank (gamma) is low, this method is an improvement over the general-permutation bound of $\Theta \left( \frac{N}{BD} \frac{\lg (N/B)}{\lg (M/B)} \right)$. We introduce a new subclass of BMMC permutations, called memory-load-dispersal (MLD) permutations, which can be performed in one pass. This subclass, which is used in the BMMC algorithm, extends the catalog of one-pass permutations appearing in Cormen93a. Although many BMMC permutations of practical interest fall into subclasses that might be explicitly invoked within the source code, we show how to detect in at most $N/BD + \ceil{\frac{\lg (N/B) + 1}{D}}$ parallel I/Os whether a given vector of target addresses specifies a BMMC permutation. Thus, one can determine efficiently at run time whether a permutation to be performed is BMMC and then avoid the general-permutation algorithm and save parallel I/Os by using our algorithm. %T BMMC Permutations on a DECmpp 12000/sx 2000 %A Kristin Bruhl %R Technical Report PCS-TR94-224 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-224.ps.Z %X Increasingly, modern computing problems, including many scientific and business applications, require huge amounts of data to be examined, modified, and stored. Parallel computers can be used to decrease the time needed to operate on such large data sets, by allowing computations to be performed on many pieces of data at once. For example, on the DECmpp machine used in our research, there are 2048 processors in the parallel processor array. The DECmpp can read data into each of these processors, perform a computation in parallel on all of it, and write the data out again, theoretically decreasing the execution time by a factor of 2048 over the time required by one of its processors. Often, the computations that occur after the data is in the processors involve rearranging, or permuting, the data within the array of parallel processors. Information moves between processors by means of a network connecting them. Communication through the network can be very expensive, especially if there are many collisions--simultaneous contentions for the same network resource--between items of data moving from one processor to another. When a program performs hundreds or even thousands of these permutations during its execution, a bottleneck can occur, impeding the overall performance of the program. Effective algorithms that decrease the time required to permute the data within a parallel computer can yield a significant speed increase in running programs with large data sets. Cormen has designed algorithms to improve performance when the data movement is defined by certain classes of permutations. This thesis will examine the performance of one of these classes, the bit-matrix-multiply/complement (BMMC) permutation, when implemented on the DECmpp. Although Cormen's algorithm was designed for parallel disk systems, this thesis adapts it to permutations of data residing in the memory of the parallel processors. The DECmpp network follows the model of an Extended Delta Network (EDN). One characteristic of an EDN is that it has a set of input and output ports to the network, each of which can carry only one item of data at a time. If more than one item needs to travel over a given port, a collision occurs. The data must access the port serially, which slows down the entire operation. Cormen's algorithm reduces these collisions by computing a schedule for sending the data over the network. For small data sets, it is not worthwhile to perform the extra operations to generate such a schedule, because the overhead associated with computing the schedule outweighs the time gained by preventing collisions at the network ports. As the size of the data set increases, eliminating collisions becomes more and more valuable. On the DECmpp, when the data permutation involves more than 128 elements per processor, our algorithm beats the more naive and obvious method for permuting in the parallel processor array. %Z A Senior Honors Thesis in Computer Science. Advisor: Thomas Cormen. %T Human Creativity Through Computer Gaming %A Christine McGavran %R Technical Report PCS-TR94-225 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-225.pdf %Z A Senior Honors Thesis in Computer Science. %T Disk-directed I/O for MIMD Multiprocessors %A David Kotz %R Technical Report PCS-TR94-226 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-226.ps.Z %X Many scientific applications that run on today's multiprocessors are bottlenecked by their file I/O needs. Even if the multiprocessor is configured with sufficient I/O hardware, the file-system software often fails to provide the available bandwidth to the application. Although libraries and improved file-system interfaces can make a significant improvement, we believe that fundamental changes are needed in the file-server software. We propose a new technique, disk-directed I/O, that flips the usual relationship between server and client to allow the disks (actually, disk servers) to determine the flow of data for maximum performance. Our simulations show that tremendous performance gains are possible. Indeed, disk-directed I/O provided consistent high performance that was largely independent of data distribution, and close to the maximum disk bandwidth. %Z A shorter version of this paper appeared in OSDI, although this TR has more recent numbers. The OSDI version is available here. A more complete journal version is here. The simulator software is also available.. %T Deciding Finiteness for Matrix Groups Over Function Fields %A Robert Beals %A Daniel N. Rockmore %A Ki-Seng Tan %R Technical Report PCS-TR94-227 %I Dartmouth College, Computer Science %C Hanover, NH %D June 1995 %U http://www.cs.dartmouth.edu/reports/TR94-227.pdf %X Let S be any finite subset GLn(F(t)) where F is a field. In this paper we give algorithms to decide if the group generated by S is finite. In the case of characteristic zero, slight modifications of earlier work of Babai, Beals and Rockmore [1] give polynomial time deterministic algorithms to solve this problem. The case of positive characteristic turns out to be more subtle and our algorithms depend on a structure theorem proved here, generalizing a theorem of Weil. We also present a fairly detailed analysis of the size of finite subgroups in this case and give bounds which depend upon the number of generators. To this end we also introduce the notion of the diameter of a finitely generated algebra and derive some upper bounds related to this quantity. In positive characteristic the deterministic algorithms we present are exponential. A randomized algorithm based on ideas of the Meat-Axe is also given. While not provably efficient, the success of the Meat-Axe suggests the randomized algorithm will be useful. %T A New Approach to the Minumum Cut Problem %A David R. Karger %A Clifford Stein %R Technical Report PCS-TR94-229 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-229.pdf %T A Multiprocessor Extension to the Conventional File System Interface %A Nils Nieuwejaar %A David Kotz %R Technical Report PCS-TR94-230 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-230.ps.Z %X As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. By tracing all the activity of a parallel file system in a production, scientific computing environment, we show that many applications exhibit highly regular, but non-consecutive I/O access patterns. Since the conventional interface does not provide an efficient method of describing these patterns, we present an extension which supports strided and nested-strided I/O requests. %Z A result of the CHARISMA project. This report has been superceded by report PCS-TR95-253. %T Multimedia authoring, development environments, and digital video editing. %A Fillia Makedon %A James Matthews %A Charles B. Owen %A Samuel A. Rebelsky %R Technical Report PCS-TR94-231 %I Dartmouth College, Computer Science %C Hanover, NH %D November 1, %U http://www.cs.dartmouth.edu/reports/TR94-231.ps.Z %X Multimedia systems integrate text, audio, video, graphics, and other media and allow them to be utilized in a combined and interactive manner. Using this exciting and rapidly developing technology, multimedia applications can provide extensive benefits in a variety of arenas, including research, education, medicine, and commerce. While there are many commercial multimedia development packages, the easy and fast creation of a useful, full-featured multimedia document is not yet a straightforward task. This paper addresses issues in the development of multimedia documents, ranging from user-interface tools that manipulate multimedia documents to multimedia communication technologies such as compression, digital video editing and information retrieval. It outlines the basic steps in the multimedia authoring process and some of the requirements that need to be met by multimedia development environments. It also presents the role of video, an essential component of multimedia systems and the role of programming in digital video editing. A model is described for remote access of distributed video. The paper concludes with a discussion of future research directions and new uses of multimedia documents. %Z Appeared in: Proc. of SPIE Defining the Global Information Infrastructure: Systems and Services (Critical Review), Boston, MA, 11/2/94. %T Exploring the Use of I/O Nodes for Computation in a MIMD Multiprocessor %A David Kotz %A Ting Cai %R Technical Report PCS-TR94-232 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-232.ps.Z %X As parallel systems move into the production scientific computing world, the emphasis will be on cost-effective solutions that provide high throughput for a mix of applications. Cost-effective solutions demand that a system make effective use of all of its resources. Many MIMD multiprocessors today, however, distinguish between ``compute'' and ``I/O'' nodes, the latter having attached disks and being dedicated to running the file-system server. This static division of responsibilities simplifies system management but does not necessarily lead to the best performance in workloads that need a different balance of computation and I/O. Of course, computational processes sharing a node with a file-system service may receive less CPU time, network bandwidth, and memory bandwidth than they would on a computation-only node. In this paper we examine this issue experimentally. We found that high-performance I/O does not necessarily require substantial CPU time, leaving plenty of time for application computation. There were some complex file-system requests, however, which left little CPU time available to the application. (The impact on network and memory bandwidth still needs to be determined.) For applications (or users) that cannot tolerate an occasional interruption, we recommend that they continue to use only compute nodes. For tolerant applications needing more cycles than those provided by the compute nodes, we recommend that they take full advantage of both compute and I/O nodes for computation, and that operating systems should make this possible. %Z Revised version appeared in IOPADS '95 at IPPS '95, pages 78-89. %T Hypergraph Partitioning Algorithms %A Tom Leighton %A Fillia Makedon %A Spyros Tragoudas %R Technical Report PCS-TR94-233 %I Dartmouth College, Computer Science %C Hanover, NH %D October, 1994 %U http://www.cs.dartmouth.edu/reports/TR94-233.pdf %X We present the first polynomial time approximation algorithms for the balanced hypergraph partitioning problem. The approximations are within polylogarithmic factors of the optimal solutions. The choice of algorithm involves a time complexity/approximation bound tradeoff. We employ a two step methodology. First we approximate the flux of the input hypergraph. This involves an approximate solution to a concurrent flow problem on the hypergraph. In the second step we use the approximate flux to obtain approximations for the balanced bipartitioning problem. Our results extend the approximation algorithms by Leighton-Rao on graphs to hypergraphs. We also give the first polylogarithmic times optimal approximation algorithms for multiway (graph and hypergraph) partitioning problems into bounded size sets. A better approximation algorithm for the latter problem is finally presented for the special case of bounded sets of size at most O(log n) on planar graphs and hypergraphs, where n is the number of nodes of the input instance. %T A DAta-Parallel Programming Library for Education (DAPPLE) %A David Kotz %R Technical Report PCS-TR94-235 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-235.ps.Z %X In the context of our overall goal to bring the concepts of parallel computing into the undergraduate curriculum, we set out to find a parallel-programming language for student use. To make it accessible to students at all levels, and to be independent of any particular hardware platform, we chose to design our own language, based on a data-parallel model and on C++. The result, DAPPLE, is a C++ class library designed to provide the illusion of a data-parallel programming language on conventional hardware and with conventional compilers. DAPPLE defines Vectors and Matrices as basic classes, with all the usual C++ operators overloaded to provide elementwise arithmetic. In addition, DAPPLE provides typical data-parallel operations like scans, permutations, and reductions. Finally, DAPPLE provides a parallel if-then-else statement to restrict the scope of the above operations to partial vectors or matrices. %Z A revision of this report appeared in SIGCSE '95. %T Distributed Scheduling in Finite Capacity Networks %A Perry Fizzano %A Clifford Stein %R Technical Report PCS-TR94-236 %I Dartmouth College, Computer Science %C Hanover, NH %D November, 1994 %U http://www.cs.dartmouth.edu/reports/TR94-236.ps.Z %X We consider the problem of scheduling unit-sized jobs in a distributed network of processors. Each processor only knows the number of jobs it and its neighbors have. We give an analysis of intuitive algorithm and prove that the algorithm produces schedules that are within a logarithmic factor of the length of the optimal schedule given that the optimal schedule is sufficiently long. %T Efficient Parallel Algorithms for Closest Point Problems %A Peter Su %R Technical Report PCS-TR94-238 %I Dartmouth College, Computer Science %C Hanover, NH %D November 1994 %U http://www.cs.dartmouth.edu/reports/TR94-238.pdf %X This dissertation develops and studies fast algorithms for solving closest point problems. Algorithms for such problems have applications in many areas including statistical classification, crystallography, data compression, and finite element analysis. In addition to a comprehensive empirical study of known sequential methods, I introduce new parallel algorithms for these problems that are both efficient and practical. I present a simple and flexible programming model for designing and analyzing parallel algorithms. Also, I describe fast parallel algorithms for nearest-neighbor searching and constructing Voronoi diagrams. Finally, I demonstrate that my algorithms actually obtain good performance on a wide variety of machine architectures. The key algorithmic ideas that I examine are exploiting spatial locality, and random sampling. Spatial decomposition provides allows many concurrent threads to work independently of one another in local areas of a shared data structure. Random sampling provides a simple way to adaptively decompose irregular problems, and to balance workload among many threads. Used together, these techniques result in effective algorithms for a wide range of geometric problems. The key experimental ideas used in my thesis are simulation and animation. I use algorithm animation to validate algorithms and gain intuition about their behavior. I model the expected performance of algorithms using simulation experiences, and some knowledge as to how much critical primitive operations will cost on a given machine. In addition, I do this without the burden of esoteric computational models that attempt to cover every possible variable in the design of a computer system. An iterative process of design, validation, and simulation delays the actual implementation until as many details as possible are accounted for. Then, further experiments are used to turn the implementations for better performance. %Z Advisor: Scot Drysdale %T The Design and Development of Interactive Multimedia Conference Proceedings %A Samuel A. Rebelsky %A James Ford %A Kenneth Harker %A Fillia Makedon %A Panagiotis Metaxas %A Charles B. Owen %R Technical Report PCS-TR94-239 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-239.pdf %T Incremental Equational Programming %A Samuel A. Rebelsky %R Technical Report PCS-TR94-240 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-240.pdf %T Building Multimedia Proceedings: The Roles of Video in Interactive Electronic Conference Proceedings %A Samuel A. Rebelsky %A Fillia Makedon %A James Matthews %A Charles B. Owen %A Laura Bright %A Kenneth Harker %A Nancy Toth %A Panagiotis Metaxas %R Technical Report PCS-TR94-241 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-241.pdf %T ViC*: A Preprocessor for Virtual-Memory C* %A Thomas H. Cormen %A Alex Colvin %R Technical Report PCS-TR94-243 %I Dartmouth College, Computer Science %C Hanover, NH %D 1994 %U http://www.cs.dartmouth.edu/reports/TR94-243.ps.Z %X This paper describes the functionality of ViC*, a compiler-like preprocessor for out-of-core C*. The input to ViC* is a C* program but with certain shapes declared \verb`outofcore`, which means that all parallel variables of these shapes reside on disk. The output is a standard C* program with the appropriate I/O and library calls added for efficient access to out-of-core parallel variables. %T DartCVL: The Dartmouth C Vector Library %A Thomas H. Cormen %A Sumit Chawla %A Preston Crow %A Melissa Hirschl %A Roberto Hoyle %A Keith D. Kotay %A Rolf H. Nelson %A Nils Nieuwejaar %A Scott M. Silver %A Michael B. Taylor %A Rajiv Wickremesinghe %R Technical Report PCS-TR95-250 %I Dartmouth College, Computer Science %C Hanover, NH %D 1995 %U http://www.cs.dartmouth.edu/reports/TR95-250.ps.Z %X As a class project, we implemented