%T Reducing Mass Degeneracy in SAR by MS by Stable Isotopic Labeling
%A Chris Bailey-Kellogg
%A John J. Kelley
%A Clifford Stein
%A Bruce Randall Donald
%R Technical Report TR2000-362
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-362.ps.Z
%X
Mass spectrometry (MS) promises to be an invaluable tool for
functional genomics, by supporting low-cost, high-throughput
experiments. However, large-scale MS faces the potential problem of
mass degeneracy -- indistinguishable masses for multiple
biopolymer fragments (e.g. from a limited proteolytic digest). This
paper studies the tasks of planning and interpreting MS experiments
that use selective isotopic labeling, thereby substantially reducing
potential mass degeneracy. Our algorithms support an
experimental-computational protocol called Structure-Activity
Relation by Mass Spectrometry (SAR by MS), for elucidating the
function of protein-DNA and protein-protein complexes. SAR by MS
enzymatically cleaves a crosslinked complex and analyzes the resulting
mass spectrum for mass peaks of hypothesized fragments. Depending on
binding mode, some cleavage sites will be shielded; the absence of
anticipated peaks implicates corresponding fragments as either part of
the interaction region or inaccessible due to conformational change
upon binding. Thus different mass spectra provide evidence for
different structure-activity relations. We address combinatorial and
algorithmic questions in the areas of data analysis
(constraining binding mode based on mass signature) and
experiment planning (determining an isotopic labeling strategy to
reduce mass degeneracy and aid data analysis). We explore the
computational complexity of these problems, obtaining upper and lower
bounds. We report experimental results from implementations of our
algorithms.
%Z
This report supercedes TR99-359.
To appear in the 8th
International Conference on Intelligent Systems for Molecular Biology,
(August 20-23, 2000) La Jolla, CA (Accepted; in press).
%T A Formal Semantics for SPKI
%A Jon Howell
%A David Kotz
%R Technical Report TR2000-363
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-363.ps.Z
%X
We extend the logic and semantics of authorization due to
Abadi, Lampson, et al. to support restricted delegation.
Our formal model provides a simple interpretation for the variety of
constructs in the Simple Public Key Infrastructure (SPKI), and
lends intuition about possible extensions. We discuss both extensions
that our semantics supports and extensions that it cautions against.
%Z
This TR supercedes TR1999-361.
This technical report is an extended version of a paper submitted to
ESORICS 2000.
For more information, see the project web page.
%T Landmarks for absolute localization
%A Jon Howell
%A Keith Kotay
%R Technical Report TR2000-364
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-364.ps.Z
%X
For certain experiments in mobile robotics, it is convenient to eliminate
positional estimation error in the interest of analyzing other parts of
the experiment. We designed and implemented a simple, accurate scheme
for encoding and recovering absolute position information. The encoding
is a two-dimensional image printed on the plane of the floor, and the
absolute position information is recovered using a downward-looking
video camera mounted on a mobile robot.
%Z
This document describes work done in the Dartmouth Robotics Laboratory
in April of 1997 and August of 1999. It was previously ``published'' as
a web page, but we thought it would make sense to document it more
permanently.
%T Mobile Agents: Motivations and State-of-the-Art Systems
%A Robert S. Gray
%A David Kotz
%A George Cybenko
%A Daniela Rus
%R Technical Report TR2000-365
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-365.ps.Z
%X
A mobile agent is an executing program that can migrate, at times of its
own choosing, from machine to machine in a heterogeneous network. On
each machine, the agent interacts with stationary service agents and
other resources to accomplish its task. In this chapter, we first make
the case for mobile agents, discussing six strengths of mobile agents
and the applications that benefit from these strengths. Although none of
these strengths are unique to mobile agents, no
competing technique shares all six. In other words, a mobile-agent
system provides a single general framework in which a wide range of
distributed applications can be implemented efficiently and easily.
We then present a representative cross-section of current mobile-agent
systems.
%Z
This technical report will appear as a chapter in Jeffrey M. Bradshaw,
editor, Handbook of Agent Technology, AAAI/MIT Press, 2000. In Press.
%T Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
%A David Kotz
%A Guofei Jiang
%A Robert S. Gray
%A George Cybenko
%A Ronald A. Peterson
%R Technical Report TR2000-366
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-366.ps.Z
%X
Wireless networks are an ideal environment for mobile agents, because
their mobility allows them to move across an unreliable link to reside
on a wired host, next to or closer to the resources they need to use.
Furthermore, client-specific data transformations can be moved across
the wireless link, and run on a wired gateway server, with the goal of
reducing bandwidth demands. In this paper we examine the tradeoffs
faced when deciding whether to use mobile agents to support a
data-filtering application, in which numerous wireless clients filter
information from a large data stream arriving across the wired
network. We develop an analytical model and use parameters from our
own experiments to explore the model's implications.
%Z
In August 2000 a revised version appeared in the International
Workshop on Modeling and Simulation of Wireless and Mobile Systems
(MSWiM 2000). In October 2000 a further revised version appeared as
Dartmouth Technical Report TR2000-377, and was submitted to the
journal Mobile Networks and Applications (ACM MONET).
%T Approximation Algorithms for the Minimum Bends Traveling Salesman Problem
%A Cliff Stein
%A David P. Wagner
%R Technical Report TR2000-367
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-367.ps.Z
%X
The problem of traversing a set of points in the order that minimizes
the total distance traveled (traveling salesman problem) is one of
the most famous and well-studied problems in combinatorial
optimization. It has many applications, and has been a testbed for
many of the must useful ideas in algorithm design and analysis. The
usual metric, minimizing the total distance traveled, is an important
one, but many other metrics are of interest.
In this paper, we introduce the metric of minimizing the number of
turns in the tour, given that the input points are in the Euclidean
plane. To our knowledge this metric has not been studied previously.
It is motivated by applications in robotics and in the movement of
other heavy machinery: for many such devices turning is an expensive
operation. We give approximation algorithms for several variants
of the traveling salesman problem for which the metric is to minimize
the number of turns. We call this the minimum bends traveling salesman
problem.
For the case of an arbitrary set of $n$ points in the
Euclidean plane, we give an O(lg z)-approximation algorithm, where
z is the maximum number of collinear points. In the worst case z
can be as big as n, but z will often be much smaller. For the case
when the lines are restricted to being either horizontal or vertical,
we give a 2-approximation algorithm. If we have the further
restriction that no two points are allowed to have the same x- or
y-coordinate, we give an algorithm that finds a tour which makes at
most two turns more than the optimal tour. Thus we have an
approximation algorithm with an additive, rather than a multiplicative
error bound. Beyond the additive error bound, our algorithm for this
problem introduces several interesting algorithmic techniques for
decomposing sets of points in the Euclidean plane that we believe
to be of independent interest.
%Z
Submitted to FOCS 2000.
%T A Simulation of Auroral Absorption
%A Eric Michael Greenberg
%R Technical Report TR2000-368
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-368.pdf
%X
HF radio transmissions propagate long distances by reflecting off the ionosphere. At high latitudes radio propagation is strongly affected by the northern lights (aurora borealis), which causes ionization at low altitudes and hence the absorption of radio waves. Models of this process are still in a primitive state. A simulation of radio wave propagation was created in order to test Foppiano and Bradley's empirical model of auroral absorption. The simulation attempts to predict the net absorption of signals at a receiver by simulating a large number of transmitters, even though the exact sources of the signals are unknown. Although the simulation takes into account auroral and nonauroral absorption as well as other sources of path loss, the analysis focuses on the nighttime aurora. An intelligent search algorithm is used in order to efficiently adjust the model to best fit the data. The output of the simulation is qualitatively and quantitatively compared to signal levels observed with HF radio receivers located in northern Canada. The analysis allows us to develop alternative models of auroral absorption which account for the level of geomagnetic activity, and these are compared to the standard Foppiano and Bradley model.
%Z
Advisor: Dan Rockmore
%T Registration of Images with Dissimilar Contrast using a Hybrid Method Employing Correlation and Mutual Information
%A Karolyn A. Abram
%R Technical Report TR2000-369
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-369.ps.Z
%X
The problem of fitting one image into another is commonly
called "registration." Finding the best possible translation and rotation
necessary to align two images is one approach to solving this problem.
Registration is a crucial component of many remote sensing and medical
image interpretation applications. Image alignment techniques aid in
volumetric estimations of complicated structures and allow
radiologists to accurately identify changes between sequential images.
Radiologists require image alignment capabilities to correct for
patient motion and/or content displacement between images.
Numerous image registration techniques exist for correcting the
alignment problems mentioned above. Unfortunately, most of these
techniques, such as Correlation, fail to find a good alignment when
dealing with images that differ in contrast. The Mutual Information
method is able to align images independently of contrast, but it is
computationally intensive. We explore a hybrid technique that
utilizes both Correlation and Mutual Information. The Hybrid
technique hopes to gain greater contrast independence than Correlation
alone while achieving a lower running time than a pure Mutual
Information technique.
%Z
Undergraduate Honors Thesis
Advisor: Daniel N. Rockmore
%T An Infrastructure for a Mobile-Agent System that Provides Personalized Services to Mobile Devices
%A Debbie O. Chyi
%R Technical Report TR2000-370
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-370.ps.Z
%X
In this paper, we present the design of a mobile-agent system
that provides a mobile user with a personalized information retrieval
service and we describe the implementation of the infrastructure for
such a system. This "Personal Agent System" gathers information from
the Internet and uses context-aware mechanisms to manage the
information according to a mobile user's needs and preferences. The
user's schedule and location are the context indicators in this
system. These indicators are critical in ensuring that users obtain
only the information they want, receive information in a form that is
most useful for viewing on their mobile device, and is notified of
new information in a minimally intrusive manner. The system
incorporates a rule-based learning system to enhance the
personalization achieved by the system.
%Z
Undergraduate Honors Thesis.
Advisor: David F. Kotz
%T Personal Radio
%A John C. Artz
%R Technical Report TR2000-372
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-372.ps.Z
%X
With the development of new technologies that allow the broadcast of
digital data over radio signals, there are many possibilities for
improving upon the traditional radio station model for content
delivery. The idea of Personal Radio is a system that tailors content
to meet the needs of each individual. Using Global Positioning System
(GPS) technology to play location specific content, the listening
history to play content an appropriate number of times, and user
feedback to learn personal preferences, the Personal Radio provides
the listener with the content that is the most useful/interesting to
them. This paper will examine the general design of such a system and
present solutions developed in the implementation of several pieces
of the design.
%Z
Undergraduate Honors Thesis.
Advisors: David Kotz and Daniela Rus
%T Depth from Flash
%A David B. Martin
%R Technical Report TR2000-373
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-373.ps.Z
%X
Digital camera technology has recently seen substantial
improvements in image quality while lower prices have made
it affordable to the average consumer. Camera manufacturers,
however, are not taking full advantage of this new medium
for image capture. By filtering the already digitized image
produced by these cameras through on-board image processing
algorithms we can dramatically increase the power of digital
cameras. For example, according to experts in the
photographic industry, most people simply take bad
pictures. Classic examples of this phenomenon are
photographs taken indoors with a point-and-shoot style
camera using its built-in flash. The subjects of these
photographs often seem to have a spotlight on them, making
them look bright and washed out while the rest of the
photograph is dark and indistinct. This can primarily be
accounted for by a well known property of point light
sources: falloff in brightness is inversely proportional to
the square of the distance between the light and the object
being illuminated. A technique first introduced in the field
of computer vision has been shown to successfully recover
information about the distance between the light source and
objects in the world. We propose using this technique, which
is readily implementable in hardware, to correct for a
variety of poorly illuminated digital images.
%Z
Undergraduate Honors Thesis.
Advisor: Hany Farid
%T An Economic CPU-Time Market for D'Agents
%A Ezra E. K. Cooper
%A Robert S. Gray
%R Technical Report TR2000-375
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-375.ps.Z
%X
A usable and efficient resource-management system has been created for
use with D'Agents. The software dynamically negotiates a price rate for
CPU time, using the competitive bids of mobile agents that offer
currency in return for fast computation. The system allows mobile agents
to plan their expenditures across many hosts while minimizing the time
needed for their tasks. The ability to price CPU time opens the door for
service owners to be compensated for the computation consumed by agents and
provides an incentive for servers to allow anonymous agents. We
discuss the theoretical background which makes a CPU market system
possible and the performance of the D'Agents market system.
%Z
Undergraduate honors thesis.
Advisor: Bob Gray.
%T The complexity of planning with partially-observable Markov decision processes
%A Martin Mundhenk
%R Technical Report TR2000-376
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-376.ps.Z
%X
This work surveys results on the complexity of planning under
uncertainty. The planning model considered is the partially-observable
Markov decision process. The general planning problems are,
given such a process,
(a) to calculate its performance under a given control policy,
(b) to find an optimal or approximate optimal control policy, and
(c) to decide whether a good policy exists.
The complexity of this and related problems
depend on a variety of factors,
including the observability of the process state,
the compactness of the process representation,
the type of policy, or
even the number of actions relative to the number of states.
In most cases, the problem can be shown to be
complete for some known complexity class.
The skeleton of this survey are results from
Littman, Goldsmith and Mundhenk (Journal of Artificial Intelligence
Research 1998), Mundhenk (Mathematics of Operations Research 2000),
Mundhenk, Goldsmith, Lusena and Allender (Journal of the ACM 2000),
and Lusena, Goldsmith and Mundhenk (University of KY CS TR).
But there are also some news.
%T Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
%A David Kotz
%A George Cybenko
%A Robert S. Gray
%A Guofei Jiang
%A Ronald A. Peterson
%A Martin O. Hofmann
%A Daria A. Chacon
%A Kenneth R. Whitebread
%A James Hendler
%R Technical Report TR2000-377
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-377.ps.Z
%X
Wireless networks are an ideal environment for mobile agents, since their
mobility allows them to move across an unreliable link to reside on a wired
host, next to or closer to the resources that they need to use. Furthermore,
client-specific data transformations can be moved across the wireless link and
run on a wired gateway server, reducing bandwidth demands. In
this paper we examine the tradeoffs faced when deciding whether to use mobile
agents in a data-filtering application where numerous wireless clients filter
information from a large data stream arriving across the wired network. We
develop an analytical model and use parameters from filtering experiments
conducted during a U.S. Navy Fleet Battle Experiment (FBE) to explore the
model's implications.
%Z
Updated version of TR2000-366. To appear, after revisions, in the journal
Mobile Networks and Applications (ACM MONET).
%T Naming and sharing resources across administrative boundaries (Volume I)
%A Jon Howell
%R Technical Report TR2000-378
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-378.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report represents Volume One of the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume II is available as TR2000-379.
Please note that a list of errata is available as TR2000-380.
%T Naming and sharing resources across administrative boundaries (Volume II)
%A Jon Howell
%R Technical Report TR2000-379
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-379.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report represents Volume Two of the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume I is available as TR2000-378.
Please note that a list of errata is available as TR2000-380.
%T Naming and sharing resources across administrative boundaries (errata)
%A Jon Howell
%R Technical Report TR2000-380
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-380.ps.Z
%X
I tackle the problem of naming and sharing resources across
administrative boundaries. Conventional systems manifest the hierarchy
of typical administrative structure in the structure of their own
mechanism. While natural for communication that follows hierarchical
patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users
and administrators. I propose to organize resource naming and
security, not around administrative domains, but around the sharing
patterns of users.
The dissertation is organized into four main parts. First, I discuss
the challenges and tradeoffs involved in naming resources and consider
a variety of existing approaches to naming.
Second, I consider the architectural requirements for user-centric
sharing. I evaluate existing systems with respect to these
requirements.
Third, to support the sharing architecture, I develop a formal logic of
sharing that captures the notion of restricted delegation. Restricted
delegation ensures that users can use the same mechanisms to share
resources consistently, regardless of the origin of the resource, or
with whom the user wishes to share the resource next. A formal
semantics gives unambiguous meaning to the logic. I apply the
formalism to the Simple Public Key Infrastructure and discuss how the
formalism either supports or discourages potential extensions to such a
system.
Finally, I use the formalism to drive a user-centric sharing
implementation for distributed systems. I show how this implementation
enables end-to-end authorization, a feature that makes heterogeneous
distributed systems more secure and easier to audit. Conventionally,
gateway services that bridge administrative domains, add abstraction,
or translate protocols typically impede the flow of authorization
information from client to server. In contrast, end-to-end
authorization enables us to build gateway services that preserve
authorization information, hence we reduce the size of the trusted
computing base and enable more effective auditing. I demonstrate my
implementation and show how it enables end-to-end authorization across
various boundaries. I measure my implementation and argue that its
performance tracks that of similar authorization mechanisms without
end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing
benefits both users and administrators.
%Z
This technical report contains errata for the dissertation. Volume
One (187 pages) is the heart of the dissertation. Volume Two (237
pages) contains a software manual introduced with illustrated code
fragments, plus plots of the raw data used for the experimental
results presented in Volume One. That material is optional; I
recommend that the interested reader begin with just Volume One.
Volume I is available as TR2000-378.
Volume II is available as TR2000-379.
%T A Survey of Context-Aware Mobile Computing Research
%A Guanling Chen
%A David Kotz
%R Technical Report TR2000-381
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-381.ps.Z
%X
Context-aware computing is a mobile computing paradigm in which
applications can discover and take advantage of contextual information
(such as user location, time of day, nearby people and devices, and
user activity). Since it was proposed about a decade ago, many
researchers have studied this topic and built several context-aware
applications to demonstrate the usefulness of this new
technology. Context-aware applications (or the system infrastructure
to support them), however, have never been widely available to
everyday users. In this survey of research on context-aware systems
and applications, we looked in depth at the types of context used and
models of context information, at systems that support collecting and
disseminating context, and at applications that adapt to the changing
context. Through this survey, it is clear that context-aware research
is an old but rich area for research. The difficulties and possible
solutions we outline serve as guidance for researchers hoping to make
context-aware computing a reality.
%T Bayes Optimal Metasearch: A Probabilistic Model for Combining the Results of Multiple Retrieval Systems
%A Javed A. Aslam
%A Mark Montague
%R Technical Report TR2000-382
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-382.ps.Z
%X
We introduce a new, probabilistic model for combining the outputs of
an arbitrary number of query retrieval systems. By gathering simple
statistics on the average performance of a given set of query
retrieval systems, we construct a Bayes optimal mechanism for
combining the outputs of these systems. Our construction yields a
metasearch strategy whose empirical performance nearly always
exceeds the performance of any of the constituent systems. Our
construction is also robust in the sense that if ``good'' and
``bad'' systems are combined, the performance of the composite is
still on par with, or exceeds, that of the best constituent system.
Finally, our model and theory provide theoretical and empirical
avenues for the improvement of this metasearch strategy.
%Z
Preliminary version appeared in SIGIR 2000.
%T Reconstructing Ancient Egyptian Tombs
%A Hany Farid
%R Technical Report TR2000-383
%I Dartmouth College, Computer Science
%C Hanover, NH
%D December 2000
%U http://www.cs.dartmouth.edu/reports/TR2000-383.ps.Z
%X
From the pyramids of Giza to the tombs of Thebes (modern Luxor),
ancient Egypt's glorious history has produced remarkable architecture.
Sadly, the nearly four million yearly tourists have taken a heavy toll
on many of these ancient structures. Of particular concern are many
of the tombs located opposite to Luxor on the western bank of the
Nile. Digital reconstruction of these tombs has the potential to help
document and preserve these important historical structures.
Photographing and reconstruction of these tombs poses new and unique
problems that this paper begins to address. Techniques for removing
image distortions, recovering 3-D shape, and correcting for lighting
imbalances are discussed. A complete reconstruction of the tomb of
Sennedjem is shown.
%T Ambiguity-Directed Sampling for Qualitative Analysis of Sparse Data from Spatially-Distributed Physical Systems
%A Chris Bailey-Kellogg
%A Naren Ramakrishnan
%R Technical Report TR2001-384
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-384.ps.Z
%X
A number of important scientific and engineering applications, such as
fluid dynamics simulation and aircraft design, require analysis of
spatially-distributed data from expensive experiments and complex
simulations. In such data-scarce applications, it is advantageous to
use models of given sparse data to identify promising regions for
additional data collection. This paper presents a principled
mechanism for applying domain-specific knowledge to design focused
sampling strategies. In particular, our approach uses ambiguities
identified in a multi-level qualitative analysis of sparse data to
guide iterative data collection. Two case studies demonstrate that
this approach leads to highly effective sampling decisions that are
also explainable in terms of problem structures and domain knowledge.
%T Lock-free Scheduling of Logical Processes in Parallel Simulation
%A Xiaowen Liu
%A David M. Nicol
%A King Tan
%R Technical Report TR2001-385
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-385.ps.Z
%X
With fixed lookahead information in a simulation model, the overhead
of asynchronous conservative parallel simulation lies in the mechanism
used for propagating time updates in order for logical processes to
safely advance their local simulation clocks. Studies have shown that
a good scheduling algorithm should preferentially schedule processes
containing events on the critical path. This paper introduces a
lock-free algorithm for scheduling logical processes in conservative
parallel discrete-event simulation on shared-memory multiprocessor
machines. The algorithm uses fetch\&add operations that help avoid
inefficiencies associated with using locks. The lock-free algorithm is
robust. Experiments show that, compared with the scheduling algorithm
using locks, the lock-free algorithm exhibits better performance when
the number of logical processes assigned to each processor is small or
when the workload becomes significant. In models with large number of
logical processes, our algorithm shows only modest increase in
execution time due to the overhead in the algorithm for extra
bookkeeping.
%Z
A revision of this report appears on PADS 2001.
%T Mobile-Agent versus Client/Server Performance: Scalability in an Information-Retrieval Task
%A Robert S. Gray
%A David Kotz
%A Ronald A. Peterson
%A Peter Gerken
%A Martin Hofmann
%A Daria Chacon
%A Greg Hill
%A Niranjan Suri
%R Technical Report TR2001-386
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-386.ps.Z
%X
Mobile agents are programs that can jump from host
to host in the network, at times and to places of
their own choosing. Many groups have developed
mobile-agent software platforms, and several
mobile-agent applications. Experiments show that
mobile agents can, among other things, lead to
faster applications, reduced bandwidth demands, or
less dependence on a reliable network
connection. There are few if any studies of the
scalability of mobile-agent servers, particularly as
the number of clients grows. We present some recent
performance and scalability experiments that compare
three mobile-agent platforms with each other and
with a traditional client/server approach. The
experiments show that mobile agents often outperform
client/server solutions, but also demonstrate the
deep interaction between environmental and
application parameters. The three mobile-agent
platforms have similar behavior but their absolute
performance varies with underlying implementation
choices.
%Z
Revised version appeared in Mobile Agents 2001.
See here.
%T A simple bound on the expected height of a randomly built binary search tree
%A Javed A. Aslam
%R Technical Report TR2001-387
%I Dartmouth College, Computer Science
%C Hanover, NH
%D Sometime 2001
%Z
Abstract and paper lost.
%T Applying the Vector Radix Method to Multidimensional, Multiprocessor, Out-of-Core Fast Fourier Transforms
%A Michael F. Ringenburg
%R Technical Report TR2001-388
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-388.ps.Z
%X
We describe an efficient algorithm for calculating Fast Fourier
Transforms on matrices of arbitrarily high dimension using the
vector-radix method when the problem size is out-of-core (i.e.,
when the size of the data set is larger than the total available
memory of the system). The algorithm takes advantage of multiple
processors when they are present, but it is also efficient on single-processor
systems. Our work is an extension of work done by Lauren Baptist in
[Bapt99], which applied the vector-radix method to 2-dimensional
out-of-core matrices.
To determine the effectiveness of the algorithm, we present
empirical results as well as an analysis of the I/O, communication,
and computational complexity. We perform the empirical tests on
a DEC 2100 server and on a cluster of Pentium-based Linux
workstations. We compare our results with the traditional
dimensional method of calculating multidimensional FFTs, and show
that as the number of dimensions increases, the vector-radix-based
algorithm becomes increasingly effective relative to the dimensional
method.
In order to calculate the complexity of the algorithm, it was
necessary to develop a method for analyzing the interprocessor
communication costs of the BMMC data-permutation algorithm
(presented in [CSW98]) used by our FFT algorithms. We present
this analysis method and show how it was derived.
%Z
Masters Thesis. Advisor: Tom Cormen.
%T Improving a Brokering System for Linking Distributed Simulations
%A Thomas B. Stephens
%R Technical Report TR2001-389
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-389.ps.Z
%X
The Agent Based Environment for Linking Simulations (ABELS)
is a software framework designed to provide disparate simulations
with dynamically updated data sources. It allows simulations and
other agents to join a "cloud" of interacting producers and consumers
of data. Once they have joined the cloud, they can publish services
to other members and use methods published by others. This paper
presents the initial design of a set of matchmaking components for
the ABELS framework. These components dictate how services describe
their abilities and requirements to ABELS. Furthermore, they help
ABELS successfully match data producing services to the requests of
data consuming clients.
We begin by describing a system for a data producing service
to describe itself to the ABELS cloud, as well as a corresponding
system for a data consumer to describe its needs. We then describe
in detail the three components that make up the ABELS matchmaking
system: the match ranker, which ranks a data producer's ability to
fill the request of a data consumer; the thesaurus, which helps the
match ranker recognize closely related terms; and the unit database,
which allows participants in the ABELS system to translate between
related data units. We also discuss how these basic components can
be built upon and improved in future versions of the ABELS framework.
%Z
Senior Honors Thesis. Advisor: Linda F. Wilson.
%T Mobile Voice Over IP (MVOIP): An Application-level Protocol
%A G. Ayorkor Mills-Tettey
%R Technical Report TR2001-390
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-390.ps.Z
%X
Current Voice over Internet Protocol (VOIP) protocols require
participating hosts to have fixed IP addresses for the duration of a
VOIP call. When using a wireless-enabled host, such as a tablet
computer on an 802.11 wireless network, it is possible for a
participant in a VOIP call to roam around the network, moving from
one subnet to another and needing to change IP addresses. This
address change creates the need for mobility support in VOIP
applications.
We present the design of Mobile Voice over IP (MVOIP), an
application-level protocol that enables such mobility in a VOIP
application based on the ITU H.323 protocol stack. An MVOIP
application uses hints from the surrounding network to determine that
it has switched subnets. It then initiates a hand-off procedure that
comprises pausing its current calls, obtaining a valid IP address for
the current subnet, and reconnecting to the remote party with whom it
was in a call. Testing the system shows that on a Windows 2000
platform there is a perceivable delay in the hand-off process, most
of which is spent in the Windows API for obtaining DHCP addresses.
Despite this bottleneck, MVOIP works well on a wireless network.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T A Directory Infrastructure to Support Mobile Services
%A Ammar Khalid
%R Technical Report TR2001-391
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-391.ps.Z
%X
Traditional Voice-over-IP applications such as Microsoft NetMeeting
assume that the user is on a machine with a fixed IP address. If,
however, the user connects to the Internet, via a wireless network,
on a handheld device, his IP address frequently changes as he moves
from one subnet to another. In such a situation, we need a service
that can be queried for the most current IP address of a person whom
we wish to contact. In this project, we design and implement such a
directory service. The service authenticates all callers and callees,
is robust against most host failure, and scales to several thousand
registered users.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T SmartReminder: A Case Study on Context-Sensitive Applications
%A Arun Mathias
%R Technical Report TR2001-392
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-392.ps.Z
%X
Designing context-sensitive applications is challenging. We design
and implement SmartReminder to explore designing context-sensitive
applications and to demonstrate how the SOLAR system can be used
in developing such applications. SmartReminder is an application
that reminds the user based on contextual information. Current
appointment-reminder applications remind the user about their
appointments at an arbitrarily specified time. For instance, they
might remind the user ten minutes before each appointment.
SmartReminder, on the other hand, uses contextual information,
like location, to better estimate the appropriate reminder time
for each appointment. It reminds the user based on where they are,
where they need to be, and how long it will take them to get
there. This paper presents SmartReminder as an illustration of how
context-sensitive applications can be designed using the SOLAR
system for dissemination of contextual information.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T Measuring early usage of Dartmouth's wireless network
%A Pablo Stern
%R Technical Report TR2001-393
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-393.ps.Z
%X
In Spring 2001, Dartmouth College installed a campus-wide 802.11b
wireless network. To understand how that network is used, we examined
the usage characteristics of the network over a five-week period. We
monitored access points to determine user behavior, and user and
network traffic characteristics. Because our study coincided with the
deployment of the access points, our analysis captures the growth of
a wireless network. The results of this study help understand the
behavior of mobile users and provide a reference to network engineers
wishing to deploy and expand similar wireless networks.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T An Empirical Study of Training and Testing Error in Boosting
%A David D. Latham
%R Technical Report TR2001-394
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-394.ps.Z
%X
Bounds have been proven for both training and testing error for the boosting
algorithm AdaBoost, but in practice neither seem to produce a particularly
tight bound. In this paper we share some observations of these bounds from
empirical results, and then explore some properties of the algorithm with an
eye towards finding an improved bound for the performance of AdaBoost.
Based on our empirical evidence, the error of a hypothesis which labels
examples probabilistically based upon the confidence of the vote of the weak
hypotheses forms a tighter bound for the training error.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T An Implementation of Object-Oriented Program Transformation for Thought-Guided Debugging
%A Tiffany M. Wong
%R Technical Report TR2001-395
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-395.ps.Z
%X
This paper presents our design and implementation of program transformation for
C++ that will be used in the context of a thought-guided debugging system. The
program uses a lexical analyzer written in Flex and a grammar written in Bison
that work in conjunction to scan the inputted C++ code for function definitions
and class definitions. The code is then transformed to produce trace
information for each defined function, while the original functionality of the
code is left untouched. We also implement two additional data structures that
are used for information storage during the course of the program.
%Z
Senior Honors Thesis. Advisor: Tom Cormen
%T Implementing a Database Information System for an Electronic Baseball Scorecard
%A Tiffany M. Wong
%R Technical Report TR2001-396
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-396.ps.Z
%X
We present our design and implementation of a database system of information
storage and retrieval for an electronic baseball scorecard. The program uses
the relational MySQL database to hold information and a Tcl API to handle
interactions between the database and the user interface code. This paper
discusses the inner workings of how information storage was broken down inside
the database, how queries were internally constructed in accordance with the
user's input, and how statistics for players and teams were calculated and
returned to the user. Finally, we discuss some limitations attached to our
current implementation of the program and propose improvements that can be made
in future versions.
%Z
Senior Honors Thesis. Advisor: Tom Cormen
%T Supporting Adaptive Ubiquitous Applications with the SOLAR System
%A Guanling Chen
%A David Kotz
%R Technical Report TR2001-397
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-397.ps.Z
%X
As we embed more computers into our daily environment, ubiquitous
computing promises to make them less noticeable and help to prevent
information overload. We see, however, few ubiquitous applications
that are able to adapt to the dynamics of user, physical, and
computational context. We believe that there are two challenges
causing this lack of ubiquitous applications: there is no flexible and
scalable way to support information collection and dissemination in a
ubiquitous and mobile environment, and there is no general approach to
building adaptive applications given heterogeneous contextual
information. We propose a system infrastructure, Solar, to meet these
challenges. Solar uses a subscription-based operator graph abstraction
and allows dynamic composition of stackable operators to manage
ubiquitous information sources. After developing a set of diverse
adaptive applications, we expect to identify fundamental techniques
for context-aware adaptation. Our expectation is that Solar's
end-to-end support for information collection, dissemination, and
utilization will make it easy to build adaptive applications for a
ubiquitous mobile environment with many users and devices.
%T A System for Audio Personalization with Applications on Wireless Devices
%A David Marmaros
%R Technical Report TR2001-398
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-398.pdf
%X
We present and analyze a system for dynamically tailoring discrete audio content for numerous users based on aggregate data and intuitive feedback mechanisms. The framework for this system utilizes a flexible client-server architecture to facilitate audio dissemination, with particular attention to distribution over wireless networks. We discuss the requirements and specifications of such a system. We further analyze the algorithms and protocols required for its operation. Finally, we outline and provide data from a demonstration of this application.
%Z
Senior Honors Thesis. Advisors: David Kotz and Daniela Rus.
%T WebALPS Implementation and Performance Analysis: Using Trusted Co-servers to Enhance Privacy and Security of Web Interactions
%A Shan Jiang
%R Technical Report TR2001-399
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-399.ps.Z
%X
The client-server model of the Web poses a fundamental trust issue:
clients are forced to trust in secrecy and correctness of computation
occurring at a remote server of unknown credibility. The current solution
for this problem is to use a PKI (Public Key Infrastructure) system and
SSL (Secure Sockets Layer) digital certificates to prove the claimed
identity of a server and establish an authenticated, encrypted channel
between the client and this server. However, this approach does not
address the security risks posed by potential malicious server
operators or any third parties who may penetrate the server sites.
The WebALPS (Web Applications with Lots of Privacy and Security)
approach is proposed to address these weaknesses by moving sensitive
computations at server side into trusted co-servers running inside
high-assurance secure coprocessors.
In this report, we examine the foundations of the credibility of
WebALPS co-servers. Then we will describe our work of designing and
building a prototype WebALPS co-server, which is integrated into the
widely-deployed, commercial-grade Apache server. We will also present
the performance test results of our system which support the argument
that WebALPS approach provides a systematic and practical way to address the
remote trust issue.
%Z
Master's thesis.
%T An Armored Data Vault
%A Alex Iliev
%R Technical Report TR2001-400
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-400.pdf
%X
We consider the problem of secure long-term archiving of
network traffic, an instance of the problem of storing data
securely. We approach the problem using secure hardware, which enables
the enforcement of flexible access policy. The policy cannot be
circumvented by anyone, even insiders, and so we are assured that
access to the data is as originally intended. The policy can be
expressed as any feasible computation, as it will be checked inside
the secure hardware without possibility of interference. We discuss
our design of a device to perform such network data archiving and have
implemented a prototpe device. We discuss other possible application
areas of the design.
%Z
Senior Honors Thesis.
Advisor: Sean Smith.
%T Outbound Authentication for Programmable Secure Coprocessors
%A Sean W. Smith
%R Technical Report TR2001-401
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-401.ps.Z
%X
A programmable secure coprocessor platform can help solve many
security problems in distributed computing. These solutions usually
require that coprocessor applications be able to participate as
full-fledged parties in distributed cryptographic protocols. Thus, to
fully enable these solutions, a generic platform must not only provide
programmability, maintenance, and configuration in the hostile
field---it must also provide outbound authentication for the
entities that result. A particular application on a particular
untampered device must be able to prove who it is to a party on the
other side of the Internet.
To be effective, a secure outbound authentication service must closely
mesh with the overall security architecture. Our initial architecture
only sketched a rough design for this service, and did not complete
it. This paper presents our research and development experience in
refining and implementing this design, to provide PKI-based outbound
authentication for the IBM 4758 Model 2 secure coprocessor platform.
%T Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs
%A Jeremy T. Fineman
%R Technical Report TR2001-402
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-402.ps.Z
%X
We present an improved version of the Dimensional Method for computing
multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor
system when the data consist of too many records to fit into
memory. Data are spread across parallel disks and processed in
sections. We use the Parallel Disk Model for analysis.
The simple Dimensional Method performs the 1-dimensional FFTs for
each dimension in term. Between each dimension, an out-of-core
permutation is used to rearrange the data to contiguous locations.
The improved Dimensional Method processes multiple dimensions at a
time.
We show that determining an optimal sequence and groupings of
dimensions is NP-complete. We then analyze the effects of two
modifications to the Dimensional Method independently: processing
multiple dimensions at one time, and processing single dimensions in a
different order.
Finally, we show a lower bound on the I/O complexity of the
Dimensional Method and present an algorithm that is approximately
asymptotically optimal.
%Z
Senior Honors Thesis. Advisor: Tom Cormen.
%T EcomRISK.org : A site to classify and organize the risks of performing business on the Internet
%A Aidan S. Marcuss
%R Technical Report TR2001-403
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-403.ps.Z
%X
As the use of the Internet and other computer networks to transact
business grows, there is an ever increasing need for those taking
part in those transactions to understand the risks of doing so. While
there are many web sites that have created valuable databases of
specific vulnerabilities for certain types of hardware and software,
there is a lack of focus on attempting to analyze the interaction of
businesses, their systems, computer networks, and their customers and
the risks that are created by either intended or unattended
interactions. EcomRISK.org is a web site that presents a clear
taxonomy to classify these risks and provides other features to aid
in the general discussion of e-commerce risk. The site, and the
taxonomy at the center of it, creates a database of these incidents
so they can be clearly searched. This paper discusses the creation of
EcomRISK.org, from vision to birth.
%Z
Senior Honors Thesis. Advisor: Fillia Makedon.
%T Efficient Compression of Generic Function Dispatch Tables
%A Eric Kidd
%R Technical Report TR2001-404
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-404.ps.Z
%X
A generic function is similar to an overloaded operator, but provides a way
to select an appropriate behavior at run-time instead of compile-time.
Dujardin and colleagues have proposed an algorithm for building and
compressing generic function dispatch tables.
We present several modifications to their algorithm, including an
improvement to Pseudo-Closest-Poles and two new algorithms for compressing
pole tables. The two new compression algorithms are simple and fast, and
one produces smaller output than the original.
%Z
Senior Honors Thesis. Advisor: Chris Hawblitzel.
%T DaSSFNet: An Extension to DaSSF for High-Performance Network Modeling
%A Mehmet Iyigun
%R Technical Report TR2001-405
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-405.ps.Z
%X
Scalable Simulation Framework (SSF) is a discrete-event simulation
framework providing a unified programming interface geared towards
network simulation. Dartmouth SSF (DaSSF) is a C++ implementation of
SSF, designed for simulating very large-scale multi-protocol
communication networks. As of the latest release, DaSSF lacks many
features present in SSF and this prevents it from achieving
mainstream use. To alleviate this shortcoming we designed and
implemented DaSSFNet which extends DaSSF to the levels of
functionality found in SSF. In this paper, we show that DaSSFNet and
SSFNet are identical in operation given the same input. We also show
that DaSSFNet is about twice as fast and has one third the memory
consumption of SSFNet when simulating identical networks. Therefore,
we argue, that the DaSSF simulation package with DaSSFNet now offers
a viable alternative to SSF in high-performance network simulation.
%Z
Senior Honors Thesis. Advisor: David M. Nicol
%T Fastab: Solving the Pitch to Notation Problem
%A Jeremy I. Robin
%R Technical Report TR2001-406
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-406.ps.Z
%X
I have always been frustrated with the length of time necessary to
notate a piece of music. Computers have simplified so many other
aspects of our lives, it seems that they should be able to simplify
this task as well. In fact, there are already two distinct ways that
engineers have attempted to attack this problem. The first analyzes
the waveform generated by microphone input and relies on Fourier
Analysis and other similar methods. The other examines the analog
signal generated by a electric guitar-like pickup placed beneath the
strings. The method used by Fastab relies much less on the musical
properties of an instrument. Instead, Fastab records where and when
the fingers and pick contact the instrument using digital electronics
and microprocessor technology. Fastab provides a solution to the
pitch to notation problem which is cheaper and more accurate than any
other device available today.
%Z
Senior Honors Thesis. Advisor: Scot Drysdale
%T TCP/IP Implementation within the Dartmouth Scalable Simulation Framework
%A Michael G. Khankin
%R Technical Report TR2001-407
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-407.ps.Z
%X
This paper discusses TCP/IP networking, and in particular, the DaSSF
implementation of TCP/IP. The paper reviews the protocols, outlines the
implementation design, and demonstrates some tests. In addition, some
performance and memory usage analysis is performed.
We find DaSSF TCP/IP to be a viable option to the existing SSF. DaSSF
TCP/IP is faster and uses less memory so we can simulate larger, more
complex, models.
%Z
Senior Honors Thesis. Advisor: David M. Nicol
%T Market-based Control of Mobile-agent Systems
%A Jonathan L. Bredin
%R Technical Report TR2001-408
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-408.ps.Z
%X
Modern distributed systems scatter sensors, storage, and computation
throughout the environment.
Ideally these devices communicate and share resources, but there
is seldom motivation for a device's owner to yield control to another user.
We establish markets for computational resources
to motivate principals to share resources with arbitrary users, to enforce
priority in distributed systems, to provide flexible and rational limitations
on the potential of an application, and to provide a lightweight structure
to balance the workload over time and between devices.
As proof of concept, we implement a structure software agents
can use to discover and negotiate access to networked resources.
The structure separates discovery, authentication, and consumption
enforcement as separate orthogonal issues to give system
designers flexibility.
Mobile agents represent informational and computational flow.
We develop mechanisms that distributively allocate computation among
mobile agents in two settings.
The first models a situation where users collectively own networked
computing resources and require priority enforcement.
We extend the allocation mechanism to allow
resource reservation to mitigate utility volatility.
The second, more general model relaxes the ownership assumption.
We apply our computational market to an open setting where a principal's
chief concern is revenue maximization.
Our simulations compare the performance of market-based allocation
policies to traditional policies and relate the cost of ownership and
consumption separation.
We observe that our markets effectively prioritize applications'
performance, can operate under uncertainty and network delay, provide
metrics to balance network load, and allow measurement of
market-participation risk versus reservation-based computation.
In addition to allocation problems, we investigate resource selection
to optimize execution time.
The problem is NP-complete if the costs and latencies are constant.
Both metrics' dependence on the chosen set complicates matters.
We study how a greedy approach, a novel heuristic, and a
shortest-constrained-path strategy perform in mobile-agent applications.
Market-based computational-resource allocation fertilizes
applications where previously there was a dearth of motive for or means
of cooperation.
The rationale behind mobile-agent performance optimization is also useful
for resource allocation in general distributed systems where an application
has a sequence of dependent tasks or when data collection is expensive.
%Z
A Ph.D thesis from the Department of Computer Science, Dartmouth College.
%T Web Spoofing 2001
%A Yougu Yuan
%A Eileen Zishuang Ye
%A Sean W. Smith
%R Technical Report TR2001-409
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-409.ps.Z
%X
The Web is currently the pre-eminent medium for electronic
service delivery to remote users. As a consequence,
authentication of servers is more important than ever. Even
sophisticated users base their decision whether or not to trust a
site on browser cues---such as location bar information, SSL
icons, SSL warnings, certificate information, response time, etc.
In their seminal work on web spoofing, Felten et al showed how a
malicious server could forge some of these cues---but using
approaches that are no longer reproducible. However, subsequent
evolution of Web tools has not only patched security holes---it
has also added new technology to make pages more interactive and
vivid. In this paper, we explore the feasibility of web spoofing
using this new technology---and we show how, in many cases, every
one of the above cues can be forged.
In particular, we show how a malicious server can forge all the
SSL information a client sees---thus providing a cautionary tale
about the security of one of the most common applications of PKI.
We stress that these techniques have been implemented, and are
available for public demonstration.
%T Securing Web Servers against Insider Attack
%A Shan Jiang
%A Sean Smith
%A Kazuhiro Minami
%R Technical Report TR2001-410
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-410.ps.Z
%X
Too often, ``security of Web transactions'' reduces to ``encryption of
the channel''---and neglects to address what happens at the server on
the other end. This oversight forces clients to trust the good
intentions and competence of the server operator---but gives clients
no basis for that trust. Furthermore, despite academic and industrial
research in secure coprocessing, many in the computer science
community still regard ``secure hardware'' as a synonym for
``cryptographic accelerator.' This oversight neglects the real
potential of COTS secure coprocessing technology to establish trusted
islands of computation in hostile environments---such as at web
servers with risk of insider attack.
In this paper, we apply secure coprocessing and cryptography to solve
this real problem in Web technology. We present a vision: using
secure coprocessors to establish trusted co-servers at Web servers and
moving sensitive computations inside these co-servers. We present a
prototype implementation of this vision that scales to realistic
workloads. Finally, we validate this approach by building a simple
E-voting application on top of our prototype.
From our experience, we conclude that this approach provides a
practical and effective way to enhance the security of Web servers
against insider attack.
%T Write Once, Move Anywhere: Toward Dynamic Interoperability of Mobile Agent Systems
%A Arne Grimstrup
%A Robert S. Gray
%A David Kotz
%A Thomas Cowin
%A Greg Hill
%A Niranjan Suri
%A Daria Chacon
%A Martin Hofmann
%R Technical Report TR2001-411
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-411.ps.Z
%X
Mobile agents are an increasingly popular paradigm,
and in recent years there has been a proliferation
of mobile-agent systems. These systems are, however,
largely incompatible with each other. In particular,
agents cannot migrate to a host that runs a
different mobile-agent system. Prior approaches to
interoperability have tried to force agents to use a
common API, and so far none have succeeded. Our
goal, summarized in the catch phrase ``Write
Once, Move Anywhere,'' led to our efforts to develop
mechanisms that support dynamic runtime
interoperability of mobile-agent systems. This paper
describes the Grid Mobile-Agent System,
which allows agents to migrate to different
mobile-agent systems.
%Z
Revised July 25, 2001.
%T Detecting Steganographic Messages in Digital Images
%A Hany Farid
%R Technical Report TR2001-412
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-412.ps.Z
%X
Techniques and applications for information hiding have
become increasingly more sophisticated and widespread. With
high-resolution digital images as carriers, detecting the presence of
hidden messages has also become considerably more difficult. It is
sometimes possible, nevertheless, to detect (but not necessarily
decipher) the presence of embedded messages. The basic approach taken
here works by finding predictable higher-order statistics of
``natural'' images within a multi-scale decomposition, and then
showing that embedded messages alter these statistics.
%T Differential Elastic Image Registration
%A Senthil Periaswamy
%A Hany Farid
%R Technical Report TR2001-413
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2001
%U http://www.cs.dartmouth.edu/reports/TR2001-413.ps.Z
%X
We have applied techniques from differential motion
estimation to the problem of automatic elastic registration of medical
images. This method models the mapping between images as a locally
affine but globally smooth warp. The mapping also explicitly accounts
for variations in image intensities. This approach is simple and
highly effective across a broad range of medical images. We show the
efficacy of this approach on several synthetic and clinical images.
%T Decentralized Control for Coordinated flow of Multi-Agent Systems
%A Valentino Crespi
%A George Cybenko
%A Massimo Santini
%A Daniela Rus
%R Technical Report TR2002-414
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-414.ps.Z
%X
This paper describes a distributed algorithm for coordinating the flow
of a mass of vehicles approaching a highway exit or a tollbooth. We
provide the problem formulation, a general methodology for distributed
control and an instantiation of this methodology to the coordinated
flow problem. We analyze our algorithm and provide experimental data.
%T Future Directions for Mobile-Agent Research
%A David Kotz
%A Robert S. Gray
%A Daniela Rus
%R Technical Report TR2002-415
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-415.ps.Z
%X
During a discussion in September 2000 the authors
examined the future of research on mobile agents and
mobile code. (A mobile agent is a running program
that can move from host to host in network at times
and to places of its own choosing.) In this paper we
summarize and reflect on that discussion. It became
clear that the field should shift its emphasis
toward mobile code, in all its forms, rather than to
continue its narrow focus on mobile
agents. Furthermore, we encourage the development of
modular components, so that application designers
may take advantage of code mobility without needing
to rewrite their application to fit in a monolithic
mobile-agent system. There are many potential
applications that may productively use mobile code,
but there is no ``killer application'' for mobile
agents. Finally, we note that although security is
an important and challenging problem, there are many
applications and environments with security
requirements well within the capability of existing
mobile-code and mobile-agent frameworks.
%T Virtual Hierarchies - An Architecture for Building and Maintaining Efficient and Resilient Trust Chains.
%A John C. Marchesini
%A Sean W. Smith
%R Technical Report TR2002-416
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-416.ps.Z
%X
In Public Key Infrastructure (PKI), the simple, monopolistic CA model
works fine until we consider the real world. Then, issues such as
scalability and mutually suspicious organizations create the need for
a multiplicity of CAs, which immediately introduces the problem of how
to organize them to balance resilience to compromise against
efficiency of path discovery.
However, security has given us tools such as secure coprocessing,
secret splitting, secret sharing, and threshold cryptography for
securely carrying out computations among multiple trust domains;
distributed computing has given us peer-to-peer networking, for
creating self-organizing distributed systems.
In this paper, we use these latter tools to address the former
problem by overlaying a virtual hierarchy on a mesh architecture of
peer CAs, and achieving both resilience and efficiency.
%T Web Spoofing Revisited: SSL and Beyond
%A Eileen Ye
%A Yougu Yuan
%A Sean Smith
%R Technical Report TR2002-417
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-417.ps.Z
%X
Can users believe what their browsers tell them? Even sophisticated
Web users decide whether or not to trust a server based on browser
cues such as location bar information, SSL icons, SSL warnings,
certificate information, and response time. In their seminal work on
Web spoofing, Felten et al showed how, in 1996, a
malicious server could forge some of these cues. However, this work
used genuine SSL sessions, and Web technology has evolved much since
1996.
The Web has since become the pre-eminent medium for electronic service
delivery to remote users, and the security of many commerce,
government, and academic network applications critically rests on the
assumption that users can authenticate the servers with which they
interact. This situation raises the question: is the browser-user
communication model today secure enough to warrant this assumption?
In this paper, we answer this question by systematically showing how a
malicious server can forge every one of the above cues. Our work
extends the prior results by examining contemporary browsers, and by
forging all of the SSL information a client sees, including the very
existence of an SSL session (thus providing a cautionary tale about
the security of one of the most common applications of PKI). We have
made these techniques available for public demonstration, because
anything less than working code would not convincingly answer the
question. We also discuss implications and potential countermeasures,
both short-term and long-term.
%Z
This TR supercedes TR2001-409.
%T Trusted Paths for Browsers: An Open-Source Solution to Web Spoofing
%A Eileen Ye
%A Sean Smith
%R Technical Report TR2002-418
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-418.ps.Z
%X
The security of the vast majority of ``secure'' Web services rests on
SSL server PKI. However, this PKI doesn't work if the the adversary
can trick the browser into appearing to tell the user the wrong thing
about the certificates and cryptography. The seminal web spoofing
work of Felten et al demonstrated the potential, in 1996, for
malicious servers to impersonate honest servers. Our recent follow-up
work explicitly shows how malicious servers can still do this---and
can also forge the existence of an SSL session and the contents of the
alleged server certificate.
This paper reports the results of our work to systematically defend
against Web spoofing, by creating a trusted path from the browser to
the user. Starting with the Mozilla source, we have implemented
techniques that protect a wide variety of browser-user communications,
that require little participation by the user and minimal disruption
of the displayed server content. We have prepared shell scripts that
install these modifications on the Mozilla source, to enable others to
replicate this work.
In on-going work, we are cleaning up and fine-tuning our code. In
future work, we hope to examine more deeply the role of user
interfaces in enabling users to make effective trust judgments.
%T FFTs for the 2-Sphere - Improvements and Variations
%A Dennis M. Healy
%A Daniel N. Rockmore
%A Peter J. Kostelec
%A Sean S. B. Moore
%R Technical Report TR2002-419
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-419.ps.Z
%X
Earlier work by Driscoll and Healy has produced an efficient
algorithm for computing the Fourier transform of band-limited
functions on the 2-sphere. In this paper we present a reformulation
and variation of the original algorithm which results in a greatly
improved inverse transform, and consequent improved convolution
algorithm for such functions. All require at most $O(N\log^2 N)$
operations where $N$ is the number of sample points. We also address
implementation considerations and give heuristics for allowing
reliable and computationally efficient floating point implementations
of slightly modified algorithms.
These claims are supported by extensive numerical experiments from
our implementation in C on DEC, HP, SGI and Linux Pentium
platforms. These results indicate that variations of the algorithm
are both reliable and efficient for a large range of useful problem
sizes. Performance appears to be architecture-dependent. The paper
concludes with a brief discussion of a few potential applications.
%Z
Preliminary versions of some of these results have appeared in the
Dartmouth College Department of Computer Science Technical Report
PCS-TR94-222 and ``An FFT for the 2-sphere and Applications'',
Proc. of ICASSP-96, Volume 3, pp. 1323--1326.
%T Context Aggregation and Dissemination in Ubiquitous Computing Systems
%A Guanling Chen
%A David Kotz
%R Technical Report TR2002-420
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-420.ps.Z
%X
Many ``ubiquitous computing'' applications need a constant flow of
information about their environment to be able to adapt to their
changing context. To support these ``context-aware'' applications we
propose a graph-based abstraction for collecting, aggregating, and
disseminating context information. The abstraction models context
information as events, produced by sources and flowing through a
directed acyclic graph of event-processing operators and delivered to
subscribing applications. Applications describe their desired event
stream as a tree of operators that aggregate low-level context
information published by existing sources into the high-level context
information needed by the application. The operator graph is thus the
dynamic combination of all applications' subscription trees.
In this paper, we motivate and describe our graph abstraction, and
discuss a variety of critical design issues. We also sketch our Solar
system, an implementation that represents one point in the design
space for our graph abstraction.
%Z
To appear in WMCSA 2002.
%T Solar: A pervasive-computing infrastructure for context-aware mobile applications
%A Guanling Chen
%A David Kotz
%R Technical Report TR2002-421
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-421.ps.Z
%X
Emerging pervasive computing technologies transform the way we live
and work by embedding computation in our surrounding environment. To
avoid increasing complexity, and allow the user to concentrate on her
tasks, applications must automatically adapt to their changing
\emph{context}, the physical and computational environment in which
they run. To support these ``context-aware'' applications we propose
a graph-based abstraction for collecting, aggregating, and
disseminating context information. The abstraction models context
information as \emph{events}, which are produced by \emph{sources},
flow through a directed acyclic graph of event-processing
\emph{operators}, and are delivered to subscribing applications.
Applications describe their desired event stream as a tree of
operators that aggregate low-level context information published by
existing sources into the high-level context information needed by the
application. The \emph{operator graph\/} is thus the dynamic
combination of all applications' subscription trees. In this paper,
we motivate our graph abstraction by discussing several applications
under development, sketch the architecture of our system (``Solar'')
that implements our abstraction, report some early experimental
results from the prototype, and outline issues for future research.
%T Controlling access to pervasive information in the ``Solar'' system
%A Kazuhiro Minami
%A David Kotz
%R Technical Report TR2002-422
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-422.ps.Z
%X
Pervasive-computing infrastructures necessarily collect a lot of
context information to disseminate to their context-aware
applications. Due to the personal or proprietary nature of much of
this context information, however, the infrastructure must limit
access to context information to authorized persons. In this paper we
propose a new access-control mechanism for event-based
context-distribution infrastructures. The core of our approach is
based on a conservative information-flow model of access control, but
users may express discretionary relaxation of the resulting
access-control list (ACL) by specifying relaxation functions.
This combination of automatic ACL derivation and user-specified ACL
relaxation allows access control to be determined and enforced in a
decentralized, distributed system with no central administrator or
central policy maker. It also allows users to express their personal
balance between functionality and privacy. Finally, our
infrastructure allows access-control policies to depend on
context-sensitive roles, allowing great flexibility.
We describe our approach in terms of a specific context-dissemination
framework, the Solar system, although the same principles would apply
to systems with similar properties.
%T Characterizing Usage of a Campus-wide Wireless Network
%A David Kotz
%A Kobby Essien
%R Technical Report TR2002-423
%I Dartmouth College, Computer Science
%C Hanover, NH
%D March 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-423.ps.Z
%X
Wireless local-area networks (WLANs) are increasingly common, but
little is known about how they are used. A clear understanding of
usage patterns in real WLANs is critical information to those who
develop, deploy, and manage WLAN technology, as well as those who
develop systems and application software for wireless networks. This
paper presents results from the largest and most comprehensive trace
of network activity in a large, production wireless LAN. For eleven
weeks we traced the activity of nearly two thousand users drawn from a
general campus population, using a campus-wide network of 476 access
points spread over 161 buildings. Our study expands on those done by
Tang and Baker, with a significantly larger and broader population.
We found that residential traffic dominated all other traffic,
particularly in residences populated by newer students; students are
increasingly choosing a wireless laptop as their primary computer.
Although web protocols were the single largest component of traffic
volume, network backup and file sharing contributed an unexpectedly
large amount to the traffic. Although there was some roaming within a
network session, we were surprised by the number of situations in
which cards roamed excessively, unable to settle on one access point.
Cross-subnet roams were an especial problem, because they broke IP
connections, indicating the need for solutions that avoid or
accommodate such roams.
%Z
A major revision of this paper appeared in Mobicom 2002.
The Mobicom paper contained two erroneous figures, however;
another report, TR2002-432, is the final corrected version.
%T Metasearch: Data Fusion for Document Retrieval
%A Mark H. Montague
%R Technical Report TR2002-424
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-424.ps.Z
%X
The metasearch problem is to optimally merge the ranked lists
output by an arbitrary number of search systems into one ranked list.
In this work:
(1) We show that metasearch improves upon not just the raw performance
of the input search engines, but also upon the consistency of the
input search engines from query to query.
(2) We experimentally prove that simply weighting input systems by
their average performance can dramatically improve fusion results.
(3) We show that score normalization is an important component of a
metasearch engine, and that dependence upon statistical outliers
appears to be the problem with the standard technique.
(4) We propose a Bayesian model for metasearch that outperforms the
best input system on average and has performance competetive with
standard techniques.
(5) We introduce the use of Social Choice Theory to the metasearch
problem, modeling metasearch as a democratic election. We adapt a
positional voting algorithm, the Borda Count, to create a metasearch
algorithm, acheiving reasonable performance.
(6) We propose a metasearch model adapted from a majoritarian voting
procedure, the Condorcet algorithm. The resulting algorithm is the
best performing algorithm in a number of situations.
(7) We propose three upper bounds for the problem, each bounding a
different class of algorithms.
We present experimental results for each algorithm using two types of
experiments on each of four data sets.
%Z
Ph.D dissertation.
%T The Future of Cryptography Under Quantum Computers
%A Marco A. Barreno
%R Technical Report TR2002-425
%I Dartmouth College, Computer Science
%C Hanover, NH
%D July 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-425.ps.Z
%X
Cryptography is an ancient art that has passed through many
paradigms, from simple letter substitutions to polyalphabetic
substitutions to rotor machines to digital encryption to public-key
cryptosystems. With the possible advent of quantum computers and the
strange behaviors they exhibit, a new paradigm shift in cryptography
may be on the horizon. Quantum computers could hold the potential to
render most modern encryption useless against a quantum-enabled
adversary. The aim of this thesis is to characterize this convergence
of cryptography and quantum computation.
We provide definitions for cryptographic primitives that frame them
in general terms with respect to complexity. We explore the various
possible relationships between BQP, the primary quantum complexity
class, and more familiar classes, and we analyze the possible
implications for cryptography.
%Z
This paper was written as a senior honors thesis with advisor Sean
W. Smith.
%T Role Definition Language (RDL): A Language to Describe Context-Aware Roles
%A Christopher P. Masone
%R Technical Report TR2002-426
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-426.ps.Z
%X
As wireless networks become more prevalent, a widening array
of computational resources becomes available to the mobile user.
Since not all users should have unrestricted access to these
resources, a method of access control must be devised. In a
context-aware environment, context information can be used to
supplement more conventional password-based access control systems.
We believe the best way to achieve this is through the use of
Context-Aware Role-Based Access Control, a model in which permissions
are assigned to entities called roles, each principal is a member of
one or more roles, and a role's membership is determined using
context information. We designed and implemented RDL
(Role-Definition Language), a simple, expressive and somewhat
extensible programming language to facilitate the description of
roles in terms of context information.
%Z
Senior Honors Thesis. Advisor: David Kotz.
%T Performance and Interoperability In Solar
%A A. Abram White
%R Technical Report TR2002-427
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-427.ps.Z
%X
Ubiquitous computing promises to integrate computers into our physical
environment, surrounding us with applications that are able to adapt
to our dynamics. Solar is a software infrastructure designed to
deliver contextual information to these applications. To serve the
large number and wide variety of context-aware devices envisioned by
ubiquitous computing, Solar must exhibit both high performance and the
ability to interoperate with many computing platforms. We created a
testing framework to measure the performance of distributed systems
such as Solar, as well as a pluggable data-transfer mechanism to
support the dissemination of information to heterogeneous
applications. This paper explores the testing framework developed,
analyzes its findings concerning the performance of the current Solar
prototype, presents several optimizations to Solar and their effects,
and finally discusses the design of the pluggable data-transfer
mechanism.
%Z
Senior Honors Thesis. Advisor: David Kotz.
See also TR2002-429.
%T Information-theoretic Bounds on the Training and Testing Error of Boosting
%A Sebastien M. Lahaie
%R Technical Report TR2002-428
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-428.ps.Z
%X
Boosting is a means of using weak learners as subroutines to produce a
strong learner with markedly better accuracy. Recent results showing
the connection between logistic regression and boosting provide the
foundation for an information-theoretic analysis of boosting. We
describe the analogy between boosting and gambling, which allows us to
derive a new upper bound on training error. This upper bound
explicitly describes the effect of noisy data on training error. We
also use information-theoretic techniques to derive an alternative upper-bound
on testing error which is independent of the size of the weak-learner
space.
%T XSLT and XQuery as Operator Languages
%A A. Abram White
%R Technical Report TR2002-429
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-429.ps.Z
%X
Ubiquitous computing promises to integrate computers into our physical
environment, surrounding us with applications that are able to adapt
to our dynamics. Solar is a software infrastructure designed to
deliver contextual information to these applications. Solar
represents context data as events, and uses small programs called
operators to filter, merge, aggregate, or transform event streams.
This paper explores the possibility of using XSLT and XQuery to build
language-neutral Solar operators.
%Z
See also TR2002-427.
%T Building Trusted Paths for Web Browsers
%A Eileen Zishuang Ye
%R Technical Report TR2002-430
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-430.ps.Z
%X
The communication between the Web browser and the human
user is one component of the server-client channel. It is not
the user but the browser that receives all server
information and establishes the secure connection. The
browser's user interface signals, such as SSL lock, https
protocol header et al., indicate whether the browser-server
communication at the current moment is secure. Those
user interface signals indicating the security status of
browser should be clearly and correctly understood by the
user.
A survey of modern Web browsers shows the information
provided by current browsers is insufficient for users to
make trust judgment. Our Web spoofing work further proved
that the browser status information is not reliable either.
We discuss the criteria for and how to build the
trusted paths between a browser and a human user. We present
an open source implementation of one of the
designs--synchronized random dynamic (SRD) boundary, based
on Modified Mozilla source code, together with its usability
study results.
%T Analysis of Protein Sequences Using Time Frequency and Kolmogorov-Smirnov Methods
%A Kobby Essien
%R Technical Report TR2002-431
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-431.ps.Z
%X
The plethora of genomic data currently available has resulted in a
search for new algorithms and analysis techniques to interpret
genomic data. In this two-fold study we explore techniques for
locating critical amino acid residues in protein sequences and for
estimating the similarity between proteins. We demonstrate the use
of the Short-Time Fourier Transform and the Continuous Wavelet
Transform together with amino acid hydrophobicity in locating
important amino acid domains in proteins and also show that the
Kolmogorov-Smirnov statistic can be used as a metric of protein
similarity.
%Z
Senior Honors Thesis. Advisor: Metin Akay.
%T Analysis of a Campus-wide Wireless Network
%A David Kotz
%A Kobby Essien
%R Technical Report TR2002-432
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-432.ps.Z
%X
Understanding usage patterns in wireless local-area networks (WLANs)
is critical for those who develop, deploy, and manage WLAN technology,
as well as those who develop systems and application software for
wireless networks. This paper presents results from the largest and
most comprehensive trace of network activity in a large, production
wireless LAN. For eleven weeks we traced the activity of nearly two
thousand users drawn from a general campus population, using a
campus-wide network of 476 access points spread over 161 buildings.
Our study expands on those done by Tang and Baker, with a
significantly larger and broader population.
We found that residential traffic dominated all other traffic,
particularly in residences populated by newer students; students are
increasingly choosing a wireless laptop as their primary computer.
Although web protocols were the single largest component of traffic
volume, network backup and file sharing contributed an unexpectedly
large amount to the traffic. Although there was some roaming within a
network session, we were surprised by the number of situations in
which cards roamed excessively, unable to settle on one access point.
Cross-subnet roams were an especial problem, because they broke IP
connections, indicating the need for solutions that avoid or
accommodate such roams.
%Z
This paper is a revision of the MOBICOM '02 paper by the same
title. The only difference is the correction of Figures 27-28
and the associated text. This report supplants TR2002-423.
%T Using the Emulab network testbed to evaluate the Armada I/O framework for computational grids
%A Ron Oldfield
%A David Kotz
%R Technical Report TR2002-433
%I Dartmouth College, Computer Science
%C Hanover, NH
%D September 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-433.ps.Z
%X
This short report describes our experiences using the Emulab network testbed
at the University of Utah to test performance of the Armada framework for
parallel I/O on computational grids.
%T Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
%A Ryan H. Lilien
%A Hany Farid
%A Bruce R. Donald
%R Technical Report TR2002-434
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-434.ps.Z
%X
We have developed an algorithm called Q5 for probabilistic
classification of healthy vs. disease whole serum samples using mass
spectrometry. The algorithm employs Principal Components Analysis (PCA)
followed by Linear Discriminant Analysis (LDA) on whole spectrum
Surface-Enhanced Laser Desorption/Ionization Time of Flight
(SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four
real datasets from complete, complex SELDI spectra of human blood
serum.
Q5 is a closed-form, exact solution to the problem of classification
of complete mass spectra of a complex protein mixture. Q5 employs a
novel probabilistic classification algorithm built upon a
dimension-reduced linear discriminant analysis. Our solution is
computationally efficient; it is non-iterative and computes the
optimal linear discriminant using closed-form equations. The optimal
discriminant is computed and verified for datasets of complete,
complex SELDI spectra of human blood serum. Replicate experiments of
different training/testing splits of each dataset are employed to
verify robustness of the algorithm. The probabilistic classification
method achieves excellent performance. We achieve sensitivity,
specificity, and positive predictive values above 97% on three
ovarian cancer datasets and one prostate cancer dataset. The Q5 method
outperforms previous full-spectrum complex sample spectral
classification techniques, and can provide clues as to the molecular
identities of differentially-expressed proteins and peptides.
%Z
To appear in Journal of Computational Biology (2003).
%T Distributed Algorithms for Guiding Navigation across a Sensor Network
%A Qun Li
%A Michael De Rosa
%A Daniela Rus
%R Technical Report TR2002-435
%I Dartmouth College, Computer Science
%C Hanover, NH
%D October 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-435.ps.Z
%X
We develop distributed algorithms for self-reconfiguring sensor
networks that respond to directing a target through a region. The
sensor network models the danger levels sensed across its area and has
the ability to adapt to changes. It represents the dangerous areas as
obstacles. A protocol that combines the artificial potential field of
the sensors with the goal location for the moving object guides the
object incrementally across the network to the goal, while maintaining
the safest distance to the danger areas. We report on hardware
experiments using a physical sensor network consisting of Mote
sensors.
%T Heterogeneous Self-Reconfiguring Robotics
%A Robert C. Fitch
%R Technical Report TR2002-436
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-436.ps.Z
%X
Self-reconfiguring robots are modular systems that can change shape,
or "reconfigure," to match structure to task. They comprise many
small, discrete, often identical modules that connect together and
that are minimally actuated. Global shape transformation is achieved
by composing local motions. Systems with a single module type, known
as "homogeneous" systems, gain fault tolerance, robustness and low
production cost from module interchangeability. However, we are
interested in "heterogeneous" systems, which include multiple types of
modules such as those with sensors, batteries or wheels. We believe
that heterogeneous systems offer the same benefits as homogeneous
systems with the added ability to match not only structure to task,
but also capability to task.
Although significant results have been achieved in understanding
homogeneous systems, research in heterogeneous systems is challenging
as key algorithmic issues remain unexplored. We propose in this thesis
to investigate questions in four main areas: 1) how to classify
heterogeneous systems, 2) how to develop efficient heterogeneous
reconfiguration algorithms with desired characteristics, 3) how to
characterize the complexity of key algorithmic problems, and 4) how to
apply these heterogeneous algorithms to perform useful new tasks in
simulation and in the physical world. Our goal is to develop an
algorithmic basis for heterogeneous systems. This has theoretical
significance in that it addresses a major open problem in the field,
and practical significance in providing self-reconfiguring robots with
increased capabilities.
%T Proofs of Soundness and Strong Normalization for Linear Memory Types
%A Heng Huang
%A Chris Hawblitzel
%R Technical Report TR2002-437
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-437.ps.Z
%X
Efficient low-level systems need more control over memory than safe high-level languages usually provide. As a result, run-time systems are typically written in unsafe languages such as C. This report describes an abstract machine designed to give type-safe code more control over memory. It includes complete definitions and proofs.
%T Exact formulae for the Lovasz Theta Function of sparse Circulant Graphs
%A Valentino Crespi
%R Technical Report TR2002-438
%I Dartmouth College, Computer Science
%C Hanover, NH
%D November 2002
%U http://www.cs.dartmouth.edu/reports/TR2002-438.ps.Z
%X
The Lovasz theta function has attracted a lot of attention for its
connection with diverse issues, such as communicating without errors
and computing large cliques in graphs. Indeed this function enjoys the
remarkable property of being computable in polynomial time, despite
being sandwitched between clique and chromatic number, two well known
hard to compute quantities.
In this paper I provide a closed formula for the Lovasz function of a
specific class of sparse circulant graphs thus generalizing Lovasz
results on cycle graphs (circulant graphs of degree 2).
%T 3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings
%A Chris J. Langmead
%A Bruce R. Donald
%R Technical Report TR2003-439
%I Dartmouth College, Computer Science
%C Hanover, NH
%D January 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-439.pdf
%X
Recognition of a protein's fold provides valuable
information about its function. While many sequence-based
homology prediction methods exist, an important challenge remains:
two highly dissimilar sequences can have similar folds --- how can
we detect this rapidly, in the context of structural genomics?
High-throughput NMR experiments, coupled with novel algorithms for
data analysis, can address this challenge. We report an automated
procedure for detecting 3D-structural homologies from sparse,
unassigned protein NMR data.
Our method identifies the 3D-structural models in a protein
structural database whose geometries best fit the unassigned
experimental NMR data. It does not use sequence information and is
thus not limited by sequence homology. The method can also be
used to confirm or refute structural predictions made by other
techniques such as protein threading or sequence homology. The
algorithm runs in O(pnk3) time, where p is the number of
proteins in the database, n is the number of residues in the
target protein, and k is the resolution of a rotation search.
The method requires only uniform 15N-labelling of the protein
and processes unassigned 1H-15N residual dipolar couplings,
which can be acquired in a couple of hours. Our experiments on NMR
data from 5 different proteins demonstrate that the method
identifies closely related protein folds, despite low-sequence
homology between the target protein and the computed
model.
%Z
A revised and expanded version of this TR will appear as a refereed
paper at the
IEEE Computer Society Bioinformatics Conference
,
Stanford, California (August, 2003),
%T Efficient Security for BGP Route Announcements
%A David M. Nicol
%A Sean W. Smith
%A Meiyuan Zhao
%R Technical Report TR2003-440
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-440.R2.ps.Z
%X
The Border Gateway Protocol (BGP) determines how Internet traffic is
routed throughout the entire world; malicious behavior by one or more
BGP speakers could create serious security issues. Since the protocol
depends on a speaker honestly reporting path information sent by
previous speakers and involves a large number of independent speakers,
the Secure BGP (S-BGP) approach uses public-key cryptography to ensure
that a malicious speaker cannot fabricate this information. However,
such public-key cryptography is expensive: S-BGP requires a digital
signature operation on each announcement sent to each peer, and a
linear (in the length of the path) number of verifications on each
receipt. We use simulation of a 110 AS system derived from the
Internet to evaluate the impact that the processing costs of
cryptography have on BGP convergence time. We find that under heavy
load the convergence time using ordinary S-BGP is nearly twice as
large as under BGP. We examine the impact of highly aggressive caching
and pre-computation optimizations for S-BGP, and find that convergence
time is much closer to BGP. However, these optimizations may be
unrealistic, and are certainly expensive of memory. We consequently
use the structure of BGP processing to design optimizations that
reduce cryptographic overhead by amortizing the cost of private-key
signatures over many messages. We call this method
Signature-Amortization (S-A). We find that S-A provides as good or
better convergence times as the highly optimized S-BGP, but without
the cost and complications of caching and pre-computation. It is
possible therefore to minimize the impact route validation has on
convergence, by being careful with signatures, rather than consumptive
of memory.
%Z
Revision 2 released May 9, 2003.
Original revision 1, of February 2003, is available in
pdf or
ps.Z.
%T Flexible and Scalable Public Key Security for SSH
%A Yasir Ali
%A S. W. Smith
%R Technical Report TR2003-441
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-441.pdf
%X
A standard tool for secure remote access, the SSH protocol uses public-key cryptography to establish an encrypted and integrity-protected channel with a remote server. However, widely-deployed implementations of the protocol are vulnerable to man-in-the-middle attacks, where an adversary substitutes her public key for the server's. This danger particularly threatens a traveling user Bob borrowing a client machine.
Imposing a traditional X.509 PKI on all SSH servers and clients is neither flexible nor scalable nor (in the foreseeable future) practical. Requiring extensive work or an SSL server at Bob's site is also not practical for many users.
This paper presents our experiences designing and implementing an alternative scheme that solves the public-key security problem in SSH without requiring such an a priori universal trust structure or extensive sysadmin work--although it does require a modified SSH client. (The code is available for public download.)
%Z
A revised version, published as a conference paper, as follows:
Y. Ali, S.W. Smith.
"Flexible and Scalable Public Key Security for SSH."
Public Key Infrastructure: EuroPKI 2004, pp. 43-56,
LNCS 3093, June 2004. Springer-Verlag. DOI 10.1007/b98201.
http://dx.doi.org/10.1007/b98201
http://www.springerlink.com/content/h8jng6kc1rf3j97g/
%T Privacy-enhanced credential services
%A Alex Iliev
%A Sean Smith
%R Technical Report TR2003-442
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-442.ps.Z
%X
The use of credential directories in PKI and authorization systems such as
Shibboleth introduces a new privacy risk: an insider at the directory can learn
much about otherwise protected interactions by observing who makes queries, and
what they ask for. Recent advances in Practical Private Information Retrieval
provide promising countermeasures. In this paper, we extend this technology to
solve this new privacy problem, and present a design and preliminary prototype
for a LDAP-based credential service that can prevent even an insider from
learning anything more than the fact a query was made. Our preliminary
performance analysis suggests that the complete prototype may be sufficiently
robust for academic enterprise settings.
%Z
Submitted to the 2nd Annual PKI Research Workshop.
%T Keyjacking: Risks of the Current Client-side Infrastructure
%A John C. Marchesini
%A Sean W. Smith
%A Meiyuan Zhao
%R Technical Report TR2003-443
%I Dartmouth College, Computer Science
%C Hanover, NH
%D February 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-443.ps.Z
%X
In theory, PKI can provide a flexible and strong way to
authenticate users in distributed information systems. In practice,
much is being invested in realizing this vision via client-side SSL
and browser-based keystores. Exploring this vision, we demonstrate
that browsers will use personal certificates to authenticate requests
that the person neither knew of nor approved (and which password-based
systems would have defeated), and we demonstrate the easy permeability
of these keystores (including new attacks on medium and high-security
IE/XP keys). We suggest some countermeasures, but also suggest that a
fundamental rethinking of the trust, usage, and storage model might
result in a more effective PKI.
%T Stupid Columnsort Tricks
%A Geeta Chaudhry
%A Thomas H. Cormen
%R Technical Report TR2003-444
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-444.ps.Z
%X
Leighton's columnsort algorithm sorts on an $r \times s$ mesh, subject
to the restrictions that $s$ is a divisor of~$r$ and that $r \geq
2s^2$ (so that the mesh is tall and thin). We show how to mitigate
both of these restrictions. One result is that the requirement that
$s$ is a divisor of~$r$ is unnecessary; columnsort sorts correctly
whether or not $s$ divides~$r$. We present two algorithms that, as
long as $s$ is a perfect square, relax the restriction that $r \geq
2s^2$; both reduce the exponent of~$s$ to~$3/2$. One algorithm
requires $r \geq 4s^{3/2}$ if $s$ divides~$r$ and $r \geq 6s^{3/2}$ if
$s$ does not divide~$r$. The other algorithm requires $r \geq
4^{3/2}$, and it requires $s$ to be a divisor of~$r$. Both algorithms
have applications in increasing the maximum problem size in
out-of-core sorting programs.
%T Relaxing the Problem-Size Bound for Out-of-Core Columnsort
%A Geeta Chaudhry
%A Elizabeth A. Hamon
%A Thomas H. Cormen
%R Technical Report TR2003-445
%I Dartmouth College, Computer Science
%C Hanover, NH
%D April 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-445.ps.Z
%X
Previous implementations of out-of-core columnsort limit the problem
size to $N \leq \sqrt{(M/P)^3 / 2}$, where $N$ is the number of
records to sort, $P$ is the number of processors, and $M$ is the total
number of records that the entire system can hold in its memory (so
that $M/P$ is the number of records that a single processor can hold
in its memory). We implemented two variations to out-of-core
columnsort that relax this restriction. Subblock columnsort is based
on an algorithmic modification of the underlying columnsort algorithm,
and it improves the problem-size bound to $N \leq (M/P)^{5/3} /
4^{2/3}$ but at the cost of additional disk I/O\@. $M$-columnsort
changes the notion of the column size in columnsort, improving the
maximum problem size to $N \leq \sqrt{M^3 / 2}$ but at the cost of
additional computation and communication. Experimental results on a
Beowulf cluster show that both subblock columnsort and $M$-columnsort
run well but that $M$-columnsort is faster. A further advantage of
$M$-columnsort is that it handles a wider range of problem sizes than
subblock columnsort.
%T Efficient and Practical Constructions of LL/SC Variables
%A Prasad Jayanti
%A Srdjan Petrovic
%R Technical Report TR2003-446
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-446.pdf
%X
Over the past decade, LL/SC have emerged as the most suitable
synchronization instructions for the design of lock-free algorithms.
However, current architectures do not support these instructions;
instead, they support either CAS or RLL/RSC (e.g. POWER4, MIPS, SPARC, IA-64). To bridge this gap, this paper presents two efficient wait-free
algorithms for implementing 64-bit LL/SC objects from 64-bit CAS or
RLL/RSC objects.
Our first algorithm is practical: it has a small, constant time complexity (of 4 for LL and 5 for SC) and a space overhead of only 4 words per process. This algorithm uses unbounded sequence numbers.
For theoretical interest, we also present a more complex bounded algorithm
that still guarantees constant time complexity and O(1) space overhead per process.
The LL/SC primitive is free of the well-known ABA problem that afflicts CAS. By efficiently implementing LL/SC words from CAS words,
this work presents an efficient general solution to the ABA problem.
%T Billiards Adviser as a Search in a Continuous Domain with Significant Uncertainty
%A Thomas Mueller
%R Technical Report TR2003-448
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-448.pdf
%X
Typical search algorithms are limited to problems in which there is a certain number of moves for any given state, and the effect of each move is well known. In order to overcome this limitation, we consider the problem of determining the optimal shot given the positions of balls on a billiards table. Our solution includes the image recognition necessary to determine each ball's position, the calculation of the optimal shot, and the presentation of that shot to the player. The focus of the paper is on the second part - determining the angle and force with which the player should attempt to hit the cue ball for each shot in order to sink all of the other balls with the fewest shots. The solution to this problem is unique from other game search algorithms in that it must take into account the infinite number of possible shots given any configuration of balls as well as the fact that the player is not likely to hit the ball exactly how he attempts to do so. We compare the performance of our algorithm with one that ignores the latter fact to show that our modifications do in fact improve performance for a search in a continuous domain with significant uncertainty.
%Z
Advisor: Zack Butler
%T An Active Learning Approach to Efficiently Ranking Retrieval Engines
%A Lisa A. Torrey
%R Technical Report TR2003-449
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-449.ps.Z
%X
Evaluating retrieval systems, such as those submitted to the
annual TREC competition, usually requires a large number of
documents to be read and judged for relevance to query
topics. Test collections are far too big to be exhaustively
judged, so only a subset of documents is selected to form
the judgment ``pool.'' The selection method that TREC uses
produces pools that are still quite large. Research has
indicated that it is possible to rank the retrieval systems
correctly using substantially smaller pools.
This paper introduces an active learning algorithm whose
goal is to reach the correct rankings using the smallest
possible number of relevance judgments. It adds one document
to the pool at a time, always trying to select the document
with the highest information gain. Several variants of this
algorithm are described, each with improvements on the one
before. Results from experiments are included for comparison
with the traditional TREC pooling method. The best version
of the algorithm reliably outperforms the traditional
method, although its degree of improvement varies.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T An Evaluation of the Impact of Models for Radio Propagation on the Simulation of 802.11b Wireless Networks
%A Evan W. Richardson
%R Technical Report TR2003-450
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-450.ps.Z
%X
Working with an existing wireless network simulator, we describe the addition
of both a method for modeling arbitrary terrain, and for calculating signal
attenuation with the Irregular Terrain Model (ITM). We also investigate ITM's effects on upper protocol layer in comparison to the Two-Ray Ground Reflection
model. Upon examination, it was found that aside from the terrain between the
transmitter and receiver, ITM's various parameters are of little significance
in the computed signal attenuation. Further, examination of the behavior of the
upper protocol layers revealed that at high traffic levels, choice of propagation
model can have significant effects on the results of the simulation.
%Z
Senior Honors Thesis. Advisor: Felipe Perrone.
%T 802.11b Wireless Network Visualization and Radiowave Propagation Modeling
%A Chris Lentz
%R Technical Report TR2003-451
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-451.pdf
%X
This paper outlines the methods of creating detailed coverage maps
of 802.11b networks, with an emphasis on minimizing the expenses and
time involved. The goal of this work is to develop and present a
streamlined, reproducible approach to wireless visualization as well
as techniques for predicting coverage area before conducting network
installations.
After evaluating these coverage maps, a repeated series of field
measurements will be checked against interpolated values in order to
improve techniques for extrapolation of data for unsampled regions. If
successful, these extrapolation techniques will provide additional
guidelines for, and assist modeling of, new wireless network
installations. However, this paper demonstrates that due to the
microcellular structure of indoor/outdoor 802.11b networks, accurate
interpolation and propagation prediction techniques do not exist
independent of highly specific location models. In lieu of the
creation of extensive simulation environments, best practice
guidelines for municipal wireless network planning and deployment are
presented.
%Z
Senior thesis. Advisor: David Kotz.
%T An Analysis of Convergence Properties of the Border Gateway Protocol Using Discrete Event Simulation
%A Brian J. Premore
%R Technical Report TR2003-452
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-452.ps.Z
%X
The Internet is an enormous internetwork formed by connecting tens of
thousands of independently managed computer networks. Though the
Internet has no central authority and is highly heterogeneous, a
universally adopted addressing scheme---defined by the Internet Protocol
(IP)---makes interaction between the individual networks possible.
Complementing IP is the Border Gateway Protocol (BGP), which facilitates
communication between parts of the internetwork by determining paths by
which data can get from one network to any other. Just as IP is used
ubiquitously as an addressing scheme, BGP is used ubiquitously for the
purpose of network-to-network routing.
Because BGP is universal, its well-being is the concern of everyone.
In other words, when BGP suffers, everyone suffers. Even when just one
instance of BGP on one router is ill-behaved, it can have global
effects. Unfortunately, as the Internet has grown, the amount of stress
put on BGP has increased. For a long time, the behavior of inter-domain
routing was studied minimally and was assumed to be working just fine.
Research eventually showed, however, that routing was not actually
functioning so smoothly, and the highly dynamic nature of the Internet
was taking its toll on the routing infrastructure. This discovery
prompted a closer look at the behavior of BGP.
Though its underlying premise is a simple distributed shortest-path
algorithm, the dynamic nature of the Internet, combined with some
additional constraints in the protocol, has made analytical approaches
to studying the protocol infeasible. Measurement-based approaches have
been taken, but they are difficult to implement and have minimal leeway
for allowing exploration of the protocol's behavior under different
conditions. For these reasons we have taken the approach of simulation
in order to begin to understand some of the complex ways in which BGP
behaves. Simulation allows one to explore the protocol more fully,
testing it under various conditions and modifying the protocol itself to
explore the consequences of its fundamental design.
We have studied BGP behavior with respect to several parameters, some
external (network characteristics) and some internal (protocol
characteristics). We show that there is room for improvement in the
protocol, in particular with respect to convergence following changes in
availability of an address in the network. The rate-limiting mechanism
of the protocol is a particular parameter of concern. Although it was
previously thought to help improve convergence, we found that in some
cases it can have drastic degrading effects. As a result of our work,
we suggest ways in which BGP could be modified in practice to reduce the
instability of the protocol.
%Z
This is a Ph.D. thesis. It differs from the version of the thesis which
appears in the Dartmouth College library in a couple of significant
ways. First, it is single-spaced. Figures have moved around in order
to accommodate this change. Second, it includes some corrections. The
primary correction is that some bibliography entries were reordered in
order to properly alphabetize them. This has the side effect that the
numbered citations throughout the document are different in this version
than in the original version.
%T SPADE: SPKI/SDSI for Attribute Release Policies in a Distributed Environment
%A Sidharth P. Nazareth
%R Technical Report TR2003-453
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-453.ps.Z
%X
Shibboleth is a federated administrated system that supports
inter-institutional authentication and authorization for sharing of
resources. SPKI/SDSI is a public key infrastructure whose creation
was motivated by the perception that X.509 is too complex and
flawed. This thesis addresses the problem of how users that are part
of a Public Key Infrastructure in a distributed computing system can
effectively specify, create, and disseminate their Attribute Release
Policies for Shibboleth using SPKI/SDSI. This thesis explores
existing privacy mechanims, as well as distributed trust management
and policy based systems. My work describes the prototype for a Trust
Management Framework called SPADE (SPKI/SDSI for Attribute Release
Policies in a Distributed Environment) that I have designed,
developed and implemented. The principal result of this research has
been the demonstration that SPKI/SDSI is a viable approach for trust
management and privacy policy specification, especially for
minimalistic policies in a distributed environment.
%Z
M.S Thesis.
Advisor: Sean Smith
%T Discrete-Event Fluid Modeling of Background TCP Traffic
%A David M. Nicol
%A Guanhua Yan
%R Technical Report TR2003-454
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-454.ps.Z
%X
TCP is the most widely
used transport layer protocol used in the internet today.
A TCP session adapts the demands it places on the network
to observations of bandwidth availability on the network.
Because TCP is adaptive, any model of its behavior that aspires to be
accurate must be influenced by other network traffic.
This point is especially important in the context of
using simulation to evaluate some new network algorithm of interest
(e.g. reliable multi-cast) in an environment where the background
traffic affects---and is affected by---its behavior.
We need to generate background traffic efficiently in a way
that captures the salient features of TCP, while
the reference and background traffic representations
interact with each other.
This paper describes a fluid model of TCP and a switching
model that has flows represented by fluids interacting with
packet-oriented flows. We describe
conditions under which a fluid model
produces exactly the same behavior
as a packet-oriented model, and we
quantify the performance advantages of the approach
both analytically and empirically. We observe that very significant
speedups may be attained while keeping high accuracy.
%T Persistence and Prevalence in the Mobility of Dartmouth Wireless Network Users
%A Clara E. Lee
%R Technical Report TR2003-455
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-455.ps.Z
%X
Wireless local-area networks (WLANs) are increasing in popularity. As
more people use WLANs it is important to understand how these users
behave. We analyzed data collected over three months of 2002 to
measure the persistence and prevalence of users of the Dartmouth
wireless network.
We found that most of the users of Dartmouth's network have
short association times and a high rate of mobility. This
observation fits with the predominantly student population of
Dartmouth College, because students do not have a fixed
workplace and are moving to and from classes all day.
%Z
The data in this paper is highly suspect; see TR2003-480.
%T Discovery, Visualization and Analysis of Gene Regulatory Sequence Elements in Genomes
%A Daniel F. Simola
%R Technical Report TR2003-456
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-456.pdf
%X
The advent of rapid DNA sequencing has produced an explosion in the amount of available sequence information, permitting us to ask many new questions about DNA. There is a pressing need to design algorithms that can provide answers to questions related to the control of gene expression, and thus to the structure, function, and behavior of organisms. Such algorithms must filter through massive amounts of informational noise to identify meaningful conserved regulatory DNA sequence elements.
We are approaching these questions with the notion that visualization is a key to exploring data relationships. Understanding the exact nature of these relationships can be very difficult by simply interpreting raw data. The ability to look at data in a graphical form allows us to apply our innate capacity to think visually to discern the subtle relationships that might not be recognizable otherwise.
This thesis provides computational tools to visually identify and analyze candidate motifs in the DNA of a species. This includes a parsing utility to store genomic data and an application to search for and visually identify motifs. Using these tools, novel and previously compiled gene sets were identified using the genome of the plant species Arabidopsis thaliana.
%Z
Senior Honors Thesis. Advisor: Jay Aslam.
%T Electronic Documents and Digital Signatures
%A Kunal Kain
%R Technical Report TR2003-457
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-457.pdf
%X
Often, the main motivation for using PKI in business environments is to streamline workflow, by enabling humans to digitally sign electronic documents, instead of manually signing paper ones. However, this application fails if adversaries can construct electronic documents whose viewed contents can change in useful ways, without invalidating the digital signature. In this paper, we examine the space of such attacks, and describe how many popular electronic document formats and PKI packages permit them.
%Z
A revised version was published as follows:
K. Kain, S.W. Smith, R. Asokan.
"Digital Signatures and Electronic Documents: A Cautionary Tale."
Advanced Communications and Multimedia Security,
pp. 293-307, September 2002. Kluwer Academic Publishers.
http://portal.acm.org/citation.cfm?id=647802.737169
http://www.cs.dartmouth.edu/~sws/pubs/ksa02.pdf
%T Power Conservation in the Network Stack of Wireless Sensors
%A Michael De Rosa
%R Technical Report TR2003-458
%I Dartmouth College, Computer Science
%C Hanover, NH
%D June 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-458.ps.Z
%X
Wireless sensor networks have recently become an incredibly active research area in the networking community. Much attention has been given to the construction of power-conserving protocols and techniques, as battery life is the one factor that prevents successful wide-scale deployment of such networks. These techniques concentrate on the optimization of network behavior, as the wireless transmission of data is the most expensive operation performed by a sensor node. Very little work has been published on the integration of such techniques, and their suitability to various application domains. This paper presents an exhaustive power consumption analysis of network stacks constructed with common algorithms, to determine the interactions between such algorithms and the suitability of the resulting network stack for various applications.
%Z
Senior Honors Thesis. Advisor: Bob Gray.
%T Efficient I/O for Computational Grid Applications
%A Ron A. Oldfield
%R Technical Report TR2003-459
%I Dartmouth College, Computer Science
%C Hanover, NH
%D May 2003
%U http://www.cs.dartmouth.edu/reports/TR2003-459.ps.Z
%X
High-performance computing increasingly occurs on "computational grids"
composed of heterogeneous and geographically distributed systems of
computers, networks, and storage devices that collectively act as a
single "virtual" computer. A key challenge in this environment is to
provide efficient access to data distributed across remote data
servers. This dissertation explores some of the issues associated
with I/O for wide-area distributed computing and describes an I/O
system, called Armada, with the following features: a framework to
allow application and dataset providers to flexibly compose graphs
of processing modules that describe the distribution, application
interfaces, and processing required of the dataset before or after
computation; an algorithm to restructure application graphs to
increase parallelism a