Access Control Realities As Observed in a Clinical Medical Setting
Dartmouth Technical Report TR2012-714
Sara Sinclair
Sean W. Smith
Date: January 2012
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/574/TR2012-714.pdf (146KB)
Abstract:
Effective computer security requires looking not just at technology, but also at
how it meshes with users in the real-world enterprises depending on it. As part
of a longer-term series of projects, we have been looking at these issues-— particularly
access control-— in a variety of real-world enterprises. In previous work, we
looked at companies in the finance and software industries; this paper reports on
a study of a hospital's access control systems. Both studies employ ethnographic
methods to elicit observations on the failures of current access control technologies
in large, dynamic organizations; participants in the corporate study were largely
drawn from IT staff members, whereas this clinical study involved a larger number
of end users.
Access Control Hygiene and the Empathy Gap in Medical IT
Dartmouth Technical Report TR2012-713
Yifei Wang
Sean W. Smith
Andrew Gettinger
Date: January 2012
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/573/TR2012-713.pdf (406KB)
Abstract:
In theory, access control is a solved problem. In practice, large real-world enterprises
still report trouble: de facto policy becomes unmanageable; users circumvent controls.
These issues can be particularly critical in medical IT, such as emerging EMR and
EHR, where access control errors can have serious repercussions. In this paper,
we investigate how real-world EMR users think about access control when they are
making policy decisions in the abstract---and when they are actually using the system
in treatment scenarios. Mismatches suggest places (“empathy gaps”) where new
policy tools may be neededNotes:
Wallpaper Maps
Dartmouth Technical Report TR2012-712
M. Douglas McIlroy
Date: January 2012
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/572/TR2012-712.pdf (5091KB)
Abstract:
A wallpaper map is a conformal projection of a spherical earth onto regular polygons
with which the plane can be tiled continuously. A complete set of distinct wallpaper
maps that satisfy certain natural symmetry conditions is derived and illustrated.
Though all of the projections have been published before, the family had not been
characterized as a whole. Some wallpaper maps generalize to one-parameter subfamilies
in which the sphere is pre-transformed by a conformal automorphism.
EXPOSING PRIVACY CONCERNS IN MHEALTH DATA SHARING
Dartmouth Technical Report TR2012-711
Aarathi Prasad
Date: January 2012
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/571/TR2012-711.pdf (4828KB)
Abstract:
Mobile health (mHealth) has become important in the field of healthcare information
technology, as patients begin to use mobile devices to record their daily activities
and vital signs. These devices can record personal health information even outside
the hospital setting, while the patients are at home or at their workplace. However,
the devices might record sensitive information that might not be relevant for medical
purposes and in some cases may be misused. Patients need expressive privacy controls
so that they can trade potential health benefits of the technology with the privacy
risks. To provide such privacy controls, it is important to understand what patients
feel are the benefits and risks associated with the technology and what controls
they want over the information.
We conducted focus groups to understand the privacy concerns that patients have
when they use mHealth devices. We conducted a user study to understand how willing
patients are to share their personal health information that was collected using
an mHealth device. To the best of our knowledge, ours is the first study that explores
users' privacy concerns by giving them the opportunity to actually share the information
collected about them using mHealth devices. We found that patients tend to share
more information with third parties than the public and prefer to keep certain information
from their family and friends. Finally, based on these discoveries, we propose some
guidelines to developing defaults for sharing settings in mHealth systems. NOTE::
M.S. Thesis. Advisor: David Kotz.
The Good, the Bad, and the Actively Verified
Dartmouth Technical Report TR2011-710
John F Williamson
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/570/TR2011-710.pdf (187KB)
Abstract:
We believe that we can use active probing for compromise recovery. Our intent is
to exploit the differences in behavior between compromised and uncompromised systems
and use that information to identify those which are not behaving as expected. Those
differences may indicate a deviation in either con figuration or implementation
from what we expect on the network, either of which suggests that the misbehaving
entity might not be trustworthy. In this work, we propose and build a case for a
method for using altered behavior directly resulting from or introduced as a side-effect
of the compromise of a network service to detect the presence of such a compromise.
We use several case studies to illustrate our technique, and demonstrate its feasibility
with a software tool developed using our method.Notes:
Anomaly Detection in Network Streams Through a Distributional Lens
Dartmouth Technical Report TR2011-707
Chrisil Arackaparambil
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/569/TR2011-707.pdf (982KB)
Abstract:
Anomaly detection in computer networks yields valuable information on events relating
to the components of a network, their states, the users in a network and their activities.
This thesis provides a unified distribution-based methodology for online detection
of anomalies in network traffic streams. The methodology is distribution-based in
that it regards the traffic stream as a time series of distributions (histograms),
and monitors metrics of distributions in the time series. The effectiveness of the
methodology is demonstrated in three application scenarios. First, in 802.11 wireless
traffic, we show the ability to detect certain classes of attacks using the methodology.
Second, in information network update streams (specifically in Wikipedia) we show
the ability to detect the activity of bots, flash events, and outages, as they occur.
Third, in Voice over IP traffic streams, we show the ability to detect covert channels
that exfiltrate confidential information out of the network. Our experiments show
the high detection rate of the methodology when compared to other existing methods,
while maintaining a low rate of false positives. Furthermore, we provide algorithmic
results that enable efficient and scalable implementation of the above methodology,
to accomodate the massive data rates observed in modern infomation streams on the
Internet. Through these applications, we present an extensive study of several aspects
of the methodology. We analyze the behavior of metrics we consider, providing justification
of our choice of those metrics, and how they can be used to diagnose anomalies.
We provide insight into the choice of parameters, like window length and threshold,
used in anomaly detection.Notes:
Beyond SELinux: the Case for Behavior-Based Policy and Trust Languages
Dartmouth Technical Report TR2011-701
Sergey Bratus
Michael E. Locasto
Boris Otto
Rebecca Shapiro
Sean W. Smith
Gabriel Weaver
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/568/TR2011-701.pdf (174KB)
Abstract:
Despite the availability of powerful mechanisms for security policy and access control,
real-world information security practitioners---both developers and security officers---still
find themselves in need of something more.
We believe that this is the case because available policy languages do not provide
clear and intelligible ways to allow developers to communicate their knowledge and
expectations of trustworthy behaviors and actual application requirements to IT
administrators. We work to address this policy engineering gap by shifting the focus
of policy language design to this communication via behavior-based policies and
their motivating scenarios.
Scalable Object-Class Search via Sparse Retrieval Models and Approximate Ranking
Dartmouth Technical Report TR2011-700
Mohammad Rastegari
Chen Fang
Lorenzo Torresani
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/567/TR2011-700.pdf (206KB)
Abstract:
In this paper we address the problem of object-class retrieval in large image data
sets: given a small set of training examples defining a visual category, the objective
is to efficiently retrieve images of the same class from a large database. We propose
two contrasting retrieval schemes achieving good accuracy and high efficiency. The
first exploits sparse classification models expressed as linear combinations of
a small number of features. These sparse models can be efficiently evaluated using
inverted file indexing. Furthermore, we introduce a novel ranking procedure that
provides a significant speedup over inverted file indexing when the goal is restricted
to finding the top-k (i.e., the k highest ranked) images in the data set. We contrast
these sparse retrieval models with a second scheme based on approximate ranking
using vector quantization. Experimental results show that our algorithms for object-class
retrieval can search a 10 million database in just a couple of seconds and produce
categorization accuracy comparable to the best known class-recognition systems.
Some Communication Complexity Results and their Applications
Dartmouth Technical Report TR2011-699
Joshua E. Brody
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/566/TR2011-699.pdf (590KB)
Abstract:
Communication Complexity represents one of the premier techniques for proving lower
bounds in theoretical computer science. Lower bounds on communication problems can
be leveraged to prove lower bounds in several different areas.
In this work, we study three different communication complexity problems. The lower
bounds for these problems have applications in circuit complexity, wireless sensor
networks, and streaming algorithms.
First, we study the multiparty pointer jumping problem. We present the first nontrivial
upper bound for this problem. We also provide a suite of strong lower bounds under
several restricted classes of protocols.
Next, we initiate the study of several non-monotone functions in the distributed
functional monitoring setting and provide several lower bounds. In particular, we
give a generic adversarial technique and show that when deletions are allowed, no
nontrivial protocol is possible.
Finally, we study the Gap-Hamming-Distance problem and give tight lower bounds for
protocols that use a constant number of messages. As a result, we take a well-known
lower bound for one-pass streaming algorithms for a host of problems and extend
it so it applies to streaming algorithms that use a constant number of passes.Notes:
Effects of network trace sampling methods on privacy and utility metrics
Dartmouth Technical Report TR2011-697
Phillip A. Fazio
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/565/TR2011-697.pdf (1018KB)
Abstract:
Researchers studying computer networks rely on the availability of traffic trace
data collected from live production networks. Those choosing to share trace data
with colleagues must first remove or otherwise anonymize sensitive information.
This process, called sanitization, represents a tradeoff between the removal of
information in the interest of identity protection and the preservation of data
within the trace that is most relevant to researchers. While several metrics exist
to quantify this privacy-utility tradeoff, they are often computationally expensive.
Computing these metrics using a sample of the trace, rather than the entire input
trace, could potentially save precious time and space resources, provided the accuracy
of these values does not suffer.
In this paper, we examine several simple sampling methods to discover their effects
on measurement of the privacy-utility tradeoff when anonymizing network traces prior
to their sharing or publication. After sanitizing a small sample trace collected
from the Dartmouth College wireless network, we tested the relative accuracy of
a variety of previously implemented packet and flow-sampling methods on a few existing
privacy and utility metrics. This analysis led us to conclude that, for our test
trace, no single sampling method we examined allowed us to accurately measure the
trade-off, and that some sampling methods can produce grossly inaccurate estimates
of those values. We were unable to draw conclusions on the use of packet versus
flow sampling in these instances.Notes:
Appearance-design interfaces and tools for computer cinematography: Evaluation and application
Dartmouth Technical Report TR2011-696
William B. Kerr
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/564/TR2011-696.pdf (30645KB)
Abstract:
We define appearance design as the creation and editing of scene content such as
lighting and surface materials in computer graphics. The appearance design process
takes a significant amount of time relative to other production tasks and poses
difficult artistic challenges. Many user interfaces have been proposed to make appearance
design faster, easier, and more expressive, but no formal validation of these interfaces
had been published prior to our body of work. With a focus on novice users, we present
a series of investigations into the strengths and weaknesses of various appearance
design user interfaces. In particular, we develop an experimental methodology for
the evaluation of representative user interface paradigms in the areas of lighting
and material design. We conduct three user studies having subjects perform design
tasks under controlled conditions. In these studies, we discover new insight into
the effectiveness of each paradigm for novices measured by objective performance
as well as subjective feedback. We also offer observations on common workflow and
capabilities of novice users in these domains. We use the results of our lighting
study to develop a new representation for artistic control of lighting, where light
travels along nonlinear paths.Notes:
Assisting Human Motion-Tasks with Minimal, Real-time Feedback
Dartmouth Technical Report TR2011-695
Paritosh A. Kavathekar
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/563/TR2011-695.pdf (2199KB)
Abstract:
Teaching physical motions such as riding, exercising, swimming, etc. to human beings
is hard. Coaches face difficulties in communicating their feedback verbally and
cannot correct the student mid-action; teaching videos are two dimensional and suffer
from perspective distortion. Systems that track a user and provide him real-time
feedback have many potential applications: as an aid to the visually challenged,
improving rehabilitation, improving exercise routines such as weight training or
yoga, teaching new motion tasks, synchronizing motions of multiple actors, etc.
It is not easy to deliver real-time feedback in a way that is easy to interpret,
yet unobtrusive enough to not distract the user from the motion task. I have developed
motion feedback systems that provide real-time feedback to achieve or improve human
motion tasks. These systems track the user's actions with simple sensors, and use
tiny vibration motors as feedback devices. Vibration motors provide feedback that
is both intuitive and minimally intrusive. My systems' designs are simple, flexible,
and extensible to large-scale, full-body motion tasks. The systems that I developed
as part of this thesis address two classes of motion tasks: configuration tasks
and trajectory tasks. Configuration tasks guide the user to a target configuration.
My systems for configuration tasks use a motion-capture system to track the user.
Configuration-task systems restrict the user's motions to a set of motion primitives,
and guide the user to the target configuration by executing a sequence of motion-primitives.
Trajectory tasks assume that the user understands the motion task. The systems for
trajectory tasks provide corrective feedback that assists the user in improving
their performance. This thesis presents the design, implementation, and results
of user experiments with the prototype systems I have developed.Notes:
Minimum time kinematic trajectories for self-propelled rigid bodies in the unobstructed plane
Dartmouth Technical Report TR2011-694
Andrei A. Furtuna
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/562/TR2011-694.pdf (2733KB)
Abstract:
The problem of moving rigid bodies efficiently is of particular interest in robotics
because the simplest model of a mobile robot or of a manipulated object is often
a rigid body. Path planning, controller design and robot design may all benefit
from precise knowledge of optimal trajectories for a set of permitted controls.
In this work, we present a general solution to the problem of finding minimum time
trajectories for an arbitrary self-propelled, velocity-bounded rigid body in the
obstacle-free plane. Such minimum-time trajectories depend on the vehicle's capabilities
and on and the start and goal configurations. For example, the fastest way to move
a car sideways might be to execute a parallel-parking motion. The fastest long-distance
trajectories for a wheelchair-like vehicle might be of a turn-drive-turn variety.
Our analysis reveals a wide variety of types of optimal trajectories. We determine
an exhaustive taxonomy of optimal trajectory types, presented as a branching tree.
For each of the necessary leaf nodes, we develop a specific algorithm to find the
fastest trajecNotes:
A Multilevel, Posture-based Model for Motor Control in Simulation and Robotic Applications
Dartmouth Technical Report TR2011-691
Divya A. Gunasekaran
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/561/TR2011-691.pdf (52703KB)
Abstract:
This paper presents a multilevel, posture-based motor control model intended to plan
collision-free movements in a 3D environment while maintaining computationally efficiency
and accurately imitating human and primate motor function. Our model is a comprehensive
approach that addresses the storage and lookup of postures and movements, path planning
and the generation of new movements, and learning with experience. We demonstrate
the functionality and computational advantages of the model through preliminary
testing on a humanoid robot.Notes:
802.15.4/ZigBee Analysis and Security: tools for practical exploration of the attack surface
Dartmouth Technical Report TR2011-689
Ricky A. Melgares
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/560/TR2011-689.pdf (20517KB)
Abstract:
This thesis explores methods and techniques for surveying 802.15.4 and ZigBee wireless
networks. The tools developed will aid in reconnaissance attacks against target
networks; information gathered during this process will be used to profile a target
network and its devices, as well as to pinpoint the geolocation of devices for executing
physical attacks against the onboard hardware. Attacks against the PHY and MAC layers
of the 802.15.4 standard will be explored as well.Notes:
Exploiting the Hard-Working DWARF: Trojan and Exploit Techniques Without Native Executable Code
Dartmouth Technical Report TR2011-688
James M.H. Oakley
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/559/TR2011-688.pdf (334KB)
Abstract:
The study of vulnerabilities and exploitation is one of finding mechanisms affecting
the flow of computation and of finding new means to perform unexpected computation.
In this paper we show the extent to which exception handling mechanisms as implemented
and used by gcc can be used to control program execution. We show that the data
structures used to store exception handling information on UNIX-like systems actually
contain Turing-complete bytecode, which is executed by a virtual machine during
the course of exception unwinding and handling. We discuss how a malicious attacker
could gain control over these structures and how such an attacker could utilize
them once control has been achieved.Notes:
IEEE 802.15.4 Wireless Security: Self-Assessment Frameworks
Dartmouth Technical Report TR2011-687
Ryan M. Speers
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/558/TR2011-687.pdf (3016KB)
Abstract:
This thesis analyzes the security of networks built upon the IEEE 802.15.4 standard,
specifically in regard to the ability of an attacker to manipulate such networks
under real-world conditions. The author presents a set of tools, both hardware and
software, that advance the state-of-the-art in reconnaissance and site surveying,
intelligent packet generation, and launching of attacks. Specifically, tools provide
increased hardware support for the KillerBee toolkit, a Scapy layer for forming
802.15.4 packets, reflexive jamming of packets, and other research enablers. This
work aims to advance the ability of security auditors to understand the threats
to IEEE 802.15.4 networks by providing auditors usable and low-cost tools to carry
out vulnerability assessments.Notes:
Static Analysis for Ruby in the Presence of Gradual Typing
Dartmouth Technical Report TR2011-686
Michael Edgar
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/557/TR2011-686.pdf (832KB)
Abstract:
Dynamic languages provide new challenges to traditional static analysis techniques,
leaving most errors to be detected at runtime and making many properties of code
difficult to infer. Ruby code usually takes advantage of both dynamic typing and
metaprogramming to produce elegant yet difficult-to-analyze programs. Function evalpq
and its variants, which usually foil static analysis, are used frequently as a primitive
runtime macro system. The goal of this thesis is to answer the question:
What useful information about real-world Ruby programs can be determined statically
with a high degree of accuracy?
Two observations lead to a number of statically-discoverable errors and properties
in parseable Ruby programs. The first is that many interesting properties of a program
can be discovered through traditional static analysis techniques despite the presence
of dynamic typing. The second is that most metaprogramming occurs when the program
files are loaded and not during the execution of the "main program."
Traditional techniques, such as flow analysis and Static Single Assignment transformations
aid extraction of program invariants, including both explicitly programmed constants
and those implicitly defined by Ruby's semantics. A meaningful, well-defined distinction
between load time and run time in Ruby is developed and addresses the second observation.
This distinction allows us to statically discern properties of a Ruby program despite
many idioms that require dynamic evaluation of code. Lastly, gradual typing through
optional annotations improves the quality of error discovery and other statically-inferred
properties.Notes:
Constant-RMR Abortable Reader-Priority Reader-Writer Algorithm
Dartmouth Technical Report TR2011-685
Nan Zheng
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/556/TR2011-685.pdf (655KB)
Abstract:
The concurrent reader-writer problem (Courtois, 1971) involves two classes of processes:
readers and writers, both of which wish to access a shared resource. Many readers
can access the shared resource at the same time. However, if a writer is accessing
the resource, no readers or other writers can access the resource at the same time.
In the reader-priority version of the problem, readers are prioritized over writers
when processes from both classes are trying to access the shared resource. Previous
research (Bhatt, 2010) showed a reader-priority constant-RMR multi-reader, multi-writer
algorithm for Cache-Coherent (CC) systems. However, this algorithm does not allow
for readers or writers to abort, which allows readers and writers waiting for the
resource to stop trying to access the resource and to quickly return to the Remainder
Section of the code, where the process performs tasks unrelated to the shared resource.
This thesis presents an abortable constant-RMR reader-priority multi-reader single-writer
algorithm for CC systems. Additionally, we show how to generalize the algorithm
into a multi-reader multi-writer algorithm using any given abortable mutual exclusion
algorithm. The algorithm is proven rigorously by invariants and tested using a system
of mathematical specification and model-checking tools (PlusCal/TLA+/TLC).Notes:
Reader-Writer Exclusion Supporting Upgrade and Downgrade with Starvation Freedom
Dartmouth Technical Report TR2011-684
Matthew Elkherj
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/555/TR2011-684.pdf (274KB)
Abstract:
In this thesis we give a constant Remote-Memory-Reference (on CC systems) reader-writer
exclusion algorithm supporting upgrade and downgrade, built from a reader-writer
exclusion algorithm by Jayanti and Liu. The algorithm is starvation-free, and allows
for repeated upgrades and downgrades.Notes:
Reader-Writer Exclusion Supporting Upgrade and Downgrade with Reader-Priority
Dartmouth Technical Report TR2011-683
Michael I. Diamond
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/554/TR2011-683.pdf (283KB)
Abstract:
The Reader-Writer Exclusion problem seeks to provide a lock that protects some critical
section of code for two classes of processes, readers and writers, where multiple
readers are permitted to hold the lock at a time, but only one writer can hold the
lock to the exclusion of all other processes. The difficulties in solving this problem
lie not only in developing a good algorithm, but in rigorously formulating desirable
properties for such an algorithm to have. Recently, Bhatt and Jayanti accomplished
both of these tasks for several variants of the Reader-Writer Exclusion problem.
We seek to extend their work by augmenting one of their algorithms (the one giving
readers priority over writers) with the notions of upgrading and downgrading. We
augment the algorithm by allowing processes in the critical section that are only
permitted to read to attempt to acquire permission to write by upgrading, and by
allowing processes that are permitted to write to relinquish their permission to
write--but still remain in the critical section as readers--by downgrading.Notes:
A Solution to k-Exclusion with O(logk) RMR Complexity
Dartmouth Technical Report TR2011-682
Jonathan H. Choi
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/553/TR2011-682.pdf (271KB)
Abstract:
We specify and prove an algorithm solving k-Exclusion, a generalization of the Mutual
Exclusion problem. k-Exclusion requires that at most k processes be in the Critical
Section (CS) at once; in addition, we require bounded exit, starvation freedom and
fairness properties. The goal within this framework is to minimize the number of
Remote Memory References (RMRs) made. Previous algorithms have required Omega(k)
RMRs in the worst case. Our algorithm requires O(log k) RMRs in the worst case under
the Cache-Coherent (CC) model, a considerable improvement in time complexity.Notes:
Obstruction-free Snapshot, Obstruction-free Consensus, and Fetch-and-add Modulo k
Dartmouth Technical Report TR2011-681
Jack R. Bowman
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/552/TR2011-681.pdf (108KB)
Abstract:
In this thesis we design algorithms for three problems: snapshot, consensus, and
fetch-and-add modulo k. Our solutions for snapshot and consensus are non-anonymous
and obstruction-free, and our solution for Fetch-and-add Modulo k is wait-free.
We also conjecture an anonymous, obstruction-free solution to consensus.Notes:
A New Artificial Intelligence for Auralux
Dartmouth Technical Report TR2011-692
Edward B. McNeill
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/551/TR2011-692.pdf (1156KB)
Abstract:
This project focused on developing a more challenging artificial intelligence for
the real-time strategy game Auralux. I designed and implemented an AI framework
named FlexibleAI that could be configured with various parameters controlling different
aspects of the overall algorithm. In this way, the AI could be tuned to be more
successful. I then created a testing framework called AuraSim that simplified Auralux
into an easily-simulated turn-based format. After testing various configurations
and tuning the FlexibleAI's parameters to be more successful, the AI eventually
achieved a victory rate several times better than its average opponent. This provides
the basis for a more challenging Auralux AI that will likely prove more satisfying
to play against.
Autoscopy Jr.: Intrusion Detection for Embedded Control Systems
Dartmouth Technical Report TR2011-704
Jason O. Reeves
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/550/TR2011-704-rev1.pdf (655KB)
Abstract:
Securing embedded control systems within the power grid presents a unique challenge:
on top of the resource restrictions inherent to these devices, SCADA systems must
also accommodate strict timing requirements that are non-negotiable, and their massive
scale greatly amplifies costs such as power consumption. These constraints make
the conventional approach to host intrusion detection--namely, employing virtualization
in some manner--too costly or impractical for embedded control systems within critical
infrastructure. Instead, we take an in-kernel approach to system protection, building
upon the Autoscopy system developed by Ashwin Ramaswamy that places probes on indirectly-called
functions and uses them to monitor its host system for behavior characteristic of
control-flow-altering malware, such as rootkits. In this thesis, we attempt to show
that such a method would indeed be a viable method of protecting embedded control
systems. We first identify several issues with the original prototype, and present
a new version of the program (dubbed Autoscopy Jr.) that uses trusted location lists
to verify that control is coming from a known, trusted location inside our kernel.
Although we encountered additional performance overhead when testing our new design,
we developed a kernel profiler that allowed us to identify the probes responsible
for this overhead and discard them, leaving us with a final probe list that generated
less than 5% overhead on every one of our benchmark tests. Finally, we attempted
to run Autoscopy Jr. on two specialized kernels (one with an optimized probing framework,
and another with a hardening patch installed), finding that the former did not produce
enough performance benefits to preclude using our profiler, and that the latter
required a different method of scanning for indirect functions for Autoscopy Jr.
to operate. We argue that Autoscopy Jr. is indeed a feasible intrusion detection
system for embedded control systems, as it can adapt easily to a variety of system
architectures and allows us to intelligently balance security and performance on
these critical devices.Notes:
Tackling Latency Using FG
Dartmouth Technical Report TR2011-706
Priya Natarajan
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/549/TR2011-706.pdf (806KB)
Abstract:
Applications that operate on datasets which are too big to fit in main memory, known
in the literature as external-memory or out-of-core applications, store their data
on one or more disks. Several of these applications make multiple passes over the
data, where each pass reads data from disk, operates on it, and writes data back
to disk. Compared with an in-memory operation, a disk-I/O operation takes orders
of magnitude (approx. 100,000 times) longer; that is, disk-I/O is a high-latency
operation. Out-of-core algorithms often run on a distributed-memory cluster to take
advantage of a cluster's computing power, memory, disk space, and bandwidth. By
doing so, however, they introduce another high-latency operation: interprocessor
communication. Efficient implementations of these algorithms access data in blocks
to amortize the cost of a single data transfer over the disk or the network, and
they introduce asynchrony to overlap high-latency operations and computations. FG,
short for Asynchronous Buffered Computation Design and Engineering Framework Generator,
is a programming framework that helps to mitigate latency in out-of-core programs
that run on distributed-memory clusters. An FG program is composed of a pipeline
of stages operating on buffers. FG runs the stages asynchronously so that stages
performing high-latency operations can overlap their work with other stages. FG
supplies the code to create a pipeline, synchronize the stages, and manage data
buffers; the user provides a straightforward function, containing only synchronous
calls, for each stage. In this thesis, we use FG to tackle latency and exploit
the available parallelism in out-of-core and distributed-memory programs. We show
how FG helps us design out-of-core programs and think about parallel computing in
general using three instances: an out-of-core, distribution-based sorting program;
an implementation of external-memory suffix arrays; and a scientific-computing application
called the fast Gauss transform. FG's interaction with these real-world programs
is symbiotic: FG enables efficient implementations of these programs, and the design
of the first two of these programs pointed us toward further extensions for FG.
Today's era of multicore machines compels us to harness all opportunities for parallelism
that are available in a program, and so in the latter two applications, we combine
FG's multithreading capabilities with the routines that OpenMP offers for in-core
parallelism. In the fast Gauss transform application, we use this strategy to realize
an up to 20-fold performance improvement compared with an alternate fast Gauss transform
implementation. In addition, we use our experience with designing programs in FG
to provide some suggestions for the next version of FG.Notes:
Hide-n-Sense: Privacy-aware secure mHealth sensing
Dartmouth Technical Report TR2011-702
Shrirang Mare
Jacob Sorber
Minho Shin
Cory Cornelius
David Kotz
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/548/TR2011-702.pdf (384KB)
Abstract:
As healthcare in many countries faces an aging population and rising costs, mobile
sensing technologies promise a new opportunity. Using mobile health (mHealth) sensing,
which uses medical sensors to collect data about the patients, and mobile phones
to act as a gateway between sensors and electronic health record systems, caregivers
can continuously monitor the patients and deliver better care. Furthermore, individuals
can become better engaged in monitoring and managing their own health. Although
some work on mHealth sensing has addressed security, achieving strong privacy for
low-power sensors remains a challenge.
We make three contributions. First, we propose an mHealth sensing protocol that
provides strong security and privacy properties with low energy overhead, suitable
for low-power sensors. The protocol uses three novel techniques: adaptive security,
to dynamically modify transmission overhead; MAC striping, to make forgery difficult
even for small-sized MACs; and an asymmetric resource requirement. Second, we demonstrate
a prototype on a Chronos wrist device, and evaluate it experimentally. Third, we
provide a security, privacy, and energy analysis of our system.Notes:
BGrep and BDiff: UNIX Tools for High-Level Languages
Dartmouth Technical Report TR2011-705
Gabriel A. Weaver
Sean W. Smith
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/547/TR2011-705.pdf (201KB)
Abstract:
The rise in high-level languages for system administrators requires us to rethink
traditional UNIX tools designed for these older data formats.
We propose new block-oriented tools, bgrep and bdiff, operating on syntactic blocks
of code rather than the line, the traditional information container of UNIX. Transcending
the line number allows us to introduce longitudinal diff, a mode of bdiff that lets
us track changes across arbitrary blocks of code. We present a detailed implementation
roadmap and evaluation framework for the full version of this paper. In addition
we demonstrate how the design of our tools already addresses several real-wold problems
faced by network administrators to maintain security policy.
Large-scale Wireless Local-area Network Measurement and Privacy Analysis
Dartmouth Technical Report TR2011-703
Keren Tan
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/545/TR2011-703.pdf (1362KB)
Abstract:
The edge of the Internet is increasingly becoming wireless. Understanding the wireless
edge is therefore important for understanding the performance and security aspects
of the Internet experience. This need is especially necessary for enterprise-wide
wireless local-area networks (WLANs) as organizations increasingly depend on WLANs
for mission- critical tasks. To study a live production WLAN, especially a large-scale
network, is a difficult undertaking. Two fundamental difficulties involved are (1)
building a scalable network measurement infrastructure to collect traces from a
large-scale production WLAN, and (2) preserving user privacy while sharing these
collected traces to the network research community. In this dissertation, we present
our experience in designing and implementing one of the largest distributed WLAN
measurement systems in the United States, the Dartmouth Internet Security Testbed
(DIST), with a particular focus on our solutions to the challenges of efficiency,
scalability, and security. We also present an extensive evaluation of the DIST system.
To understand the severity of some potential trace-sharing risks for an enterprise-wide
large-scale wireless network, we conduct privacy analysis on one kind of wireless
network traces, a user-association log, collected from a large-scale WLAN. We introduce
a machine-learning based approach that can extract and quantify sensitive information
from a user-association log, even though it is sanitized. Finally, we present a
case study that evaluates the tradeoff between utility and privacy on WLAN trace
sanitization.Notes:
Exploiting the Hard-Working DWARF: Trojans with no Native Executable Code
Dartmouth Technical Report TR2011-680
James M.H. Oakley
Sergey Bratus
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/525/TR2011-680.pdf (356KB)
Abstract:
All binaries compiled by recent versions of GCC from C++ programs include complex
data and dedicated code for exception handling support. The data structures describe
the call stack frame layout in the DWARF format bytecode. The dedicated code includes
an interpreter of this bytecode and logic to implement the call stack unwinding.
Despite being present in a large class of programs -- and therefore potentially
providing a huge attack surface -- this mechanism is not widely known or studied.
Of particular interest to us is that the exception handling mechanism provides the
means for fundamentally altering the flow of a program. DWARF is designed specifically
for calculating call frame addresses and register values. DWARF expressions are
Turing-complete and may calculate register values based on any readable data in
the address space of the process. The exception handling data is in effect an embedded
program residing within every C++ process. This talk explores what can be accomplished
with control of the exception handling information without modifying the program's
text or data. We also examine the exception handling mechanism and argue that it
is rife for vulnerability finding, not least because the error states of a program
are often those least well tested.
Privacy Analysis of User Association Logs in a Large-scale Wireless LAN
Dartmouth Technical Report TR2011-679
Keren Tan
Guanhua Yan
Jihwang Yeo
David Kotz
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/524/TR2011-679.pdf (300KB)
Abstract:
User association logs collected from a large-scale wireless LAN record where and
when a user has used the network. Such information plays an important role in wireless
network research. One concern of sharing these data with other researchers, however,
is that the logs pose potential privacy risks for the network users. Today, the
common practice in sanitizing these data before releasing them to the public is
to anonymize users' sensitive information, such as their devices' MAC addresses
and their exact association locations. In this work, we demonstrate that such sanitization
measures are insufficient to protect user privacy because the differences between
user association behaviors can be modeled and many are distinguishable. By simulating
an adversary's role, we propose a novel type of correlation attack in which the
adversary uses the anonymized association log to build signatures against each user,
and when combined with auxiliary information, such signatures can help to identify
users within the anonymized log. On a user association log that contains more than
four thousand users and millions of association records, we demonstrate that this
attack technique is able to pinpoint the victim's identity exactly with a probability
as high as 70%, and narrow it down to a set of 20 candidates with a probability
close to 100%. We further evaluate the effectiveness of standard anonymization techniques,
including generalization and perturbation, in mitigating this correlation attack;
our experimental results reveal only limited success of these methods, suggesting
that more thorough treatment is needed when anonymizing wireless user association
logs before public release.Notes:
Out of the Depths: Image Statistics of Space, Water, and the Minuscule World
Dartmouth Technical Report TR2011-678
Nimit S. Dhulekar
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/523/TR2011-678.pdf (727KB)
Abstract:
In images of natural scenes, a consistent relationship exists between spectral power
and spatial frequency. The power spectrum falls off with a form 1/f^p as spatial
frequency f increases, with values of p approximately equal to 2. To quantify the
extent to which this statistical characteristic is exhibited by other classes of
images, we examined astronomical, underwater, and microscale images. It was found
that this property holds for all three categories of images, although the value
of p varies in the range 1.76 to 2.37. The second statistical characteristic computed
was the angular spread of the power spectrum. This metric is a means to verify whether
the image categories investigated tend to display more power in the horizontal and
vertical orientations, akin to natural images. It was found that these image categories
have primarily isotropic spectral signatures with a much reduced anisotropy as compared
to natural images. Along similar lines, we introduce a new measure called the anisotropy
index which compares the power in the horizontal and vertical orientations with
power in oblique orientations. The statistics thus presented are for the ensemble
power spectrum. We also construct 4 classifiers to distinguish between natural images
and astronomical, microscale, and underwater images. The k-nearest neighbor classifier
with Mahalanobis distance had the best accuracy of 70.5% on the training set and
66.9% on the test set, for correctly identifying natural images. From these classifiers,
we can not only view the confusion in classification among the investigated image
categories, but also the difference in statistics as compared to natural images.
These classifiers also make it possible to verify that the images in a particular
class display statistics similar to that of the ensemble image.Notes:
A 3-D Lighting and Shadow Analysis of the JFK Zapruder Film (Frame 317)
Dartmouth Technical Report TR2010-677
Hany Farid
Date: January 2011
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/522/TR2010-677.pdf (15944KB)
Abstract:
Claims of a broader conspiracy behind U.S. President John F. Kennedy's assassination
have persisted for the past nearly five decades. The Zapruder film is considered
to be the most complete recording of JFK's assassination. Many have claimed that
this 8mm film was manipulated to conceal evidence of a second shooter, which would
invalidate the claim that a lone gunman, Lee Harvey Oswald, was responsible for
JFK's assassination. Here we consider the viability of one specific claim of postproduction
tampering in the Zapruder film.
Numerical methods for fMRI data analysis
Dartmouth Technical Report TR2010-676
Geethmala Sridaran
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/521/TR2010-676.pdf (1334KB)
Abstract:
Brain imaging data are increasingly analyzed via a range of machine-learning methods.
In this thesis, we discuss three specific contributions to the field of neuroimaging
analysis methods: 1. To apply a recently-developed technique for identifying and
viewing similarity structure in neuroimaging data, in which candidate representational
structures are ranked; 2. Provide side-by-side analyses of neuroimaging data by
a typical non-hierarchical (SVM) versus hierarchical (Decision Tree) machine-learning
classification methods; and 3. To develop a novel programming environment for PyMVPA,
a current popular analysis toolbox, such that users will be able to type a small
number of packaged commands to carry out a range of standard analyses. We carried
out our analysis with an fMRI data set generated using auditory stimuli. "Tree"
and "Ring" were the best voted structural representations we obtained by applying
the Kemp's algorithm. Machine-learning classification resulted in accuracy values
that were similar for both decision tree and SVM algorithms. Coding for different
sound categories primarily occurred in the temporal lobes of the brain. We discovered
a few non-temporal regions of the brain coding for these auditory sounds as well.Notes:
Graph algorithms for NMR resonance assignment and cross-link experiment planning
Dartmouth Technical Report TR2010-675
Fei Xiong
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/520/TR2010-675.pdf (2298KB)
Abstract:
The study of three-dimensional protein structures produces insights into protein
function at the molecular level. Graphs provide a natural representation of protein
structures and associated experimental data, and enable the development of graph
algorithms to analyze the structures and data. This thesis develops such graph representations
and algorithms for two novel applications: structure-based NMR resonance assignment
and disulfide cross-link experiment planning for protein fold determination. The
first application seeks to identify correspondences between spectral peaks in NMR
data and backbone atoms in a structure (from x-ray crystallography or homology modeling),
by computing correspondences between a contact graph representing the structure
and an analogous but very noisy and ambiguous graph representing the data. The assignment
then supports further NMR studies of protein dynamics and protein-ligand interactions.
A hierarchical grow-and-match algorithm was developed for smaller assignment problems,
ensuring completeness of assignment, while a random graph approach was developed
for larger problems, provably determining unique matches in polynomial time with
high probability. Test results show that our algorithms are robust to typical levels
of structural variation, noise, and missings, and achieve very good overall assignment
accuracy. The second application aims to rapidly determine the overall organization
of secondary structure elements of a target protein by probing it with a set of
planned disulfide cross-links. A set of informative pairs of secondary structure
elements is selected from graphs representing topologies of predicted structure
models. For each pair in this ``fingerprint'', a set of informative disulfide probes
is selected from graphs representing residue proximity in the models. Information-theoretic
planning algorithms were developed to maximize information gain while minimizing
experimental complexity, and Bayes error plan assessment frameworks were developed
to characterize the probability of making correct decisions given experimental data.
Evaluation of the approach on a number of structure prediction case studies shows
that the optimized plans have low risk of error while testing only a very small
portion of the quadratic number of possible cross-link candidates.Notes:
Virtual Container Attestation: Customized trusted containers for on-demand computing.
Dartmouth Technical Report TR2010-674
Katelin A. Bailey
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/519/TR2010-674.pdf (589KB)
Abstract:
In today's computing environment, data is moving to central locations and most computers
are merely used to access the data. Today is the era of cloud computing and distributed
computing, where users have control over neither data nor computation. As this trend
continues there is an increasing frequency of mutually distrustful parties being
forced to interact and share resources with each other in potentially dangerous
situations.
Therefore, there is an urgent need for a means of creating trust between two entities,
or at the very least providing some means of determining the trust level of a given
machine.
Current approaches to the trust problem focus on various forms of isolation and
attestation, but most have high overheads or are overly rigid in their requirements
to users. I propose and implement an alternative solution which provides flexible,
on-demand containers for untrusted applications, and enforcement of requested security
properties. Together these provide assurance to the remote parties that the machines
behave as required or are quickly shut down.Notes:
Block Sensitivity versus Sensitivity
Dartmouth Technical Report TR2010-673
Karn Seth
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/518/TR2010-673.pdf (253KB)
Abstract:
Sensitivity and block sensitivity are useful and well-studied measures of computational
complexity, but in spite of their similarities, the largest possible gap between
them is still unknown. Rubinstein showed that this gap must be at least quadratic,
and Kenyon and Kutin showed that it is at worst exponential, but many strongly suspect
that the gap is indeed quadratic, or at worst polynomial. Our work shows that for
a large class of functions, which includes Rubinstein's function, the quadratic
gap between sensitivity and block sensitivity is the best we can possibly do.Notes:
Optimization Algorithms for Site-directed Protein Recombination Experiment Planning
Dartmouth Technical Report TR2010-672
Wei Zheng
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/517/TR2010-672.pdf (6986KB)
Abstract:
Site-directed protein recombination produces improved and novel protein variants
by recombining sequence fragments from parent proteins. The resulting hybrids accumulate
multiple mutations that have been evolutionarily accepted together. Subsequent screening
or selection identifies hybrids with desirable characteristics. In order to increase
the "hit rate" of good variants, this thesis develops experiment planning algorithms
to optimize protein recombination experiments. First, to improve the frequency of
generating novel hybrids, a metric is developed to assess the diversity among hybrids
and parent proteins. Dynamic programming algorithms are then created to optimize
the selection of breakpoint locations according to this metric. Second, the trade-off
between diversity and stability in recombination experiment planning is studied,
recognizing that diversity requires changes from parent proteins, which may also
disrupt important residue interactions necessary for protein stability. Accordingly,
methods based on dynamic programming are developed to provide combined optimization
of diversity and stability, finding optimal breakpoints such that no other experiment
plan has better performance in both aspects simultaneously. Third, in order to support
protein recombination with heterogeneous structures and focus on functionally important
regions, a general framework for protein fragment swapping is developed. Differentiating
source and target parents, and swappable regions within them, fragment swapping
enables asymmetric, selective site-directed recombination. Two applications of protein
fragment swapping are studied. In order to generate hybrids inheriting functionalities
from both source and target proteins by fragment swapping, a method based on integer
programming selects optimal swapping fragments to maximize the predicted stability
and activity of hybrids in the resulting library. In another application, human
source protein fragments are swapped into therapeutic exogenous target protein to
minimize the occurrence of peptides that trigger immune response. A dynamic programming
method is developed to optimize fragment selection for both humanity and functionality,
resulting in therapeutically active variants with decreased immunogenicity.Notes:
The Curious Timekeeper: Creative Thesis in Interactive Sculpture
Dartmouth Technical Report TR2010-670
Kate I. Schnippering
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/516/TR2010-670.pdf (30288KB)
Abstract:
When we interact with computers, we have set expectations about our interactive experience,
operating a mouse and keyboard to elicit predictable responses on a screen. Intersecting
the world of Computing with Fine Art gains us potential to innovate outside these
bounds by restricting the expected performance of a computer-- setting it to a particular
purpose rather than allowing it to run anyone's software. To challenge standard
human-computer interaction, this work set out to create an interesting and unusual
interactive experience, fully integrated into a sculpture. The approach was to design
a system to form a small environment, having many components rather than risking
everything on any one feature. To push the bounds of Fine Art interactions, the
work addresses time-control using video in a manner that painting or static sculpture
cannot achieve. The result is The Curious Timekeeper, a large sculpture of a whimsical
bird that hopes to bewilder viewers, to encourage them to consider the role of computing
in their lives and in art.Notes:
A 3-D Photo Forensic Analysis of the Lee Harvey Oswald Backyard Photo
Dartmouth Technical Report TR2010-669
Hany Farid
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/515/TR2010-669.pdf (1779KB)
Abstract:
More than forty-five years after the assassination of U.S. President Kennedy theories
continue to circulate suggesting that the accused assassin, Lee Harvey Oswald, acted
as part of a larger conspiracy. It has been argued, for example, that incriminating
photographs of Oswald were manipulated, and hence evidence of a broader plot. We
describe a detailed 3-D analysis of the Oswald photos to determine if such claims
of tampering are warranted.
Creating Large Disturbances in the Power Grid: Methods of Attack After Cyber Infiltration
Dartmouth Technical Report TR2010-668
Loren D. Sands-Ramshaw
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/514/TR2010-668.pdf (771KB)
Abstract:
Researchers are pursuing methods of securing the cyber aspect of the U.S. power grid,
one of the country's most critical infrastructures. An attacker who is able to infiltrate
an Energy Management System (EMS) can instruct elements of the grid to function
improperly or can skew the state information received by the control programs or
operators. In addition, a cyber attack can combine multiple attacks and affect many
physical locations at once. A study of the possible adverse effects an attack could
generate can underline the urgency of improving grid security, contribute to a roadmap
and priority list for security researchers, and advise on how defending against
cyber attacks can differ from defending against point failures and physical attacks.
In this paper I discuss the physical and cyber systems that compose the power grid,
and I explore ways in which a compromise of the cyber system can affect the physical
system, with a particular emphasis on the best means of creating large disturbances.
Further, I consider ways in which cyber attacks differ from physical attacks.Notes:
A Note on Randomized Streaming Space Bounds for the Longest Increasing Subsequence Problem
Dartmouth Technical Report TR2010-667
Amit Chakrabarti
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/513/TR2010-667.pdf (98KB)
Abstract:
The deterministic space complexity of approximating the length of the longest increasing
subsequence of a stream of N integers is known to be Theta~(sqrt N). However, the
randomized complexity is wide open. We show that the technique used in earlier work
to establish the Omega(sqrt N) deterministic lower bound fails strongly under randomization:
specifically, we show that the communication problems on which the lower bound is
based have very efficient randomized protocols. The purpose of this note is to guide
and alert future researchers working on this very interesting problem.
NeuroPhone: Brain-Mobile Phone Interface using a Wireless EEG Headset
Dartmouth Technical Report TR2010-666
Matthew K. Mukerjee
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/512/TR2010-666.pdf (4675KB)
Abstract:
Neural signals are everywhere just like mobile phones. We propose to use neural signals
to control mobile phones for hands-free, silent and effortless human-mobile interaction.
Until recently, devices for detecting neural signals have been costly, bulky and
fragile. We present the design, implementation and evaluation of the NeuroPhone
system, which allows neural signals to drive mobile phone applications on the iPhone
using cheap off-the-shelf wireless electroencephalography (EEG) headsets. We demonstrate
a mind-controlled address book dialing app, which works on similar principles to
P300-speller brain-computer interfaces: the phone flashes a sequence of photos of
contacts from the address book and a P300 brain potential is elicited when the flashed
photo matches the person whom the user wishes to dial. EEG signals from the headset
are transmitted wirelessly to an iPhone, which natively runs a lightweight classifier
to discriminate P300 signals from noise. When a person's contact-photo triggers
a P300, his/her phone number is automatically dialed. NeuroPhone breaks new ground
as a brain-mobile phone interface for ubiquitous pervasive computing. We discuss
the challenges in making our initial prototype more practical, robust, and reliable
as part of our on-going research.Notes:
Predictive YASIR: High Security with Lower Latency in Legacy SCADA
Dartmouth Technical Report TR2010-665
Rouslan V. Solomakhin
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/511/TR2010-665.pdf (289KB)
Abstract:
Message authentication with low latency is necessary to ensure secure operations
in legacy industrial control networks, such as power grid networks. Previous authentication
solutions by our lab and others looked at single messages and incurred noticeable
latency. To reduce this latency, we develop Predictive YASIR, a bump-in-the-wire
device that looks at broader patterns of messages. The device (1) predicts the incoming
plaintext based on previous observations; (2) compresses, encrypts, and authenticates
data online; and (3) pre-sends a part of ciphertext before receiving the whole plaintext.
I demonstrate the performance properties of this approach by implementing it in
the Scalable Simulation Framework and testing it on Modbus/ASCII protocol, which
is widely used in the power grid, oil and gas, manufacturing, and water treatment
control networks. By looking at broader message patterns and using predictive analysis,
my results demonstrate a 15.48 +/- 0.35% improvement in latency over the previous
most efficient solution. The simulation code is available from http://www.cs.dartmouth.edu/~pyasir/.Notes:
Flexible Object Manipulation
Dartmouth Technical Report TR2010-663
Matthew P. Bell
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/510/TR2010-663.pdf (2212KB)
Abstract:
Flexible objects are a challenge to manipulate. Their motions are hard to predict,
and the high number of degrees of freedom makes sensing, control, and planning difficult.
Additionally, they have more complex friction and contact issues than rigid bodies,
and they may stretch and compress. In this thesis, I explore two major types of
flexible materials: cloth and string. For rigid bodies, one of the most basic problems
in manipulation is the development of immobilizing grasps. The same problem exists
for flexible objects. I have shown that a simple polygonal piece of cloth can be
fully immobilized by grasping all convex vertices and no more than one third of
the concave vertices. I also explored simple manipulation methods that make use
of gravity to reduce the number of fingers necessary for grasping. I have built
a system for folding a T-shirt using a 4 DOF arm and a fixed-length iron bar which
simulates two fingers. The main goal with string manipulation has been to tie knots
without the use of any sensing. I have developed single-piece fixtures capable of
tying knots in fishing line, solder, and wire, along with a more complex track-based
system for autonomously tying a knot in steel wire. I have also developed a series
of different fixtures that use compressed air to tie knots in string. Additionally,
I have designed four-piece fixtures, which demonstrate a way to fully enclose a
knot during the insertion process, while guaranteeing that extraction will always
succeed.Notes:
Constant RMR Solutions to Reader Writer Synchronization
Dartmouth Technical Report TR2010-662
Vibhor Bhatt
Prasad Jayanti
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/509/TR2010-662.pdf (234KB)
Abstract:
We study Reader-Writer Exclusion, a well-known variant of the Mutual Exclusion problem
where processes are divided into two classes--readers and writers--and multiple
readers can be in the Critical Section (CS) at the same time, although no process
may be in the CS at the same time as a writer. Since readers don't conflict with
each other, they should not obstruct each other. Specifically, the concurrent entering
property must be satisfied: if all writers are in the remainder section, each reader
should be able to enter the CS in a bounded number of its own steps. Three versions
of the Reader-Writer Exclusion problem are commonly studied--one where writers have
priority over readers, another where readers have priority, and the last where neither
class has priority over the other and no process may starve.
To ensure high performance on Cache-Coherent (CC) and Distributed Shared Memory
(DSM) multiprocessors, algorithms should be designed to generate as few remote memory
references (RMRs) as possible. The ideal would be to achieve constant RMR complexity,
i.e., the worst case number of RMRs that a process generates in order to enter and
exit the CS once is a constant, independent of the number of processes.
Constant RMR complexity algorithms have existed for Mutual Exclusion for two decades,
but none exists for Reader-Writer Exclusion. Danek and Hadzilacos' lower bound proof
implies that it is impossible to achieve sublinear RMR complexity for DSM machines.
For CC machines, the best existing bound, also due to Danek and Hadzilacos , is
O(log n), where n is the number of processes. In this work, we present the first
constant RMR complexity algorithms for all three versions of the Reader-Writer Exclusion
problem (for CC machines).Notes:
On the Reliability of Wireless Fingerprinting using Clock Skews
Dartmouth Technical Report TR2010-661
Chrisil Arackaparambil
Sergey Bratus
Anna Shubina
David Kotz
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/508/TR2010-661.pdf (225KB)
Abstract:
Determining whether a client station should trust an access point is a known problem
in wireless security. Traditional approaches to solving this problem resort to cryptography.
But cryptographic exchange protocols are complex and therefore induce potential
vulnerabilities in themselves. We show that measurement of clock skews of access
points in an 802.11 network can be useful in this regard, since it provides fingerprints
of the devices. Such fingerprints can be used to establish the first point of trust
for client stations wishing to connect to an access point. Fingerprinting can also
be used in the detection of fake access points.
We demonstrate deficiencies of previously studied methods that measure clock skews
in 802.11 networks by means of an attack that spoofs clock skews. We then provide
means to overcome those deficiencies, thereby improving the reliability of fingerprinting.
Finally, we show how to perform the clock-skew arithmetic that enables network providers
to publish clock skews of their access points for use by clients.Notes:
AnonyTL Specification
Dartmouth Technical Report TR2010-660
Daniel Peebles
Cory Cornelius
Apu Kapadia
David Kotz
Minho Shin
Nikos Triandopoulos
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/507/TR2010-660.pdf (71KB)
Abstract:
We provide a specification of AnonyTL, a domain-specific language that describes
sensing tasks for mobile devices in a manner that facilitates automated reasoning
about privacy.
Hardware-Assisted Secure Computation
Dartmouth Technical Report TR2009-659
Alexander Iliev
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/506/TR2009-659.pdf (1659KB)
Abstract:
The theory community has worked on Secure Multiparty Computation (SMC) for more than
two decades, and has produced many protocols for many settings. One common thread
in these works is that the protocols cannot use a Trusted Third Party (TTP), even
though this is conceptually the simplest and most general solution. Thus, current
protocols involve only the direct players---we call such protocols self-reliant.
They often use blinded boolean circuits, which has several sources of overhead,
some due to the circuit representation and some due to the blinding.
However, secure coprocessors like the IBM 4758 have actual security properties
similar to ideal TTPs. They also have little RAM and a slow CPU.We call such devices
Tiny TTPs. The availability of real tiny TTPs opens the door for a different approach
to SMC problems. One major challenge with this approach is how to execute large
programs on large inputs using the small protected memory of a tiny TTP, while preserving
the trust properties that an ideal TTP provides. In this thesis we have investigated
the use of real TTPs to help with the solution of SMC problems.
We start with the use of such TTPs to solve the Private Information Retrieval (PIR)
problem, which is one important instance of SMC. Our implementation utilizes a 4758.
The rest of the thesis is targeted at general SMC. Our SMC system, Faerieplay, moves
some functionality into a tiny TTP, and thus avoids the blinded circuit overhead.
Faerieplay consists of a compiler from high-level code to an arithmetic circuit
with special gates for efficient indirect array access, and a virtual machine to
execute this circuit on a tiny TTP while maintaining the typical SMC trust properties.
We report on Faerieplay's security properties, the specification of its components,
and our implementation and experiments. These include comparisons with the Fairplay
circuit-based two-party system, and an implementation of the Dijkstra graph shortest
path algorithm. We also provide an implementation of an oblivious RAM which supports
similar tiny TTP-based SMC functionality but using a standard RAM program. Performance
comparisons show Faerieplay's circuit approach to be considerably faster, at the
expense of a more constrained programming environment when targeting a circuit.Notes:
User survey regarding the needs of network researchers in trace-anonymization tools
Dartmouth Technical Report TR2009-658
Jihwang Yeo
Keren Tan
David Kotz
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/505/TR2009-658.pdf (398KB)
Abstract:
To understand the needs of network researchers in an anonymization tool, we conducted
a survey on the network researchers. We invited network researchers world-wide to
the survey by sending invitation emails to well-known mailing lists whose subscribers
may be interested in network research with collecting, sharing and sanitizing network
traces.
Katana: A Hot Patching Framework for ELF Executables
Dartmouth Technical Report TR2009-657
Ashwin Ramaswamy
Sergey Bratus
Michael E. Locasto
Sean W. Smith
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/504/TR2009-657.pdf (197KB)
Abstract:
Despite advances in software modularity, security, and reliability, offline patching
remains the predominant form of updating or protecting commodity software. Unfortunately,
the mechanics of hot patching (the process of upgrading a program while it executes)
remain understudied, even though such a capability offers practical benefits for
both consumer and mission-critical systems.
A reliable hot patching procedure would serve particularly well by reducing the
downtime necessary for critical functionality or security upgrades. Yet, hot patching
also carries the risk -- real or perceived -- of leaving the system in an inconsistent
state, which leads many owners to forego its benefits as too risky.
In this paper, we propose a novel method for hot patching ELF binaries that supports
(a) synchronized global data and code updates and (b) reasoning about the results
of applying the hot patch. We propose a format, which we call a Patch Object, for
encoding patches as a special type of ELF relocatable object file. Our tool, Katana,
automatically creates these patch objects as a by-product of the standard source
build process. Katana also allows an end-user to apply the Patch Objects to a running
process. In essence, our method can be viewed as an extension of the Application
Binary Interface (ABI), and we argue for its inclusion in future ABI standards.
Detecting Photographic Composites of Famous People
Dartmouth Technical Report TR2009-656
Eric Kee
Hany Farid
Date: January 2010
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/503/TR2009-656.pdf (1945KB)
Abstract:
Photos are commonly falsified by compositing two or more people into a single image.
We describe how such composites can be detected by estimating a camera's intrinsic
parameters. Differences in these parameters across the image are then used as evidence
of tampering. Expanding on earlier work, this approach is more applicable to low-resolution
images, but requires a reference image of each person in the photo as they are directly
facing the camera. When considering composites of famous people, such a reference
photo is easily obtained from an on-line image search.
Activity-Aware Electrocardiogram-based Passive Ongoing Biometric Verification
Dartmouth Technical Report TR2009-655
Janani C. Sriram
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/502/TR2009-655.pdf (1823KB)
Abstract:
Identity fraud due to lost, stolen or shared information or tokens that represent
an individual's identity is becoming a growing security concern. Biometric recognition
- the identification or verification of claimed identity, shows great potential
in bridging some of the existing security gaps. It has been shown that the human
Electrocardiogram (ECG) exhibits sufficiently unique patterns for use in biometric
recognition. But it also exhibits significant variability due to stress or activity,
and signal artifacts due to movement. In this thesis, we develop a novel activity-aware
ECG-based biometric recognition scheme that can verify/identify under different
activity conditions. From a pattern recognition standpoint, we develop algorithms
for preprocessing, feature extraction and probabilistic classification. We pay particular
attention to the applicability of the proposed scheme in ongoing biometric verification
of claimed identity. Finally we propose a wearable prototype architecture of our
scheme.Notes:
Semantic and Visual Encoding of Diagrams
Dartmouth Technical Report TR2009-654
Gabriel A. Weaver
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/501/TR2009-654.pdf (342KB)
Abstract:
Constructed geometric diagrams capture a dynamic relationship between text and image
that played a central role in ancient science and mathematics. Euclid, Theodosius,
Ptolemy, Archimedes and others constructed diagrams to geometrically model optics,
astronomy, cartography, and hydrostatics. Each derived geometric properties from
their models and interpreted their results with respect to the model's underlying
semantics. Although diagram construction is a dynamic process, the media in which
these works were published (manuscripts and books) forced scholars to either view
a snapshot of that process (a static image) or manually perform the entire construction.
Mainstream approaches to digitization represent constructed diagrams as they appear
in print, as static images. Such representations fail to capture the dynamic nature
of constructed diagrams and so we designed and implemented a computational framework
for dynamically interacting with them. Our architecture for representing, retrieving,
and interacting with diagrams has already been used to produce a publicly-available,
archival-quality digital corpus of diagrams for the Archimedes Palimpsest Project,
establishing our approach's viability in the real world. After using our system
to study diagrams in Archimedes, we discuss the generality of our approach and its
application to other domains including circuit design, software engineering, and
patent databases.
Distributed Monitoring of Conditional Entropy for Network Anomaly Detection
Dartmouth Technical Report TR2009-653
Chrisil Arackaparambil
Sergey Bratus
Joshua Brody
Anna Shubina
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/500/TR2009-653.pdf (492KB)
Abstract:
Monitoring the empirical Shannon entropy of a feature in a network packet stream
has previously been shown to be useful in detecting anomalies in the network traffic.
Entropy is an information-theoretic statistic that measures the variability of the
feature under consideration. Anomalous activity in network traffic can be captured
by detecting changes in this variability.
There are several challenges, however, in monitoring this statistic. Computing
the statistic efficiently is non-trivial. Further, when monitoring multiple features,
the streaming algorithms proposed previously would likely fail to keep up with the
ever-increasing channel bandwidth of network traffic streams. There is also the
concern that an adversary could attempt to mask the effect of his attacks on variability
by a mimicry attack disguising his traffic to mimic the distribution of normal traffic
in the network, thus avoiding detection by an entropy monitoring sensor. Also, the
high rate of false positives is a big problem with Intrusion Detection Systems,
and the case of entropy monitoring is no different.
In this work we propose a way to address the above challenges. First, we leverage
recent progress in sketching algorithms to develop a distributed approach for computing
entropic statistics accurately, at reasonable memory costs. Secondly, we propose
monitoring not only regular entropy, but the related statistic of conditional entropy,
as a more reliable measure in detecting anomalies. Lastly, we implement our approach
and evaluate it with real data collected at the link layer of an 802.11 wireless
network. To our knowledge, this is the first time entropy-based approaches have
been considered for this kind of traffic.
The Effects of Introspection on Computer Security Policies
Dartmouth Technical Report TR2009-652
Stephanie A. Trudeau
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/499/TR2009-652.pdf (40880KB)
Abstract:
What does it mean to be an expert? And what makes an expert more capable than a non-expert
when it comes to evaluating and articulating their impressions about something as
commonly practiced as food tasting? How do we explain those behaviors that humans
perform very well, but don't quite know why? Studies have shown that there exists
a class of activities that we as humans execute well intuitively, but that we perform
much worse upon introspection. Evidence supports the claim that the act of introspection
actually causes us to do more poorly at these tasks.
My goal is to apply this idea to computer security. At present, designs for most
security policy interfaces leave much to be desired. This lack of usability leaves
these systems in need of improvement, possibly causing users to become more vulnerable
than they otherwise would have. My research includes a user study on the privacy
policies of the interface for a social networking website similar to Facebook. Evidence
from the study supports the claim that the act of introspecting upon one's personal
security policy actually makes one worse at making policy decisions.Notes:
Developing an Improved, Web-Based Classroom Response System with Web Services
Dartmouth Technical Report TR2009-651
Oleg B. Seletsky
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/498/TR2009-651.pdf (1203KB)
Abstract:
Classroom Response Systems (CRS) are an in-class technology used to poll students
and instantly display an aggregate representation of their responses. CRS have been
around since the 1970s and have become increasingly more popular in higher education
lecture halls. Even though technology, specifically computers and communications,
has improved significantly since the 1970s, CRS have remained surprisingly unchanged.
The purpose of this project was to develop an innovative web-based CRS using web
services. The web-based aspect utilizes Dartmouth's wireless campus while the web
services back-end makes the product more extensible. Lastly, we added a set of out-of-class
learning tools for students as well as an in-class tool called the Confusion Meter
to enhance student-to-instructor communication. With these features, our goal was
to create a free, open-source system that enhances the teaching and learning experience
and remains extensible and developer-friendly, unlike any commercial CRS currently
available.Notes:
A Computational Framework for Certificate Policy Operations
Dartmouth Technical Report TR2009-650
Gabriel A. Weaver
Scott Rea
Sean W. Smith
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/497/TR2009-650.pdf (265KB)
Abstract:
The trustworthiness of any Public Key Infrastructure (PKI) rests upon the expectations
for trust, and the degree to which those ex- pectations are met. Policies, whether
implicit as in PGP and SDSI/SPKI or explicitly required as in X.509, document expectations
for trust in a PKI. The widespread use of X.509 in the context of global e-Science
infrastructures, financial institutions, and the U.S. Federal government demands
efficient, transparent, and reproducible policy decisions. Since current manual
processes fall short of these goals, we designed, built, and tested computational
tools to process the citation schemes of X.509 certificate policies defined in RFC
2527 and RFC 3647. Our PKI Policy Repository, PolicyBuilder, and PolicyReporter
improve the consistency of certificate policy operations as actually practiced in
compliance au- dits, grid accreditation, and policy mapping for bridging PKIs. Anecdotal
and experimental evaluation of our tools on real-world tasks establishes their actual
utility and suggests how machine-actionable policy might empower individuals to
make informed trust decisions in the future.
Applying Domain Knowledge from Structured Citation Formats to Text and Data Mining: Examples Using the CITE Architecture
Dartmouth Technical Report TR2009-649
D. Neel Smith
Gabriel A. Weaver
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/496/TR2009-649.pdf (128KB)
Abstract:
Domain knowledge expressed in structured citation formats can be exploited in data
mining. We propose four structural properties of canonically cited texts, then look
at to two classic problems in the study of the scholia, or ancient scholarly commentary,
found in the manuscripts of the Iliad. We cluster citations of scholia to analyze
their distribution in different manuscripts; this leads to a revised view of how
the manuscripts' scribes drew on their source material. Correlated frequencies of
named entities suggest that one group of manuscripts had access to material more
closely based on the work of the greatest Hellenistic editor of Homer, Aristarchus
of Samothrace.Notes:
Surface Reconstruction through Time
Dartmouth Technical Report TR2009-648
LeeAnn T. Brash
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/495/TR2009-648.pdf (731KB)
Abstract:
Surface reconstruction is an area of computational geometry that has been progressing
rapidly over the last decade. Current algorithms and their implementations can reconstruct
surfaces from a variety of input and the accuracy and precision improve with each
new development. These all make use of various heuristics to achieve a reconstruction.
Much of this work consists of reconstructing a still object from point samples taken
from the object's surface.
We examine reconstructing an n-dimensional object and its motion by treating time
as an (n + 1)st axis. Our input consists of (n-1)-dimensional scans taken over time
and at di?erent positions on the original object. This input is mapped into (n +
1) dimensions where the (n + 1)st dimension is a scaled time axis and then fed into
an existing surface reconstruction algorithm. A cross section of the reconstructed
surface perpendicular to the time axis yields an approximation to the shape of the
n-dimensional surface at the corresponding point in time.
The intended application for this work is the reconstruction of medical images
from scanning technology such as MRI or CT into moving 3d surfaces. We investigate
reconstructing 2d moving surfaces through time as a preliminary step towards the
moving 3d problem.
We spend most of our efforts in this thesis on the problem of computing a scaling
factor for mapping time into the (n + 1)st axis to minimize the number of scans
needed to meet the sampling requirements for an existing surface reconstruction
algorithm. We give three bounds, based on features of the 2d moving object, that
are necessary to accomplish this.Notes:
Hawk: 3D Gestured-Based Interactive Bird Flight Simulation
Dartmouth Technical Report TR2009-647
Thomas Y. Eastman
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/494/TR2009-647.pdf (5038KB)
Abstract:
Control interfaces provide the most tangible connection between human users and computer
software. This link is especially important in interactive real-time applications,
like games and simulations, because users desire efficient controls that allow them
to maximize their interactivity and immersion with the software. Traditionally,
interfaces have been largely limited to keyboards and mice. Recently, however, technological
advances have made motion-sensitive devices not only available to mainstream consumers
but have also lifted restrictions limiting devices to two-dimensional motion. This
work presents a 3-dimensional motion-sensitive interface alongside a natural application.
Players can control a soaring red-tailed hawk and perform various intuitive flight
maneuvers using two Nintendo Wii Remotes (Wiimotes).Notes:
An Information Complexity Approach to the Inner Product Problem
Dartmouth Technical Report TR2009-646
Amit Chakrabarti
William B. Henderson-Frost
Ranganath Kondapally
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/493/TR2009-646.pdf (254KB)
Abstract:
We prove a lower bound of the randomized communication complexity of the inner product
function on the uniform distribution.Notes:
Automated Tracking of Dividing Nuclei in Microscopy Videos of Living Cells
Dartmouth Technical Report TR2009-645
Evan L. Tice
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/492/TR2009-645.pdf (339KB)
Abstract:
Many cell biologists perform analysis of multinucleated cell data in order to better
under- stand the mechanisms that regulate cell division. Sbalzarini, et al., have
developed methods for automatically tracking nuclei in cell data in order to aid
in this time-consuming analysis. In this paper, we present an implementation of
the Sbalzarini tracking algorithm, introduce a new algorithm we developed which
is able to identify mitosis events, and present other software tools we have developed
to aid in the automated detection of nucleus data.Notes:
Autoscopy: Detecting Pattern-Searching Rootkits via Control Flow Tracing
Dartmouth Technical Report TR2009-644
Ashwin Ramaswamy
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/491/TR2009-644.pdf (735KB)
Abstract:
Traditional approaches to rootkit detection assume the execution of code at a privilege
level below that of the operating system kernel, with the use of virtual machine
technologies to enable the detection system itself to be immune from the virus or
rootkit code. In this thesis, we approach the problem of rootkit detection from
the standpoint of tracing and instrumentation techniques, which work from within
the kernel and also modify the kernel's run-time state to detect aberrant control
flows. We wish to investigate the role of emerging tracing frameworks (Kprobes,
DTrace etc.) in enforcing operating system security without the reliance on a full-blown
virtual machine just for the purposes of such policing. We first build a novel rootkit
prototype that uses pattern-searching techniques to hijack hooks embedded in dynamically
allocated memory, which we present as a showcase of emerging attack techniques.
We then build an intrusion detection system-- autoscopy, atop kprobes, that detects
anomalous control flow patterns typically exhibited by rootkits within a running
kernel. Furthermore, to validate our approach, we show that we were able to successfully
detect 15 existing Linux rootkits. We also conduct performance analyses, which show
the overhead of our system to range from 2% to 5% on a wide range of standard benchmarks.
Thus by leveraging tracing frameworks within operating systems, we show that it
is possible to introduce real-world security in devices where performance and resource
constraints are tantamount to security considerations.Notes:
Dynamic Universal Accumulators for DDH Groups and Their Application to Attribute-Based Anonymous Credential Systems
Dartmouth Technical Report TR2009-643
Man Ho Au
Patrick P. Tsang
Willy Susilo
Yi Mu
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/490/TR2009-643.pdf (367KB)
Abstract:
We present the first dynamic universal accumulator that allows (1) the accumulation
of elements in a DDH-hard group G and (2) one who knows x such that y=g^x has ---
or has not --- been accumulated, where g generates G, to efficiently prove her knowledge
of such x in zero knowledge, and hence without revealing, e.g., x or y.
We introduce the Attribute-Based Anonymous Credential System (ABACS), which allows
the verifier to authenticate anonymous users according to any access control policy
expressible as a formula of possibly negated boolean user attributes. We construct
the system from our accumulator.Notes:
Approximability of the Unsplittable Flow Problem on Trees
Dartmouth Technical Report TR2009-642
Chrisil Arackaparambil
Amit Chakrabarti
Chien-Chung Huang
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/489/TR2009-642.pdf (156KB)
Abstract:
We consider the approximability of the Unsplittable Flow Problem (UFP) on tree graphs,
and give a deterministic quasi-polynomial time approximation scheme for the problem
when the number of leaves in the tree graph is at most poly-logarithmic in $n$ (the
number of demands), and when all edge capacities and resource requirements are suitably
bounded. Our algorithm generalizes a recent technique that obtained the first such
approximation scheme for line graphs. Our results show that the problem is not APX-hard
for such graphs unless NP subseteq DTIME(2^{polylog(n)}). Further, a reduction from
the Demand Matching Problem shows that UFP is APX-hard when the number of leaves
is Omega(n^epsilon) for any constant epsilon > 0.
Together, the two results give a nearly tight characterization of the approximability
of the problem on tree graphs in terms of the number of leaves, and show the structure
of the graph that results in hardness of approximation.
A Combined Routing Method for Ad Hoc Wireless Networks
Dartmouth Technical Report TR2009-641
Soumendra Nanda
Zhenhui Jiang
David Kotz
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/488/TR2009-641.pdf (1906KB)
Abstract:
Several simulation and real world studies show that certain ad hoc routing protocols
perform better than others under specific mobility and traffic patterns. In order
to exploit this phenomena, we propose a novel approach to adapt a network to changing
conditions; we introduce "a combined routing method" that allows the network to
seamlessly swap from one routing protocol to another protocol dynamically, while
routing continues uninterrupted. By creating a thin new virtual layer, we enable
each node in the ad hoc wireless network notify each other about the protocol swap
and we do not make any changes to existing routing protocols. To ensure that routing
works efficiently after the protocol swap, we reuse information from the previous
protocol's routing table while initializing the data structures for the new routing
protocol. We study the feasibility of our technique and the overheads incurred while
swapping between AODV, ODMRP and APRL under different network topologies and traffic
patterns through detailed simulations. Our results show that the swap latency is
related to the nature of the destination protocol and the topology of the network.
We also find that the control packet ratio of a routing protocol during and after
a swap is close to that of the protocol running before a swap, thus indicating that
our approach does not add excessive overhead.Notes:
Authenticated Streamwise On-line Encryption
Dartmouth Technical Report TR2009-640
Patrick P. Tsang
Rouslan V. Solomakhin
Sean W. Smith
Date: January 2009
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/487/TR2009-640.pdf (343KB)
Abstract:
In Blockwise On-line Encryption, encryption and decryption return an output block
as soon as the next input block is received. In this paper, we introduce Authenticated
Streamwise On-line Encryption (ASOE), which operates on plaintexts and ciphertexts
as streams of arbitrary length (as opposed to fixed-sized blocks), and thus significantly
reduces message expansion and end-to-end latency. Also, ASOE provides data authenticity
as an option. ASOE can therefore be used to efficiently secure resource-constrained
communications with real-time requirements such as those in the electric power grid
and wireless sensor networks.
We investigate and formalize ASOE's strongest achievable notion of security, and
present a construction that is secure under that notion. An instantiation of our
construction incurs zero end-to-end latency due to buffering and only 48 bytes of
message expansion, regardless of the plaintext-size.
Functional Monitoring Without Monotonicity
Dartmouth Technical Report TR2008-639
Chrisil Arackaparambil
Joshua Brody
Amit Chakrabarti
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/486/TR2008-639.pdf (149KB)
Abstract:
The notion of distributed functional monitoring was recently introduced by Cormode,
Muthukrishnan and Yi to initiate a formal study of the communication cost of certain
fundamental problems arising in distributed systems, especially sensor networks.
In this model, each of k sites reads a stream of tokens and is in communication
with a central coordinator, who wishes to continuously monitor some function f of
sigma, the union of the k streams. The goal is to minimize the number of bits communicated
by a protocol that correctly monitors f(sigma), to within some small error. As in
previous work, we focus on a threshold version of the problem, where the coordinator's
task is simply to maintain a single output bit, which is 0 whenever f(sigma) leq
tau(1 - epsilon) and 1 whenever f(sigma) geq tau. Following Cormode et al., we term
this the (k, f, tau, epsilon) functional monitoring problem.
In previous work, some upper and lower bounds were obtained for this problem, with
f being a frequency moment function, e.g., F_0, F_1, F_2. Importantly, these functions
are monotone. Here, we further advance the study of such problems, proving three
new classes of results. First, we prove new lower bounds on this problem when f
= F_p, for several values of p. Second, we study the effect of non-monotonicity
of f on our ability to give nontrivial monitoring protocols, by considering f =
F_p with deletions allowed, as well as f = H, the empirical Shannon entropy of a
stream. Third, we provide nontrivial monitoring protocols when f is either H, or
any of a related class of entropy functions (Tsallis entropies). These are the first
nontrivial algorithms for distributed monitoring of non-monotone functions.
Digital Image Ballistics from JPEG Quantization: A Followup Study
Dartmouth Technical Report TR2008-638
Hany Farid
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/485/TR2008-638.pdf (258KB)
Abstract:
The lossy JPEG compression scheme employs a quantization table that controls the
amount of compression achieved. Because different cameras typically employ different
tables, a comparison of an image's quantization scheme to a database of known cameras
affords a simple technique for confirming or denying an image's source. This report
describes the analysis of quantization tables extracted from 1,000,000 images downloaded
from Flickr.com.
Nymble: Blocking Misbehaving Users in Anonymizing Networks
Dartmouth Technical Report TR2008-637
Patrick P. Tsang
Apu Kapadia
Cory Cornelius
Sean W. Smith
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/484/TR2008-637.pdf (723KB)
Abstract:
Anonymizing networks such as Tor allow users to access Internet services privately
by using a series of routers to hide the client's IP address from the server. The
success of such networks, however, has been limited by users employing this anonymity
for abusive purposes such as defacing popular websites. Website administrators routinely
rely on IP-address blocking for disabling access to misbehaving users, but blocking
IP addresses is not practical if the abuser routes through an anonymizing network.
As a result, administrators block emph{all} known exit nodes of anonymizing networks,
denying anonymous access to misbehaving and behaving users alike. To address this
problem, we present Nymble, a system in which servers can ``blacklist'' misbehaving
users, thereby emph{blocking users without compromising their anonymity}. Our system
is thus agnostic to different servers' definitions of misbehavior --- servers can
blacklist users for whatever reason, and the privacy of blacklisted users is maintained.Notes:
Toward Evaluating Lighting Design Interface Paradigms for Novice Users
Dartmouth Technical Report TR2008-636
William Brandon Kerr
Fabio Pellacini
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/483/TR2008-636.pdf (3642KB)
Abstract:
Lighting design is a complex and fundamental task in computer cinematography, involving
adjustment of light parameters to define final scene appearance. Many lighting interfaces
have been proposed to improve lighting design work flow. These paradigms exist in
three paradigm categories: direct light parameter manipulation, indirect light feature
manipulation (e.g., shadow dragging), and goal-based optimization of light through
painting. To this date, no formal evaluation of the relative effectiveness of these
methods has been performed. In this paper, we present a first step toward evaluating
the three paradigms in the form of a user study with novice users. We focus our
evaluation on simple tasks that directly affect lighting features, such as highlights,
shadows and intensity gradients, in scenes with up to 2 point lights and 5 objects
under direct illumination. We perform quantitative experiments to measure relative
efficiency between interfaces together with qualitative input to explore the intuitiveness
of the paradigms. Our results indicate that paint-based goal specification is more
cumbersome than either direct or indirect manipulation. Furthermore, our investigation
suggests improvements to not only the implementation of the paradigms, but also
overall paradigm structure for further exploration.Notes:
BLAC: Revoking Repeatedly Misbehaving Anonymous Users Without Relying on TTPs
Dartmouth Technical Report TR2008-635
Patrick P. Tsang
Man Ho Au
Apu Kapadia
Sean W. Smith
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/482/TR2008-635.pdf (335KB)
Abstract:
Several credential systems have been proposed in which users can authenticate to
service providers anonymously. Since anonymity can give users the license to misbehave,
some variants allow the selective deanonymization (or linking) of misbehaving users
upon a complaint to a trusted third party (TTP). The ability of the TTP to revoke
a user's privacy at any time, however, is too strong a punishment for misbehavior.
To limit the scope of deanonymization, systems have been proposed in which users
are deanonymized if they authenticate ``too many times,'' such as ``double spending''
with electronic cash. While useful in some applications, it is not possible to generalize
such techniques to more subjective definitions of misbehavior, e.g., it is not possible
to block users who ``deface too many webpages'' on a website. We present BLAC, the
first anonymous credential system in which service providers can revoke the credentials
of repeatedly misbehaving users without relying on a TTP. Since revoked users remain
anonymous, misbehaviors can be judged subjectively without users fearing arbitrary
deanonymization by a TTP. Finally, our construction supports a $d$-strikes-out revocation
policy, whereby users who have been subjectively judged to have repeatedly misbehaved
at least $d$ times are revoked from the system.Notes:
LZfuzz: a fast compression-based fuzzer for poorly documented protocols
Dartmouth Technical Report TR2008-634
Sergey Bratus
Axel Hansen
Anna Shubina
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/481/TR2008-634.pdf (396KB)
Abstract:
Real-world infrastructure offers many scenarios where protocols (and other details)
are not released due to being considered too sensitive or for other reasons. This
situation makes it hard to apply fuzzing techniques to test their security and reliability,
since their full documentation is only available to their developers, and domain
developer expertise does not necessarily intersect with fuzz-testing expertise (nor
deployment responsibility). State-of-the-art fuzzing techniques, however, work best
when protocol specifications are available. Still, operators whose networks include
equipment communicating via proprietary protocols should be able to reap the benefits
of fuzz-testing them. In particular, administrators should be able to test proprietary
protocols in the absence of end-to-end application-level encryption to understand
whether they can withstand injection of bad traffic, and thus be able to plan adequate
network protection measures. Such protocols can be observed in action prior to fuzzing,
and packet captures can be used to learn enough about the structure of the protocol
to make fuzzing more efficient. Various machine learning approaches, e.g. bioinformatics
methods, have been proposed for learning models of the targeted protocols. The problem
with most of these approaches to date is that, although sometimes quite successful,
they are very computationally heavy and thus are hardly practical for application
by network administrators and equipment owners who cannot easily dedicate a compute
cluster to such tasks. We propose a simple method that, despite its roughness, allowed
us to learn facts useful for fuzzing from protocol traces at much smaller CPU and
time costs. Our fuzzing approach proved itself empirically in testing actual proprietary
SCADA protocols in an isolated control network test environment, and was also successful
in triggering flaws in implementations of several popular commodity Internet protocols.
Our fuzzer, LZfuzz (pronounced ``lazy-fuzz'') relies on a variant of Lempel--Ziv
compression algorithm to guess boundaries between the structural units of the protocol,
and builds on the well-known free software GPF fuzzer.
Attribute-Based, Usefully Secure Email
Dartmouth Technical Report TR2008-633
Christopher P. Masone
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/480/TR2008-633.pdf (4705KB)
Abstract:
A secure system that cannot be used by real users to secure real-world processes
is not really secure at all. While many believe that usability and security are
diametrically opposed, a growing body of research from the field of Human-Computer
Interaction and Security (HCISEC) refutes this assumption. All researchers in this
field agree that focusing on aligning usability and security goals can enable the
design of systems that will be more secure under actual usage.
We bring to bear tools from the social sciences (economics, sociology, psychology,
etc.) not only to help us better understand why deployed systems fail, but also
to enable us to accurately characterize the problems that we must solve in order
to build systems that will be secure in the real world. Trust, a critically important
facet of any socio-technical secure system, is ripe for analysis using the tools
provided for us by the social sciences. There are a variety of scopes in which issues
of trust in secure systems can be stud- ied. We have chosen to focus on how humans
decide to trust new correspondents. Current secure email systemssuch as S/MIME
and PGP/MIMEare not expressive enough to capture the real ways that trust flows
in these sorts of scenarios. To solve this problem, we begin by applying concepts
from social science research to a variety of such cases from interesting application
domains; primarily, crisis management in the North American power grid. We have
examined transcripts of telephone calls made between grid manage- ment personnel
during the August 2003 North American blackout and extracted several different classes
of trust flows from these real-world scenarios. Combining this knowl- edge with
some design patterns from HCISEC, we develop criteria for a system that will enable
humans apply these same methods of trust-building in the digital world. We then
present Attribute-Based, Usefully Secure Email (ABUSE) and not only show that it
meets our criteria, but also provide empirical evidence that real users are helped
by the system.Notes:
TwoKind Authentication: Protecting Private Information in Untrustworthy Environments (Extended Version)
Dartmouth Technical Report TR2008-632
Katelin Bailey
Apu Kapadia
Linden Vongsathorn
Sean W. Smith
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/479/TR2008-632.pdf (416KB)
Abstract:
We propose and evaluate TwoKind Authentication, a simple and effective technique
that allows users to limit access to their private information in untrustworthy
environments. Users often log in to Internet sites from insecure computers, and
more recently have started divulging their email passwords to social-networking
sites, thereby putting their private communications at risk. To mitigate this problem,
we explore the use of multiple authenticators for the same account that are associated
with specific sets of privileges. In its simplest form, TwoKind features two modes
of authentication, a low and a high authenticator. By using a low authenticator,
users can signal to the server they are in an untrusted environment, following which
the server restricts the user's actions, including access to private data. In this
paper, we seek to evaluate the effectiveness of multiple authenticators in promoting
safer behavior in users. We demonstrate the effectiveness of this approach through
a user experiment --- we find that users make a distinction between the two authenticators
and generally behave in a security-conscientious way, protecting their high authenticator
a majority of the time. Our study suggests that TwoKind will be beneficial to several
Internet applications, particularly if the privileges can be customized to a user's
security preferences.Notes:
Pas de Deux avec les Microrobots (Video)
Dartmouth Technical Report TR2008-631
Bruce R. Donald
Christopher G. Levey
Igor Paprotny
Date: January 2008
Abstract:
Video captured through an optical microscope, showing simultaneous control and operation
of two stress-engineered microrobots. The dimensions of our microrobots are 260
x 60 x 10 micrometers; each robot consists of an unthetered scratch-drive actuator
that provides forward motion, and a steering-arm actuator that controls whether
the robot moves in a straight line or turns. Our stress-engineered microrobots
are electrostatically powered via a global control signal transmitted to all the
robots regardless of the their position and orientation within their operating environment.
Hence, a single control and power-delivery signal must be used to simultaneously
control all robots within the same operating environment, resulting in a highly
underactuated system. Despite this high level of underactution we are able to achieve
independent control of the individual microrobots by designing their steering-arms
to respond to different voltage levels of the supplied control signal. This example
uses nested hysteresis gaps. A hysteresis gap is the difference between the snap-down
and release voltages for a steering-arm actuator. Nested hysteresis gaps allow us
to set the states of the steering-arms (up or down) to any configuration. As shown
in this video, all four states of the two microrobot steering-arms are used to choreograph
their motion. A disadvantage of nested hysteresis gaps is that they are control-voltage
bandwidth intensive, limiting the number of simultaneously-controllable devices.
An alternative multi-microrobot control scheme that minimizes control-bandwidth
is described in [1].Notes:
Planar Microassembly by Parallel Actuation of MEMS Microrobots (Microassembly Video)
Dartmouth Technical Report TR2008-630
Bruce R. Donald
Christopher G. Levey
Igor Paprotny
Date: January 2008
Abstract:
Movie of a representative microassembly experiment using devices from species 1,3,4
and 5, recorded through an optical microscope. The robots are initially arranged
along the corners of a rectangle with sides 1 by 0.9 mm. The assembly experiment
is divided into three stages. During stage 1, devices 4 and 5 dock together to form
the initial stable shape. In stage 2, device 3 docks with the initial stable shape,
while during stage 3, device 1 docks with the stable shape, forming the final assembly.Notes:
Lighting and Optical Tools for Image Forensics
Dartmouth Technical Report TR2008-629
Micah K. Johnson
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/476/TR2008-629.pdf (5377KB)
Abstract:
We present new forensic tools that are capable of detecting traces of tampering in
digital images without the use of watermarks or specialized hardware. These tools
operate under the assumption that images contain natural properties from a variety
of sources, including the world, the lens, and the sensor. These properties may
be disturbed by digital tampering and by measuring them we can expose the forgery.
In this context, we present the following forensic tools: (1) illuminant direction,
(2) specularity, (3) lighting environment, and (4) chromatic aberration. The common
theme of these tools is that they exploit lighting or optical properties of images.
Although each tool is not applicable to every image, they add to a growing set of
image forensic tools that together will complicate the process of making a convincing
forgery.Notes:
Key Management for Secure Power SCADA
Dartmouth Technical Report TR2008-628
Manya K. Sleeper
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/475/TR2008-628.pdf (926KB)
Abstract:
This thesis proposes a key management protocol for secure power SCADA systems that
seeks to take advantage of the full security capacity of a given network by allowing
devices to use public key cryptography for key management if they are capable of
doing so and reverting to symmetric key cryptography only when such use is necessitated
by the weakness of a given device. Allowing devices to obtain different levels of
security permits SCADA networks to maximize their security in the decades before
such networks are capable of implementing fully public key-based key management
protocols. Such a system is obtained through the use of a protocol based on a modified
version of SSL using X.509 certificates containing encrypted symmetric keys that
allow master devices the option of using the symmetric keys for encrypting the shared
secret used to create keying material, instead of using a slave device's public
key. This thesis presents the protocol and uses proof-of-concept code to carry out
a performance evaluation of the key management scheme.Notes:
Detecting kernel rootkits
Dartmouth Technical Report TR2008-627
Ashwin Ramaswamy
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/474/TR2008-627.pdf (327KB)
Abstract:
Kernel rootkits are a special category of malware that are deployed directly in the
kernel and hence have unmitigated reign over the functionalities of the kernel itself.
We seek to detect such rootkits that are deployed in the real world by first observing
how the majority of kernel rootkits operate. To this end, comparable to how rootkits
function in the real world, we write our own kernel rootkit that manipulates the
network driver, thus giving us control over all packets sent into the network. We
then implement a mechanism to thwart the attacks of such rootkits by noticing that
a large number of the rootkits deployed today rely heavily on the redirection of
function pointers within the kernel. By overwriting the desired function pointer
to its own function, a rootkit can perform a proverbial man-in-the-middle attack.
Our goal is not just the detection of kernel rootkits, but also to levy as little
an impact on system performance as possible. Hence our technique is to leverage
existing kernel functionalities (in the case of Linux) such as kprobes to identify
potential attack scenarios from within the sytem rather than from outside it (such
as a VMM). We hope to introduce real-world security in devices where performance
and resource constraints are tantamount to security considerations.Notes:
Anchor-Free Localization in Mixed Wireless Sensor Network Systems
Dartmouth Technical Report TR2008-626
Yurong Xu
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/473/TR2008-626.pdf (1647KB)
Abstract:
Recent technological advances have fostered the emergence of Wireless Sensor Networks
(WSNs), which consist of tiny, wireless, battery-powered nodes that are expected
to revolutionize the ways in which we understand and construct complex physical
systems. A fundamental property needed to use and maintain these WSNs is ``localization'',
which allows the establishment of spatial relationships among nodes over time. This
dissertation presents a series of Geographic Distributed Localization (GDL) algorithms
for mixed WSNs, in which both static and mobile nodes can coexist. The GDL algorithms
provide a series of useful methods for localization in mixed WSNs. First, GDL provides
an approximation called ``hop-coordinates'', which improves the accuracy of both
hop-counting and connectivity-based measurement techniques. Second, GDL utilizes
a distributed algorithm to compute the locations of all nodes in static networks
with the help of the hop-coordinates approximation. Third, GDL integrates a sensor
component into this localization paradigm for possible mobility and as a result
allows for a more complex deployment of WSNs as well as lower costs. In addition,
the development of GDL
incorporated the possibility of manipulated communications, such as wormhole attacks.
Simulations show that such a localization system can provide fundamental support
for security by detecting and localizing wormhole attacks. Although several localization
techniques have been proposed in the past few years, none currently satisfies our
requirements to provide an accurate, efficient and reliable localization for mixed
WSNs. The contributions of this dissertation are: (1) our measurement technique
achieves better accuracy both in measurement and localization than other methods;
(2) our method significantly improves the efficiency of localization in updating
location in mixed WSNs by incorporating sensors into the method; (3) our method
can detect and locate the communication that has been manipulated by a wormhole
in a network without relying on a central server.Notes:
Making RBAC Work in Dynamic, Fast-Changing Corporate Environments
Dartmouth Technical Report TR2008-624
Ruslan Y. Dimov
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/472/TR2008-624.pdf (469KB)
Abstract:
In large organizations with tens of thousands of employees, managing individual people's
permissions is tedious and error prone, and thus a possible source of security risks.
Role-Based Access Control addresses this problem by grouping users into roles, which
reflect job functions in the corporation. Permissions are assigned to roles instead
of directly to users, which means that all users assigned to a role have the same
set of permissions with respect to that role. However, adoption of RBAC in organizations
such as investment banks is hindered by two main factors: first, it is costly and
time-consuming to define roles. Second, there are certain job functions (such as
consultant) that cannot be expressed as RBAC roles, because their users need to
have different permission sets. The topic of this thesis is to investigate whether
roles can be applied to domains that exhibit the peculiarities of the investment
bank example. We introduce a new framework for roles that allows us to separately
represent what the role means as a job function, and what permissions its individual
users have. That way we maintain the key property of RBAC - that the number of roles
is small, while allowing for variations among users. We have also investigated machine
learning approaches in order to figure out whether roles are concepts that can be
learned or approximated by a function. We present our findings that certain learning
schemes, such as Probably Approximately Correct (PAC) earning and Instance-based
learning are not applicable to roles, while others - such as decision-tree learning,
might be useful.Notes:
Linkability in Activity Inference Data Sets
Dartmouth Technical Report TR2008-623
Jeffrey Fielding
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/471/TR2008-623.pdf (242KB)
Abstract:
Activity inference is an active area of ubiquitous computing research. By training
machine learning algorithms on data from sensors worn by volunteers, researchers
hope to develop software that can interact more naturally with the user by inferring
what the user is doing. In this thesis, we use the same sensor data to infer which
volunteer is carrying the sensors. Such inference could be useful -- for example,
a mobile device might infer who is carrying it and adapt to that user's preferences.
It also raises some privacy concerns, since an attacker could learn more about a
user by linking together several sensor traces from the same user. We develop a
model to differentiate users based on their sensor data, and examine its accuracy
as well as the potential benefits and pitfalls.Notes:
Group-Aware Stream Filtering
Dartmouth Technical Report TR2008-621
Ming Li
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/470/TR2008-621.pdf (2125KB)
Abstract:
Recent years have witnessed a new class of monitoring applications that need to continuously
collect information from remote data sources. Those data sources, such as web click-streams,
stock quotes, and sensor data, are often characterized as fast-rate high-volume
``streams''. Distributed stream-processing systems are thus designed to efficiently
use system resources to serve the data-acquisition needs of the applications. Most
of the state-of-the-art stream-processing systems assume an Ethernet-based network
whose bandwidth is abundant, and focus on mechanisms to save computational power
and memory. For applications involving wireless networks, particularly multi-hop
mesh networks, we recognize that the most limiting factor in efficiently processing
streams lies in the network's highly constrained bandwidth. Hence, this dissertation
proposes a group-aware stream filtering approach that saves bandwidth at the cost
of increased CPU time, for low-bandwidth data-streaming systems.
This approach, used together with multicasting, exploits two overlooked properties
of monitoring applications: 1) many of them can tolerate some degree of ``slack''
in their data quality requirements, and 2) there may exist multiple subsets of the
source data satisfying the quality needs of an application. We can thus choose the
``best alternative'' subset for each application to maximize the data overlap within
the group to best benefit from multicasting. After proving the problem NP-hard,
we introduce a suite of heuristics-based algorithms that ensure data quality, specifically
data granularity and timeliness, in addition to preserving network bandwidth.
Our framework for group-aware stream filtering is extensible and supports a diverse
range of filtering needs of monitoring applications. We evaluate this approach with
a prototype system based on real-world data sets. The results show that quality-managed
group-aware filtering is effective in trading CPU time for bandwidth savings, compared
with self-interested stream filtering. We also evaluate the effect of each algorithm
on temporal freshness of the data. Finally, we discuss other application realms
that might benefit from group-aware stream filtering.Notes:
A Dynamically Refocusable Sampling Infrastructure for 802.11 Networks
Dartmouth Technical Report TR2008-620
Udayan Deshpande
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/469/TR2008-620.pdf (1874KB)
Abstract:
The edge of the Internet is increasingly wireless. Enterprises large and small, homeowners,
and even whole cities have deployed Wi-Fi networks for their users, and many users
never need to--- or never bother to--- use the wired network. With the advent of
high-throughput wireless networks (such as 802.11n) some new construction, even
of large enterprise build- ings, may no longer be wired for Ethernet. To understand
Internet traffic, then, we need to understand the wireless edge.
Measuring Wi-Fi traffic, however, is challenging. It is insufficient to capture
traffic in the access points, or upstream of the access points, because the activity
of neighboring networks, ad hoc networks, and physical interference cannot be seen
at that level. To truly understand the MAC-layer behavior, we need to capture frames
from the air using Air Monitors (AMs) placed in the vicinity of the network. Such
a capture is always a sample of the network activity, since it is physically impossible
to capture a full trace: all frames from all channels at all times in all places.
We have built a monitoring infrastructure that captures frames from the 802.11 network.
This infrastructure includes several "channel sampling" strategies that will capture
repre- sentative traffic from the network. Further, the monitoring infrastructure
needs to modify its behavior according to feedback received from the downstream
consumers of the captured traffic in case the analysis needs traffic of a certain
type. We call this technique "refocusing". The "coordinated sampling" technique
improves the efficiency of the monitoring by utilizing the AMs intelligently.
Finally, we deployed this measurement infrastructure within our Computer Science
building to study the performance of the system with real network traffic.Notes:
Mesh-Mon: a Monitoring and Management System for Wireless Mesh Networks
Dartmouth Technical Report TR2008-619
Soumendra Nanda
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/468/TR2008-619.pdf (5302KB)
Abstract:
A mesh network is a network of wireless routers that employ multi-hop routing and
can be used to provide network access for mobile clients. Mobile mesh networks can
be deployed rapidly to provide an alternate communication infrastructure for emergency
response operations in areas with limited or damaged infrastructure. In this dissertation,
we present Dart-Mesh: a Linux-based layer-3 dual-radio two-tiered mesh network that
provides complete 802.11b coverage in the Sudikoff Lab for Computer Science at Dartmouth
College. We faced several challenges in building, testing, monitoring and managing
this network. These challenges motivated us to design and implement Mesh-Mon, a
network monitoring system to aid system administrators in the management of a mobile
mesh network. Mesh-Mon is a scalable, distributed and decentralized management system
in which mesh nodes cooperate in a proactive manner to help detect, diagnose and
resolve network problems automatically. Mesh-Mon is independent of the routing protocol
used by the mesh routing layer and can function even if the routing protocol fails.
We demonstrate this feature by running Mesh-Mon on two versions of Dart-Mesh, one
running on AODV (a reactive mesh routing protocol) and the second running on OLSR
(a proactive mesh routing protocol) in separate experiments. Mobility can cause
links to break, leading to disconnected partitions. We identify critical nodes in
the network, whose failure may cause a partition. We introduce two new metrics based
on social-network analysis: the Localized Bridging Centrality (LBC) metric and the
Localized Load-aware Bridging Centrality (LLBC) metric, that can identify critical
nodes efficiently and in a fully distributed manner. We run a monitoring component
on client nodes, called Mesh-Mon-Ami, which also assists Mesh-Mon nodes in the dissemination
of management information between physically disconnected partitions, by acting
as carriers for management data.
We conclude, from our experimental evaluation on our 16-node Dart-Mesh testbed,
that our system solves several management challenges in a scalable manner, and is
a useful and effective tool for monitoring and managing real-world mesh networks.Notes:
The Weakest Failure Detector to Solve Mutual Exclusion
Dartmouth Technical Report TR2008-618
Vibhor Bhatt
Nicholas Christman
Prasad Jayanti
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/467/TR2008-618.pdf (189KB)
Abstract:
Mutual exclusion is not solvable in an asynchronous message-passing system where
processes are subject to crash failures. Delporte-Gallet et. al. determined the
weakest failure detector to solve this problem when a majority of processes are
correct. Here we identify the weakest failure detector to solve mutual exclusion
in any environment, i.e., regardless of the number of faulty processes.
We also show a relation between mutual exclusion and consensus, arguably the two
most fundamental problems in distributed computing. Specifically, we show that a
failure detector that solves mutual exclusion is sufficient to solve non-uniform
consensus but not necessarily uniform consensus.
YASIR: A Low-Latency, High-Integrity Security Retrofit for Legacy SCADA Systems (Extended Version)
Dartmouth Technical Report TR2008-617
Patrick P. Tsang
Sean W. Smith
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/466/TR2008-617.pdf (983KB)
Abstract:
We construct a bump-in-the-wire (BITW) solution that retrofits security into time-critical
communications over bandwidth-limited serial links between devices in legacy Supervisory
Control And Data Acquisition (SCADA) systems, on which the proper operations of
critical infrastructures such as the electric power grid rely. Previous BITW solutions
do not provide the necessary security within timing constraints; the previous solution
that does is not BITW. At a hardware cost comparable to existing solutions, our
BITW solution provides sufficient security, and yet incurs minimal end-to-end communication
latency.Notes:
Bounded Unpopularity Matchings
Dartmouth Technical Report TR2008-616
Chien-Chung Huang
Kavitha Telikepalli
Dimitrios Michail
Meghana Nasre
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/465/TR2008-616.pdf (110KB)
Abstract:
We investigate the following problem: given a set of jobs and a set of people with
preferences over the jobs, what is the optimal way of matching people to jobs? Here
we consider the notion of emph{popularity}. A matching $M$ is popular if there is
no matching $M'$ such that more people prefer $M'$ to $M$ than the other way around.
Determining whether a given instance admits a popular matching and, if so, finding
one, was studied in cite{AIKM05}. If there is no popular matching, a reasonable
substitute is a matching whose {em unpopularity} is bounded. We consider two measures
of unpopularity - {em unpopularity factor} denoted by $u(M)$ and {em unpopularity
margin} denoted by $g(M)$. McCutchen recently showed that computing a matching $M$
with the minimum value of $u(M)$ or $g(M)$ is NP-hard, and that if $G$ does not
admit a popular matching, then we have $u(M) ge 2$ for all matchings $M$ in $G$.
Here we show that a matching $M$ that achieves $u(M) = 2$ can be computed in $O(msqrt{n})$
time (where $m$ is the number of edges in $G$ and $n$ is the number of nodes) provided
a certain graph $H$ admits a matching that matches all people. We also describe
a sequence of graphs: $H = H_2, H_3,ldots,H_k$ such that if $H_k$ admits a matching
that matches all people, then we can compute in $O(kmsqrt{n})$ time a matching $M$
such that $u(M) le k-1$ and $g(M) le n(1-frac{2}{k})$. Simulation results suggest
that our algorithm finds a matching with low unpopularity.
PPAA: Peer-to-Peer Anonymous Authentication (Extended Version)
Dartmouth Technical Report TR2008-615
Patrick P. Tsang
Sean W. Smith
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/464/TR2008-615.pdf (1265KB)
Abstract:
In the pursuit of authentication schemes that balance user privacy and accountability,
numerous anonymous credential systems have been constructed. However, existing systems
assume a client-server architecture in which only the clients, but not the servers,
care about their privacy. In peer-to-peer (P2P) systems where both clients and servers
are peer users with privacy concerns, no existing system correctly strikes that
balance between privacy and accountability. In this paper, we provide this missing
piece: a credential system in which peers are {em pseudonymous} to one another (that
is, two who interact more than once can recognize each other via pseudonyms) but
are otherwise anonymous and unlinkable across different peers. Such a credential
system finds applications in, e.g., Vehicular Ad-hoc Networks (VANets) and P2P networks.
We formalize the security requirements of our proposed credential system, provide
a construction for it, and prove the security of our construction. Our solution
is efficient: its complexities are independent of the number of users in the system.Notes:
Experiment Planning for Protein Structure Elucidation and Site-Directed Protein Recombination
Dartmouth Technical Report TR2008-614
Xiaoduan Ye
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/463/TR2008-614.pdf (4527KB)
Abstract:
In order to most effectively investigate protein structure and improve protein function,
it is necessary to carefully plan appropriate experiments. The combinatorial number
of possible experiment plans demands effective criteria and efficient algorithms
to choose the one that is in some sense optimal. This thesis addresses experiment
planning challenges in two significant applications. The first part of this thesis
develops an integrated computational-experimental approach for rapid discrimination
of predicted protein structure models by quantifying their consistency with relatively
cheap and easy experiments (cross-linking and site-directed mutagenesis followed
by stability measurement). In order to obtain the most information from noisy and
sparse experimental data, rigorous Bayesian frameworks have been developed to analyze
the information content. Efficient algorithms have been developed to choose the
most informative, least expensive, and most robust experiments. The effectiveness
of this approach has been demonstrated using existing experimental data as well
as simulations, and it has been applied to discriminate predicted structure models
of the pTfa chaperone protein from bacteriophage lambda.
The second part of this thesis seeks to choose optimal breakpoint locations for
protein engineering by site-directed recombination. In order to increase the possibility
of obtaining folded and functional hybrids in protein recombination, it is necessary
to retain the evolutionary relationships among amino acids that determine protein
stability and functionality. A probabilistic hypergraph model has been developed
to model these relationships, with edge weights representing their statistical significance
derived from database and a protein family. The effectiveness of this model has
been validated by showing its ability to distinguish functional hybrids from non-functional
ones in existing experimental data. It has been proved to be NP-hard in general
to choose the optimal breakpoint locations for recombination that minimize the total
perturbation to these relationships, but exact and approximate algorithms have been
developed for a number of important cases.Notes:
Complete Configuration Space Analysis for Structure Determination of Symmetric Homo-oligomers by NMR
Dartmouth Technical Report TR2008-613
Shobha Potluri
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/462/TR2008-613.pdf (3319KB)
Abstract:
Symmetric homo-oligomers (protein complexes with similar subunits arranged symmetrically)
play pivotal roles in complex biological processes such as ion transport and cellular
regulation. Structure determination of these complexes is necessary in order to
gain valuable insights into their mechanisms. Nuclear Magnetic Resonance (NMR) spectroscopy
is an experimental technique used for structural studies of such complexes. The
data available for structure determination of symmetric homo-oligomers by NMR is
often sparse and ambiguous in nature, raising concerns about existing heuristic
approaches for structure determination. We have developed an approach that is complete
in that it identifies all consistent conformations, data-driven in that it separately
evaluates the consistency of structures to data and biophysical constraints and
efficient in that it avoids explicit consideration of each of the possible structures
separately. By being complete, we ensure that native conformations are not missed.
By being data-driven, we are able to separately quantify the information content
in the data alone versus data and biophysical modeling. We take a configuration
space (degree-of-freedom) approach that provides a compact representation of the
conformation space and enables us to efficiently explore the space of possible conformations.
This thesis demonstrates that the configuration space-based method is robust to
sparsity and ambiguity in the data and enables complete, data-driven and efficient
structure determination of symmetric homo-oligomers.Notes:
Localized Bridging Centrality for Distributed Network Analysis
Dartmouth Technical Report TR2008-612
Soumendra Nanda
David Kotz
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/461/TR2008-612.pdf (336KB)
Abstract:
Centrality is a concept often used in social network analysis to study different
properties of networks that are modeled as graphs. We present a new centrality metric
called Localized Bridging Centrality (LBC). LBC is based on the Bridging Centrality
(BC) metric that Hwang et al. recently introduced. Bridging nodes are nodes that
are located in between highly connected regions. LBC is capable of identifying bridging
nodes with an accuracy comparable to that of the BC metric for most networks. As
the name suggests, we use only local information from surrounding nodes to compute
the LBC metric, while, global knowledge is required to calculate the BC metric.
The main difference between LBC and BC is that LBC uses the egocentric definition
of betweenness centrality to identify bridging nodes, while BC uses the sociocentric
definition of betweenness centrality. Thus, our LBC metric is suitable for distributed
computation and has the benefit of being an order of magnitude faster to calculate
in computational complexity. We compare the results produced by BC and LBC in three
examples. We applied our LBC metric for network analysis of a real wireless mesh
network. Our results indicate that the LBC metric is as powerful as the BC metric
at identifying bridging nodes that have a higher flow of information through them
(assuming a uniform distribution of network flows) and are important for the robustness
of the network.
Evaluating Mobility Predictors in Wireless Networks for Improving Handoff and Opportunistic Routing
Dartmouth Technical Report TR2008-611
Libo Song
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/460/TR2008-611.pdf (3097KB)
Abstract:
We evaluate mobility predictors in wireless networks. Handoff prediction in wireless
networks has long been considered as a mechanism to improve the quality of service
provided to mobile wireless users. Most prior studies, however, were based on theoretical
analysis, simulation with synthetic mobility models, or small wireless network traces.
We study the effect of mobility prediction for a large realistic wireless situation.
We tackle the problem by using traces collected from a large production wireless
network to evaluate several major families of handoff-location prediction techniques,
a set of handoff-time predictors, and a predictor that jointly predicts handoff
location and time. We also propose a fallback mechanism, which uses a lower-order
predictor whenever a higher-order predictor fails to predict.
We found that low-order Markov predictors, with our proposed fallback mechanisms,
performed as well or better than the more complex and more space-consuming compression-based
handoff-location predictors. Although our handoff-time predictor had modest prediction
accuracy, in the context of mobile voice applications we found that bandwidth reservation
strategies can benefit from the combined location and time handoff predictor, significantly
reducing the call-drop rate without significantly increasing the call-block rate.
We also developed a prediction-based routing protocol for mobile opportunistic networks.
We evaluated and compared our protocol's performance to five existing routing protocols,
using simulations driven by real mobility traces. We found that the basic routing
protocols are not practical for large-scale opportunistic networks. Prediction-based
routing protocols trade off the message delivery ratio against resource usage and
performed well and comparable to each other.Notes:
Active Behavioral Fingerprinting of Wireless Devices
Dartmouth Technical Report TR2008-610
Sergey Bratus
Cory Cornelius
Daniel Peebles
David Kotz
Date: January 2008
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/459/TR2008-610.pdf (290KB)
Abstract:
We propose a simple active method for discovering facts about the chipset, the firmware
or the driver of an 802.11 wireless device by observing its responses (or lack thereof)
to a series of crafted non-standard or malformed 802.11 frames. We demonstrate that
such responses can differ significantly enough to distinguish between a number of
popular chipsets and drivers. We expect to significantly expand the number of recognized
device types through community contributions of signature data for the proposed
open fingerprinting framework. Our method complements known fingerprinting approaches,
and can be used to interrogate and spot devices that may be spoofing their MAC addresses
in order to conceal their true architecture from other stations, such as a fake
AP seeking to engage clients in complex protocol frame exchange (e.g., in order
to exploit a driver vulnerability). In particular, it can be used to distinguish
rogue APs from legitimate APs before association.Notes:
Settling for limited privacy: how much does it help?
Dartmouth Technical Report TR2008-609
Anna M. Shubina
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/458/TR2008-609.pdf (650KB)
Abstract:
This thesis explores practical and theoretical aspects of several privacy-providing
technologies, including tools for anonymous web-browsing, verifiable electronic
voting schemes, and private information retrieval from databases. State-of-art privacy-providing
schemes are frequently impractical for implementational reasons or for sheer information-theoretical
reasons due to the amount of information that needs to be transmitted. We have been
researching the question of whether relaxing the requirements on such schemes, in
particular settling for imperfect but sufficient in real-world situations privacy,
as opposed to perfect privacy, may be helpful in producing more practical or more
efficient schemes. This thesis presents three results. The first result is the
introduction of caching as a technique for providing anonymous web-browsing at the
cost of sacrificing some functionality provided by anonymizing systems that do not
use caching. The second result is a coercion-resistant electronic voting scheme
with nearly perfect privacy and nearly perfect voter verifiability. The third result
consists of some lower bounds and some simple upper bounds on the amount of communication
in nearly private information retrieval schemes; our work is the first in-depth
exploration of private information schemes with imperfect privacy.Notes:
Exclusion and Object Tracking in a Network of Processes
Dartmouth Technical Report TR2007-608
Yih-Kuen Tsay
Chien-Chung Huang
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/457/TR2007-608.pdf (279KB)
Abstract:
This paper concerns two fundamental problems in distributed computing---mutual exclusion
and mobile object tracking. For a variant of the mutual exclusion problem where
the network topology is taken into account, all existing distributed solutions make
use of tokens. It turns out that these token-based solutions for mutual exclusion
can also be adapted for object tracking, as the token behaves very much like a mobile
object. To handle objects with replication, we go further to consider the more general
$k$-exclusion problem which has not been as well studied in a network setting. A
strong fairness property for $k$-exclusion requires that a process trying to enter
the critical section will eventually succeed even if emph{up to} $k-1$ processes
stay in the critical section indefinitely. We present a comparative survey of existing
token-based mutual exclusion algorithms, which have provided much inspiration for
later $k$-exclusion algorithms. We then propose two solutions to the $k$-exclusion
problem, the second of which meets the strong fairness requirement. Fault-tolerance
issues are also discussed along with the suggestion of a third algorithm that is
also strongly fair. Performances of the three algorithms are compared by simulation.
Finally, we show how the various exclusion algorithms can be adapted for tracking
mobile objects.
The Quality of Open Source Production: Zealots and Good Samaritans in the Case of Wikipedia
Dartmouth Technical Report TR2007-606
Denise Anthony
Sean W. Smith
Tim Williamson
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/456/TR2007-606.pdf (311KB)
Abstract:
New forms of production based in electronic technology, such as open-source and open-content
production, convert private commodities (typically software) into essentially public
goods. A number of studies find that, like in other collective goods, incentives
for reputation and group identity motivate contributions to open source goods, thereby
overcoming the social dilemma inherent in producing such goods. In this paper we
examine how contributor motivations affect the quality of contributions to the open-content
online encyclopedia Wikipedia. We find that quality is associated with contributor
motivations, but in a surprisingly inconsistent way. Registered users' quality increases
with more contributions, consistent with the idea of participants motivated by reputation
and commitment to the Wikipedia community. Surprisingly, however, we find the highest
quality from the vast numbers of anonymous "Good Samaritans" who contribute only
once. Our findings that Good Samaritans as well as committed "zealots" contribute
high quality content to Wikipedia suggest that it is the quantity as well as the
quality of contributors that positively affects the quality of open source production.Notes:
Video Stabilization and Enhancement
Dartmouth Technical Report TR2007-605
Hany Farid
Jeffrey B. Woodward
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/455/TR2007-605.pdf (2127KB)
Abstract:
We describe a simple and computationally efficient approach for video stabilization
and enhancement. By combining multiple low-quality video frames, it is possible
to extract a high-quality still image. This technique is particularly helpful in
identifying people, license plates, etc. from low-quality video surveillance cameras.
YASIR: A Low-Latency, High-Integrity Security Retrofit for Legacy SCADA Systems
Dartmouth Technical Report TR2007-603
Patrick P. Tsang
Sean W. Smith
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/454/TR2007-603.pdf (514KB)
Abstract:
We construct a bump-in-the-wire (BITW) solution that retrofits security into time-critical
communications over bandwidth-limited serial links between devices in Supervisory
Control And Data Acquisition (SCADA) systems. Previous BITW solutions fail to provide
the necessary security within timing constraints; the previous solution that does
provide the necessary security is not BITW. At a comparable hardware cost, our BITW
solution provides sufficient security, and yet incurs minimal end-to-end communication
latency. A microcontroller prototype of our solution is under development.Notes:
Fast-Converging Tatonnement Algorithms for the Market Problem
Dartmouth Technical Report TR2007-602
Richard Cole
Lisa Fleischer
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/453/TR2007-602.pdf (244KB)
Abstract:
Why might markets tend toward and remain near equilibrium prices? In an effort to
shed light on this question from an algorithmic perspective, this paper defines
and analyzes two simple tatonnement algorithms that differ from previous algorithms
that have been subject to asymptotic analysis in three significant respects: the
price update for a good depends only on the price, demand, and supply for that good,
and on no other information; the price update for each good occurs distributively
and asynchronously; the algorithms work (and the analyses hold) from an arbitrary
starting point.
Our algorithm introduces a new and natural update rule. We show that this update
rule leads to fast convergence toward equilibrium prices in a broad class of markets
that satisfy the weak gross substitutes property. These are the first analyses for
computationally and informationally distributed algorithms that demonstrate polynomial
convergence.
Our analysis identifies three parameters characterizing the markets, which govern
the rate of convergence of our protocols. These parameters are, broadly speaking:
1. A bound on the fractional rate of change of demand for each good with respect
to fractional changes in its price.
2. A bound on the fractional rate of change of demand for each good with respect
to fractional changes in wealth.
3. The relative demand for money at equilibrium prices.
We give two protocols. The first assumes global knowledge of only the first parameter.
For this protocol, we also provide a matching lower bound in terms of these parameters.
Our second protocol assumes no global knowledge whatsoever.
Blacklistable Anonymous Credentials: Blocking Misbehaving Users without TTPs (Extended Version)
Dartmouth Technical Report TR2007-601
Patrick P. Tsang
Man Ho Au
Apu Kapadia
Sean W. Smith
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/452/TR2007-601.pdf (337KB)
Abstract:
Several credential systems have been proposed in which users can authenticate to
services anonymously. Since anonymity can give users the license to misbehave, some
variants allow the selective deanonymization (or linking) of misbehaving users upon
a complaint to a trusted third party (TTP). The ability of the TTP to revoke a user's
privacy at any time, however, is too strong a punishment for misbehavior. To limit
the scope of deanonymization, systems such as ``e-cash'' have been proposed in which
users are deanonymized under only certain types of well-defined misbehavior such
as ``double spending.'' While useful in some applications, it is not possible to
generalize such techniques to more subjective definitions of misbehavior. We present
the first anonymous credential system in which services can ``blacklist'' misbehaving
users without contacting a TTP. Since blacklisted users remain anonymous, misbehaviors
can be judged subjectively without users fearing arbitrary deanonymization by a
TTP.Notes:
Light-Based Sample Reduction Methods for Interactive Relighting of Scenes with Minute Geometric Scale
Dartmouth Technical Report TR2007-600
William B. Kerr
Fabio Pellacini
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/451/TR2007-600.pdf (5431KB)
Abstract:
Rendering production-quality cinematic scenes requires high computational and temporal
costs. From an artist's perspective, one must wait for several hours for feedback
on even minute changes of light positions and parameters. Previous work approximates
scenes so that adjustments on lights may be carried out with interactive feedback,
so long as geometry and materials remain constant. We build on these methods by
proposing means by which objects with high geometric complexity at the subpixel
level, such as hair and foliage, can be approximated for real-time cinematic relighting.
Our methods make no assumptions about the geometry or shaders in a scene, and as
such are fully generalized. We show that clustering techniques can greatly reduce
multisampling, while still maintaining image fidelity at an error significantly
lower than sparsely sampling without clustering, provided that no shadows are computed.
Scenes that produce noise-like shadow patterns when sparse shadow samples are taken
suffer from additional error introduced by those shadows. We present a viable solution
to scalable scene approximation for lower sampling reolutions, provided a robust
solution to shadow approximation for sub-pixel geomery can be provided in the future.
Two's Company, Three's a Crowd: Stable Family and Threesome Roommates Problems
Dartmouth Technical Report TR2007-598
Chien-Chung Huang
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/450/TR2007-598.pdf (264KB)
Abstract:
We investigate Knuth's eleventh open question on stable matchings. In the stable
family problem, sets of women, men, and dogs are given, all of whom state their
preferences among the other two groups. The goal is to organize them into family
units, so that no three of them have the incentive to desert their assigned family
members to form a new family. A similar problem, called the threesome roommates
problem, assumes that a group of persons, each with their preferences among the
combinations of two others, are to be partitioned into triples. Similarly, the goal
is to make sure that no three persons want to break up with their assigned roommates.
Ng and Hirschberg were the first to investigate these two problems. In their formulation,
each participant provides a strictly-ordered list of all combinations. They proved
that under this scheme, both problems are NP-complete. Their paper reviewers pointed
out that their reduction exploits emph{inconsistent} preference lists and they wonder
whether these two problems remain NP-complete if preferences are required to be
consistent. We answer in the affirmative. In order to give these two problems a
broader outlook, we also consider the possibility that participants can express
indifference, on the condition that the preference consistency has to be maintained.
As an example, we propose a scheme in which all participants submit two (or just
one in the roommates case) lists ranking the other two groups separately. The order
of the combinations is decided by the sum of their ordinal numbers. Combinations
are tied when the sums are equal. By introducing indifference, a hierarchy of stabilities
can be defined. We prove that all stability definitions lead to NP-completeness
for existence of a stable matching.
A Security Assessment of Trusted Platform Modules
Dartmouth Technical Report TR2007-597
Evan R. Sparks
Date: January 2007
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/449/TR2007-597.ps.Z (170KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/449/TR2007-597.pdf (264KB)
Abstract:
Trusted Platform Modules (TPMs) are becoming ubiquitous devices included in newly
released personal computers. Broadly speaking, the aim of this technology is to
provide a facility for authenticating the platform on which they are running: they
are able to measure attest to the authenticity of a hardware and software configuration.
Designed to be cheap, commodity devices which motherboard and processor vendors
can include in their products with minimal marginal cost, these devices have a good
theoretical design. Unfortunately, there exist several practical constraints on
the effectiveness of TPMs and the architectures which employ them which leave them
open to attack. We demonstrate some hardware and software attacks against these
devices and architectures. These attacks include Time of Check/Time of Use attacks
on the Integrity Measurment Architecture, and a bus attack against the Low Pin Count
bus. Further, we explore the possibility of side-channel attacks against TPMs.Notes:
When One Pipeline Is Not Enough
Dartmouth Technical Report TR2007-596
Thomas H. Cormen
Priya Natarajan
Elena Riccio Davidson
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/448/TR2007-596.pdf (158KB)
Abstract:
Pipelines that operate on buffers often work well to mitigate the high latency inherent
in interprocessor communication and in accessing data on disk. Running a single
pipeline on each node works well when each pipeline stage consumes and produces
data at the same rate. If a stage might consume data faster or slower than it produces
data, a single pipeline becomes unwieldy.
We describe how we have extended the FG programming environment to support multiple
pipelines in two forms. When a node might send and receive data at different rates
during interprocessor communication, we use disjoint pipelines that send and receive
on each node. When a node consumes and produces data from different streams on the
node, we use multiple pipelines that intersect at a particular stage. Experimental
results for two out-of-core sorting algorithms---one based on columnsort and the
other a distribution-based sort---demonstrate the value of multiple pipelines.
Exploring the Integration of Memory Management and Trusted Computing
Dartmouth Technical Report TR2007-594
Nihal A. D'Cunha
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/447/TR2007-594.pdf (550KB)
Abstract:
This thesis addresses vulnerabilities in current Trusted Computing architecture by
exploring a design for a better Trusted Platform Module (TPM); one that integrates
more closely with the CPU's Memory Management Unit (MMU). We establish that software-based
attacks on trusted memory can be carried out undetectably by an adversary on current
TCG/TPM implementations. We demonstrate that an attacker with sufficient privileges
can compromise the integrity of a TPM-protected system by modifying critical loaded
code and static data after measurement has taken place. More specifically, these
attacks illustrate the Time Of Check vs. Time of Use (TOCTOU) class of attacks.
We propose to enhance the MMU, enabling it to detect when memory containing trusted
code or data is being maliciously modified at run-time. On detection, it should
be able to notify the TPM of these modifications. We seek to use the concepts of
selective memory immutability as a security tool to harden the MMU, which will result
in a more robust TCG/TPM implementation. To substantiate our ideas for this proposed
hardware feature, we designed and implemented a software prototype system, which
employs the monitoring capabilities of the Xen virtual machine monitor.
We performed a security evaluation of our prototype and validated that it can detect
all our software-based TOCTOU attacks. We applied our prototype to verify the integrity
of data associated with an application, as well as suggested and implemented ways
to prevent unauthorized use of data by associating it with its owner process. Our
performance evaluation reveals minimal overhead.Notes:
Closest and Farthest-Line Voronoi Diagrams in the Plane
Dartmouth Technical Report TR2007-593
Mark C. Henle
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/446/TR2007-593.pdf (5467KB)
Abstract:
Voronoi diagrams are a geometric structure containing proximity information useful
in efficiently answering a number of common geometric problems associated with a
set of points in the plane.. They have applications in fields ranging from crystallography
to biology. Diagrams of sites other than points and with different distance metrics
have been studied. This paper examines the Voronoi diagram of a set of lines, which
has escaped study in the computational geometry literature.
The combinatorial and topological properties of the closest and farthest Voronoi
diagrams are analyzed and O(n^2) and O(n log n) algorithms are presented for their
computation respectively.Notes:
SCML: A Structural Representation for Chinese Characters
Dartmouth Technical Report TR2007-592
Daniel G. Peebles
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/445/TR2007-592.pdf (1474KB)
Abstract:
Chinese characters are used daily by well over a billion people. They constitute
the main writing system of China and Taiwan, form a major part of written Japanese,
and are also used in South Korea. Anything more than a cursory glance at these characters
will reveal a high degree of structure to them, but computing systems do not currently
have a means to operate on this structure. Existing character databases and dictionaries
treat them as numerical code points, and associate with them additional `hand-computed'
data, such as stroke count, stroke order, and other information to aid in specific
searches. Searching by a character's `shape' is effectively impossible in these
systems.
I propose a new approach to representing these characters, through an XML-based
language called SCML. This language, by encoding an abstract form of a character,
allows the direct retrieval of important information such as stroke count and stroke
order, and permits useful but previously impossible automated analysis of characters.
In addition, the system allows the design of a view that takes abstract SCML representations
as character models and outputs glyphs based on an aesthetic, facilitating the creation
of `meta-fonts' for Chinese characters. Finally, through the creation of a specialized
database, SCML allows for efficient structural character queries to be performed
against the body of inserted characters, thus allowing people to search by the most
obvious of a character's characteristics: its shape.Notes:
Dumbots: Unexpected Botnets through Networked Embedded Devices
Dartmouth Technical Report TR2007-591
Kwang-Hyun Baek
Sergey Bratus
Sara Sinclair
Sean W. Smith
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/444/TR2007-591.pdf (189KB)
Abstract:
Currently, work on botnets focuses primarily on PCs. However, as lightweight computing
devices with embedded operating systems become more ubiquitous, they present a new
and very disturbing target for botnet developers. In this paper, we present both
an empirical demonstration on a widely deployed multimedia box, as well as an evaluation
of the deeper potential of these dumbots.
Secure Cryptographic Precomputation with Insecure Memory
Dartmouth Technical Report TR2007-590
Patrick P. Tsang
Sean W. Smith
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/443/TR2007-590.pdf (262KB)
Abstract:
Precomputation dramatically reduces the execution latency of many cryptographic algorithms.
To sustain the reduced latency over time during which these algorithms are routinely
invoked, however, a pool of precomputation results must be stored and be readily
available. While precomputation is an old and well-known technique, how to securely
and yet efficiently store these precomputation results has largely been ignored.
For instance, requiring tamper-proof memory would be too expensive, if not unrealistic,
for precomputation to be cost-effective. In this paper, we propose an architecture
that provides secure storage for cryptographic precomputation using only insecure
memory, which may be eavesdropped or even tampered with. Specifically, we design
a small tamper-resistant hardware module that we call the {em Queue Security Proxy
(QSP)}, which situates on the data-path between the processor and the insecure memory.
Our analysis shows that our design is secure, efficient, flexible and yet inexpensive.
In particular, our design's timing overhead and hardware cost are independent of
the storage size. We also discuss in this paper several interesting extensions to
our proposed architecture. We plan to prototype our design assuming the scenario
of precomputing DSA signatures, effectively building a cost-effective low-latency
DSA signing secure coprocessor.
Lighting with Sketches
Dartmouth Technical Report TR2007-589
Alexander Wakefield Steinberg
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/442/TR2007-589.pdf (3435KB)
Abstract:
Lighting design is a fundamental aspect of computer cinematography, where it is used
to support storytelling by affecting the mood, style, and believability of a scene.
Traditionally, lighting has requred the tedious adjustment of large set parameters
that describe complex lighting setups, including lights positions, colors, shapes,
etc. This work presents an interactive user interface that facilitates lighting
workflow by using a sketching paradigm for light creation. Lights are specified
by a series of strokes that define various properties of illumination such as shape
of the light and position of illuminated and shadowed areass. The system will them
perform a nonlinear optimization over all the light parameters to find a match to
the controlling sketches. To demonstrate our prototype system, we lit a simple scene
fully with our application, showing that sketching paradigms ar promising to facilitate
the lighting workflow.Notes:
A Combined Routing Method for Wireless Ad Hoc Networks
Dartmouth Technical Report TR2007-588
Soumendra Nanda
Zhenhui Jiang
David Kotz
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/441/TR2007-588.pdf (335KB)
Abstract:
To make ad hoc wireless networks adaptive to different mobility and traffic patterns,
this paper proposes an approach to swap from one protocol to another protocol dynamically,
while routing continues. By the insertion of a thin new layer, we were able to make
each node in the ad hoc wireless network notify each other about the protocol swap.
To ensure that routing works efficiently after the protocol swap, we initialized
the destination routing protocol's data structures and reused the previous routing
information to build the new routing table. We also tested our approach under different
network topologies and traffic patterns in static networks to learn whether the
swap was fast and whether the swap incurred too much overhead. We found that the
swap latency was related to the nature of the destination protocol and the topology
of the network. We also found that the control packet ratio after swap was close
to that of the protocol running without swap, which indicates that our method does
not incur too much overhead for the swap.Notes:
Protein Design by Mining and Sampling an Undirected Graphical Model of Evolutionary Constraints
Dartmouth Technical Report TR2007-587
John Thomas
Naren Ramakrishnan
Chris Bailey-Kellogg
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/440/TR2007-587.pdf (378KB)
Abstract:
Evolutionary pressures on proteins to maintain structure and function have constrained
their sequences over time and across species. The sequence record thus contains
valuable information regarding the acceptable variation and covariation of amino
acids in members of a protein family. When designing new members of a protein family,
with an eye toward modified or improved stability or functionality, it is incumbent
upon a protein engineer to uncover such constraints and design conforming sequences.
This paper develops such an approach for protein design: we first mine an undirected
probabilistic graphical model of a given protein family, and then use the model
generatively to sample new sequences. While sampling from an undirected model is
difficult in general, we present two complementary algorithms that effectively sample
the sequence space constrained by our protein family model. One algorithm focuses
on the high-likelihood regions of the space. Sequences are generated by sampling
the cliques in a graphical model according to their likelihood while maintaining
neighborhood consistency. The other algorithm designs a fixed number of high-likelihood
sequences that are reflective of the amino acid composition of the given family.
A set of shuffled sequences is iteratively improved so as to increase their mean
likelihood under the model. Tests for two important protein families, WW domains
and PDZ domains, show that both sampling methods converge quickly and generate diverse
high-quality sets of sequences for further biological study.Notes:
People-Centric Urban Sensing: Security Challenges for the New Paradigm
Dartmouth Technical Report TR2007-586
Peter Johnson
Apu Kapadia
David Kotz
Nikos Triandopoulos
Date: January 2007
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/439/TR2007-586.pdf (155KB)
Abstract:
We study the security challenges that arise in emph{people-centric urban sensing},
a new sensor-networking paradigm that leverages humans as part of the sensing infrastructure.
Most prior work on sensor networks has focused on collecting and processing ephemeral
data about the environment using a static topology and an application-aware infrastructure.
People-centric urban sensing, however, involves collecting, storing, processing
and fusing large volumes of data related to every-day human activities. Sensing
is performed in a highly dynamic and mobile environment, and supports (among other
things) pervasive computing applications that are focused on enhancing the user's
experience. In such a setting, where humans are the central focus, there are new
challenges for information security; not only because of the complex and dynamic
communication patterns, but also because the data originates from sensors that are
carried by a person---not a tiny sensor thrown in the forest or mounted on the neck
of an animal. In this paper we aim to instigate discussion about this critical issue---because
people-centric sensing will never succeed without adequate provisions for security
and privacy. To that end, we outline several important challenges and suggest general
solutions that hold promise in this new paradigm of sensor networks.
Path Planning Algorithms under the Link-Distance Metric
Dartmouth Technical Report TR2006-585
David P. Wagner
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/438/TR2006-585.pdf (1569KB)
Abstract:
The Traveling Salesman Problem and the Shortest Path Problem are famous problems
in computer science which have been well studied when the objective is measured
using the Euclidean distance. Here we examine these geometric problems under a different
set of optimization criteria. Rather than considering the total distance traversed
by a path, this thesis looks at reducing the number of times a turn is made along
that path, or equivalently, at reducing the number of straight lines in the path.
Minimizing this objective value, known as the link-distance, is useful in situations
where continuing in a given direction is cheap, while turning is a relatively expensive
operation. Applications exist in VLSI, robotics, wireless communications, space
travel, and other fields where it is desirable to reduce the number of turns.
This thesis examines rectilinear and non-rectilinear variants of the Traveling Salesman
Problem under this metric. The objective of these problems is to find a path visiting
a set of points which has the smallest number of bends. A 2-approximation algorithm
is given for the rectilinear problem, while for the non-rectilinear problem, an
O(log n)-approximation algorithm is given. The latter problem is also shown to be
NP-Complete.
Next, the Rectilinear Minimum Link-Distance Problem, also known as the Minimum Bends
Path Problem, is considered. Here the objective is to find a rectilinear path between
two points among rectilinear obstacles which has the minimum number of bends, while
avoiding passing through any of the obstacles. The problem has been well studied
in two dimensions, but is relatively unexplored in higher dimensions. A main result
of this thesis is an O(n^{5/2} log n) time algorithm solving this problem in three
dimensions. Previously known algorithms have had worst-case running times of Omega(n^3).
This algorithm requires a data structure that supports efficient operations on pointsets
within rectangular regions of the Euclidean plane. We design a new data structure,
which is a variation on the segment tree, in order to support these operations.
Finally, an implementation of the data structure and of the algorithm solving the
Minimum Link-Distance Problem demonstrates their experimental running times and
ease of implementation.Notes:
Tools and algorithms to advance interactive intrusion analysis via Machine Learning and Information Retrieval
Dartmouth Technical Report TR2006-584
Javed Aslam
Sergey Bratus
Virgil Pavlu
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/437/TR2006-584.pdf (1907KB)
Abstract:
We consider typical tasks that arise in the intrusion analysis of log data from the
perspectives of Machine Learning and Information Retrieval, and we study a number
of data organization and interactive learning techniques to improve the analyst's
efficiency. In doing so, we attempt to translate intrusion analysis problems into
the language of the abovementioned disciplines and to offer metrics to evaluate
the effect of proposed techniques. The Kerf toolkit contains prototype implementations
of these techniques, as well as data transformation tools that help bridge the gap
between the real world log data formats and the ML and IR data models.
We also describe the log representation approach that Kerf prototype tools are based
on. In particular, we describe the connection between decision trees, automatic
classification algorithms and log analysis techniques implemented in Kerf.
Digital Image Ballistics from JPEG Quantization
Dartmouth Technical Report TR2006-583
Hany Farid
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/436/TR2006-583.pdf (108KB)
Abstract:
Most digital cameras export images in the JPEG file format. This lossy compression
scheme employs a quantization table that controls the amount of compression achieved.
Different cameras typically employ different tables. A comparison of an image's
quantization scheme to a database of known cameras affords a simple technique for
confirming or denying an image's source. Similarly, comparison to a database of
photo-editing software can be used in a forensic setting to determine if an image
was edited after its original recording.
Cheating to Get Better Roommates in a Random Stable Matching
Dartmouth Technical Report TR2006-582
Chien-Chung Huang
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/435/TR2006-582.pdf (162KB)
Abstract:
This paper addresses strategies for the stable roommates problem, assuming that a
stable matching is chosen at random. We investigate how a cheating man should permute
his preference list so that he has a higher-ranking roommate probabilistically.
In the first part of the paper, we identify a necessary condition for creating
a new stable roommate for the cheating man. This condition precludes any possibility
of his getting a new roommate ranking higher than all his stable roommates when
everyone is truthful. Generalizing to the case that multiple men collude, we derive
another impossibility result: given any stable matching in which a subset of men
get their best possible roommates, they cannot cheat to create a new stable matching
in which they all get strictly better roommates than in the given matching.
Our impossibility result, considered in the context of the stable marriage problem,
easily re-establishes the celebrated Dubins-Freedman Theorem. The more generalized
Demange-Gale-Sotomayor Theorem states that a coalition of men and women cannot cheat
to create a stable matching in which everyone of them gets a strictly better partner
than in the Gale-Shapley algorithm (with men proposing). We give a sharper result:
a coalition of men and women cannot cheat together so that, in a newly-created stable
matching, every man in the coalition gets a strictly better partner than in the
Gale-Shapley algorithm while none of the women in the coalition is worse off.
In the second part of the paper, we present two cheating strategies that guarantee
that the cheating man's new probability distribution over stable roommates majorizes
the original one. These two strategies do not require the knowledge of the probability
distribution of the cheating man. This is important because the problem of counting
stable matchings is #P-complete. Our strategies only require knowing the set of
stable roommates that the cheating man has and can be formulated in polynomial time.
Our second cheating strategy has an interesting corollary in the context of stable
marriage with the Gale-Shapley algorithm. Any woman-optimal strategy will ensure
that every woman, cheating or otherwise, ends up with a partner at least as good
as when everyone is truthful.
Visualizing Paths in Context
Dartmouth Technical Report TR2006-580
Fabio Pellacini
Lori Lorigo
Geri Gay
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/434/TR2006-580.pdf (293KB)
Abstract:
Data about movement through a space is increasingly becoming available for capture
and analysis. In many applications, this data is captured or modeled as transitions
between a small number of areas of interests, or a finite set of states, and these
transitions constitute paths in the space. Similarities and differences between
paths are of great importance to such analyses, but can be difficult to assess.
In this work we present a visualization approach for representing paths in context,
where individual paths can be compared to other paths or to a group of paths. Our
approach summarizes path behavior using a simple circular layout, including information
about state and transition likelihood using Markov random models, together with
information about specific path and state behavior. The layout avoids line crossovers
entirely, making it easy to observe patterns while reducing visual clutter. In our
tool, paths can either be compared in their natural sequence or by aligning multiple
paths using Multiple Sequence Alignment, which can better highlight path similarities.
We applied our technique to eye tracking data and cell phone tower data used to
capture human movement.
Metric Measurements on a Plane from a Single Image
Dartmouth Technical Report TR2006-579
Micah K. Johnson
Hany Farid
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/433/TR2006-579.pdf (3572KB)
Abstract:
The past decade has seen considerable advances in the application of principles from
projective geometry to problems in image analysis and computer vision. In this paper,
we review a subset of this work, and leverage these results for the purpose of forensic
analysis. Specifically, we review three techniques for making metric measurements
on planar surfaces from a single image. The resulting techniques should prove useful
in forensic settings where real-world measurements are required.
Wait-Free and Obstruction-Free Snapshot
Dartmouth Technical Report TR2006-578
Khanh Do Ba
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/432/TR2006-578.pdf (178KB)
Abstract:
The snapshot problem was first proposed over a decade ago and has since been well-studied
in the distributed algorithms community. The challenge is to design a data structure
consisting of $m$ components, shared by upto $n$ concurrent processes, that supports
two operations. The first, $Update(i,v)$, atomically writes $v$ to the $i$th component.
The second, $Scan()$, returns an atomic snapshot of all $m$ components. We consider
two termination properties: wait-freedom, which requires a process to always terminate
in a bounded number of its own steps, and the weaker obstruction-freedom, which
requires such termination only for processes that eventually execute uninterrupted.
First, we present a simple, time and space optimal, obstruction-free solution to
the single-writer, multi-scanner version of the snapshot problem (wherein concurrent
Updates never occur on the same component). Second, we assume hardware support for
compare&swap (CAS) to give a time-optimal, wait-free solution to the multi-writer,
single-scanner snapshot problem (wherein concurrent Scans never occur). This algorithm
uses only $O(mn)$ space and has optimal CAS, write and remote-reference complexities.
Additionally, it can be augmented to implement a general snapshot object with the
same time and space bounds, thus improving the space complexity of $O(mn^2)$ of
the only previously known time-optimal solution.Notes:
SAMPLED: Shared Anonymous Music PLayback using wirelEss Devices
Dartmouth Technical Report TR2006-577
Constantinos C. Neophytou
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/431/TR2006-577.pdf (177KB)
Abstract:
Recent advances in mobile computing enable many new applications, yet at the same
time create privacy implications caused by the increasing amount of data that becomes
available. This thesis will explore the possibilities of wireless-enabled portable
devices and their attending privacy implications.
We will describe how such a device containing personal information about the musical
preferences of its user can help improve the user's experience in a social setting
where music is played for all, and at the same time preserve each user's privacy.Notes:
Computation Reuse in Statics and Dynamics Problems for Assemblies of Rigid Bodies
Dartmouth Technical Report TR2006-576
Anne Loomis
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/430/TR2006-576.pdf (25992KB)
Abstract:
The problem of determining the forces among contacting rigid bodies is fundamental
to many areas of robotics, including manipulation planning, control, and dynamic
simulation. For example, consider the question of how to unstack an assembly, or
how to find stable regions of a rubble pile. In considering problems of this type
over discrete or continuous time, we often encounter a sequence of problems with
similar substructure. The primary contribution of our work is the observation that
in many cases, common physical structure can be exploited to solve a sequence of
related problems more efficiently than if each problem were considered in isolation.
We examine three general problems concerning rigid-body assemblies: dynamic simulation,
assembly planning, and assembly stability given limited knowledge of the structure's
geometry.
To approach the dynamic simulation and assembly planning applications, we have
optimized a known method for solving the system dynamics. The accelerations of and
forces among contacting rigid bodies may be computed by formulating the dynamics
equations and contact constraints as a complementarity problem. Dantzig's algorithm,
when applicable, takes n or fewer major cycles to find a solution to the linear
complementarity problem corresponding to an assembly with n contacts. We show that
Dantzig's algorithm will find a solution in n - k or fewer major cycles if the algorithm
is initialized with a solution to the dynamics problem for a subassembly with k
internal contacts.
Finally, we show that if we have limited knowledge of a structure's geometry, we
can still learn about stable regions of its surface by physically pressing on it.
We present an approach for finding stable regions of planar assemblies: sample presses
on the surface to identify a stable cone in wrench space, partition the space of
applicable wrenches into stable and unstable regions, and map these back to the
surface of the structure.Notes:
Limited Delegation (Without Sharing Secrets) in Web Applications
Dartmouth Technical Report TR2006-574
Nicholas J. Santos
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/429/TR2006-574.pdf (608KB)
Abstract:
Delegation is the process wherein an entity Alice designates an entity Bob to speak
on her behalf. In password-based security systems, delegation is easy: Alice gives
Bob her password. This is a useful feature, and is used often in the real world.
But it's also problematic. When Alice shares her password, she must delegate all
her permissions, but she may wish to delegate a limited set. Also, as we move towards
PKI-based systems, secret-sharing becomes impractical. This thesis explores one
solution to these problems. We use proxy certificates in a non-standard way so that
user Alice can delegate a subset of her privileges to user Bob in a secure, decentralized
way for web applications.
We identify how delegation changes the semantics of access control, then build
a system to demonstrate these possibilities in action. An extension on top of Mozilla's
Firefox web browser allows a user to create and use proxy certificates for delegation,
and a module on top of the Apache web server accepts multiple chains of these certificates.
This is done in a modified SSL session that should not break current SSL implementations.Notes:
A simple computational method for the identification of disease-associated loci in complex, incomplete pedigrees
Dartmouth Technical Report TR2006-573
Gregory Leibon
Dan Rockmore
Martin R. Pollak
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/428/TR2006-573.pdf (806KB)
Abstract:
We present an approach, called the Shadow Method, for the identification of disease
loci from dense genetic marker maps in complex, potentially incomplete pedigrees.
Shadow is a simple method based on an analysis of the patterns of obligate meiotic
recombination events in genotypic data. This method can be applied to any high density
marker map and was specifically designed to explore the fact that extremely dense
marker maps are becoming more readily available. We also describe how to interpret
and associated meaningful P-Values to the results. Shadow has significant advantages
over traditional parametric linkage analysis methods in that it can be readily applied
even in cases in which the topology of a pedigree or pedigrees can only be partially
determined. In addition, Shadow is robust to variability in a range of parameters
and in particular does not require prior knowledge of mode of inheritance, penetrance,
or clinical misdiagnosis rate. Shadow can be used for any SNP data, but is especially
effective when applied to dense samplings. Our primary example uses data from Affymetrix
100k SNPChip samples in which we illustrate our approach by analyzing simulated
data as well as genome-wide SNP data from two pedigrees with inherited forms of
kidney failure, one of which is compared with a typical LOD score analysis.
Secure Context-sensitive Authorization
Dartmouth Technical Report TR2006-571
Kazuhiro Minami
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/427/TR2006-571.pdf (826KB)
Abstract:
Pervasive computing leads to an increased integration between the real world and
the computational world, and many applications in pervasive computing adapt to the
user's context, such as the location of the user and relevant devices, the presence
of other people, light or sound conditions, or available network bandwidth, to meet
a user's continuously changing requirements without taking explicit input from the
users. We consider a class of applications that wish to consider a user's context
when deciding whether to authorize a user's access to important physical or information
resources. Such a context-sensitive authorization scheme is necessary when a mobile
user moves across multiple administrative domains where they are not registered
in advance. Also, users interacting with their environment need a non-intrusive
way to access resources, and clues about their context may be useful input into
authorization policies for these resources. Existing systems for context-sensitive
authorization take a logic-based approach, because a logical language makes it possible
to define a context model where a contextual fact is expressed with a boolean predicate
and to derive higher-level context information and authorization decisions from
contextual facts. However, those existing context-sensitive authorization systems
have a central server that collects context information, and evaluates policies
to make authorization decisions on behalf of a resource owner. A centralized solution
assumes that all resource owners trust the server to make correct decisions, and
all users trust the server not to disclose private context information. In many
realistic applications of pervasive computing, however, the resources, users, and
sources of context information are inherently distributed among many organizations
that do not necessarily trust each other. Resource owners may not trust the integrity
of context information produced by another domain, and context sensors may not trust
others with the confidentiality of data they provide about users. In this thesis,
we present a secure distributed proof system for context-sensitive authorization.
Our system enables multiple hosts to evaluate an authorization query in a peer-to-peer
way, while preserving the confidentiality and integrity policies of mutually untrusted
principals running those hosts. We also develop a novel caching and revocation mechanism
to support context-sensitive policies that refer to information in dozens of different
administrative domains. Contributions of this thesis include the definition of fine-grained
security policies that specify trust relations among principals in terms of information
confidentiality and integrity, the design and implementation of a secure distributed
proof system, a proof for the correctness of our algorithm, and a performance evaluation
showing that the amortized performance of our system scales to dozens of servers
in different domains.Notes:
A Novel Minimized Dead-End Elimination Criterion and Its Application to Protein Redesign in a Hybrid Scoring and Search Algorithm for Computing Partition Functions over Molecular Ensembles
Dartmouth Technical Report TR2006-570
Ivelin Georgiev
Ryan H. Lilien
Bruce R. Donald
Date: January 2006
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/426/TR2006-570.pdf (590KB)
Abstract:
Novel molecular function can be achieved by redesigning an enzyme's active site so
that it will perform its chemical reaction on a novel substrate. One of the main
challenges for protein redesign is the efficient evaluation of a combinatorial number
of candidate structures. The modeling of protein flexibility, typically by using
a rotamer library of commonly-observed low-energy side-chain conformations, further
increases the complexity of the redesign problem. A dominant algorithm for protein
redesign is Dead-End Elimination (DEE), which prunes the majority of candidate conformations
by eliminating rigid rotamers that provably are not part of the Global Minimum Energy
Conformation (GMEC). The identified GMEC consists of rigid rotamers (i.e., rotamers
that have not been energy-minimized) and is thus referred to as the rigid-GMEC.
As a post-processing step, the conformations that survive DEE may be energy-minimized.
When energy minimization is performed after pruning with DEE, the combined protein
design process becomes heuristic, and is no longer provably accurate: a conformation
that is pruned using rigid-rotamer energies may subsequently minimize to a lower
energy than the rigid-GMEC. That is, the rigid-GMEC and the conformation with the
lowest energy among all energy-minimized conformations (the minimized-GMEC) are
likely to be different. While the traditional DEE algorithm succeeds in not pruning
rotamers that are part of the rigid-GMEC, it makes no guarantees regarding the identification
of the minimized-GMEC. In this paper we derive a novel, provable, and efficient
DEE-like algorithm, called minimized-DEE (MinDEE), that guarantees that rotamers
belonging to the minimized-GMEC will not be pruned, while still pruning a combinatorial
number of conformations. We show that MinDEE is useful not only in identifying the
minimized-GMEC, but also as a filter in an ensemble-based scoring and search algorithm
for protein redesign that exploits energy-minimized conformations. We compare our
results both to our previous computational predictions of protein designs and to
biological activity assays of predicted protein mutants. Our provable and efficient
minimized-DEE algorithm is applicable in protein redesign, protein-ligand binding
prediction, and computer-aided drug design.Notes:
A Combined Routing Method for Ad hoc Wireless Networks
Dartmouth Technical Report TR2005-566
Zhenhui Jiang
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/425/TR2005-566.pdf (471KB)
Abstract:
To make ad hoc wireless networks adaptive to different mobility and traffic patterns,
we studied in this thesis an approach to swap from one protocol to another protocol
dynamically, while routing continues. By the insertion of a new layer, we were able
to make each node in the ad hoc wireless network notify each other about the protocol
swap. To ensure that routing works efficiently after the protocol swap, we initialized
the destination routing protocols data structures and reused the previous routing
information to build the new routing table. We also tested our approach under different
network topologies and traffic patterns in static networks to learn whether the
swap is fast and whether the swap incurs too much overload . We found that the swap
latency is related to the destination protocol and the topology of the network.
We also found that the control packet ratio after swap is close to the protocol
running without swap, which means our method does not incur too many control packets
for swap.Notes:
How hard is it to cheat in the Gale-Shapley Stable Matching Algorithm
Dartmouth Technical Report TR2005-565
Chien-Chung Huang
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/424/TR2005-565.pdf (116KB)
Abstract:
We study strategy issues surrounding the stable marriage problem. Under the Gale-Shapley
algorithm (with men proposing), a classical theorem says that it is impossible for
every liar to get a better partner. We try to challenge this theorem. First, observing
a loophole in the statement of the theorem, we devise a coalition strategy in which
a non-empty subset of the liars gets a better partner and no man is worse off than
before. This strategy is restricted in that not everyone has the incentive to cheat.
We attack the classical theorem further by means of randomization. However, this
theorem shows surprising robustness: it is impossible that every liar has the chance
to improve while no one gets hurt. Hence, this impossibility result indicates that
it is always hard to induce some people to falsify their lists. Finally, to overcome
the problem of lacking motivation, we exhibit another randomized lying strategy
in which every liar can expect to get a better partner, though with a chance of
getting a worse one.
A Steerable, Untethered, 250x60 micron MEMS Mobile Micro-Robot
Dartmouth Technical Report TR2005-564
Bruce R. Donald
Christopher G. Levey
Craig D. McGray
Igor Paprotny
Daniela Rus
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/423/TR2005-564.pdf (17402KB)
Abstract:
We present a steerable, electrostatic, untethered, MEMS micro-robot, with dimensions
of 60 µm by 250 µm by 10 µm. This micro-robot is 1 to 2 orders
of magnitude smaller in size than previous micro-robotic systems. The device
consists of a curved, cantilevered steering arm, mounted on an untethered scratch
drive actuator. These two components are fabricated monolithically from the same
sheet of conductive polysilicon, and receive a common power and control signal through
a capacitive coupling with an underlying electrical grid. All locations on the grid
receive the same power and control signal, so that the devices can be operated without
knowledge of their position on the substrate and without constraining rails or tethers.
Control and power delivery waveforms are broadcast to the device through the capacitive
power coupling, and are decoded by the electromechanical response of the device
body. Individual control of the component actuators provides two distinct motion
gaits (forward motion and turning), which together allow full coverage of a planar
workspace (the robot is globally controllable). These MEMS micro-robots demonstrate
turning error of less than 3.7 °/mm during forward motion, turn with radii as
small as 176 µm, and achieve speeds of over 200 µm/sec, with an average
step size of 12 nm. They have been shown to operate open-loop for distances exceeding
35 cm without failure, and can be controlled through teleoperation to navigate complex
paths.Notes:
Computation reuse in stacking and unstacking
Dartmouth Technical Report TR2005-563
Anne Loomis
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/422/TR2005-563.pdf (502KB)
Abstract:
Algorithms for dynamic simulation and control are fundamental to many applications,
including computer games and movies, medical simulation, and mechanical design.
I propose to explore efficient algorithms for finding a stable unstacking sequence
-- an order in which we can remove every object from a structure without causing
the structure to collapse under gravity at any step.
We begin with a basic unstacking sequence algorithm: consider the set of all objects
in a structure. Collect all possible subsets into a disassembly graph. Search the
graph, testing the stability of each node as it is visited. Any path of stable nodes
from start to goal is a stable unstacking sequence.
I propose to show how we can improve the performance of individual stability tests
for three-dimensional structures with Coulomb friction, and give effective methods
for searching the disassembly graph. I will also analyze the computational complexity
of stable unstacking problems, and explore a classification of structures based
on characteristics of their stable unstacking sequences.
In preliminary work, I have shown that we can reuse computation from one stability
test of a planar subassembly to the next. The implementation, which solves the system
dynamics as a linear complementarity problem (LCP), outperforms an implementation
that solves the system statics as a linear program (LP). This is surprising because
LCPs are more complex than LPs, and dynamics equations are more complex than statics
equations.Notes:
A Quasi-PTAS for Unsplittable Flow on Line Graphs
Dartmouth Technical Report TR2005-561
Nikhil Bansal
Amit Chakrabarti
Amir Epstein
Baruch Schieber
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/421/TR2005-561.pdf (118KB)
Abstract:
We study the Unsplittable Flow Problem (UFP) on a line graph, focusing on the long-standing
open question of whether the problem is APX-hard. We describe a deterministic quasi-polynomial
time approximation scheme for UFP on line graphs, thereby ruling out an APX-hardness
result, unless NP is contained in DTIME(2^polylog(n)). Our result requires a quasi-polynomial
bound on all edge capacities and demands in the input instance. Earlier results
on this problem included a polynomial time (2+epsilon)-approximation under the assumption
that no demand exceeds any edge capacity (the "no-bottleneck assumption") and a
super-constant integrality gap if this assumption did not hold. Unlike most earlier
work on UFP, our results do not require a no-bottleneck assumption.
Combinatorial Theorems about Embedding Trees on the Real Line
Dartmouth Technical Report TR2005-560
Amit Chakrabarti
Subhash Khot
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/420/TR2005-560.pdf (201KB)
Abstract:
We consider the combinatorial problem of embedding a tree metric into the real line
with low distortion. For two special families of trees --- the family of complete
binary trees and the family of subdivided stars --- we provide embeddings whose
distortion is provably optimal, up to a constant factor. We also prove that the
optimal distortion of a linear embedding of a tree can be arbitrarily low or high
even when it has bounded degree.
Performance Evaluation of Distributed Security Protocols Using Discrete Event Simulation
Dartmouth Technical Report TR2005-559
Meiyuan Zhao
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/419/TR2005-559.pdf (930KB)
Abstract:
The Border Gateway Protocol (BGP) that manages inter-domain routing on the Internet
lacks security. Protective measures using public key cryptography introduce complexities
and costs.
To support authentication and other security functionality in large networks, we
need public key infrastructures (PKIs). Protocols that distribute and validate certificates
introduce additional complexities and costs. The certification path building algorithm
that helps users establish trust on certificates in the distributed network environment
is particularly complicated.
Neither routing security nor PKI come for free. Prior to this work, the research
study on performance issues of these large-scale distributed security systems was
minimal. In this thesis, we evaluate the performance of BGP security protocols and
PKI systems. We answer the questions about how the performance affects protocol
behaviors and how we can improve the efficiency of these distributed protocols to
bring them one step closer to reality.
The complexity of the Internet makes an analytical approach difficult; and the scale
of Internet makes empirical approaches also unworkable. Consequently, we take the
approach of simulation. We have built the simulation frameworks to model a number
of BGP security protocols and the PKI system. We have identified performance problems
of Secure BGP (S-BGP), a primary BGP security protocol, and proposed and evaluated
Signature Amortization (S-A) and Aggregated Path Authentication (APA) schemes that
significantly improve efficiency of S-BGP without compromising security. We have
also built a simulation framework for general PKI systems and evaluated certification
path building algorithms, a critical part of establishing trust in Internet-scale
PKI, and used this framework to improve algorithm performance.Notes:
Improving Large-Scale Network Traffic Simulation with Multi-Resolution Models
Dartmouth Technical Report TR2005-558
Guanhua Yan
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/418/TR2005-558.pdf (1026KB)
Abstract:
Simulating a large-scale network like the Internet is a challenging undertaking because
of the sheer volume of its traffic. Packet-oriented representation provides high-fidelity
details but is computationally expensive; fluid-oriented representation offers high
simulation efficiency at the price of losing packet-level details. Multi-resolution
modeling techniques exploit the advantages of both representations by integrating
them in the same simulation framework. This dissertation presents solutions to the
problems regarding the efficiency, accuracy, and scalability of the traffic simulation
models in this framework. The ``ripple effect'' is a well-known problem inherent
in event-driven fluid-oriented traffic simulation, causing explosion of fluid rate
changes. Integrating multi-resolution traffic representations requires estimating
arrival rates of packet-oriented traffic, calculating the queueing delay upon a
packet arrival, and computing packet loss rate under buffer overflow. Real time
simulation of a large or ultra-large network demands efficient background traffic
simulation. The dissertation includes a rate smoothing technique that provably mitigates
the ``ripple effect'', an accurate and efficient approach that integrates traffic
models at multiple abstraction levels, a sequential algorithm that achieves real
time simulation of the coarse-grained traffic in a network with 3 tier-1 ISP (Internet
Service Provider) backbones using an ordinary PC, and a highly scalable parallel
algorithm that simulates network traffic at coarse time scales.Notes:
Natural Image Statistics for Digital Image Forensics
Dartmouth Technical Report TR2005-557
Siwei Lyu
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/417/TR2005-557.pdf (3233KB)
Abstract:
We describe a set of natural image statistics that are built upon two multi-scale
image decompositions, the quadrature mirror filter pyramid decomposition and the
local angular harmonic decomposition. These image statistics consist of first- and
higher-order statistics that capture certain statistical regularities of natural
images. We propose to apply these image statistics, together with classification
techniques, to three problems in digital image forensics: (1) differentiating photographic
images from computer-generated photorealistic images, (2) generic steganalysis;
(3) rebroadcast image detection. We also apply these image statistics to the traditional
art authentication for forgery detection and identification of artists in an art
work. For each application we show the effectiveness of these image statistics and
analyze their sensitivity and robustness.Notes:
Efficient Wait-Free Algorithms for Implementing LL/SC Objects
Dartmouth Technical Report TR2005-556
Srdjan Petrovic
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/416/TR2005-556.pdf (913KB)
Abstract:
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional
(SC) have emerged as the most suitable synchronization instructions for the design
of lock-free algorithms. However, current architectures do not support these instructions;
instead, they support either CAS (e.g., UltraSPARC, Itanium, Pentium) or restricted
versions of LL/SC (e.g., POWER4, MIPS, Alpha). Thus, there is a gap between what
algorithm designers want (namely, LL/SC) and what multiprocessors actually support
(namely, CAS or restricted LL/SC). To bridge this gap, this thesis presents a series
of efficient, wait-free algorithms that implement LL/SC from CAS or restricted LL/SC.
The Theory of Trackability with Applications to Sensor Networks
Dartmouth Technical Report TR2005-555
Valentino Crespi
George V. Cybenko
Guofei Jiang
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/415/TR2005-555.pdf (439KB)
Abstract:
In this paper, we formalize the concept of tracking in a sensor network and develop
a rigorous theory of {em trackability} that investigates the rate of growth of the
number of consistent tracks given a sequence of observations made by the sensor
network. The phenomenon being tracked is modelled by a nondeterministic finite automaton
and the sensor network is modelled by an observer capable of detecting events related,
typically ambiguously, to the states of the underlying automaton.
More formally, an input string, $Z^t$, of $t+1$ symbols (the sensor network observations)
that is presented to a nondeterministic finite automaton, $M$, (the model) determines
a set, ${cal H}_M(Z^t)$, of state sequences (the tracks or hypotheses) that are
capable of generating the input string $Z^t$. We study the growth of the size of
this set, $|{cal H}_M(Z^t)|$, as a function of the length of the input string, $t+1$.
Our main result is that for a given automaton and sensor coverage, the worst-case
rate of growth is either polynomial or exponential in $t$, indicating a kind of
phase transition in tracking accuracy.
The techniques we use include the Joint Spectral Radius, $rho(Sigma)$, of a finite
set, $Sigma$, of $(0,1)$-matrices derived from $M$. Specifically, we construct a
set of matrices, $Sigma$, corresponding to $M$ with the property that $rho(Sigma)
leq 1$ if and only if $|{cal H}_M(Z^t)|$ grows polynomially in $t$. We also prove
that for $(0,1)$-matrices, the decision problem $rho(Sigma)leq 1$ is Turing decidable
and, therefore, so is the problem of deciding whether worst case state sequence
growth for a given automaton is polynomial or exponential. These results have applications
in sensor networks, computer network security and autonomic computing as well as
various tracking problems of recent interest involving detecting phenomena using
noisy observations of hidden states.
Efficiently Implementing a Large Number of LL/SC Objects
Dartmouth Technical Report TR2005-554
Prasad Jayanti
Srdjan Petrovic
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/414/TR2005-554.pdf (380KB)
Abstract:
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional
(SC) have emerged as the most suitable synchronization instructions for the design
of lock-free algorithms. However, current architectures do not support these instructions;
instead, they support either CAS (e.g., UltraSPARC, Itanium) or restricted versions
of LL/SC (e.g., POWER4, MIPS, Alpha). Thus, there is a gap between what algorithm
designers want (namely, LL/SC) and what multiprocessors actually support (namely,
CAS or RLL/RSC). To bridge this gap, a flurry of algorithms that implement LL/SC
from CAS have appeared in the literature. The two most recent algorithms are due
to Doherty, Herlihy, Luchangco, and Moir (2004) and Michael (2004). To implement
M LL/SC objects shared by N processes, Doherty et al.'s algorithm uses only O(N
+ M) space, but is only non-blocking and not wait-free. Michael's algorithm, on
the other hand, is wait-free, but uses O(N^2 + M) space. The main drawback of his
algorithm is the time complexity of the SC operation: although the expected amortized
running time of SC is only O(1), the worst-case running time of SC is O(N^2). The
algorithm in this paper overcomes this drawback. Specifically, we design a wait-free
algorithm that achieves a space complexity of O(N^2 + M), while still maintaining
the O(1) worst-case running time for LL and SC operations.
An Untethered, Electrostatic, Globally Controllable MEMS Micro-Robot: Supplementary videos
Dartmouth Technical Report TR2005-553
Bruce R. Donald
Christopher G. Levey
Craig D. McGray
Igor Paprotny
Daniela Rus
Date: January 2005
Abstract:
We present a steerable, electrostatic, untethered, MEMS micro-robot, with dimensions
of 60 µm by 250 µm by 10 µm. This micro-robot is
1 to 2 orders of magnitude smaller in size than previous micro-robotic systems.
The device consists of a curved, cantilevered steering arm, mounted on an untethered
scratch drive actuator. These two components are fabricated monolithically from
the same sheet of conductive polysilicon, and receive a common power and control
signal through a capacitive coupling with an underlying electrical grid. All locations
on the grid receive the same power and control signal, so that the devices can be
operated without knowledge of their position on the substrate and without constraining
rails or tethers. Control and power delivery waveforms are broadcast to the device
through the capacitive power coupling, and are decoded by the electromechanical
response of the device body. Individual control of the component actuators provides
two distinct motion gaits (forward motion and turning), which together allow full
coverage of a planar workspace (the robot is globally controllable). These MEMS
micro-robots demonstrate turning error of less than 3.7 °/mm during forward
motion, turn with radii as small as 176 µm, and achieve speeds of over
200 µm/sec, with an average step size of 12 nm. They have been shown
to operate open-loop for distances exceeding 35 cm without failure, and can
be controlled through teleoperation to navigate complex paths. This document contains
movies showing the actuation of the micro-robots during open-loop actuation and
teleoperation experiments. The videos have been sped up for ease of viewing. On
each video, the time-scale is noted in the lower-right corner of the screen.Notes:
On the Design of an Immersive Environment for Security-Related Studies
Dartmouth Technical Report TR2005-552
Yougu Yuan
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/412/TR2005-552.pdf (381KB)
Abstract:
The Internet has become an essential part of normal operations of both public and
private sectors. Many security issues are not addressed in the original Internet
design, and security now has become a large concern for networking research and
study. There is an imperative need to have an simulation environment that can be
used to help study security-related research problems. In the thesis we present
our effort to build such an environment: Real-time Immersive Network Simulation
Environment (RINSE). RINSE features flexible configuration of models using various
networking protocols and real-time user interaction. We also present the Estimate
Next Infection (ENI) model we developed for Internet scanning worms using RINSE,
and the effort of combining multiple resolutions in worm modeling.Notes:
More Efficient Secure Function Evaluation Using Tiny Trusted Third Parties
Dartmouth Technical Report TR2005-551
Alexander Iliev
Sean Smith
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/411/TR2005-551.pdf (145KB)
Abstract:
We investigate the use of trustworthy devices, which function as trusted third parties
(TTPs), to solve general two-party Secure Function Evaluation (SFE) problems. We
assume that a really trustworthy TTP device will have very limited protected memory
and computation environment---a emph{tiny TTP}. This precludes trivial solutions
like "just run the function in the TTP".
Traditional scrambled circuit evaluation approaches to SFE have a very high overhead
in using indirectly-addressed arrays---every array access's cost is linear in the
array size. The main gain in our approach is that array access can be provided with
much smaller overhead---$O(sqrt{N}log N)$. This expands the horizon of problems
which can be efficiently solved using SFE. Additionally, our technique provides
a simple way to deploy arbitrary programs on tiny TTPs.
In our prototype, we use a larger (and expensive) device, the IBM 4758 secure coprocessor,
but we also speculate on the design of future tiny devices that could greatly improve
the current prototype's efficiency by being optimized for the operations prevalent
in our algorithms.
We have prototyped a compiler for the secure function definition language (SFDL)
developed in the Fairplay project. Our compiler produces an arithmetic circuit,
augmented with emph{array access gates} which provide more efficient secure access
to arrays. We then have a circuit interpreter in the 4758 to evaluate such a circuit
on given inputs. It does this gate by gate, requiring very little protected space.
We report on the performance of this prototype, which confirms our approach's strength
in handling indirectly-addressed arrays.
Mining Frequent and Periodic Association Patterns
Dartmouth Technical Report TR2005-550
Guanling Chen
Heng Huang
Minkyong Kim
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/410/TR2005-550.pdf (131KB)
Abstract:
Profiling the clients' movement behaviors is useful for mobility modeling, anomaly
detection, and location prediction. In this paper, we study clients' frequent and
periodic movement patterns in a campus wireless network. We use offline data-mining
algorithms to discover patterns from clients' association history, and analyze the
reported patterns using statistical methods. Many of our results reflect the common
characteristics of a typical academic campus, though we also observed some unusual
association patterns. There are two challenges: one is to remove noise from data
for efficient pattern discovery, and the other is to interpret discovered patterns.
We address the first challenge using a heuristic-based approach applying domain
knowledge. The second issue is harder to address because we do not have the knowledge
of people's activities, but nonetheless we could make reasonable interpretation
of the common patterns.
Structural Analysis of Social Networks with Wireless Users
Dartmouth Technical Report TR2005-549
Guanling Chen
David Kotz
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/409/TR2005-549.pdf (94KB)
Abstract:
Online interactions between computer users form Internet-based social networks. In
this paper we present a structural analysis of two such networks with wireless users.
In one network the wireless users participate in a global file-sharing system, and
in the other they interact with each other through a local music-streaming application.
Lower Bounds on the Communication Complexity of Shifting
Dartmouth Technical Report TR2005-548
Marco D. Adelfio
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/408/TR2005-548.pdf (110KB)
Abstract:
We study the communication complexity of the SHIFT (equivalently, SUM-INDEX) function
in a 3-party simultaneous message model. Alice and Bob share an n-bit string x and
Alice holds an index i and Bob an index j. They must send messages to a referee
who knows only n, i and j, enabling him to determine x[(i+j) mod n]. Surprisingly,
it is possible to achieve nontrivial savings even with such a strong restriction:
Bob can now make do with only ceil(n/2) bits. Here we show that this bound is completely
tight, for all n. This is an exact lower bound, with no asymptotics involved.Notes:
Towards Tiny Trusted Third Parties
Dartmouth Technical Report TR2005-547
Alexander Iliev
Sean Smith
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/407/TR2005-547.pdf (444KB)
Abstract:
Many security protocols hypothesize the existence of a {em trusted third party (TTP)}
to ease handling of computation and data too sensitive for the other parties involved.
Subsequent discussion usually dismisses these protocols as hypothetical or impractical,
under the assumption that trusted third parties cannot exist. However, the last
decade has seen the emergence of hardware-based devices that, to high assurance,
can carry out computation unmolested; emerging research promises more. In theory,
such devices can perform the role of a trusted third party in real-world problems.
In practice, we have found problems. The devices aspire to be general-purpose processors
but are too small to accommodate real-world problem sizes. The small size forces
programmers to hand-tune each algorithm anew, if possible, to fit inside the small
space without losing security. This tuning heavily uses operations that general-purpose
processors do not perform well. Furthermore, perhaps by trying to incorporate too
much functionality, current devices are also too expensive to deploy widely.
Our current research attempts to overcome these barriers, by focusing on the effective
use of {em tiny} TTPs ({em T3Ps}). To eliminate the programming obstacle, we used
our experience building hardware TTP apps to design and prototype an efficient way
to execute {em arbitrary} programs on T3Ps while preserving the critical trust properties.
To eliminate the performance and cost obstacles, we are currently examining the
potential hardware design for a T3P optimized for these operations.
In previous papers, we reported our work on the programming obstacle. In this paper,
we examine the potential hardware designs. We estimate that such a T3P could outperform
existing devices by several orders of magnitude, while also having a gate-count
of only 30K-60K, one to three orders of magnitude smaller than existing devices.
Boolean operations with two dimensional segment trees
Dartmouth Technical Report TR2005-546
Robert Scot Drysdale
David P. Wagner
Date: January 2005
Abstract:
Notes:
Automatic Image Orientation Determination with Natural Image Statistics
Dartmouth Technical Report TR2005-545
Siwei Lyu
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/405/TR2005-545.pdf (332KB)
Abstract:
In this paper, we propose a new method for automatically determining image orientations.
This method is based on a set of natural image statistics collected from a multi-scale
multi-orientation image decomposition (e.g., wavelets). From these statistics, a
two-stage hierarchal classification with multiple binary SVM classifiers is employed
to de- termine image orientation. The proposed method is evaluated and compared
to existing methods with experiments performed on 18040 natural images, where it
showed promising performance.
Managing Access Control in Virtual Private Networks
Dartmouth Technical Report TR2005-544
Twum Djin
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/404/TR2005-544.pdf (587KB)
Abstract:
Virtual Private Network technology allows remote network users to benefit from resources
on a private network as if their host machines actually resided on the network.
However, each resource on a network may also have its own access control policies,
which may be completely unrelated to network access. Thus users access to a network
(even by VPN technology) does not guarantee their access to the sought resources.
With the introduction of more complicated access privileges, such as delegated access,
it is conceivable for a scenario to arise where a user can access a network remotely
(because of direct permissions from the network administrator or by delegated permission)
but cannot access any resources on the network. There is, therefore, a need for
a network access control mechanism that understands the privileges of each remote
network user on one hand, and the access control policies of various network resources
on the other hand, and so can aid a remote user in accessing these resources based
on the user's privileges.
This research presents a software solution in the form of a centralized access
control framework called an Access Control Service (ACS), that can grant remote
users network presence and simultaneously aid them in accessing various network
resources with varying access control policies. At the same time, the ACS provides
a centralized framework for administrators to manage access to their resources.
The ACS achieves these objectives using VPN technology, network address translation
and by proxying various authentication protocols on behalf of remote users.Notes:
On-line Metasearch, Pooling, and System Evaluation
Dartmouth Technical Report TR2005-543
Robert A. Savell
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/403/TR2005-543.pdf (2437KB)
Abstract:
This thesis presents a unified method for simultaneous solution of three problems
in Information Retrieval--- metasearch (the fusion of ranked lists returned by retrieval
systems to elicit improved performance), efficient system evaluation (the accurate
evaluation of retrieval systems with small numbers of relevance judgements), and
pooling or ``active sample selection" (the selection of documents for manual judgement
in order to develop sample pools of high precision or pools suitable for assessing
system quality). The thesis establishes a unified theoretical framework for addressing
these three problems and naturally generalizes their solution to the on-line context
by incorporating feedback in the form of relevance judgements. The algorithm---
Rankhedge for on-line retrieval, metasearch and system evaluation--- is the first
to address these three problems simultaneously and also to generalize their solution
to the on-line context.
Optimality of the Rankhedge algorithm is developed via Bayesian and maximum entropy
interpretations. Results of the algorithm prove to be significantly superior to
previous methods when tested over a range of TREC (Text REtrieval Conference) data.
In the absence of feedback, the technique equals or exceeds the performance of benchmark
metasearch algorithms such as CombMNZ and Condorcet. The technique then dramatically
improves on this performance during the on-line metasearch process. In addition,
the technique generates pools of documents which include more relevant documents
and produce more accurate system evaluations than previous techniques. The thesis
includes an information-theoretic examination of the original Hedge algorithm as
well as its adaptation to the context of ranked lists. The work also addresses the
concept of information-theoretic similarity within the Rankhedge context and presents
a method for decorrelating the predictor set to improve worst case performance.
Finally, an information-theoretically optimal method for probabilistic ``active
sampling" is presented with possible application to a broad range of practical and
theoretical contexts.Notes:
A toy rock climbing robot
Dartmouth Technical Report TR2005-542
Matthew P. Bell
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/402/TR2005-542.pdf (1082KB)
Abstract:
The goal of this thesis was to build a simple toy rock climbing robot, and to explore
problems related to grasping, path planning, and robot control. The robot is capable
of climbing a wall of pegs either under manual control through a host system and
an infrared interface, or on the basis of a set of pre-recorded keyframes. In addition,
the robot can climb certain peg configurations using a cyclic gait. The robot climbs
in an open-loop mode without sensor feedback. All communications are sent through
the IR connection, and the tether to the robot consists only of two power wires.Notes:
Aggregated Path Authentication for Efficient BGP Security
Dartmouth Technical Report TR2005-541
Meiyuan Zhao
Sean W. Smith
David M. Nicol
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/401/TR2005-541.pdf (171KB)
Abstract:
The border gateway protocol (BGP) controls inter-domain routing in the Internet.
BGP is vulnerable to many attacks, since routers rely on hearsay information from
neighbors. Secure BGP (S-BGP) uses DSA to provide route authentication and mitigate
many of these risks. However, many performance and deployment issues prevent S-BGP's
real-world deployment. Previous work has explored improving S-BGP processing latencies,
but space problems, such as increased message size and memory cost, remain the major
obstacles. In this paper, we combine two efficient cryptographic techniques---signature
amortization and aggregate signatures---to design new aggregated path authentication
schemes. We propose six constructions for aggregated path authentication that substantially
improve efficiency of S-BGP's path authentication on both speed and space criteria.
Our performance evaluation shows that the new schemes achieve such an efficiency
that they may overcome the space obstacles and provide a real-world practical solution
for BGP security.
Classifying the Mobility of Users and the Popularity of Access Points
Dartmouth Technical Report TR2005-540
Minkyong Kim
David Kotz
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/400/TR2005-540.pdf (1532KB)
Abstract:
There is increasing interest in location-aware systems and applications. It is important
for any designer of such systems and applications to understand the nature of user
and device mobility. Furthermore, an understanding of the effect of user mobility
on access points (APs) is also important for designing, deploying, and managing
wireless networks. Although various studies of wireless networks have provided insights
into different network environments and user groups, it is often hard to apply these
findings to other situations, or to derive useful abstract models.
In this paper, we present a general methodology for extracting mobility information
from wireless network traces, and for classifying mobile users and APs. We used
the Fourier transform to convert time-dependent location information to the frequency
domain, then chose the two strongest periods and used them as parameters to a classification
system based on Bayesian theory. To classify mobile users, we computed diameter
(the maximum distance between any two APs visited by a user during a fixed time
period) and observed how this quantity changes or repeats over time. We found that
user mobility had a strong period of one day, but there was also a large group of
users that had either a much smaller or much bigger primary period. Both primary
and secondary periods had important roles in determining classes of mobile users.
Users with one day as their primary period and a smaller secondary period were most
prevalent; we expect that they were mostly students taking regular classes. To classify
APs, we counted the number of users visited each AP. The primary period did not
play a critical role because it was equal to one day for most of the APs; the secondary
period was the determining parameter. APs with one day as their primary period and
one week as their secondary period were most prevalent. By plotting the classes
of APs on our campus map, we discovered that this periodic behavior of APs seemed
to be independent of their geographical locations, but may depend on the relative
locations of nearby APs. Ultimately, we hope that our study can help the design
of location-aware services by providing a base for user mobility models that reflect
the movements of real users.Notes:
Preventing Theft of Quality of Service on Open Platforms
Dartmouth Technical Report TR2005-539
Kwang-Hyun Baek
Sean W. Smith
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/399/TR2005-539.pdf (597KB)
Abstract:
As multiple types of traffic converge onto one network (frequently wireless), enterprises
face a tradeoff between effectiveness and security. Some types of traffic, such
as voice-over-IP (VoIP), require certain quality of service (QoS) guarantees to
be effective. The end client platform is in the best position to know which packets
deserve this special handling. In many environments (such as universities), end
users relish having control over their own machines. However, if end users administer
their own machines, nothing stops dishonest ones from marking undeserving traffic
for high QoS. How can an enterprise ensure that only appropriate traffic receives
high QoS, while also allowing end users to retain control over their own machines?
In this paper, we present the design and prototype of a solution, using SELinux,
TCPA/TCG hardware, Diffserv, 802.1x, and EAP-TLS.Notes:
An O(n^{5/2} log n) Algorithm for the Rectilinear Minimum Link-Distance Problem in Three Dimensions (Extended Abstract)
Dartmouth Technical Report TR2005-538
Robert Scot Drysdale
Clifford Stein
David P. Wagner
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/398/TR2005-538.pdf (109KB)
Abstract:
In this paper we consider the Rectilinear Minimum Link-Distance Problem in Three
Dimensions. The problem is well studied in two dimensions, but is relatively unexplored
in higher dimensions. We solve the problem in O(B n log n) time, where n is the
number of corners among all obstacles, and B is the size of a BSP decomposition
of the space containing the obstacles. It has been shown that in the worst case
B = Theta(n^{3/2}), giving us an overall worst case time of O(n^{5/2} log n). Previously
known algorithms have had worst-case running times of Omega(n^3).Notes:
Detection of Covert Channel Encoding in Network Packet Delays
Dartmouth Technical Report TR2005-536
Vincent Berk
Annarita Giani
George Cybenko
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/397/TR2005-536-rev1.pdf (137KB)
Abstract:
Covert channels are mechanisms for communicating information in ways that are difficult
to detect. Data exfiltration can be an indication that a computer has been compromised
by an attacker even when other intrusion detection schemes have failed to detect
a successful attack. Covert timing channels use packet inter-arrival times, not
header or payload embedded information, to encode covert messages. This paper investigates
the channel capacity of Internet-based timing channels and proposes a methodology
for detecting covert timing channels based on how close a source comes to achieving
that channel capacity. A statistical approach is then used for the special case
of binary codes.Notes:
Graphical Models of Residue Coupling in Protein Families
Dartmouth Technical Report TR2005-535
John Thomas
Naren Ramakrishnan
Chris Bailey-Kellogg
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/396/TR2005-535.pdf (282KB)
Abstract:
Identifying residue coupling relationships within a protein family can provide important
insights into intrinsic molecular processes, and has significant applications in
modeling structure and dynamics, understanding function, and designing new or modified
proteins. We present the first algorithm to infer an undirected graphical model
representing residue coupling in protein families. Such a model serves as a compact
description of the joint amino acid distribution, and can be used for predictive
(will this newly designed protein be folded and functional?), diagnostic (why is
this protein not stable or functional?), and abductive reasoning (what if I attempt
to graft features of one protein family onto another?). Unlike current correlated
mutation algorithms that are focused on assessing dependence, which can conflate
direct and indirect relationships, our algorithm focuses on assessing independence,
which modularizes variation and thus enables efficient reasoning of the types described
above. Further, our algorithm can readily incorporate, as priors, hypotheses regarding
possible underlying mechanistic/energetic explanations for coupling. The resulting
approach constitutes a powerful and discriminatory mechanism to identify residue
coupling from protein sequences and structures. Analysis results on the G-protein
coupled receptor (GPCR) and PDZ domain families demonstrate the ability of our approach
to effectively uncover and exploit models of residue coupling.Notes:
Department of Computer Science Activity 1998-2004
Dartmouth Technical Report TR2005-534
David Kotz
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/395/TR2005-534.pdf (871KB)
Abstract:
This report summarizes much of the research and teaching activity of the Department
of Computer Science at Dartmouth College between late 1998 and late 2004. The material
for this report was collected as part of the final report for NSF Institutional
Infrastructure award EIA-9802068, which funded equipment and technical staff during
that six-year period. This equipment and staff supported essentially all of the
departments research activity during that period.
Spatial Multipath Location Aided Routing
Dartmouth Technical Report TR2005-533
Soumendra Nanda
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/394/TR2005-533.pdf (1945KB)
Abstract:
Mobile ad-hoc networks (MANETs) are infrastructure-free networks of mobile nodes
that communicate with each other wirelessly. There are several routing schemes that
have been proposed and several of these have been already extensively simulated
or implemented as well. The primary applications of such networks have been in disaster
relief operations, military use, conferencing and environment sensing. There are
several ad hoc routing algorithms at present that utilize position information (usually
in two dimensional terms) to make routing decisions at each node. Our goal is to
utilize three-dimensional (3D) position information to provide more reliable as
well as efficient routing for certain applications. We thus describe extensions
to various location aware routing algorithms to work in 3D. We propose a new hierarchical,
zone-based 3D routing algorithm, based on GRID by Liao, Tseng and Sheu. Our new
algorithm called "Hyper-GRID" is a hybrid algorithm that uses multipath routing
(alternate path caching) in 3D. We propose replacing LAR with Multipath LAR (MLAR)
in GRID. We have implemented MLAR and are validating MLAR through simulation using
ns-2 and studying its efficiency, scalability and other properties. We use a random
waypoint mobility model and compare our MLAR approach versus LAR, AODV and AOMDV
in both 2D and 3D for a range of traffic and mobility scenarios. Our simulation
results demonstrate the performance benefits of MLAR over LAR and AODV in most mobility
situations. AOMDV delivers more packets than MLAR consistently, but does so at the
cost of more frequent flooding of control packets and thus higher bandwidth usage
than MLAR.Notes:
SHEMP: Secure Hardware Enhanced MyProxy
Dartmouth Technical Report TR2005-532
John Marchesini
Sean Smith
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/393/TR2005-532.pdf (171KB)
Abstract:
While PKI applications differ in how they use keys, all applications share one assumption:
users have keypairs. In previous work, we established that desktop keystores are
not safe places to store private keys, because the TCB is too large. These keystores
are also immobile, difficult to use, and make it impossible for relying parties
to make reasonable trust judgments. Since we would like to use desktops as PKI clients
and cannot realistically expect to redesign the entire desktop, this paper presents
a system that works within the confines of modern desktops to shrink the TCB needed
for PKI applications. Our system (called Secure Hardware Enhanced MyProxy (SHEMP))
shrinks the TCB in space and allows the TCB's size to vary over time and over various
application sensitivity levels, thus making desktops usable for PKI.
Statistical Tools for Digital Image Forensics
Dartmouth Technical Report TR2005-531
Alin C. Popescu
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/392/TR2005-531.pdf (8204KB)
Abstract:
A digitally altered image, often leaving no visual clues of having been tampered
with, can be indistinguishable from an authentic image. The tampering, however,
may disturb some underlying statistical properties of the image. Under this assumption,
we propose five techniques that quantify and detect statistical perturbations found
in different forms of tampered images: (1) re-sampled images (e.g., scaled or rotated);
(2) manipulated color filter array interpolated images; (3) double JPEG compressed
images; (4) images with duplicated regions; and (5) images with inconsistent noise
patterns. These techniques work in the absence of any embedded watermarks or signatures.
For each technique we develop the theoretical foundation, show its effectiveness
on credible forgeries, and analyze its sensitivity and robustness to simple counter-attacks.Notes:
High-Throughput Inference of Protein-Protein Interaction Sites from Unassigned NMR Data by Analyzing Arrangements Induced By Quadratic Forms on 3-Manifolds
Dartmouth Technical Report TR2005-530
Ramgopal R. Mettu
Ryan H. Lilien
Bruce R. Donald
Date: January 2005
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/391/TR2005-530.pdf (1492KB)
Abstract:
We cast the problem of identifying protein-protein interfaces, using only unassigned
NMR spectra, into a geometric clustering problem. Identifying protein-protein interfaces
is critical to understanding inter- and intra-cellular communication, and NMR allows
the study of protein interaction in solution. However it is often the case that
NMR studies of a protein complex are very time-consuming, mainly due to the bottleneck
in assigning the chemical shifts, even if the apo structures of the constituent
proteins are known. We study whether it is possible, in a high-throughput manner,
to identify the interface region of a protein complex using only unassigned chemical
shift and residual dipolar coupling (RDC) data. We introduce a geometric optimization
problem where we must cluster the cells in an arrangement on the boundary of a 3-manifold.
The arrangement is induced by a spherical quadratic form, which in turn is parameterized
by SO(3)xR^2. We show that this formalism derives directly from the physics of RDCs.
We present an optimal algorithm for this problem that runs in O(n^3 log n) time
for an n-residue protein. We then use this clustering algorithm as a subroutine
in a practical algorithm for identifying the interface region of a protein complex
from unassigned NMR data. We present the results of our algorithm on NMR data for
7 proteins from 5 protein complexes and show that our approach is useful for high-throughput
applications in which we seek to rapidly identify the interface region of a protein
complex.Notes:
Secure Context-sensitive Authorization
Dartmouth Technical Report TR2004-529
Kazuhiro Minami
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/390/TR2004-529.pdf (343KB)
Abstract:
There is a recent trend toward rule-based authorization systems to achieve flexible
security policies. Also, new sensing technologies in pervasive computing make it
possible to define context-sensitive rules, such as ``allow database access only
to staff who are currently located in the main office.'' However, these rules, or
the facts that are needed to verify authority, often involve sensitive context information.
This paper presents a secure context-sensitive authorization system that protects
confidential information in facts or rules. Furthermore, our system allows multiple
hosts in a distributed environment to perform the evaluation of an authorization
query in a collaborative way; we do not need a universally trusted central host
that maintains all the context information. The core of our approach is to decompose
a proof for making an authorization decision into a set of sub-proofs produced on
multiple different hosts, while preserving the integrity and confidentiality policies
of the mutually untrusted principals operating these hosts. We prove the correctness
of our algorithm.
Discrete-Time Fractional Differentiation from Integer Derivatives
Dartmouth Technical Report TR2004-528
Hany Farid
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/389/TR2004-528.pdf (642KB)
Abstract:
Discrete-time fractional derivative filters (1-D and 2-D) are shown to be well approximated
from a small set of integer derivatives. A fractional derivative of arbitrary order
(and, in 2-D, of arbitrary orientation) can therefore be efficiently computed from
a linear combination of integer derivatives of the underlying signal or image.
Type-Safe Operating System Abstractions
Dartmouth Technical Report TR2004-526
Lea Wittie
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/388/TR2004-526.pdf (1349KB)
Abstract:
Operating systems and low-level applications are usually written in languages like
C and assembly, which provide access to low-level abstractions. These languages
have unsafe type systems that allow many bugs to slip by programmers. For example,
in 1988, the Internet Worm exploited several insecure points in Unix including the
finger command. A call to finger with an unexpected argument caused a buffer overflow,
leading to the shutdown of most Internet traffic. A finger application written in
a type-safe language would have prevented its exploit and limited the points the
Internet Worm could attack. Such vulnerabilities are unacceptable in security-critical
applications such as the secure coprocessors of the Marianas network, secStore key
storage from Plan 9, and self-securing storage.
This research focuses on safe language techniques for building OS components that
cannot cause memory or IO errors. For example, an Ethernet device driver communicates
with its device through IO operations. The device depends on FIFO queues to send
and receive packets. A mistake in an IO operation can overflow or underflow the
FIFO queues, cause memory errors, or cause configuration inconsistencies on the
device. Data structures such as FIFO queues can be written safely in safe languages
such as Java and ML but these languages do not allow the access to the low-level
resources that an OS programmer needs. Therefore, safe OS components require a language
that combines the safety of Java with the low-level control of C.
My research formalizes the concurrency, locks, and system state needed by the safety-critical
areas of a device driver. These formal concepts are built on top of an abstract
syntax and rules that guarantees basic memory safety using linear and singleton
types to implement safe memory load and store operations. I proved that the improved
abstract machine retains the property of soundness, which means that all well-typed
programs will be able to execute until they reach an approved end-state. Together,
the concurrency, locks, and state provide safety for IO operations and data structures.
Using the OSKit from the University of Utah as a starting point, I developed a
small operating system. I ported the 3c509 Ethernet device driver from C to Clay,
a C-like type-safe language that uses a type system powerful enough to enforce invariants
about low-level devices and data structures. The resulting driver works safely in
a multi-threaded environment. It is guaranteed to obtain locks before using shared
data. It cannot cause a FIFO queue to overflow or underflow and it will only call
IO operations when invariants are satisfied.
This type-safe driver demonstrates an actual working application of the theoretical
components of my research. The abstract machine is powerful enough to encode a given
OS specification and enforce a provably matching implementation. These results lead
towards fundamentally secure computing environments.Notes:
Secure Hardware Enhanced MyProxy
Dartmouth Technical Report TR2004-525
John Marchesini
Sean W. Smith
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/387/TR2004-525.pdf (202KB)
Abstract:
In 1976, Whitfield Diffie and Martin Hellman demonstrated how "New Directions In
Cryptography" could enable secure information exchange between parties that do not
share secrets. In order for public key cryptography to work in modern distributed
environments, we need an infrastructure for finding and trusting other parties'
public keys (i.e., a PKI). A number of useful applications become possible with
PKI. While the applications differ in how they use keys (e.g., S/MIME uses the key
for message encryption and signing, while client-side SSL uses the key for authentication),
all applications share one assumption: users have keypairs.
In previous work, we examined the security aspects of some of the standard keystores
and the their interaction with the OS. We concluded that desktops are not safe places
to store private keys, and we demonstrated the permeability of keystores such as
the default Microsoft keystore and the Mozilla keystore. In addition to being unsafe,
these desktop keystores have the added disadvantage of being immobile.
In other previous work, we examined trusted computing. In industry, a new trusted
computing initiative has emerged: the Trusted Computing Platform Alliance (TCPA)
(now renamed the Trusted Computing Group (TCG)). The goal of the TCG design is lower-assurance
security that protects an entire desktop platform and is cheap enough to be commercially
feasible. Last year, we built a trusted computing platform based on the TCG specifications
and hardware.
The picture painted by these previous projects suggests that common desktops are
not secure enough for use as PKI clients, and trusted computing can improve the
security of client machines. The question that I propose to investigate is: "Can
I build a system which applies trusted computing hardware in a reasonable manner
in order to make desktops usable for PKI?" My design begins with the Grid community's
"MyProxy" credential repository, and enhances it to take advantage of secure hardware
on the clients, at the repository, and in the policy framework. The result is called
"Secure Hardware Enhanced MyProxy".
A Survey of WPA and 802.11i RSN Authentication Protocols
Dartmouth Technical Report TR2004-524
Kwang-Hyun Baek
Sean W. Smith
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/386/TR2004-524.pdf (410KB)
Abstract:
In the new standards for WLAN security, many choices exist for the authentication
process. In this paper, we list eight desired properties of WLAN authentication
protocols, survey eight recent authentication protocols, and analyze the protocols
according to the desired properties.
Efficient Wait-Free Implementation of Multiword LL/SC Variables
Dartmouth Technical Report TR2004-523
Prasad Jayanti
Srdjan Petrovic
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/385/TR2004-523.pdf (195KB)
Abstract:
Since the design of lock-free data structures often poses a formidable intellectual
challenge, researchers are constantly in search of abstractions and primitives that
simplify this design. The multiword LL/SC object is such a primitive: many existing
algorithms are based on this primitive, including the nonblocking and wait-free
universal constructions of Anderson and Moir (1995), the closed objects construction
of Chandra et al.(1998) and the snapshot algorithms of Jayanti (2002, 2004).
In this paper, we consider the problem of implementing a W-word LL/SC object shared
by N processes. The previous best algorithm, due to Anderson and Moir (1995), is
time optimal (LL and SC operations run in O(W) time), but has a space complexity
of O(N^2W). We present an algorithm that uses novel buffer management ideas to cut
down the space complexity by a factor of N to O(NW), while still being time optimal.
Composing a Well-Typed Region
Dartmouth Technical Report TR2004-521
Chris Hawblitzel
Heng Huang
Lea Wittie
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/384/TR2004-521.pdf (477KB)
Abstract:
Efficient low-level systems need more control over memory than safe high-level languages
usually provide. In particular, safe languages usually prohibit explicit deallocation,
in order to prevent dangling pointers. Regions provide one safe deallocation mechanism;
indeed, many region calculi have appeared recently, each with its own set of operations
and often complex rules. This paper encodes regions from lower-level typed primitives
(linear memory, coercions, and delayed types), so that programmers can design their
own region operations and rules.
Mercer Kernels for Object Recognition with Local Features
Dartmouth Technical Report TR2004-520
Siwei Lyu
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/383/TR2004-520.pdf (210KB)
Abstract:
In this paper, we propose a new class of kernels for object recognition based on
local image feature representations. Formal proofs are given to show that these
kernels satisfy the Mercer condition and reflect similarities between sets of local
features. In addition, multiple types of local features and semilocal constraints
are incorporated to reduce mismatches between local features, thus further improve
the classification performance. Experimental results of SVM classifiers coupled
with the proposed kernels are reported on ecognition tasks with the standard COIL-100
database and compared with existing methods. The proposed kernels achieved satisfactory
performance and were robust to changes in object configurations and image degradations.Notes:
Heterogeneous Self-Reconfiguring Robotics
Dartmouth Technical Report TR2004-519
Robert C. Fitch
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/382/TR2004-519.pdf (11757KB)
Abstract:
Self-reconfiguring (SR) robots are modular systems that can autonomously change shape,
or reconfigure, for increased versatility and adaptability in unknown environments.
In this thesis, we investigate planning and control for systems of non-identical
modules, known as heterogeneous SR robots. Although previous approaches rely on
module homogeneity as a critical property, we show that the planning complexity
of fundamental algorithmic problems in the heterogeneous case is equivalent to that
of systems with identical modules. Primarily, we study the problem of how to plan
shape changes while considering the placement of specific modules within the structure.
We characterize this key challenge in terms of the amount of free space available
to the robot and develop a series of decentralized reconfiguration planning algorithms
that assume progressively more severe free space constraints and support reconfiguration
among obstacles. In addition, we compose our basic planning techniques in different
ways to address problems in the related task domains of positioning modules according
to function, locomotion among obstacles, self-repair, and recognizing the achievement
of distributed goal-states. We also describe the design of a novel simulation environment,
implementation results using this simulator, and experimental results in hardware
using a planar SR system called the Crystal Robot. These results encourage development
of heterogeneous systems. Our algorithms enhance the versatility and adaptability
of SR robots by enabling them to use functionally specialized components to match
capability, in addition to shape, to the task at hand.Notes:
Creating and Detecting Doctored and Virtual Images: Implications to The Child Pornography Prevention Act
Dartmouth Technical Report TR2004-518
Hany Farid
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/381/TR2004-518.pdf (4612KB)
Abstract:
The 1996 Child Pornography Prevention Act (CPPA) extended the existing federal criminal
laws against child pornography to include certain types of "virtual porn". In 2002,
the United States Supreme Court found that portions of the CPPA, being overly broad
and restrictive, violated First Amendment rights. The Court ruled that images containing
an actual minor or portions of a minor are not protected, while computer generated
images depicting a fictitious "computer generated" minor are constitutionally protected.
In this report I outline various forms of digital tampering, placing them in the
context of this recent ruling. I also review computational techniques for detecting
doctored and virtual (computer generated) images.
Parallel Out-of-Core Sorting: The Third Way
Dartmouth Technical Report TR2004-517
Geeta Chaudhry
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/380/TR2004-517.pdf (619KB)
Abstract:
Sorting very large datasets is a key subroutine in almost any application that is
built on top of a large database. Two ways to sort out-of-core data dominate the
literature: merging-based algorithms and partitioning-based algorithms. Within these
two paradigms, all the programs that sort out-of-core data on a cluster rely on
assumptions about the input distribution.
We propose a third way of out-of-core sorting: oblivious algorithms. In all, we
have developed six programs that sort out-of-core data on a cluster. The first three
programs, based completely on Leighton's columnsort algorithm, have a restriction
on the maximum problem size that they can sort. The other three programs relax this
restriction; two are based on our original algorithmic extensions to columnsort.
We present experimental results to show that our algorithms perform well. To the
best of our knowledge, the programs presented in this thesis are the first to sort
out-of-core data on a cluster without making any simplifying assumptions about the
distribution of the data to be sorted.Notes:
Exposing Digital Forgeries by Detecting Duplicated Image Regions
Dartmouth Technical Report TR2004-515
Alin C. Popescu
Hany Farid
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/379/TR2004-515.pdf (5887KB)
Abstract:
We describe an efficient technique that automatically detects duplicated regions
in a digital image. This technique works by first applying a principal component
analysis to small fixed-size image blocks to yield a reduced dimension representation.
This representation is robust to minor variations in the image due to additive noise
or lossy compression. Duplicated regions are then detected by lexicographically
sorting all of the image blocks. We show the efficacy of this technique on credible
forgeries, and quantify its robustness and sensitivity to additive noise and lossy
JPEG compression.
Solar: Building A Context Fusion Network for Pervasive Computing
Dartmouth Technical Report TR2004-514
Guanling Chen
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/378/TR2004-514.pdf (2534KB)
Abstract:
The complexity of developing context-aware pervasive-computing applications calls
for distributed software infrastructures that assist applications to collect, aggregate,
and disseminate contextual data. In this dissertation, we present a Context Fusion
Network (CFN), called Solar, which is built with a scalable and self-organized service
overlay. Solar is flexible and allows applications to select distributed data sources
and compose them with customized data-fusion operators into a directed acyclic information
flow graph. Such a graph represents how an application computes high-level understandings
of its execution context from low-level sensory data. To manage application-specified
operators on a set of overlay nodes called Planets, Solar provides several unique
services such as application-level multicast with policy-driven data reduction to
handle buffer overflow, context-sensitive resource discovery to handle environment
dynamics, and proactive monitoring and recovery to handle common failures. Experimental
results show that these services perform well on a typical DHT-based peer-to-peer
routing substrate. In this dissertation, we also discuss experience, insights, and
lessons learned from our quantitative analysis of the input sensors, a detailed
case study of a Solar application, and development of other applications in different
domains.Notes:
Performance Evaluation of a Resource Discovery Service
Dartmouth Technical Report TR2004-513
Jue Wang
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/377/TR2004-513.pdf (971KB)
Abstract:
In a pervasive computing environment, the number and variety of resources (services,
devices, and contextual information resources) make it necessary for applications
to accurately discover the best ones quickly. Thus a resource-discovery service,
which locates specific resources and establishes network connections as better resources
become available, is necessary for those applications. The performance of the resource-discovery
service is important when the applications are in a dynamic and mobile environment.
In this thesis, however, we do not focus on the resource- discovery technology itself,
but the evaluation of the scalability and mobility of the resource discovery module
in Solar, a context fusion middleware. Solar has a naming service that provides
resource discovery, since the resource names encode static and dynamic attributes.
The results of our experiments show that Solar's resource discovery performed generally
well in a typical dynamic environment, although Solar can not be scaled as well
as it should. And we identify the implementation issues related to that problem.
We also discuss experience, insights, and lessons learned from our quantitative
analysis of the experiment results.
Testing the Greenpass Wireless Security System
Dartmouth Technical Report TR2004-512
Kimberly S. Powell
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/376/TR2004-512.pdf (228KB)
Abstract:
Greenpass, developed by Nick Goffee, Sung Hoon Kim, Meiyuan Zhao and John Marchesini
under the supervision of Sean Smith and Punch Taylor, is a wireless security solution
that implements SPKI/SDSI delegation on top of X.509 keypairs within the EAP-TLS
authentication protocol. This system aims to model the decentralized way that authorization
flows in real-world enterprise settings and provide a seamless solution that allows
for easy access to all resources in the network by both registered users and authorized
guests. These goals are achieved through the deployment of a delegation tool, which
allows an active entity associated to the organization's network to grant authorization
to another entity previously unauthorized to use the network.
This paper describes the testing process of the first prototype for this system.
It examines trust and usability issues of the Greenpass Wireless Security System
and determines the accuracy of the system's implementation in relation to its objectives.
It then addresses the planning and execution of a small-scale demo for this prototype
based on the examined issues and makes projections for further tests on a larger
scale.Notes:
Outdoor Experimental Comparison of Four Ad Hoc Routing Algorithms
Dartmouth Technical Report TR2004-511
Robert S. Gray
David Kotz
Calvin Newport
Nikita Dubrovsky
Aaron Fiske
Jason Liu
Christopher Masone
Susan McGrath
Yougu Yuan
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/375/TR2004-511.pdf (219KB)
Abstract:
Most comparisons of wireless ad hoc routing algorithms involve simulated or indoor
trial runs, or outdoor runs with only a small number of nodes, potentially leading
to an incorrect picture of algorithm performance. In this paper, we report on the
results of an outdoor trial run of four different routing algorithms, APRL, AODV,
GPSR, and STARA, running on top of thirty-three 802.11-enabled laptops moving randomly
through an athletic field. The laptops generated random traffic according to the
traffic patterns observed in a prototype application, and ran each routing algorithm
for a fifteen-minute period over the course of the hour-long trial run. The 33-laptop
experiment represents one of the largest outdoor tests of wireless routing algorithms,
and three of the algorithms each come from a different algorithmic class, providing
insight into the behavior of ad hoc routing algorithms at larger real-world scales
than have been considered so far. In addition, we compare the outdoor results with
both indoor ("tabletop") and simulation results for the same algorithms, examining
the differences between the indoor results and the outdoor reality. The paper also
describes the software infrastructure that allowed us to implement the ad hoc routing
algorithms in a comparable way, and use the same codebase for indoor, outdoor, and
simulated trial runs.Notes:
Greenpass RADIUS Tools for Delegated Authorization in Wireless Networks
Dartmouth Technical Report TR2004-510
Sung Hoon Kim
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/374/TR2004-510.pdf (7922KB)
Abstract:
Dartmouth's Greenpass project extends how public key cryptography can be used to
secure the wireless LAN with a RADIUS (Remote Authentication Dial In User Service)
server that is responsible for handling authentication requests from clients (called
supplicants in the 802.1x authentication model). This thesis describes the design
and implementation of the authentication process of Greenpass, specifically what
decisions are made in determining who is granted access and how a small modification
of already existing protocols can be used to provide guest access in a way that
better reflects how delegation of authority works in the real world.
Greenpass takes advantage of the existing PKI to authenticate local Dartmouth users
via X.509 identity certificates using EAP-TLS. We use the flexibility of SPKI/SDSI
(Simple Public Key Infrastructure/Simple Distributed Security Infrastructure) authorization
certificates to distribute the responsibility of delegating access to guests to
certain authorized delegators, avoiding some of the necessary steps and paperwork
associated with having a large centralized entity responsible for the entire institution.
This thesis also discusses how our solution can be adapted to support different
methods of guest delegation and investigates the possibility of eliminating the
cumbersome central entity and administrative overhead traditionally associated with
public key cryptography.Notes:
Greenpass Client Tools for Delegated Authorization in Wireless Networks
Dartmouth Technical Report TR2004-509
Nicholas C. Goffee
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/373/TR2004-509.pdf (798KB)
Abstract:
Dartmouth's Greenpass project seeks to provide strong access control to a wireless
network while simultaneously providing flexible guest access; to do so, it augments
the Wi-Fi Alliance's existing WPA standard, which offers sufficiently strong user
authentication and access control, with authorization based on SPKI certificates.
SPKI allows certain local users to delegate network access to guests by issuing
certificates that state, in essence, "he should get access because I said it's okay."
The Greenpass RADIUS server described in Kim's thesis [55] performs an authorization
check based on such statements so that guests can obtain network access without
requiring a busy network administrator to set up new accounts in a centralized database.
To our knowledge, Greenpass is the first working delegation-based solution to Wi-Fi
access control.
My thesis describes the Greenpass client tools, which allow a guest to introduce
himself to a delegator and allow the delegator to issue a new SPKI certificate to
the guest. The guest does not need custom client software to introduce himself or
to connect to the Wi-Fi network. The guest and delegator communicate using a set
of Web applications. The guest obtains a temporary key pair and X.509 certificate
if needed, then sends his public key value to a Web server we provide. The delegator
looks up her guest's public key and runs a Java applet that lets her verify her
guests' identity using visual hashing and issue a new SPKI certificate to him. The
guest's new certificate chain is stored as an HTTP cookie to enable him to "push"
it to an authorization server at a later time. I also describe how Greenpass can
be extended to control access to a virtual private network (VPN) and suggest several
interesting future research and development directions that could build on this
work.Notes:
PPL: a Packet Processing Language
Dartmouth Technical Report TR2004-508
Eric G. Krupski
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/372/TR2004-508.pdf (146KB)
Abstract:
Any computing device or system that uses the internet needs to analyze and identify
the contents of network packets. Code that does this is often written in C, but
reading, identifying, and manipulating network packets in C requires writing tricky
and tedious code. Previous work has offered specification languages for describing
the format of network packets, which would allow packet type identification without
the hassles of doing this task in C. For example, McCann and Chandra's Packet Types
[3] system allows the programmer to define arbitrary packet types and generates
C unctions which match given data against a specified packet type. This paper will
present a packet processing language named PPL, which extends McCann and Chandras
Packet Types to allow the programmer to not only describe arbitrary packet types,
but also to control when and how a matching is attempted, with ML-style pattern
matching. PPL is intended for multiple applications, such as intrusion detection
systems, quick prototypes of new protocols, and IP de-multiplexing code.Notes:
Experimental evaluation of wireless simulation assumptions
Dartmouth Technical Report TR2004-507
David Kotz
Calvin Newport
Robert S. Gray
Jason Liu
Yougu Yuan
Chip Elliott
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/371/TR2004-507.pdf (549KB)
Abstract:
All analytical and simulation research on ad~hoc wireless networks must necessarily
model radio propagation using simplifying assumptions. Although it is tempting to
assume that all radios have circular range, have perfect coverage in that range,
and travel on a two-dimensional plane, most researchers are increasingly aware of
the need to represent more realistic features, including hills, obstacles, link
asymmetries, and unpredictable fading. Although many have noted the complexity of
real radio propagation, and some have quantified the effect of overly simple assumptions
on the simulation of ad~hoc network protocols, we provide a comprehensive review
of six assumptions that are still part of many ad~hoc network simulation studies.
In particular, we use an extensive set of measurements from a large outdoor routing
experiment to demonstrate the weakness of these assumptions, and show how these
assumptions cause simulation results to differ significantly from experimental results.
We close with a series of recommendations for researchers, whether they develop
protocols, analytic models, or simulators for ad~hoc wireless networks.Notes:
Access Control in a Distributed Decentralized Network: An XML Approach to Network Security using XACML and SAML
Dartmouth Technical Report TR2004-506
Paul J. Mazzuca
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/370/TR2004-506.pdf (666KB)
Abstract:
The development of eXtensible Distributed Access Control (XDAC) systems is influenced
by the transference of data access and storage from the local computer to the network.
In this distributed system, access control is determined by independent components
which transmit requests and decisions over a network, utilizing XML signing capabilities
found in the Security Assertion Markup Language (SAML). All resources in the XDAC
system are protected by the first component, a Policy Enforcement Point (PEP), which
acts as the main divider between the requesting entity and the requested resource.
The PEP grants access to a resource only if the second component, a Policy Decision
Point (PDP), returns a permit response after consulting a set of applicable policies
based on the requester's attributes, the resource, the action that the requester
desires to apply to that resource, and optionally the environment.
With Sun's eXtensible Access Control Markup Language (XACML), the XML encoded policies
can be combined among multiple nodes across a network using XACML rules and algorithms
to formulate a single decision based on an XACML request. In this thesis project,
I build a secure and efficient XDAC System based on XACML, implement an extension
to the SAML Assertion design by including XACML Attributes and Results, describe
in-detail about the many features that a XDAC System should embody, and show how
a XDAC System would be effectively used in modern day computing.Notes:
Technological Implications for Privacy
Dartmouth Technical Report TR2004-505
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/369/TR2004-505.pdf (132KB)
Abstract:
The World-Wide Web is increasingly used for commerce and access to personal information
stored in databases. Although the Web is ``just another medium'' for information
exchange, the fact that all the information is stored in computers, and all of the
activity happens in computers and computer networks, makes it easier (cheaper) than
every to track users' activities. By recording and analyzing user's activities in
the Web, activities that may seem to be quite private to many users, it is more
likely than ever before that a person's privacy may be threatened. In this paper
I examine some of the technology in the Web, and how it affects the privacy of Web
users. I also briefly summarize some of the efforts to regulate privacy on the Internet.Notes:
Simulating mobile ad hoc networks: a quantitative evaluation of common MANET simulation models
Dartmouth Technical Report TR2004-504
Calvin Newport
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/368/TR2004-504.pdf (731KB)
Abstract:
Because it is difficult and costly to conduct real-world mobile ad hoc network experiments,
researchers commonly rely on computer simulation to evaluate their routing protocols.
However, simulation is far from perfect. A growing number of studies indicate that
simulated results can be dramatically affected by several sensitive simulation parameters.
It is also commonly noted that most simulation models make simplifying assumptions
about radio behavior. This situation casts doubt on the reliability and applicability
of many ad hoc network simulation results.
In this study, we begin with a large outdoor routing experiment testing the performance
of four popular ad hoc algorithms (AODV, APRL, ODMRP, and STARA). We present a detailed
comparative analysis of these four implementations. Then, using the outdoor results
as a baseline of reality, we disprove a set of common assumptions used in simulation
design, and quantify the impact of these assumptions on simulated results. We also
more specifically validate a group of popular radio models with our real-world data,
and explore the sensitivity of various simulation parameters in predicting accurate
results. We close with a series of specific recommendations for simulation and ad
hoc routing protocol designers.Notes:
Enhancing Expressiveness of Speech through Animated Avatars for Instant Messaging and Mobile Phones
Dartmouth Technical Report TR2004-503
Joseph E. Pechter
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/367/TR2004-503.pdf (562KB)
Abstract:
This thesis aims to create a chat program that allows users to communicate via an
animated avatar that provides believable lip-synchronization and expressive emotion.
Currently many avatars do not attempt to do lip-synchronization. Those that do are
not well synchronized and have little or no emotional expression. Most avatars with
lip synch use realistic looking 3D models or stylized rendering of complex models.
This work utilizes images rendered in a cartoon style and lip-synchronization rules
based on traditional animation. The cartoon style, as opposed to a more realistic
look, makes the mouth motion more believable and the characters more appealing.
The cartoon look and image-based animation (as opposed to a graphic model animated
through manipulation of a skeleton or wireframe) also allows for fewer key frames
resulting in faster speed with more room for expressiveness. When text is entered
into the program, the Festival Text-to-Speech engine creates a speech file and extracts
phoneme and phoneme duration data. Believable and fluid lip-synchronization is then
achieved by means of a number of phoneme-to-image rules. Alternatively, phoneme
and phoneme duration data can be obtained for speech dictated into a microphone
using Microsoft SAPI and the CSLU Toolkit. Once lip synchronization has been completed,
rules for non-verbal animation are added. Emotions are appended to the animation
of speech in two ways: automatically, by recognition of key words and punctuation,
or deliberately, by user-defined tags. Additionally, rules are defined for idle-time
animation. Preliminary results indicate that the animated avatar program offers
an improvement over currently available software. It aids in the understandability
of speech, combines easily recognizable and expressive emotions with speech, and
successfully enhances overall enjoyment of the chat experience. Applications for
the program include use in cell phones for the deaf or hearing impaired, instant
messaging, video conferencing, instructional software, and speech and animation
synthesis.Notes:
A Secure Network Node Approach to the Policy Decision Point in Distributed Access Control
Dartmouth Technical Report TR2004-502
Geoffrey H. Stowe
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/366/TR2004-502.pdf (340KB)
Abstract:
To date, the vast majority of access control research and development has been on
gathering, managing, and exchanging information about users. But an equally important
component which has yet to be fully developed is the Policy Decision Point - the
system that decides whether an access request should be granted given certain attributes
of the requestor. This paper describes the research and implementation of a new
PDP system for an undergraduate honors project. This PDP system employs three unique
features which differentiate it from existing technology: collaboration capabilities,
trusted management, and interoperability with other access control systems. Security
considerations and future research areas are also discussed.Notes:
Synchronizing Keyframe Facial Animation to Multiple Text-to-Speech Engines and Natural Voice with Fast Response Time
Dartmouth Technical Report TR2004-501
William Pechter
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/365/TR2004-501.pdf (2265KB)
Abstract:
This thesis aims to create an automated lip-synchronization system for real-time
applications. Specifically, the system is required to be fast, consist of a limited
number of keyframes with small memory requirements, and create fluid and believable
animations that synchronize with text-to-speech engines as well as raw voice data.
The algorithms utilize traditional keyframe animation and a novel method of keyframe
selection. Additionally, phoneme-to-keyframe mapping, synchronization, and simple
blending rules are employed. The algorithms provide blending between keyframe images,
borrow information from neighboring phonemes, accentuate phonemes b, p and m, differentiate
between keyframes for phonemes with allophonic variations, and provide prosodromic
variation by including emotion while speaking. The lip-sync animation synchronizes
with multiple synthesized voices and human speech. A fast and versatile online real-time
java chat interface is created to exhibit vivid facial animation.
Results show that the animation algorithms are fast and show accurate lip-synchronization.
Additionally, surveys showed that the animations are visually pleasing and improve
speech understandability 96% of the time. Applications for this project include
internet chat capabilities, interactive teaching of foreign languages, animated
news broadcasting, enhanced game technology, and cell phone messaging.Notes:
Scheduling Pipelined, Multi-Threaded Programs in Linux
Dartmouth Technical Report TR2004-500
Brunn W. Roysden
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/364/TR2004-500.pdf (67KB)
Abstract:
A process causes latency when it performs I/O or communication. Pipelined processes
mitigate latency by concurrently executing multiple threads--- sequences of operations---
and overlapping computation, communication, and I/O. When more than one thread is
ready to run, the scheduler determines which thread in fact runs. This paper presents
techniques for scheduling pipelines, with the following three findings.
First, using Linux kernel version 2.6 and the NPTL threads package, we observe a
3-6% performance improvement over kernel version 2.4 and the LinuxThreads package.
Second, we test techniques that both take advantage of prior knowledge about whether
a program is I/O-bound or compute-bound and raise and lower priorities before the
pipeline begins working. These techniques, referred to as fixed scheduling, further
improve performance by 5% in the case of the compute-bound columnsort algorithm.
In the I/O-bound algorithm, fixed scheduling failed to yield better performance
than the default scheduling.
Third, we test simple, adjusting methods that do not take advantage of prior knowledge
about whether a program in compute-bound or I/O-bound but rather adjust scheduling
as the pipeline progresses. These techniques, called adjusting scheduling, fail
to yield better performance than the default scheduling in any of our test cases.
In addition, we suggest new scheduling calls and other operating-system improvements.Notes:
Mobile Agents Simulation with DaSSF
Dartmouth Technical Report TR2004-499
Nikita E Dubrovsky
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/363/TR2004-499.pdf (353KB)
Abstract:
Mobile agents are programs that can migrate from machine to machine in a network
of computers and have complete control over their movement. Since the performance
space of mobile agents has not been characterized fully, assessing the effectiveness
of using mobile agents over a traditional client/server approach currently requires
implementing an agent system and running time-consuming experiments.
This report presents a simple mobile-agent simulation that can provide quick information
on the performance and scalability of a generic information retrieval (IR) mobile-agent
system under different network configurations. The simulation is built using the
DaSSF and DaSSFNet frameworks, resulting in high performance and great configuration
flexibility. This report also implements a real D'Agents mobile-agent IR system,
measuring the performance of the system. A comparison of these real-world performance
results and those given by the simulation suggest that the simulation has good accuracy
in predicting the scalability of a mobile-agent system. Thus this report argues
that simulation provides a good way to quickly assess the performance and scalability
of an IR mobile-agent system under different network configurations.Notes:
Efficient Wait-Free Implementation of Atomic Multi-Word Buffer
Dartmouth Technical Report TR2004-498
Rachel B. Ringel
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/362/TR2004-498.pdf (149KB)
Abstract:
This thesis proposes algorithms for implementing a atomic multi-word buffer, which
can be accessed concurrently by multiple readers and a single writer, from the hardware-supported
shared memory. The algorithms are required to be wait-free: each process reads or
writes the multi-word buffer in a bounded number of its own steps, regardless of
whether other processes are fast, slow or have crashed. Our first algorithm is
built from multi-writer, multi-reader variables whereas the second algorithm is
built from single-writer, multi-reader variables. For either algorithm, the worst-case
running time of a read or a write operation on the m-word buffer is O(m). The space
complexity of the algorithms is O(mn). Neither algorithm requires hardware support
for any special synchronization instructions; the ability to read or write into
any machine word is sufficient. The algorithms significantly improve on Peterson's
algorithm, which has O(mn) time complexity for the write operation on the buffer.Notes:
A Holesome File System
Dartmouth Technical Report TR2004-497
Darren Erik Vengroff
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/361/TR2004-497.pdf (242KB)
Abstract:
We present a novel approach to fully dynamic management of physical disk blocks in
Unix file systems. By adding a single system call, zero, to an existing file system,
we permit applications to create holes, that is, regions of files to which no physical
disk blocks are allocated, far more flexibly than previously possible. zero can
create holes in the middle of existing files.
Using zero, it is possible to efficiently implement applications including a variety
of databases and I/O-efficient computation systems on top of the Unix file system.
zero can also be used to implement an efficient file-system-based paging mechanism.
In some I/O-efficient computations, the availability of zero effectively doubles
disk capacity by allowing blocks of temporary files to be reallocated to new files
as they are read.
Experiments on a Linux ext2 file system augmented by zero demonstrate that where
their functionality overlaps, zero is more efficient than ftruncate(). Additional
experiments reveal that in exchange for added effective disk capacity, I/O-efficient
code pays only a small performance penalty.Notes:
The Changing Usage of a Mature Campus-wide Wireless Network
Dartmouth Technical Report TR2004-496
Tristan Henderson
David Kotz
Ilya Abyzov
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/360/TR2004-496.pdf (535KB)
Abstract:
Wireless Local Area Networks (WLANs) are now common on academic and corporate campuses.
As ``Wi-Fi'' technology becomes ubiquitous, it is increasingly important to understand
trends in the usage of these networks.
This paper analyzes an extensive network trace from a mature 802.11 WLAN, including
more than 550 access points and 7000 users over seventeen weeks. We employ several
measurement techniques, including syslogs, telephone records, SNMP polling and tcpdump
packet sniffing. This is the largest WLAN study to date, and the first to look at
a large, mature WLAN and consider geographic mobility. We compare this trace to
a trace taken after the networks initial deployment two years ago.
We found that the applications used on the WLAN changed dramatically. Initial WLAN
usage was dominated by Web traffic; our new trace shows significant increases in
peer-to-peer, streaming multimedia, and voice over IP (VoIP) traffic. On-campus
traffic now exceeds offcampus traffic, a reversal of the situation at the WLANs
initial deployment. Our study indicates that VoIP has been used little on the wireless
network thus far, and most VoIP calls are made on the wired network. Most calls
last less than a minute.
We saw more heterogeneity in the types of clients used, with more embedded wireless
devices such as PDAs and mobile VoIP clients. We define a new metric for mobility,
the ``session diameter.'' We use this metric to show that embedded devices have
different mobility characteristics than laptops, and travel further and roam to
more access points. Overall, users were surprisingly non-mobile, with half remaining
close to home about 98% of the time.Notes:
Dependency management in distributed settings
Dartmouth Technical Report TR2004-495
Guanling Chen
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/359/TR2004-495.pdf (100KB)
Abstract:
Ubiquitous-computing environments are heterogeneous and volatile in nature. Systems
that support ubicomp applications must be self-managed, to reduce human intervention.
In this paper, we present a general service that helps distributed software components
to manage their dependencies. Our service proactively monitors the liveness of components
and recovers them according to supplied policies. Our service also tracks the state
of components, on behalf of their dependents, and may automatically select components
for the dependent to use based on evaluations of customized functions. We believe
that our approach is flexible and abstracts away many of the complexities encountered
in ubicomp environments. In particular, we show how we applied the service to manage
dependencies of context-fusion operators and present some experimental results.Notes:
An Improved Nuclear Vector Replacement Algorithm for Nuclear Magnetic Resonance Assignment
Dartmouth Technical Report TR2004-494
Christopher J. Langmead
Bruce R. Donald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/358/TR2004-494.pdf (166KB)
Abstract:
We report an improvement to the Nuclear Vector Replacement (NVR) algorithm for high-throughput
Nuclear Magnetic Resonance (NMR) resonance assignment. The new algorithm improves
upon our earlier result in terms of accuracy and computational complexity. In particular,
the new NVR algorithm assigns backbone resonances without error (100% accuracy)
on the same test suite examined in [Langmead and Donald J. Biomol. NMR 2004], and
runs in $O(n^{5/2} log {(cn)})$ time where $n$ is the number of amino acids in the
primary sequence of the protein, and $c$ is the maximum edge weight in an integer-weighted
bipartite graph.
The Kerf toolkit for intrusion analysis
Dartmouth Technical Report TR2004-493
Javed Aslam
Sergey Bratus
David Kotz
Ron Peterson
Daniela Rus
Brett Tofel
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/357/TR2004-493.pdf (4257KB)
Abstract:
We consider the problem of intrusion analysis and present the Kerf Toolkit, whose
purpose is to provide an efficient and flexible infrastructure for the analysis
of attacks. The Kerf Toolkit includes a mechanism for securely recording host and
network logging information for a network of workstations, a domain-specific language
for querying this stored data, and an interface for viewing the results of such
a query, providing feedback on these results, and generating new queries in an iterative
fashion. We describe the architecture of Kerf, present examples to demonstrate the
power of our query language, and discuss the performance of our implementation of
this system.
Identification of Novel Small Molecule Inhibitors of Core-Binding Factor Dimerization by Computational Screening against NMR Molecular Ensembles
Dartmouth Technical Report TR2004-492
Ryan H. Lilien
Mohini Sridharan
Bruce R. Donald
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/356/TR2004-492.pdf (6325KB)
Abstract:
The long development process of novel pharmaceutical compounds begins with the identification
of a lead inhibitor compound. Computational screening to identify those ligands,
or small molecules, most likely to inhibit a target protein may benefit the pharmaceutical
development process by reducing the time required to identify a lead compound. Typically,
computational ligand screening utilizes high-resolution structural models of both
the protein and ligand to fit or `dock' each member of a ligand database into the
binding site of the protein. Ligands are then ranked by the number and quality of
interactions formed in the predicted protein-ligand complex. It is currently believed
that proteins in solution do not assume a single rigid conformation but instead
tend to move through a small region of conformation space. Therefore, docking ligands
against a static snapshot of protein structure has predictive limitations because
it ignores the inherent flexibility of the protein. A challenge, therefore, has
been the development of docking algorithms capable of modeling protein flexibility
while balancing computational feasibility. In this paper, we present our initial
development and work on a molecular ensemble-based algorithm to model protein flexibility
for protein-ligand binding prediction. First, a molecular ensemble is generated
from molecular structures satisfying experimentally-measured NMR constraints. Second,
traditional protein-ligand docking is performed on each member of the protein's
molecular ensemble. This step generates lists of ligands predicted to bind to each
individual member of the ensemble. Finally, lists of top predicted binders are consolidated
to identify those ligands predicted to bind multiple members of the protein's molecular
ensemble. We applied our algorithm to identify inhibitors of Core Binding Factor
(CBF) among a subset of approximately 70,000 ligands of the Available Chemicals
Directory. Our 26 top-predicted binding ligands are currently being tested experimentally
in the wetlab by both NMR-binding experiments (15N-edited Heteronuclear Single-Quantum
Coherence (HSQC)) and Electrophoretic Gel Mobility Shift Assays (EMSA). Preliminary
results indicate that of approximately 26 ligands tested, three induce perturbations
in the protein's NMR chemical shifts indicative of ligand binding and one ligand
(2-amino-5-cyano-4-tertbutyl thiazole) causes a band pattern in the EMSA indicating
the disruption of CBF dimerization.
Evaluating next-cell predictors with extensive Wi-Fi mobility data
Dartmouth Technical Report TR2004-491
Libo Song
David Kotz
Ravi Jain
Xiaoning He
Date: January 2004
Abstract:
Location is an important feature for many applications, and wireless networks can
better serve their clients by anticipating client mobility. As a result, many location
predictors have been proposed in the literature, though few have been evaluated
with empirical evidence. This paper reports on the results of the first extensive
empirical evaluation of location predictors, using a two-year trace of the mobility
patterns of over 6,000 users on Dartmouth's campus-wide Wi-Fi wireless network.
We implemented and compared the prediction accuracy of several location predictors
drawn from four major families of domain-independent predictors, namely Markov-based,
compression-based, PPM, and SPM predictors. We found that low-order Markov predictors
performed as well or better than the more complex and more space-consuming compression-based
predictors. Predictors of both families fail to make a prediction when the recent
context has not been previously seen. To overcome this drawback, we added a simple
fallback feature to each predictor and found that it significantly enhanced its
accuracy in exchange for modest effort. Thus the Order-2 Markov predictor with fallback
was the best predictor we studied, obtaining a median accuracy of about 72% for
users with long trace lengths. We also investigated a simplification of the Markov
predictors, where the prediction is based not on the most frequently seen context
in the past, but the most recent, resulting in significant space and computational
savings. We found that Markov predictors with this recency semantics can rival the
accuracy of standard Markov predictors in some cases. Finally, we considered several
seemingly obvious enhancements, such as smarter tie-breaking and aging of context
information, and discovered that they had little effect on accuracy. The paper ends
with a discussion and suggestions for further work.Notes:
A Case Study of Four Location Traces
Dartmouth Technical Report TR2004-490
Guanling Chen
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/354/TR2004-490.pdf (676KB)
Abstract:
Location is one of the most important context information that an ubiquitous-computing
application may leverage. Thus understanding the location systems and how location-aware
applications interact with them is critical for design and deployment of both the
location systems and location-aware applications. In this paper, we analyze a set
of traces collected from two small-scale one-building location system and two large-scale
campus-wide location systems. Our goal is to study characteristics of these location
systems ant how these factors should be taken into account by a potentially large
number of location-aware applications with different needs. We make empirical measurements
of several important metrics and compare the results across these location systems.
We discuss the implication of these results on location-aware applications and their
supporting software infrastructure, and how location systems could be improved to
better serve applications' needs. In places where possible, we use location-aware
applications discussed in existing literatures as illustrating examples.
Keyjacking: The Surprising Insecurity of Client-side SSL
Dartmouth Technical Report TR2004-489
John Marchesini
Sean W. Smith
Meiyuan Zhao
Date: January 2013
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/353/TR2004-489.pdf (141KB)
Abstract:
In theory, PKI can provide a flexible and strong way to authenticate users in distributed
information systems. In practice, much is being invested in realizing this vision
via client-side SSL and various client keystores. However, whether this works depends
on whether what the machines do with the private keys matches what the humans think
they do: whether a server operator can conclude from an SSL request authenticated
with a user's private key that the user was aware of and approved that request.
Exploring this vision, we demonstrate via a series of experiments that this assumption
does not hold with standard desktop tools, even if the browser user does all the
right things. A fundamental rethinking of the trust, usage, and storage model might
result in more effective tools for achieving the PKI vision.Notes:
Application-controlled loss-tolerant data dissemination
Dartmouth Technical Report TR2004-488
Guanling Chen
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/352/TR2004-488.pdf (256KB)
Abstract:
Reactive or proactive mobile applications require continuous monitoring of their
physical and computational environment to make appropriate decisions in time. These
applications need to monitor data streams produced by sensors and react to changes.
When mobile sensors and applications are connected by low-bandwidth wireless networks,
sensor data rates may overwhelm the capacity of network links or of the applications.
In traditional networks and distributed systems, flow-control and congestion-control
policies either drop data or force the sender to pause. When the data sender is
sensing the physical environment, however, a pause is equivalent to dropping data.
Arbitrary data drops are not necessarily acceptable to the reactive mobile applications
receiving sensor data. Data distribution systems must support application-specific
policies that selectively drop data objects when network or application buffers
overflow.
In this paper we present a data-dissemination service, PACK, which allows applications
to specify customized data-reduction policies. These policies define how to discard
or summarize data flows wherever buffers overflow on the dissemination path, notably
at the mobile hosts where applications often reside. The PACK service provides an
overlay infrastructure to support mobile data sources and sinks, using application-specific
data-reduction policies where necessary along the data path. We uniformly apply
the data-stream ``packing'' abstraction to buffer overflow caused by network congestion,
slow receivers, and the temporary disconnections caused by end-host mobility. We
demonstrate the effectiveness of our approach with an application example and experimental
measurements.
High-Throughput 3D Homology Detection via NMR Resonance Assignment
Dartmouth Technical Report TR2004-487
Christopher J. Langmead
Bruce R. Donald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/351/TR2004-487.pdf (284KB)
Abstract:
One goal of the structural genomics initiative is the identification of new protein
folds. Sequence-based structural homology prediction methods are an important means
for prioritizing unknown proteins for structure determination. However, an important
challenge remains: two highly dissimilar sequences can have similar folds --- how
can we detect this rapidly, in the context of structural genomics? High-throughput
NMR experiments, coupled with novel algorithms for data analysis, can address this
challenge. We report an automated procedure, called HD, for detecting 3D structural
homologies from sparse, unassigned protein NMR data. Our method identifies 3D models
in a protein structural database whose geometries best fit the unassigned experimental
NMR data. HD does not use, and is thus not limited by sequence homology. The method
can also be used to confirm or refute structural predictions made by other techniques
such as protein threading or homology modelling. The algorithm runs in $O(pn^{5/2}
log {(cn)} + p log p)$ time, where $p$ is the number of proteins in the database,
$n$ is the number of residues in the target protein and $c$ is the maximum edge
weight in an integer-weighted bipartite graph. Our experiments on real NMR data
from 3 different proteins against a database of 4,500 representative folds demonstrate
that the method identifies closely related protein folds, including sub-domains
of larger proteins, with as little as 10-30% sequence homology between the target
protein (or sub-domain) and the computed model. In particular, we report no false-negatives
or false-positives despite significant percentages of missing experimental data.Notes:
A meeting detector and its applications
Dartmouth Technical Report TR2004-486
Jue Wang
Guanling Chen
David Kotz
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/350/TR2004-486.pdf (331KB)
Abstract:
In this paper we present a context-sensing component that recognizes meetings in
a typical office environment. Our prototype detects the meeting start and end by
combining outputs from pressure and motion sensors installed on the chairs. We developed
a telephone controller application that transfers incoming calls to voice-mail when
the user is in a meeting. Our experiments show that it is feasible to detect high-level
context changes with ``good enough'' accuracy, using low-cost, off-the-shelf hardware,
and simple algorithms without complex training. We also note the need for better
metrics to measure context detection performance, other than just accuracy. We propose
several metrics appropriate for our application in this paper. It may be useful,
however, for the community to define a set of general metrics as a basis to compare
different approaches of context detection.
Using SPKI/SDSI for Distributed Maintenance of Attribute Release Policies in Shibboleth
Dartmouth Technical Report TR2004-485
Sidharth Nazareth
Sean Smith
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/349/TR2004-485.pdf (456KB)
Abstract:
The Shibboleth middleware from Internet2 provides a way for users at higher-education
institutions to access remote electronic content in compliance with the inter-institutional
license agreements that govern such access. To protect end-user privacy, Shibboleth
permits users to construct attribute release policies that control what user credentials
a given content provider can obtain. However, Shibboleth leaves unspecified how
to construct these policies.
To be effective, a solution needs to accommodate the typical nature of a university:
a set of decentralized fiefdoms. This need argues for a public-key infrastructure
(PKI) approach---since public-key cryptography does not require parties to agree
on a secret beforehand, and parties distributed throughout the institution are unlikely
to agree on anything. However, this need also argues against the strict hierarchical
structure of traditional PKI---policy in different fiefdoms will be decided differently,
and originate within the fiefdom, rather than from an overall root.
This paper presents our design and prototype of a system that uses the decentralized
public-key framework of SPKI/SDSI to solve this problem.
Greenpass: Flexible and Scalable Authorization for Wireless Networks
Dartmouth Technical Report TR2004-484
Sean Smith
Nicholas C. Goffee
Sung Hoon Kim
Punch Taylor
Meiyuan Zhao
John Marchesini
Date: January 2004
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/348/TR2004-484.pdf (119KB)
Abstract:
Wireless networks break the implicit assumptions that supported authorization in
wired networks (that is: if one could connect, then one must be authorized). However,
ensuring that only authorized users can access a campus-wide wireless network creates
many challenges: we must permit authorized guests to access the same network resources
that internal users do; we must accommodate the de-centralized way that authority
flows in real universities; we also must work within standards, and accommodate
the laptops and systems that users already have, without requiring additional software
or plug-ins.
This paper describes our ongoing project to address this problem, using SPKI/SDSI
delegation on top of X.509 keypair within EAP-TLS. Within the ``living laboratory''
of Dartmouth's wireless network, this project lets us solve real problem with wireless
networking, while also experimenting with trust flows and testing the limits of
current tools.
A Subgroup Algorithm to Identify Cross-Rotation Peaks Consistent with Non-Crystallographic Symmetry
Dartmouth Technical Report TR2003-481
Ryan H. Lilien
Chris Bailey-Kellogg
Amy A. Anderson
Bruce R. Donald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/347/TR2003-481.pdf (1772KB)
Abstract:
Molecular replacement (MR) often plays a prominent role in determining initial phase
angles for structure determination by X-ray crystallography. In this paper, an efficient
quaternion-based algorithm is presented for analyzing peaks from a cross-rotation
function to identify model orientations consistent with non-crystallographic symmetry
(NCS), and to generate NCS-consistent orientations missing from the list of cross-rotation
peaks. Our algorithm, CRANS, analyzes the rotation differences between each pair
of cross-rotation peaks to identify finite subgroups of NCS. Sets of rotation differences
satisfying the subgroup axioms correspond to orientations compatible with the correct
NCS. The CRANS algorithm was first tested using cross-rotation peaks computed from
structure factor data for three test systems, and then used to assist in the de
novo structure determination of dihydrofolate reductase-thymidylate synthase (DHFR-TS)
from Cryptosporidium hominis. In every case, the CRANS algorithm runs in seconds
to identify orientations consistent with the observed NCS and to generate missing
orientations not present in the cross-rotation peak list. The CRANS algorithm has
application in every molecular replacement phasing effort with NCS.Notes:
Problems with the Dartmouth wireless SNMP data collection
Dartmouth Technical Report TR2003-480
Tristan Henderson
David Kotz
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/346/TR2003-480-rev3.pdf (52KB)
Abstract:
The original Dartmouth wireless network study used SNMP to query the college's Cisco
802.11b access points. The perl scripts that performed the SNMP queries suffered
from some problems, in that they queried inappropriate SNMP values, or misunderstood
the meaning of other values. This data was also used in a subsequent analysis. The
same scripts were used to collect data for a subsequent study of another wireless
network. This document outlines these problems and indicates which of the data collected
by the original scripts may be invalid.Notes:
Experimenting with TCPA/TCG Hardware, Or: How I Learned to Stop Worrying and Love The Bear
Dartmouth Technical Report TR2003-476
John Marchesini
Sean W. Smith
Omen Wild
Rich MacDonald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/345/TR2003-476.pdf (154KB)
Abstract:
Over the last few years, our group has been working on applications of secure coprocessors---but
has been frustrated by the limited computational environment and high expense of
such devices. Over the last few years, the TCPA (now TCG) has produced a specification
for a trusted platform module (TPM)---a small hardware addition intended to improve
the overall security of a larger machine (and tied up with a still-murky vision
of Windows-based trusted computing). Some commodity desktops now come up with these
TPMs.
Consequently, we began an experiment to see if (in the absence of a Non-Disclosure
Agreement) we could use this hardware to transform a desktop Linux machine into
a virtual secure coprocessor: more powerful but less secure than higher-end devices.
This experiment has several purposes: to provide a new platform for secure coprocessor
applications, to see how well the TCPA/TCG approach works, and (by working in open
source) to provide a platform for the broader community to experiment with alternative
architectures in the contentious area of trusted computing.
This paper reports what we have learned so far: the approach is feasible, but effective
deployment requires a more thorough look at OS security.Notes:
On the Complexity of Implementing Certain Classes of Shared Objects
Dartmouth Technical Report TR2003-475
King Y. Tan
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/344/TR2003-475.ps.Z (528KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/344/TR2003-475.pdf (647KB)
Abstract:
We consider shared memory systems in which asynchronous processes cooperate with
each other by communicating via shared data objects, such as counters, queues, stacks,
and priority queues. The common approach to implementing such shared objects is
based on locking: To perform an operation on a shared object, a process obtains
a lock, accesses the object, and then releases the lock. Locking, however, has several
drawbacks, including convoying, priority inversion, and deadlocks. Furthermore,
lock-based implementations are not fault-tolerant: if a process crashes while holding
a lock, other processes can end up waiting forever for the lock. Wait-free linearizable
implementations were conceived to overcome most of the above drawbacks of locking.
A wait-free implementation guarantees that if a process repeatedly takes steps,
then its operation on the implemented data object will eventually complete, regardless
of whether other processes are slow, or fast, or have crashed. In this thesis, we
first present an efficient wait-free linearizable implementation of a class of object
types, called closed and closable types, and then prove time and space lower bounds
on wait-free linearizable implementations of another class of object types, called
perturbable types. (1) We present a wait-free linearizable implementation of n-process
closed and closable types (such as swap, fetch&add, fetch&multiply, and fetch&L,
where L is any of the boolean operations and, or, or complement) using registers
that support load-link (LL) and store-conditional (SC) as base objects. The time
complexity of the implementation grows linearly with contention, but is never more
than O(log ^2 n). We believe that this is the first implementation of a class of
types (as opposed to a specific type) to achieve a sub-linear time complexity. (2)
We prove linear time and space lower bounds on the wait-free linearizable implementations
of n-process perturbable types (such as increment, fetch&add, modulo k counter,
LL/SC bit, k-valued compare&swap (for any k >= n), single-writer snapshot) that
use resettable consensus and historyless objects (such as registers that support
read and write) as base objects. This improves on some previously known Omega(sqrt{n})
space complexity lower bounds. It also shows the near space optimality of some known
wait-free linearizable implementations.
A Probability-Based Similarity Measure for Saupe Alignment Tensors with Applications to Residual Dipolar Couplings in NMR Structural Biology
Dartmouth Technical Report TR2003-474
Anthony K. Yan
Christopher J. Langmead
Bruce Randall Donald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/343/TR2003-474.pdf (452KB)
Abstract:
High-throughput NMR structural biology and NMR structural genomics pose a fascinating
set of geometric challenges. A key bottleneck in NMR structural biology is the resonance
assignment problem. We seek to accelerate protein NMR resonance assignment and structure
determination by exploiting a priori structural information. In particular,
a method known as Nuclear Vector Replacement (NVR) has been proposed as a method
for solving the assignment problem given a priori structural information [24,25].
Among several different kinds of input data, NVR uses a particular type of NMR data
known as residual dipolar couplings (RDCs). The basic physics of residual
dipolar couplings tells us that the data should be explainable by a structural model
and set of parameters contained within the Saupe alignment tensor.
In the NVR algorithm, one estimates the Saupe alignment tensors and then proceeds
to refine those estimates. We would like to quantify the accuracy of such estimates,
where we compare the estimated Saupe matrix to the correct Saupe matrix. In this
work, we propose a way to quantify this comparison. Given a correct Saupe matrix
and an estimated Saupe matrix, we compute an upper bound on the probability that
a randomly rotated Saupe tensor would have an error smaller than the estimated Saupe
matrix. This has the advantage of being a quantified upper bound which also has
a clear interpretation in terms of geometry and probability. While the specific
application of our rotation probability results is given to NVR, our novel methods
can be used for any RDC-based algorithm to bound the accuracy of the estimated alignment
tensors. Furthermore, they could also be used in X-ray crystallography or molecular
docking to quantitate the accuracy of calculated rotations of proteins, protein
domains, nucleic acids, or small molecules.Notes:
Evaluating Location Predictors with Extensive Wi-Fi Mobility Data
Dartmouth Technical Report TR2003-472
Libo Song
David Kotz
Ravi Jain
Xiaoning He
Date: January 2003
Abstract:
Notes:
Bear: An Open-Source Virtual Secure Coprocessor based on TCPA
Dartmouth Technical Report TR2003-471
Rich MacDonald
Sean W. Smith
John Marchesini
Omen Wild
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/341/TR2003-471.ps.Z (376KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/341/TR2003-471.pdf (103KB)
Abstract:
This paper reports on our ongoing project to use TCPA to transform a desktop Linux
machine into a virtual secure coprocessor: more powerful but less secure than higher-end
devices. We use TCPA hardware and modified boot loaders to protect fairly static
components, such as a trusted kernel; we use an enforcer module---configured as
Linux Security Module---to protected more dynamic system components; we use an encrypted
loopback filesystem to protect highly dynamic components.
All our code is open source and available under GPL from http://enforcer.sourceforge.net/Notes:
Using caching for browsing anonymity
Dartmouth Technical Report TR2003-470
Anna M. Shubina
Sean W. Smith
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/340/TR2003-470.ps.Z (221KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/340/TR2003-470.pdf (318KB)
Abstract:
Privacy-providing tools, including tools that provide anonymity, are gaining popularity
in the modern world. Among the goals of their users is avoiding tracking and profiling.
While some businesses are unhappy with the growth of privacy-enhancing technologies,
others can use lack of information about their users to avoid unnecessary liability
and even possible harassment by parties with contrary business interests, and to
gain a competitive market edge.
Currently, users interested in anonymous browsing have the choice only between single-hop
proxies and the few more complex systems that are available. These still leave the
user vulnerable to long-term intersection attacks.
In this paper, we propose a caching proxy system for allowing users to retrieve
data from the World-Wide Web in a way that would provide recipient unobservability
by a third party and sender unobservability by the recipient and thus dispose with
intersection attacks, and report on the prototype we built using Google.
MEMS for Infosecurity
Dartmouth Technical Report TR2003-469
Bruce R. Donald
Craig McGray
Daniela Rus
Date: January 2003
Abstract:
Notes:
Formal Properties of Linear Memory Types
Dartmouth Technical Report TR2003-468
Heng Huang
Lea Wittie
Chris Hawblitzel
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/338/TR2003-468.ps.Z (442KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/338/TR2003-468.pdf (624KB)
Abstract:
Efficient low-level systems need more control over memory than safe high-level languages
usually provide. As a result, run-time systems are typically written in unsafe languages
such as C. This report describes an abstract machine designed to give type-safe
code more control over memory. It includes complete definitions and proofs of preservation,
progress, strong normalization, erasure, and translation correctness.
The mistaken axioms of wireless-network research
Dartmouth Technical Report TR2003-467
David Kotz
Calvin Newport
Chip Elliott
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/337/TR2003-467.ps.Z (17442KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/337/TR2003-467.pdf (3673KB)
Abstract:
Most research on ad-hoc wireless networks makes simplifying assumptions about radio
propagation. The ``Flat Earth'' model of the world is surprisingly popular: all
radios have circular range, have perfect coverage in that range, and travel on a
two-dimensional plane. CMU's ns-2 radio models are better but still fail to represent
many aspects of realistic radio networks, including hills, obstacles, link asymmetries,
and unpredictable fading. We briefly argue that key ``axioms'' of these types of
propagation models lead to simulation results that do not adequately reflect real
behavior of ad-hoc networks, and hence to network protocols that may not work well
(or at all) in reality. We then present a set of 802.11 measurements that clearly
demonstrate that these ``axioms'' are contrary to fact. The broad chasm between
simulation and reality calls into question many of results from prior papers, and
we summarize with a series of recommendations for researchers considering analytic
or simulation models of wireless networks.Notes:
Digital Art Forensics
Dartmouth Technical Report TR2003-466
Siwei Lyu
Daniel Rockmore
Hany Farid
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/336/TR2003-466.pdf (240KB)
Abstract:
We describe a computational technique for digitally authenticating works of art.
This approach builds statistical models of an artist from a set of authenticated
works. Additional works are then authenticated against this model. The statistical
model consists of first- and higher-order wavelet statistics. We show preliminary
results from our analysis of thirteen drawings by Pieter Bruegel the Elder. We also
present preliminary results showing how these techniques may be applicable to determining
how many hands contributed to a single painting.
Using Low Level Linear Memory Management for Type-Preserving Mark-Sweep Garbage Collector
Dartmouth Technical Report TR2003-465
Edward Wei
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/335/TR2003-465.ps.Z (217KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/335/TR2003-465.pdf (336KB)
Abstract:
Efficient low-level systems such as garbage collectors need more control over memory
than safe high-level languages usually provide. Due to this constraint, garbage
collectors are typically written in unsafe languages such as C. A collector of this
form usually resides as a trusted primitive runtime service outside the model of
the programming language. The type safety of these languages depends on the assumption
that the garbage collector will not violate any typing invariants. However, no realistic
systems provide proof of this assumption.
A garbage collector written in a strongly typed language can guarantee not only
the safety of the garbage collector and the program being garbage collected (mutator),
but also the interaction between the collector and the mutator. Removing the garbage
collector from the trusted computing base has many additional benefits: Untrusted
code could be given more control over memory management without sacrificing security.
Low-level code such as device drivers could interface in a safe way with a garbage
collector. For these and the growing prevalence of garbage collectors in the typical
programming system necessitate a safe solution.
Previous research by Wang et al introduced a safe copying collector based on regions,
where the live graph structure of the heap is copied from an old region to a newer
region. This paper seeks to improve the efficiency of type-preserving garbage collection
with the introduction of a type-preserving mark and sweep garbage collector.Notes:
A Surface-based Approach for Classification of 3D Neuroanatomic Structures
Dartmouth Technical Report TR2003-464
Li Shen
James Ford
Fillia Makedon
Andrew Saykin
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/334/TR2003-464.ps.Z (1189KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/334/TR2003-464.pdf (847KB)
Abstract:
We present a new framework for 3D surface object classification that combines a powerful
shape description method with suitable pattern classification techniques. Spherical
harmonic parameterization and normalization techniques are used to describe a surface
shape and derive a dual high dimensional landmark representation. A point distribution
model is applied to reduce the dimensionality. Fisher's linear discriminants and
support vector machines are used for classification. Several feature selection schemes
are proposed for learning better classifiers. After showing the effectiveness of
this framework using simulated shape data, we apply it to real hippocampal data
in schizophrenia and perform extensive experimental studies by examining different
combinations of techniques. We achieve best leave-one-out cross-validation accuracies
of 93% (whole set, N=56) and 90% (right-handed males, N=39), respectively, which
are competitive with the best results in previous studies using different techniques
on similar types of data. Furthermore, to help medical diagnosis in practice, we
employ a threshold-free receiver operating characteristic (ROC) approach as an alternative
evaluation of classification results as well as propose a new method for visualizing
discriminative patterns.
Investigation of Third Party Rights Service and Shibboleth Modification to Introduce the Service
Dartmouth Technical Report TR2003-463
Sanket Agrawal
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/333/TR2003-463.ps.Z (919KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/333/TR2003-463.pdf (477KB)
Abstract:
Shibboleth is an architecture to support inter-institutional sharing of electronic
resources that are subject to access control. Codifying copyright in Shibboleth
authorization policies is difficult because of the copyright exceptions which can
be highly subjective. Third Party Rights Service is a high-level concept that has
been suggested as a solution to approximate the exceptions of copyright law. In
this thesis, I investigate the components of the Third Party Rights Service. I design
and analyze a modified Shibboleth architecture based on these components. The resulting
architecture allows for the phased addition of the resources to make use of the
Third Party Rights Service, while keeping the existing resources in Shibboleth.
Distributed planning and control for modular robots with unit-compressible modules
Dartmouth Technical Report TR2003-462
Zack Butler
Daniela Rus
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/332/TR2003-462.ps.Z (2338KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/332/TR2003-462.pdf (1331KB)
Abstract:
Self-reconfigurable robots are versatile systems consisting of large numbers of independent
modules. Effective use of these systems requires parallel actuation and planning,
both for efficiency and independence from a central controller. This paper presents
the PacMan algorithm, a technique for distributed actuation and planning for systems
with two- or three-dimensional unit-compressible modules. We give two versions of
the algorithm along with correctness analysis. We also analyze the parallel actuation
capability of the algorithm, showing that it will not deadlock and will avoid disconnecting
the robot. We have implemented PacMan on the Crystal robot, a hardware system developed
in our lab, and we present experiments and discuss the feasibility of large-scale
implementation.
Trusted S/MIME Gateways
Dartmouth Technical Report TR2003-461
Mindy J. Pereira
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/331/TR2003-461.ps.Z (749KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/331/TR2003-461.pdf (163KB)
Abstract:
The utility of Web-based email clients is clear: a user is able to access their email
account from any computer anywhere at any time. However, this option is unavailable
to users whose security depends on their key pair being stored either on their local
computer or in their browser. Our implementation seeks to solve two problems with
secure email services. The first that of mobility: users must have access to their
key pairs in order to perform the necessary cryptographic operations. The second
is one of transition: initially, users would not want to give up their regular email
clients. Keeping these two restrictions in mind, we decided on the implementation
of a secure gateway system that works in conjunction with an existing mail server
and client. Our result is PKIGate, an S/MIME gateway that uses the DigitalNet (formerly
Getronics) S/MIME Freeware Library and IBM's 4758 secure coprocessor. This thesis
presents motivations for the project, a comparison with similar existing products,
software and hardware selection, the design, use case scenarios, a discussion of
implementation issues, and suggestions for future work.Notes:
Enhancing Asynchronous Parallel Computing
Dartmouth Technical Report TR2003-460
Elizabeth A. Hamon
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/330/TR2003-460.ps.Z (117KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/330/TR2003-460.pdf (195KB)
Abstract:
In applications using large amounts of data, hiding the latency inherent in accessing
data far from the processor is often necessary in order to achieve high performance.
Several researchers have observed that one way to address the challenge of latency
is by using a common structure: in a series of passes, the program reads in the
data, performs various operations on it, and writes out the data. Passes often consist
of a pipeline structure composed of different stages. In order to achieve high performance,
the stages are frequently overlapped, for example, by using asynchronous threads.
Out-of-core parallel programs provide one such example of this pattern. The development
and debugging time resulting from coordinating overlapping stages, however, can
be substantial. Moreover, modifying the structure of the overlap in an attempt to
achieve higher performance can require significant additional time on the part of
the programmer. This thesis presents FG, a Framework Generator designed to coordinate
the stages of a pipeline and allow the programmer to easily experiment with the
pipeline's structure, thus significantly reducing time to solution. We also discuss
preliminary results of using FG in an out-of-core sorting program.
Efficient I/O for Computational Grid Applications
Dartmouth Technical Report TR2003-459
Ron A. Oldfield
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/329/TR2003-459.ps.Z (4118KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/329/TR2003-459.pdf (1852KB)
Abstract:
High-performance computing increasingly occurs on "computational grids" composed
of heterogeneous and geographically distributed systems of computers, networks,
and storage devices that collectively act as a single "virtual" computer. A key
challenge in this environment is to provide efficient access to data distributed
across remote data servers. This dissertation explores some of the issues associated
with I/O for wide-area distributed computing and describes an I/O system, called
Armada, with the following features: a framework to allow application and dataset
providers to flexibly compose graphs of processing modules that describe the distribution,
application interfaces, and processing required of the dataset before or after computation;
an algorithm to restructure application graphs to increase parallelism and to improve
network performance in a wide-area network; and a hierarchical graph-partitioning
scheme that deploys components of the application graph in a way that is both beneficial
to the application and sensitive to the administrative policies of the different
administrative domains. Experiments show that applications using Armada perform
well in both low- and high-bandwidth environments, and that our approach does an
exceptional job of hiding the network latency inherent in grid computing.Notes:
Power Conservation in the Network Stack of Wireless Sensors
Dartmouth Technical Report TR2003-458
Michael De Rosa
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/328/TR2003-458.pdf (271KB)
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/328/TR2003-458.ps.Z (705KB)
Abstract:
Wireless sensor networks have recently become an incredibly active research area
in the networking community. Much attention has been given to the construction of
power-conserving protocols and techniques, as battery life is the one factor that
prevents successful wide-scale deployment of such networks. These techniques concentrate
on the optimization of network behavior, as the wireless transmission of data is
the most expensive operation performed by a sensor node. Very little work has been
published on the integration of such techniques, and their suitability to various
application domains. This paper presents an exhaustive power consumption analysis
of network stacks constructed with common algorithms, to determine the interactions
between such algorithms and the suitability of the resulting network stack for various
applications.Notes:
Electronic Documents and Digital Signatures
Dartmouth Technical Report TR2003-457
Kunal Kain
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/327/TR2003-457.pdf (5002KB)
Abstract:
Often, the main motivation for using PKI in business environments is to streamline
workflow, by enabling humans to digitally sign electronic documents, instead of
manually signing paper ones. However, this application fails if adversaries can
construct electronic documents whose viewed contents can change in useful ways,
without invalidating the digital signature. In this paper, we examine the space
of such attacks, and describe how many popular electronic document formats and PKI
packages permit them.Notes:
Discovery, Visualization and Analysis of Gene Regulatory Sequence Elements in Genomes
Dartmouth Technical Report TR2003-456
Daniel F. Simola
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/326/TR2003-456.pdf (2706KB)
Abstract:
The advent of rapid DNA sequencing has produced an explosion in the amount of available
sequence information, permitting us to ask many new questions about DNA. There is
a pressing need to design algorithms that can provide answers to questions related
to the control of gene expression, and thus to the structure, function, and behavior
of organisms. Such algorithms must filter through massive amounts of informational
noise to identify meaningful conserved regulatory DNA sequence elements. We are
approaching these questions with the notion that visualization is a key to exploring
data relationships. Understanding the exact nature of these relationships can be
very difficult by simply interpreting raw data. The ability to look at data in a
graphical form allows us to apply our innate capacity to think visually to discern
the subtle relationships that might not be recognizable otherwise. This thesis
provides computational tools to visually identify and analyze candidate motifs in
the DNA of a species. This includes a parsing utility to store genomic data and
an application to search for and visually identify motifs. Using these tools, novel
and previously compiled gene sets were identified using the genome of the plant
species Arabidopsis thaliana.Notes:
Persistence and Prevalence in the Mobility of Dartmouth Wireless Network Users
Dartmouth Technical Report TR2003-455
Clara E. Lee
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/325/TR2003-455.ps.Z (224KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/325/TR2003-455.pdf (152KB)
Abstract:
Wireless local-area networks (WLANs) are increasing in popularity. As more people
use WLANs it is important to understand how these users behave. We analyzed data
collected over three months of 2002 to measure the persistence and prevalence of
users of the Dartmouth wireless network.
We found that most of the users of Dartmouth's network have short association times
and a high rate of mobility. This observation fits with the predominantly student
population of Dartmouth College, because students do not have a fixed workplace
and are moving to and from classes all day.Notes:
Discrete-Event Fluid Modeling of Background TCP Traffic
Dartmouth Technical Report TR2003-454
David M. Nicol
Guanhua Yan
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/324/TR2003-454.ps.Z (2390KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/324/TR2003-454.pdf (846KB)
Abstract:
TCP is the most widely used transport layer protocol used in the internet today.
A TCP session adapts the demands it places on the network to observations of bandwidth
availability on the network. Because TCP is adaptive, any model of its behavior
that aspires to be accurate must be influenced by other network traffic. This point
is especially important in the context of using simulation to evaluate some new
network algorithm of interest (e.g. reliable multi-cast) in an environment where
the background traffic affects---and is affected by---its behavior. We need to generate
background traffic efficiently in a way that captures the salient features of TCP,
while the reference and background traffic representations interact with each other.
This paper describes a fluid model of TCP and a switching model that has flows represented
by fluids interacting with packet-oriented flows. We describe conditions under which
a fluid model produces exactly the same behavior as a packet-oriented model, and
we quantify the performance advantages of the approach both analytically and empirically.
We observe that very significant speedups may be attained while keeping high accuracy.
SPADE: SPKI/SDSI for Attribute Release Policies in a Distributed Environment
Dartmouth Technical Report TR2003-453
Sidharth P. Nazareth
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/323/TR2003-453.ps.Z (772KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/323/TR2003-453.pdf (669KB)
Abstract:
Shibboleth is a federated administrated system that supports inter-institutional
authentication and authorization for sharing of resources. SPKI/SDSI is a public
key infrastructure whose creation was motivated by the perception that X.509 is
too complex and flawed. This thesis addresses the problem of how users that are
part of a Public Key Infrastructure in a distributed computing system can effectively
specify, create, and disseminate their Attribute Release Policies for Shibboleth
using SPKI/SDSI. This thesis explores existing privacy mechanims, as well as distributed
trust management and policy based systems. My work describes the prototype for a
Trust Management Framework called SPADE (SPKI/SDSI for Attribute Release Policies
in a Distributed Environment) that I have designed, developed and implemented. The
principal result of this research has been the demonstration that SPKI/SDSI is a
viable approach for trust management and privacy policy specification, especially
for minimalistic policies in a distributed environment.Notes:
An Analysis of Convergence Properties of the Border Gateway Protocol Using Discrete Event Simulation
Dartmouth Technical Report TR2003-452
Brian J. Premore
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/322/TR2003-452.ps.Z (837KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/322/TR2003-452.pdf (2384KB)
Abstract:
The Internet is an enormous internetwork formed by connecting tens of thousands of
independently managed computer networks. Though the Internet has no central authority
and is highly heterogeneous, a universally adopted addressing scheme---defined by
the Internet Protocol (IP)---makes interaction between the individual networks possible.
Complementing IP is the Border Gateway Protocol (BGP), which facilitates communication
between parts of the internetwork by determining paths by which data can get from
one network to any other. Just as IP is used ubiquitously as an addressing scheme,
BGP is used ubiquitously for the purpose of network-to-network routing.
Because BGP is universal, its well-being is the concern of everyone. In other words,
when BGP suffers, everyone suffers. Even when just one instance of BGP on one router
is ill-behaved, it can have global effects. Unfortunately, as the Internet has grown,
the amount of stress put on BGP has increased. For a long time, the behavior of
inter-domain routing was studied minimally and was assumed to be working just fine.
Research eventually showed, however, that routing was not actually functioning so
smoothly, and the highly dynamic nature of the Internet was taking its toll on the
routing infrastructure. This discovery prompted a closer look at the behavior of
BGP.
Though its underlying premise is a simple distributed shortest-path algorithm, the
dynamic nature of the Internet, combined with some additional constraints in the
protocol, has made analytical approaches to studying the protocol infeasible. Measurement-based
approaches have been taken, but they are difficult to implement and have minimal
leeway for allowing exploration of the protocol's behavior under different conditions.
For these reasons we have taken the approach of simulation in order to begin to
understand some of the complex ways in which BGP behaves. Simulation allows one
to explore the protocol more fully, testing it under various conditions and modifying
the protocol itself to explore the consequences of its fundamental design.
We have studied BGP behavior with respect to several parameters, some external (network
characteristics) and some internal (protocol characteristics). We show that there
is room for improvement in the protocol, in particular with respect to convergence
following changes in availability of an address in the network. The rate-limiting
mechanism of the protocol is a particular parameter of concern. Although it was
previously thought to help improve convergence, we found that in some cases it can
have drastic degrading effects. As a result of our work, we suggest ways in which
BGP could be modified in practice to reduce the instability of the protocol.Notes:
802.11b Wireless Network Visualization and Radiowave Propagation Modeling
Dartmouth Technical Report TR2003-451
Chris Lentz
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/321/TR2003-451.pdf (2247KB)
Abstract:
This paper outlines the methods of creating detailed coverage maps of 802.11b networks,
with an emphasis on minimizing the expenses and time involved. The goal of this
work is to develop and present a streamlined, reproducible approach to wireless
visualization as well as techniques for predicting coverage area before conducting
network installations.
After evaluating these coverage maps, a repeated series of field measurements will
be checked against interpolated values in order to improve techniques for extrapolation
of data for unsampled regions. If successful, these extrapolation techniques will
provide additional guidelines for, and assist modeling of, new wireless network
installations. However, this paper demonstrates that due to the microcellular structure
of indoor/outdoor 802.11b networks, accurate interpolation and propagation prediction
techniques do not exist independent of highly specific location models. In lieu
of the creation of extensive simulation environments, best practice guidelines for
municipal wireless network planning and deployment are presented.Notes:
An Evaluation of the Impact of Models for Radio Propagation on the Simulation of 802.11b Wireless Networks
Dartmouth Technical Report TR2003-450
Evan W. Richardson
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/320/TR2003-450.ps.Z (4632KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/320/TR2003-450.pdf (2937KB)
Abstract:
Working with an existing wireless network simulator, we describe the addition of
both a method for modeling arbitrary terrain, and for calculating signal attenuation
with the Irregular Terrain Model (ITM). We also investigate ITM's effects on upper
protocol layer in comparison to the Two-Ray Ground Reflection model. Upon examination,
it was found that aside from the terrain between the transmitter and receiver, ITM's
various parameters are of little significance in the computed signal attenuation.
Further, examination of the behavior of the upper protocol layers revealed that
at high traffic levels, choice of propagation model can have significant effects
on the results of the simulation.Notes:
An Active Learning Approach to Efficiently Ranking Retrieval Engines
Dartmouth Technical Report TR2003-449
Lisa A. Torrey
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/319/TR2003-449.ps.Z (469KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/319/TR2003-449.pdf (795KB)
Abstract:
Evaluating retrieval systems, such as those submitted to the annual TREC competition,
usually requires a large number of documents to be read and judged for relevance
to query topics. Test collections are far too big to be exhaustively judged, so
only a subset of documents is selected to form the judgment ``pool.'' The selection
method that TREC uses produces pools that are still quite large. Research has indicated
that it is possible to rank the retrieval systems correctly using substantially
smaller pools.
This paper introduces an active learning algorithm whose goal is to reach the correct
rankings using the smallest possible number of relevance judgments. It adds one
document to the pool at a time, always trying to select the document with the highest
information gain. Several variants of this algorithm are described, each with improvements
on the one before. Results from experiments are included for comparison with the
traditional TREC pooling method. The best version of the algorithm reliably outperforms
the traditional method, although its degree of improvement varies.Notes:
Billiards Adviser as a Search in a Continuous Domain with Significant Uncertainty
Dartmouth Technical Report TR2003-448
Thomas Mueller
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/318/TR2003-448.pdf (507KB)
Abstract:
Typical search algorithms are limited to problems in which there is a certain number
of moves for any given state, and the effect of each move is well known. In order
to overcome this limitation, we consider the problem of determining the optimal
shot given the positions of balls on a billiards table. Our solution includes the
image recognition necessary to determine each ball's position, the calculation of
the optimal shot, and the presentation of that shot to the player. The focus of
the paper is on the second part - determining the angle and force with which the
player should attempt to hit the cue ball for each shot in order to sink all of
the other balls with the fewest shots. The solution to this problem is unique from
other game search algorithms in that it must take into account the infinite number
of possible shots given any configuration of balls as well as the fact that the
player is not likely to hit the ball exactly how he attempts to do so. We compare
the performance of our algorithm with one that ignores the latter fact to show that
our modifications do in fact improve performance for a search in a continuous domain
with significant uncertainty.Notes:
Efficient and Practical Constructions of LL/SC Variables
Dartmouth Technical Report TR2003-446
Prasad Jayanti
Srdjan Petrovic
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/317/TR2003-446.pdf (309KB)
Abstract:
Over the past decade, LL/SC have emerged as the most suitable synchronization instructions
for the design of lock-free algorithms. However, current architectures do not support
these instructions; instead, they support either CAS or RLL/RSC (e.g. POWER4, MIPS,
SPARC, IA-64). To bridge this gap, this paper presents two efficient wait-free algorithms
for implementing 64-bit LL/SC objects from 64-bit CAS or RLL/RSC objects.
Our first algorithm is practical: it has a small, constant time complexity (of
4 for LL and 5 for SC) and a space overhead of only 4 words per process. This algorithm
uses unbounded sequence numbers. For theoretical interest, we also present a more
complex bounded algorithm that still guarantees constant time complexity and O(1)
space overhead per process.
The LL/SC primitive is free of the well-known ABA problem that afflicts CAS. By
efficiently implementing LL/SC words from CAS words, this work presents an efficient
general solution to the ABA problem.
Relaxing the Problem-Size Bound for Out-of-Core Columnsort
Dartmouth Technical Report TR2003-445
Geeta Chaudhry
Elizabeth A. Hamon
Thomas H. Cormen
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/316/TR2003-445.ps.Z (88KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/316/TR2003-445.pdf (123KB)
Abstract:
Previous implementations of out-of-core columnsort limit the problem size to $N leq
sqrt{(M/P)^3 / 2}$, where $N$ is the number of records to sort, $P$ is the number
of processors, and $M$ is the total number of records that the entire system can
hold in its memory (so that $M/P$ is the number of records that a single processor
can hold in its memory). We implemented two variations to out-of-core columnsort
that relax this restriction. Subblock columnsort is based on an algorithmic modification
of the underlying columnsort algorithm, and it improves the problem-size bound to
$N leq (M/P)^{5/3} / 4^{2/3}$ but at the cost of additional disk I/O@. $M$-columnsort
changes the notion of the column size in columnsort, improving the maximum problem
size to $N leq sqrt{M^3 / 2}$ but at the cost of additional computation and communication.
Experimental results on a Beowulf cluster show that both subblock columnsort and
$M$-columnsort run well but that $M$-columnsort is faster. A further advantage of
$M$-columnsort is that it handles a wider range of problem sizes than subblock columnsort.
Stupid Columnsort Tricks
Dartmouth Technical Report TR2003-444
Geeta Chaudhry
Thomas H. Cormen
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/315/TR2003-444.ps.Z (130KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/315/TR2003-444.pdf (190KB)
Abstract:
Leighton's columnsort algorithm sorts on an $r times s$ mesh, subject to the restrictions
that $s$ is a divisor of~$r$ and that $r geq 2s^2$ (so that the mesh is tall and
thin). We show how to mitigate both of these restrictions. One result is that the
requirement that $s$ is a divisor of~$r$ is unnecessary; columnsort sorts correctly
whether or not $s$ divides~$r$. We present two algorithms that, as long as $s$ is
a perfect square, relax the restriction that $r geq 2s^2$; both reduce the exponent
of~$s$ to~$3/2$. One algorithm requires $r geq 4s^{3/2}$ if $s$ divides~$r$ and
$r geq 6s^{3/2}$ if $s$ does not divide~$r$. The other algorithm requires $r geq
4^{3/2}$, and it requires $s$ to be a divisor of~$r$. Both algorithms have applications
in increasing the maximum problem size in out-of-core sorting programs.
Keyjacking: Risks of the Current Client-side Infrastructure
Dartmouth Technical Report TR2003-443
John C. Marchesini
Sean W. Smith
Meiyuan Zhao
Date: January 2003
URL (application/x-gzip) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/314/TR2003-443.ps.Z (57KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/314/TR2003-443.pdf (104KB)
Abstract:
In theory, PKI can provide a flexible and strong way to authenticate users in distributed
information systems. In practice, much is being invested in realizing this vision
via client-side SSL and browser-based keystores. Exploring this vision, we demonstrate
that browsers will use personal certificates to authenticate requests that the person
neither knew of nor approved (and which password-based systems would have defeated),
and we demonstrate the easy permeability of these keystores (including new attacks
on medium and high-security IE/XP keys). We suggest some countermeasures, but also
suggest that a fundamental rethinking of the trust, usage, and storage model might
result in a more effective PKI.
Privacy-enhanced credential services
Dartmouth Technical Report TR2003-442
Alex Iliev
Sean Smith
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/313/TR2003-442.ps.Z (133KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/313/TR2003-442.pdf (290KB)
Abstract:
The use of credential directories in PKI and authorization systems such as Shibboleth
introduces a new privacy risk: an insider at the directory can learn much about
otherwise protected interactions by observing who makes queries, and what they ask
for. Recent advances in Practical Private Information Retrieval provide promising
countermeasures. In this paper, we extend this technology to solve this new privacy
problem, and present a design and preliminary prototype for a LDAP-based credential
service that can prevent even an insider from learning anything more than the fact
a query was made. Our preliminary performance analysis suggests that the complete
prototype may be sufficiently robust for academic enterprise settings.Notes:
Flexible and Scalable Public Key Security for SSH
Dartmouth Technical Report TR2003-441
Yasir Ali
S. W. Smith
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/312/TR2003-441.pdf (609KB)
Abstract:
A standard tool for secure remote access, the SSH protocol uses public-key cryptography
to establish an encrypted and integrity-protected channel with a remote server.
However, widely-deployed implementations of the protocol are vulnerable to man-in-the-middle
attacks, where an adversary substitutes her public key for the server's. This danger
particularly threatens a traveling user Bob borrowing a client machine.
Imposing a traditional X.509 PKI on all SSH servers and clients is neither flexible
nor scalable nor (in the foreseeable future) practical. Requiring extensive work
or an SSL server at Bob's site is also not practical for many users.
This paper presents our experiences designing and implementing an alternative scheme
that solves the public-key security problem in SSH without requiring such an a priori
universal trust structure or extensive sysadmin work--although it does require a
modified SSH client. (The code is available for public download.)Notes:
Efficient Security for BGP Route Announcements
Dartmouth Technical Report TR2003-440
David M. Nicol
Sean W. Smith
Meiyuan Zhao
Date: January 2003
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/311/TR2003-440.R2.ps.Z (71KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/311/TR2003-440.R2.pdf (132KB)
Abstract:
The Border Gateway Protocol (BGP) determines how Internet traffic is routed throughout
the entire world; malicious behavior by one or more BGP speakers could create serious
security issues. Since the protocol depends on a speaker honestly reporting path
information sent by previous speakers and involves a large number of independent
speakers, the Secure BGP (S-BGP) approach uses public-key cryptography to ensure
that a malicious speaker cannot fabricate this information. However, such public-key
cryptography is expensive: S-BGP requires a digital signature operation on each
announcement sent to each peer, and a linear (in the length of the path) number
of verifications on each receipt. We use simulation of a 110 AS system derived from
the Internet to evaluate the impact that the processing costs of cryptography have
on BGP convergence time. We find that under heavy load the convergence time using
ordinary S-BGP is nearly twice as large as under BGP. We examine the impact of highly
aggressive caching and pre-computation optimizations for S-BGP, and find that convergence
time is much closer to BGP. However, these optimizations may be unrealistic, and
are certainly expensive of memory. We consequently use the structure of BGP processing
to design optimizations that reduce cryptographic overhead by amortizing the cost
of private-key signatures over many messages. We call this method Signature-Amortization
(S-A). We find that S-A provides as good or better convergence times as the highly
optimized S-BGP, but without the cost and complications of caching and pre-computation.
It is possible therefore to minimize the impact route validation has on convergence,
by being careful with signatures, rather than consumptive of memory.Notes:
3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings
Dartmouth Technical Report TR2003-439
Chris J. Langmead
Bruce R. Donald
Date: January 2003
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/310/TR2003-439.pdf (280KB)
Abstract:
Recognition of a protein's fold provides valuable information about its function.
While many sequence-based homology prediction methods exist, an important challenge
remains: two highly dissimilar sequences can have similar folds --- how can we detect
this rapidly, in the context of structural genomics? High-throughput NMR experiments,
coupled with novel algorithms for data analysis, can address this challenge. We
report an automated procedure for detecting 3D-structural homologies from sparse,
unassigned protein NMR data.
Our method identifies the 3D-structural models in a protein structural database
whose geometries best fit the unassigned experimental NMR data. It does not use
sequence information and is thus not limited by sequence homology. The method can
also be used to confirm or refute structural predictions made by other techniques
such as protein threading or sequence homology. The algorithm runs in O(pnk3)
time, where p is the number of proteins in the database, n is the number of residues
in the target protein, and k is the resolution of a rotation search. The method
requires only uniform 15N-labelling of the protein and processes unassigned 1H-15N
residual dipolar couplings, which can be acquired in a couple of hours. Our experiments
on NMR data from 5 different proteins demonstrate that the method identifies closely
related protein folds, despite low-sequence homology between the target protein
and the computed model.Notes:
Exact formulae for the Lovasz Theta Function of sparse Circulant Graphs
Dartmouth Technical Report TR2002-438
Valentino Crespi
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/309/TR2002-438.ps.Z (153KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/309/TR2002-438.pdf (290KB)
Abstract:
The Lovasz theta function has attracted a lot of attention for its connection with
diverse issues, such as communicating without errors and computing large cliques
in graphs. Indeed this function enjoys the remarkable property of being computable
in polynomial time, despite being sandwitched between clique and chromatic number,
two well known hard to compute quantities.
In this paper I provide a closed formula for the Lovasz function of a specific class
of sparse circulant graphs thus generalizing Lovasz results on cycle graphs (circulant
graphs of degree 2).
Proofs of Soundness and Strong Normalization for Linear Memory Types
Dartmouth Technical Report TR2002-437
Heng Huang
Chris Hawblitzel
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/308/TR2002-437.ps.Z (275KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/308/TR2002-437.pdf (271KB)
Abstract:
Efficient low-level systems need more control over memory than safe high-level languages
usually provide. As a result, run-time systems are typically written in unsafe languages
such as C. This report describes an abstract machine designed to give type-safe
code more control over memory. It includes complete definitions and proofs.
Heterogeneous Self-Reconfiguring Robotics
Dartmouth Technical Report TR2002-436
Robert C. Fitch
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/307/TR2002-436.ps.Z (9524KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/307/TR2002-436.pdf (971KB)
Abstract:
Self-reconfiguring robots are modular systems that can change shape, or "reconfigure,"
to match structure to task. They comprise many small, discrete, often identical
modules that connect together and that are minimally actuated. Global shape transformation
is achieved by composing local motions. Systems with a single module type, known
as "homogeneous" systems, gain fault tolerance, robustness and low production cost
from module interchangeability. However, we are interested in "heterogeneous" systems,
which include multiple types of modules such as those with sensors, batteries or
wheels. We believe that heterogeneous systems offer the same benefits as homogeneous
systems with the added ability to match not only structure to task, but also capability
to task.
Although significant results have been achieved in understanding homogeneous systems,
research in heterogeneous systems is challenging as key algorithmic issues remain
unexplored. We propose in this thesis to investigate questions in four main areas:
1) how to classify heterogeneous systems, 2) how to develop efficient heterogeneous
reconfiguration algorithms with desired characteristics, 3) how to characterize
the complexity of key algorithmic problems, and 4) how to apply these heterogeneous
algorithms to perform useful new tasks in simulation and in the physical world.
Our goal is to develop an algorithmic basis for heterogeneous systems. This has
theoretical significance in that it addresses a major open problem in the field,
and practical significance in providing self-reconfiguring robots with increased
capabilities.
Distributed Algorithms for Guiding Navigation across a Sensor Network
Dartmouth Technical Report TR2002-435
Qun Li
Michael De Rosa
Daniela Rus
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/306/TR2002-435.ps.Z (3500KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/306/TR2002-435.pdf (373KB)
Abstract:
We develop distributed algorithms for self-reconfiguring sensor networks that respond
to directing a target through a region. The sensor network models the danger levels
sensed across its area and has the ability to adapt to changes. It represents the
dangerous areas as obstacles. A protocol that combines the artificial potential
field of the sensors with the goal location for the moving object guides the object
incrementally across the network to the goal, while maintaining the safest distance
to the danger areas. We report on hardware experiments using a physical sensor network
consisting of Mote sensors.
Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum
Dartmouth Technical Report TR2002-434
Ryan H. Lilien
Hany Farid
Bruce R. Donald
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/305/TR2002-434.ps.Z (1175KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/305/TR2002-434.pdf (4480KB)
Abstract:
We have developed an algorithm called Q5 for probabilistic classification of healthy
vs. disease whole serum samples using mass spectrometry. The algorithm employs Principal
Components Analysis (PCA) followed by Linear Discriminant Analysis (LDA) on whole
spectrum Surface-Enhanced Laser Desorption/Ionization Time of Flight (SELDI-TOF)
Mass Spectrometry (MS) data, and is demonstrated on four real datasets from complete,
complex SELDI spectra of human blood serum.
Q5 is a closed-form, exact solution to the problem of classification of complete
mass spectra of a complex protein mixture. Q5 employs a novel probabilistic classification
algorithm built upon a dimension-reduced linear discriminant analysis. Our solution
is computationally efficient; it is non-iterative and computes the optimal linear
discriminant using closed-form equations. The optimal discriminant is computed and
verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate
experiments of different training/testing splits of each dataset are employed to
verify robustness of the algorithm. The probabilistic classification method achieves
excellent performance. We achieve sensitivity, specificity, and positive predictive
values above 97% on three ovarian cancer datasets and one prostate cancer dataset.
The Q5 method outperforms previous full-spectrum complex sample spectral classification
techniques, and can provide clues as to the molecular identities of differentially-expressed
proteins and peptides.Notes:
Using the Emulab network testbed to evaluate the Armada I/O framework for computational grids
Dartmouth Technical Report TR2002-433
Ron Oldfield
David Kotz
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/304/TR2002-433.ps.Z (156KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/304/TR2002-433.pdf (87KB)
Abstract:
This short report describes our experiences using the Emulab network testbed at the
University of Utah to test performance of the Armada framework for parallel I/O
on computational grids.
Analysis of a Campus-wide Wireless Network
Dartmouth Technical Report TR2002-432
David Kotz
Kobby Essien
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/303/TR2002-432.ps.Z (564KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/303/TR2002-432.pdf (169KB)
Abstract:
Understanding usage patterns in wireless local-area networks (WLANs) is critical
for those who develop, deploy, and manage WLAN technology, as well as those who
develop systems and application software for wireless networks. This paper presents
results from the largest and most comprehensive trace of network activity in a large,
production wireless LAN. For eleven weeks we traced the activity of nearly two thousand
users drawn from a general campus population, using a campus-wide network of 476
access points spread over 161 buildings. Our study expands on those done by Tang
and Baker, with a significantly larger and broader population.
We found that residential traffic dominated all other traffic, particularly in residences
populated by newer students; students are increasingly choosing a wireless laptop
as their primary computer. Although web protocols were the single largest component
of traffic volume, network backup and file sharing contributed an unexpectedly large
amount to the traffic. Although there was some roaming within a network session,
we were surprised by the number of situations in which cards roamed excessively,
unable to settle on one access point. Cross-subnet roams were an especial problem,
because they broke IP connections, indicating the need for solutions that avoid
or accommodate such roams.Notes:
Analysis of Protein Sequences Using Time Frequency and Kolmogorov-Smirnov Methods
Dartmouth Technical Report TR2002-431
Kobby Essien
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/302/TR2002-431.ps.Z (7203KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/302/TR2002-431.pdf (8338KB)
Abstract:
The plethora of genomic data currently available has resulted in a search for new
algorithms and analysis techniques to interpret genomic data. In this two-fold study
we explore techniques for locating critical amino acid residues in protein sequences
and for estimating the similarity between proteins. We demonstrate the use of the
Short-Time Fourier Transform and the Continuous Wavelet Transform together with
amino acid hydrophobicity in locating important amino acid domains in proteins and
also show that the Kolmogorov-Smirnov statistic can be used as a metric of protein
similarity.Notes:
Building Trusted Paths for Web Browsers
Dartmouth Technical Report TR2002-430
Eileen Zishuang Ye
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/301/TR2002-430.ps.Z (397KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/301/TR2002-430.pdf (410KB)
Abstract:
The communication between the Web browser and the human user is one component of
the server-client channel. It is not the user but the browser that receives all
server information and establishes the secure connection. The browser's user interface
signals, such as SSL lock, https protocol header et al., indicate whether the browser-server
communication at the current moment is secure. Those user interface signals indicating
the security status of browser should be clearly and correctly understood by the
user.
A survey of modern Web browsers shows the information provided by current browsers
is insufficient for users to make trust judgment. Our Web spoofing work further
proved that the browser status information is not reliable either.
We discuss the criteria for and how to build the trusted paths between a browser
and a human user. We present an open source implementation of one of the designs--synchronized
random dynamic (SRD) boundary, based on Modified Mozilla source code, together with
its usability study results.
XSLT and XQuery as Operator Languages
Dartmouth Technical Report TR2002-429
A. Abram White
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/300/TR2002-429.ps.Z (54KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/300/TR2002-429.pdf (88KB)
Abstract:
Ubiquitous computing promises to integrate computers into our physical environment,
surrounding us with applications that are able to adapt to our dynamics. Solar is
a software infrastructure designed to deliver contextual information to these applications.
Solar represents context data as events, and uses small programs called operators
to filter, merge, aggregate, or transform event streams. This paper explores the
possibility of using XSLT and XQuery to build language-neutral Solar operators.Notes:
Information-theoretic Bounds on the Training and Testing Error of Boosting
Dartmouth Technical Report TR2002-428
Sebastien M. Lahaie
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/299/TR2002-428.ps.Z (111KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/299/TR2002-428.pdf (246KB)
Abstract:
Boosting is a means of using weak learners as subroutines to produce a strong learner
with markedly better accuracy. Recent results showing the connection between logistic
regression and boosting provide the foundation for an information-theoretic analysis
of boosting. We describe the analogy between boosting and gambling, which allows
us to derive a new upper bound on training error. This upper bound explicitly describes
the effect of noisy data on training error. We also use information-theoretic techniques
to derive an alternative upper-bound on testing error which is independent of the
size of the weak-learner space.
Performance and Interoperability In Solar
Dartmouth Technical Report TR2002-427
A. Abram White
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/298/TR2002-427.ps.Z (983KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/298/TR2002-427.pdf (352KB)
Abstract:
Ubiquitous computing promises to integrate computers into our physical environment,
surrounding us with applications that are able to adapt to our dynamics. Solar is
a software infrastructure designed to deliver contextual information to these applications.
To serve the large number and wide variety of context-aware devices envisioned by
ubiquitous computing, Solar must exhibit both high performance and the ability to
interoperate with many computing platforms. We created a testing framework to measure
the performance of distributed systems such as Solar, as well as a pluggable data-transfer
mechanism to support the dissemination of information to heterogeneous applications.
This paper explores the testing framework developed, analyzes its findings concerning
the performance of the current Solar prototype, presents several optimizations to
Solar and their effects, and finally discusses the design of the pluggable data-transfer
mechanism.Notes:
Role Definition Language (RDL): A Language to Describe Context-Aware Roles
Dartmouth Technical Report TR2002-426
Christopher P. Masone
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/297/TR2002-426.ps.Z (109KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/297/TR2002-426.pdf (83KB)
Abstract:
As wireless networks become more prevalent, a widening array of computational resources
becomes available to the mobile user. Since not all users should have unrestricted
access to these resources, a method of access control must be devised. In a context-aware
environment, context information can be used to supplement more conventional password-based
access control systems. We believe the best way to achieve this is through the use
of Context-Aware Role-Based Access Control, a model in which permissions are assigned
to entities called roles, each principal is a member of one or more roles, and a
role's membership is determined using context information. We designed and implemented
RDL (Role-Definition Language), a simple, expressive and somewhat extensible programming
language to facilitate the description of roles in terms of context information.Notes:
The Future of Cryptography Under Quantum Computers
Dartmouth Technical Report TR2002-425
Marco A. Barreno
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/296/TR2002-425.ps.Z (148KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/296/TR2002-425.pdf (233KB)
Abstract:
Cryptography is an ancient art that has passed through many paradigms, from simple
letter substitutions to polyalphabetic substitutions to rotor machines to digital
encryption to public-key cryptosystems. With the possible advent of quantum computers
and the strange behaviors they exhibit, a new paradigm shift in cryptography may
be on the horizon. Quantum computers could hold the potential to render most modern
encryption useless against a quantum-enabled adversary. The aim of this thesis is
to characterize this convergence of cryptography and quantum computation.
We provide definitions for cryptographic primitives that frame them in general terms
with respect to complexity. We explore the various possible relationships between
BQP, the primary quantum complexity class, and more familiar classes, and we analyze
the possible implications for cryptography.Notes:
Metasearch: Data Fusion for Document Retrieval
Dartmouth Technical Report TR2002-424
Mark H. Montague
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/295/TR2002-424.ps.Z (501KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/295/TR2002-424.pdf (853KB)
Abstract:
The metasearch problem is to optimally merge the ranked lists output by an arbitrary
number of search systems into one ranked list. In this work:
(1) We show that metasearch improves upon not just the raw performance of the input
search engines, but also upon the consistency of the input search engines from query
to query.
(2) We experimentally prove that simply weighting input systems by their average
performance can dramatically improve fusion results.
(3) We show that score normalization is an important component of a metasearch
engine, and that dependence upon statistical outliers appears to be the problem
with the standard technique.
(4) We propose a Bayesian model for metasearch that outperforms the best input
system on average and has performance competetive with standard techniques.
(5) We introduce the use of Social Choice Theory to the metasearch problem, modeling
metasearch as a democratic election. We adapt a positional voting algorithm, the
Borda Count, to create a metasearch algorithm, acheiving reasonable performance.
(6) We propose a metasearch model adapted from a majoritarian voting procedure,
the Condorcet algorithm. The resulting algorithm is the best performing algorithm
in a number of situations.
(7) We propose three upper bounds for the problem, each bounding a different class
of algorithms.
We present experimental results for each algorithm using two types of experiments
on each of four data sets.Notes:
Characterizing Usage of a Campus-wide Wireless Network
Dartmouth Technical Report TR2002-423
David Kotz
Kobby Essien
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/294/TR2002-423.ps.Z (229KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/294/TR2002-423.pdf (196KB)
Abstract:
Wireless local-area networks (WLANs) are increasingly common, but little is known
about how they are used. A clear understanding of usage patterns in real WLANs is
critical information to those who develop, deploy, and manage WLAN technology, as
well as those who develop systems and application software for wireless networks.
This paper presents results from the largest and most comprehensive trace of network
activity in a large, production wireless LAN. For eleven weeks we traced the activity
of nearly two thousand users drawn from a general campus population, using a campus-wide
network of 476 access points spread over 161 buildings. Our study expands on those
done by Tang and Baker, with a significantly larger and broader population.
We found that residential traffic dominated all other traffic, particularly in
residences populated by newer students; students are increasingly choosing a wireless
laptop as their primary computer. Although web protocols were the single largest
component of traffic volume, network backup and file sharing contributed an unexpectedly
large amount to the traffic. Although there was some roaming within a network session,
we were surprised by the number of situations in which cards roamed excessively,
unable to settle on one access point. Cross-subnet roams were an especial problem,
because they broke IP connections, indicating the need for solutions that avoid
or accommodate such roams.Notes:
Controlling access to pervasive information in the ``Solar'' system
Dartmouth Technical Report TR2002-422
Kazuhiro Minami
David Kotz
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/293/TR2002-422.ps.Z (355KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/293/TR2002-422.pdf (139KB)
Abstract:
Pervasive-computing infrastructures necessarily collect a lot of context information
to disseminate to their context-aware applications. Due to the personal or proprietary
nature of much of this context information, however, the infrastructure must limit
access to context information to authorized persons. In this paper we propose a
new access-control mechanism for event-based context-distribution infrastructures.
The core of our approach is based on a conservative information-flow model of access
control, but users may express discretionary relaxation of the resulting access-control
list (ACL) by specifying relaxation functions. This combination of automatic ACL
derivation and user-specified ACL relaxation allows access control to be determined
and enforced in a decentralized, distributed system with no central administrator
or central policy maker. It also allows users to express their personal balance
between functionality and privacy. Finally, our infrastructure allows access-control
policies to depend on context-sensitive roles, allowing great flexibility.
We describe our approach in terms of a specific context-dissemination framework,
the Solar system, although the same principles would apply to systems with similar
properties.
Solar: A pervasive-computing infrastructure for context-aware mobile applications
Dartmouth Technical Report TR2002-421
Guanling Chen
David Kotz
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/292/TR2002-421.ps.Z (347KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/292/TR2002-421.pdf (91KB)
Abstract:
Emerging pervasive computing technologies transform the way we live and work by embedding
computation in our surrounding environment. To avoid increasing complexity, and
allow the user to concentrate on her tasks, applications must automatically adapt
to their changing emph{context}, the physical and computational environment in which
they run. To support these ``context-aware'' applications we propose a graph-based
abstraction for collecting, aggregating, and disseminating context information.
The abstraction models context information as emph{events}, which are produced by
emph{sources}, flow through a directed acyclic graph of event-processing emph{operators},
and are delivered to subscribing applications. Applications describe their desired
event stream as a tree of operators that aggregate low-level context information
published by existing sources into the high-level context information needed by
the application. The emph{operator graph/} is thus the dynamic combination of all
applications' subscription trees. In this paper, we motivate our graph abstraction
by discussing several applications under development, sketch the architecture of
our system (``Solar'') that implements our abstraction, report some early experimental
results from the prototype, and outline issues for future research.
Context Aggregation and Dissemination in Ubiquitous Computing Systems
Dartmouth Technical Report TR2002-420
Guanling Chen
David Kotz
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/291/TR2002-420.ps.Z (100KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/291/TR2002-420.pdf (88KB)
Abstract:
Many ``ubiquitous computing'' applications need a constant flow of information about
their environment to be able to adapt to their changing context. To support these
``context-aware'' applications we propose a graph-based abstraction for collecting,
aggregating, and disseminating context information. The abstraction models context
information as events, produced by sources and flowing through a directed acyclic
graph of event-processing operators and delivered to subscribing applications. Applications
describe their desired event stream as a tree of operators that aggregate low-level
context information published by existing sources into the high-level context information
needed by the application. The operator graph is thus the dynamic combination of
all applications' subscription trees.
In this paper, we motivate and describe our graph abstraction, and discuss a variety
of critical design issues. We also sketch our Solar system, an implementation that
represents one point in the design space for our graph abstraction.Notes:
FFTs for the 2-Sphere - Improvements and Variations
Dartmouth Technical Report TR2002-419
Dennis M. Healy
Daniel N. Rockmore
Peter J. Kostelec
Sean S. B. Moore
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/290/TR2002-419.ps.Z (1499KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/290/TR2002-419.pdf (1333KB)
Abstract:
Earlier work by Driscoll and Healy has produced an efficient algorithm for computing
the Fourier transform of band-limited functions on the 2-sphere. In this paper we
present a reformulation and variation of the original algorithm which results in
a greatly improved inverse transform, and consequent improved convolution algorithm
for such functions. All require at most $O(Nlog^2 N)$ operations where $N$ is the
number of sample points. We also address implementation considerations and give
heuristics for allowing reliable and computationally efficient floating point implementations
of slightly modified algorithms.
These claims are supported by extensive numerical experiments from our implementation
in C on DEC, HP, SGI and Linux Pentium platforms. These results indicate that variations
of the algorithm are both reliable and efficient for a large range of useful problem
sizes. Performance appears to be architecture-dependent. The paper concludes with
a brief discussion of a few potential applications.Notes:
Trusted Paths for Browsers: An Open-Source Solution to Web Spoofing
Dartmouth Technical Report TR2002-418
Eileen Ye
Sean Smith
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/289/TR2002-418.ps.Z (303KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/289/TR2002-418.pdf (78KB)
Abstract:
The security of the vast majority of ``secure'' Web services rests on SSL server
PKI. However, this PKI doesn't work if the the adversary can trick the browser into
appearing to tell the user the wrong thing about the certificates and cryptography.
The seminal web spoofing work of Felten et al demonstrated the potential, in 1996,
for malicious servers to impersonate honest servers. Our recent follow-up work explicitly
shows how malicious servers can still do this---and can also forge the existence
of an SSL session and the contents of the alleged server certificate.
This paper reports the results of our work to systematically defend against Web
spoofing, by creating a trusted path from the browser to the user. Starting with
the Mozilla source, we have implemented techniques that protect a wide variety of
browser-user communications, that require little participation by the user and minimal
disruption of the displayed server content. We have prepared shell scripts that
install these modifications on the Mozilla source, to enable others to replicate
this work.
In on-going work, we are cleaning up and fine-tuning our code. In future work,
we hope to examine more deeply the role of user interfaces in enabling users to
make effective trust judgments.
Web Spoofing Revisited: SSL and Beyond
Dartmouth Technical Report TR2002-417
Eileen Ye
Yougu Yuan
Sean Smith
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/288/TR2002-417.ps.Z (1431KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/288/TR2002-417.pdf (281KB)
Abstract:
Can users believe what their browsers tell them? Even sophisticated Web users decide
whether or not to trust a server based on browser cues such as location bar information,
SSL icons, SSL warnings, certificate information, and response time. In their seminal
work on Web spoofing, Felten et al showed how, in 1996, a malicious server could
forge some of these cues. However, this work used genuine SSL sessions, and Web
technology has evolved much since 1996.
The Web has since become the pre-eminent medium for electronic service delivery
to remote users, and the security of many commerce, government, and academic network
applications critically rests on the assumption that users can authenticate the
servers with which they interact. This situation raises the question: is the browser-user
communication model today secure enough to warrant this assumption?
In this paper, we answer this question by systematically showing how a malicious
server can forge every one of the above cues. Our work extends the prior results
by examining contemporary browsers, and by forging all of the SSL information a
client sees, including the very existence of an SSL session (thus providing a cautionary
tale about the security of one of the most common applications of PKI). We have
made these techniques available for public demonstration, because anything less
than working code would not convincingly answer the question. We also discuss implications
and potential countermeasures, both short-term and long-term.Notes:
Virtual Hierarchies - An Architecture for Building and Maintaining Efficient and Resilient Trust Chains.
Dartmouth Technical Report TR2002-416
John C. Marchesini
Sean W. Smith
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/287/TR2002-416.ps.Z (382KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/287/TR2002-416.pdf (95KB)
Abstract:
In Public Key Infrastructure (PKI), the simple, monopolistic CA model works fine
until we consider the real world. Then, issues such as scalability and mutually
suspicious organizations create the need for a multiplicity of CAs, which immediately
introduces the problem of how to organize them to balance resilience to compromise
against efficiency of path discovery.
However, security has given us tools such as secure coprocessing, secret splitting,
secret sharing, and threshold cryptography for securely carrying out computations
among multiple trust domains; distributed computing has given us peer-to-peer networking,
for creating self-organizing distributed systems.
In this paper, we use these latter tools to address the former problem by overlaying
a virtual hierarchy on a mesh architecture of peer CAs, and achieving both resilience
and efficiency.
Future Directions for Mobile-Agent Research
Dartmouth Technical Report TR2002-415
David Kotz
Robert S. Gray
Daniela Rus
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/286/TR2002-415.ps.Z (1868KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/286/TR2002-415.pdf (201KB)
Abstract:
During a discussion in September 2000 the authors examined the future of research
on mobile agents and mobile code. (A mobile agent is a running program that can
move from host to host in network at times and to places of its own choosing.) In
this paper we summarize and reflect on that discussion. It became clear that the
field should shift its emphasis toward mobile code, in all its forms, rather than
to continue its narrow focus on mobile agents. Furthermore, we encourage the development
of modular components, so that application designers may take advantage of code
mobility without needing to rewrite their application to fit in a monolithic mobile-agent
system. There are many potential applications that may productively use mobile code,
but there is no ``killer application'' for mobile agents. Finally, we note that
although security is an important and challenging problem, there are many applications
and environments with security requirements well within the capability of existing
mobile-code and mobile-agent frameworks.
Decentralized Control for Coordinated flow of Multi-Agent Systems
Dartmouth Technical Report TR2002-414
Valentino Crespi
George Cybenko
Massimo Santini
Daniela Rus
Date: January 2002
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/285/TR2002-414.ps.Z (86KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/285/TR2002-414.pdf (188KB)
Abstract:
This paper describes a distributed algorithm for coordinating the flow of a mass
of vehicles approaching a highway exit or a tollbooth. We provide the problem formulation,
a general methodology for distributed control and an instantiation of this methodology
to the coordinated flow problem. We analyze our algorithm and provide experimental
data.
Differential Elastic Image Registration
Dartmouth Technical Report TR2001-413
Senthil Periaswamy
Hany Farid
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/284/TR2001-413.ps.Z (2138KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/284/TR2001-413.pdf (1447KB)
Abstract:
We have applied techniques from differential motion estimation to the problem of
automatic elastic registration of medical images. This method models the mapping
between images as a locally affine but globally smooth warp. The mapping also explicitly
accounts for variations in image intensities. This approach is simple and highly
effective across a broad range of medical images. We show the efficacy of this approach
on several synthetic and clinical images.
Detecting Steganographic Messages in Digital Images
Dartmouth Technical Report TR2001-412
Hany Farid
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/283/TR2001-412.ps.Z (586KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/283/TR2001-412.pdf (603KB)
Abstract:
Techniques and applications for information hiding have become increasingly more
sophisticated and widespread. With high-resolution digital images as carriers, detecting
the presence of hidden messages has also become considerably more difficult. It
is sometimes possible, nevertheless, to detect (but not necessarily decipher) the
presence of embedded messages. The basic approach taken here works by finding predictable
higher-order statistics of ``natural'' images within a multi-scale decomposition,
and then showing that embedded messages alter these statistics.
Write Once, Move Anywhere: Toward Dynamic Interoperability of Mobile Agent Systems
Dartmouth Technical Report TR2001-411
Arne Grimstrup
Robert S. Gray
David Kotz
Thomas Cowin
Greg Hill
Niranjan Suri
Daria Chacon
Martin Hofmann
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/282/TR2001-411.ps.Z (198KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/282/TR2001-411.pdf (214KB)
Abstract:
Mobile agents are an increasingly popular paradigm, and in recent years there has
been a proliferation of mobile-agent systems. These systems are, however, largely
incompatible with each other. In particular, agents cannot migrate to a host that
runs a different mobile-agent system. Prior approaches to interoperability have
tried to force agents to use a common API, and so far none have succeeded. Our goal,
summarized in the catch phrase ``Write Once, Move Anywhere,'' led to our efforts
to develop mechanisms that support dynamic runtime interoperability of mobile-agent
systems. This paper describes the Grid Mobile-Agent System, which allows agents
to migrate to different mobile-agent systems.Notes:
Securing Web Servers against Insider Attack
Dartmouth Technical Report TR2001-410
Shan Jiang
Sean Smith
Kazuhiro Minami
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/281/TR2001-410.ps.Z (109KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/281/TR2001-410.pdf (115KB)
Abstract:
Too often, ``security of Web transactions'' reduces to ``encryption of the channel''---and
neglects to address what happens at the server on the other end. This oversight
forces clients to trust the good intentions and competence of the server operator---but
gives clients no basis for that trust. Furthermore, despite academic and industrial
research in secure coprocessing, many in the computer science community still regard
``secure hardware'' as a synonym for ``cryptographic accelerator.' This oversight
neglects the real potential of COTS secure coprocessing technology to establish
trusted islands of computation in hostile environments---such as at web servers
with risk of insider attack.
In this paper, we apply secure coprocessing and cryptography to solve this real
problem in Web technology. We present a vision: using secure coprocessors to establish
trusted co-servers at Web servers and moving sensitive computations inside these
co-servers. We present a prototype implementation of this vision that scales to
realistic workloads. Finally, we validate this approach by building a simple E-voting
application on top of our prototype.
From our experience, we conclude that this approach provides a practical and effective
way to enhance the security of Web servers against insider attack.
Web Spoofing 2001
Dartmouth Technical Report TR2001-409
Yougu Yuan
Eileen Zishuang Ye
Sean W. Smith
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/280/TR2001-409.ps.Z (776KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/280/TR2001-409.pdf (486KB)
Abstract:
The Web is currently the pre-eminent medium for electronic service delivery to remote
users. As a consequence, authentication of servers is more important than ever.
Even sophisticated users base their decision whether or not to trust a site on browser
cues---such as location bar information, SSL icons, SSL warnings, certificate information,
response time, etc.
In their seminal work on web spoofing, Felten et al showed how a malicious server
could forge some of these cues---but using approaches that are no longer reproducible.
However, subsequent evolution of Web tools has not only patched security holes---it
has also added new technology to make pages more interactive and vivid. In this
paper, we explore the feasibility of web spoofing using this new technology---and
we show how, in many cases, every one of the above cues can be forged.
In particular, we show how a malicious server can forge all the SSL information
a client sees---thus providing a cautionary tale about the security of one of the
most common applications of PKI.
We stress that these techniques have been implemented, and are available for public
demonstration.
Market-based Control of Mobile-agent Systems
Dartmouth Technical Report TR2001-408
Jonathan L. Bredin
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/279/TR2001-408.ps.Z (698KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/279/TR2001-408.pdf (980KB)
Abstract:
Modern distributed systems scatter sensors, storage, and computation throughout the
environment. Ideally these devices communicate and share resources, but there is
seldom motivation for a device's owner to yield control to another user. We establish
markets for computational resources to motivate principals to share resources with
arbitrary users, to enforce priority in distributed systems, to provide flexible
and rational limitations on the potential of an application, and to provide a lightweight
structure to balance the workload over time and between devices. As proof of concept,
we implement a structure software agents can use to discover and negotiate access
to networked resources. The structure separates discovery, authentication, and consumption
enforcement as separate orthogonal issues to give system designers flexibility.
Mobile agents represent informational and computational flow. We develop mechanisms
that distributively allocate computation among mobile agents in two settings. The
first models a situation where users collectively own networked computing resources
and require priority enforcement. We extend the allocation mechanism to allow resource
reservation to mitigate utility volatility. The second, more general model relaxes
the ownership assumption. We apply our computational market to an open setting where
a principal's chief concern is revenue maximization. Our simulations compare the
performance of market-based allocation policies to traditional policies and relate
the cost of ownership and consumption separation. We observe that our markets effectively
prioritize applications' performance, can operate under uncertainty and network
delay, provide metrics to balance network load, and allow measurement of market-participation
risk versus reservation-based computation. In addition to allocation problems,
we investigate resource selection to optimize execution time. The problem is NP-complete
if the costs and latencies are constant. Both metrics' dependence on the chosen
set complicates matters. We study how a greedy approach, a novel heuristic, and
a shortest-constrained-path strategy perform in mobile-agent applications. Market-based
computational-resource allocation fertilizes applications where previously there
was a dearth of motive for or means of cooperation. The rationale behind mobile-agent
performance optimization is also useful for resource allocation in general distributed
systems where an application has a sequence of dependent tasks or when data collection
is expensive.Notes:
TCP/IP Implementation within the Dartmouth Scalable Simulation Framework
Dartmouth Technical Report TR2001-407
Michael G. Khankin
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/278/TR2001-407.ps.Z (1595KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/278/TR2001-407.pdf (477KB)
Abstract:
This paper discusses TCP/IP networking, and in particular, the DaSSF implementation
of TCP/IP. The paper reviews the protocols, outlines the implementation design,
and demonstrates some tests. In addition, some performance and memory usage analysis
is performed. We find DaSSF TCP/IP to be a viable option to the existing SSF. DaSSF
TCP/IP is faster and uses less memory so we can simulate larger, more complex, models.Notes:
Fastab: Solving the Pitch to Notation Problem
Dartmouth Technical Report TR2001-406
Jeremy I. Robin
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/277/TR2001-406.ps.Z (1211KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/277/TR2001-406.pdf (174KB)
Abstract:
I have always been frustrated with the length of time necessary to notate a piece
of music. Computers have simplified so many other aspects of our lives, it seems
that they should be able to simplify this task as well. In fact, there are already
two distinct ways that engineers have attempted to attack this problem. The first
analyzes the waveform generated by microphone input and relies on Fourier Analysis
and other similar methods. The other examines the analog signal generated by a electric
guitar-like pickup placed beneath the strings. The method used by Fastab relies
much less on the musical properties of an instrument. Instead, Fastab records where
and when the fingers and pick contact the instrument using digital electronics and
microprocessor technology. Fastab provides a solution to the pitch to notation problem
which is cheaper and more accurate than any other device available today.Notes:
DaSSFNet: An Extension to DaSSF for High-Performance Network Modeling
Dartmouth Technical Report TR2001-405
Mehmet Iyigun
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/276/TR2001-405.ps.Z (1177KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/276/TR2001-405.pdf (391KB)
Abstract:
Scalable Simulation Framework (SSF) is a discrete-event simulation framework providing
a unified programming interface geared towards network simulation. Dartmouth SSF
(DaSSF) is a C++ implementation of SSF, designed for simulating very large-scale
multi-protocol communication networks. As of the latest release, DaSSF lacks many
features present in SSF and this prevents it from achieving mainstream use. To alleviate
this shortcoming we designed and implemented DaSSFNet which extends DaSSF to the
levels of functionality found in SSF. In this paper, we show that DaSSFNet and SSFNet
are identical in operation given the same input. We also show that DaSSFNet is about
twice as fast and has one third the memory consumption of SSFNet when simulating
identical networks. Therefore, we argue, that the DaSSF simulation package with
DaSSFNet now offers a viable alternative to SSF in high-performance network simulation.Notes:
Efficient Compression of Generic Function Dispatch Tables
Dartmouth Technical Report TR2001-404
Eric Kidd
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/275/TR2001-404.ps.Z (193KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/275/TR2001-404.pdf (223KB)
Abstract:
A generic function is similar to an overloaded operator, but provides a way to select
an appropriate behavior at run-time instead of compile-time. Dujardin and colleagues
have proposed an algorithm for building and compressing generic function dispatch
tables.
We present several modifications to their algorithm, including an improvement to
Pseudo-Closest-Poles and two new algorithms for compressing pole tables. The two
new compression algorithms are simple and fast, and one produces smaller output
than the original.Notes:
EcomRISK.org : A site to classify and organize the risks of performing business on the Internet
Dartmouth Technical Report TR2001-403
Aidan S. Marcuss
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/274/TR2001-403.ps.Z (366KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/274/TR2001-403.pdf (324KB)
Abstract:
As the use of the Internet and other computer networks to transact business grows,
there is an ever increasing need for those taking part in those transactions to
understand the risks of doing so. While there are many web sites that have created
valuable databases of specific vulnerabilities for certain types of hardware and
software, there is a lack of focus on attempting to analyze the interaction of businesses,
their systems, computer networks, and their customers and the risks that are created
by either intended or unattended interactions. EcomRISK.org is a web site that presents
a clear taxonomy to classify these risks and provides other features to aid in the
general discussion of e-commerce risk. The site, and the taxonomy at the center
of it, creates a database of these incidents so they can be clearly searched. This
paper discusses the creation of EcomRISK.org, from vision to birth.Notes:
Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs
Dartmouth Technical Report TR2001-402
Jeremy T. Fineman
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/273/TR2001-402.ps.Z (234KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/273/TR2001-402.pdf (290KB)
Abstract:
We present an improved version of the Dimensional Method for computing multidimensional
Fast Fourier Transforms (FFTs) on a multiprocessor system when the data consist
of too many records to fit into memory. Data are spread across parallel disks and
processed in sections. We use the Parallel Disk Model for analysis.
The simple Dimensional Method performs the 1-dimensional FFTs for each dimension
in term. Between each dimension, an out-of-core permutation is used to rearrange
the data to contiguous locations. The improved Dimensional Method processes multiple
dimensions at a time.
We show that determining an optimal sequence and groupings of dimensions is NP-complete.
We then analyze the effects of two modifications to the Dimensional Method independently:
processing multiple dimensions at one time, and processing single dimensions in
a different order.
Finally, we show a lower bound on the I/O complexity of the Dimensional Method
and present an algorithm that is approximately asymptotically optimal.Notes:
Outbound Authentication for Programmable Secure Coprocessors
Dartmouth Technical Report TR2001-401
Sean W. Smith
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/272/TR2001-401.ps.Z (79KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/272/TR2001-401.pdf (116KB)
Abstract:
A programmable secure coprocessor platform can help solve many security problems
in distributed computing. These solutions usually require that coprocessor applications
be able to participate as full-fledged parties in distributed cryptographic protocols.
Thus, to fully enable these solutions, a generic platform must not only provide
programmability, maintenance, and configuration in the hostile field---it must also
provide outbound authentication for the entities that result. A particular application
on a particular untampered device must be able to prove who it is to a party on
the other side of the Internet.
To be effective, a secure outbound authentication service must closely mesh with
the overall security architecture. Our initial architecture only sketched a rough
design for this service, and did not complete it. This paper presents our research
and development experience in refining and implementing this design, to provide
PKI-based outbound authentication for the IBM 4758 Model 2 secure coprocessor platform.
An Armored Data Vault
Dartmouth Technical Report TR2001-400
Alex Iliev
Date: January 2001
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/271/TR2001-400.pdf (249KB)
Abstract:
We consider the problem of secure long-term archiving of network traffic, an instance
of the problem of storing data securely. We approach the problem using secure hardware,
which enables the enforcement of flexible access policy. The policy cannot be circumvented
by anyone, even insiders, and so we are assured that access to the data is as originally
intended. The policy can be expressed as any feasible computation, as it will be
checked inside the secure hardware without possibility of interference. We discuss
our design of a device to perform such network data archiving and have implemented
a prototpe device. We discuss other possible application areas of the design.Notes:
WebALPS Implementation and Performance Analysis: Using Trusted Co-servers to Enhance Privacy and Security of Web Interactions
Dartmouth Technical Report TR2001-399
Shan Jiang
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/270/TR2001-399.ps.Z (266KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/270/TR2001-399.pdf (354KB)
Abstract:
The client-server model of the Web poses a fundamental trust issue: clients are forced
to trust in secrecy and correctness of computation occurring at a remote server
of unknown credibility. The current solution for this problem is to use a PKI (Public
Key Infrastructure) system and SSL (Secure Sockets Layer) digital certificates to
prove the claimed identity of a server and establish an authenticated, encrypted
channel between the client and this server. However, this approach does not address
the security risks posed by potential malicious server operators or any third parties
who may penetrate the server sites.
The WebALPS (Web Applications with Lots of Privacy and Security) approach is proposed
to address these weaknesses by moving sensitive computations at server side into
trusted co-servers running inside high-assurance secure coprocessors.
In this report, we examine the foundations of the credibility of WebALPS co-servers.
Then we will describe our work of designing and building a prototype WebALPS co-server,
which is integrated into the widely-deployed, commercial-grade Apache server. We
will also present the performance test results of our system which support the argument
that WebALPS approach provides a systematic and practical way to address the remote
trust issue.Notes:
A System for Audio Personalization with Applications on Wireless Devices
Dartmouth Technical Report TR2001-398
David Marmaros
Date: January 2001
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/269/TR2001-398.pdf (759KB)
Abstract:
We present and analyze a system for dynamically tailoring discrete audio content
for numerous users based on aggregate data and intuitive feedback mechanisms. The
framework for this system utilizes a flexible client-server architecture to facilitate
audio dissemination, with particular attention to distribution over wireless networks.
We discuss the requirements and specifications of such a system. We further analyze
the algorithms and protocols required for its operation. Finally, we outline and
provide data from a demonstration of this application.Notes:
Supporting Adaptive Ubiquitous Applications with the SOLAR System
Dartmouth Technical Report TR2001-397
Guanling Chen
David Kotz
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/268/TR2001-397.ps.Z (206KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/268/TR2001-397.pdf (234KB)
Abstract:
As we embed more computers into our daily environment, ubiquitous computing promises
to make them less noticeable and help to prevent information overload. We see, however,
few ubiquitous applications that are able to adapt to the dynamics of user, physical,
and computational context. We believe that there are two challenges causing this
lack of ubiquitous applications: there is no flexible and scalable way to support
information collection and dissemination in a ubiquitous and mobile environment,
and there is no general approach to building adaptive applications given heterogeneous
contextual information. We propose a system infrastructure, Solar, to meet these
challenges. Solar uses a subscription-based operator graph abstraction and allows
dynamic composition of stackable operators to manage ubiquitous information sources.
After developing a set of diverse adaptive applications, we expect to identify fundamental
techniques for context-aware adaptation. Our expectation is that Solar's end-to-end
support for information collection, dissemination, and utilization will make it
easy to build adaptive applications for a ubiquitous mobile environment with many
users and devices.
Implementing a Database Information System for an Electronic Baseball Scorecard
Dartmouth Technical Report TR2001-396
Tiffany M. Wong
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/267/TR2001-396.ps.Z (135KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/267/TR2001-396.pdf (99KB)
Abstract:
We present our design and implementation of a database system of information storage
and retrieval for an electronic baseball scorecard. The program uses the relational
MySQL database to hold information and a Tcl API to handle interactions between
the database and the user interface code. This paper discusses the inner workings
of how information storage was broken down inside the database, how queries were
internally constructed in accordance with the user's input, and how statistics for
players and teams were calculated and returned to the user. Finally, we discuss
some limitations attached to our current implementation of the program and propose
improvements that can be made in future versions.Notes:
An Implementation of Object-Oriented Program Transformation for Thought-Guided Debugging
Dartmouth Technical Report TR2001-395
Tiffany M. Wong
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/266/TR2001-395.ps.Z (33KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/266/TR2001-395.pdf (55KB)
Abstract:
This paper presents our design and implementation of program transformation for C++
that will be used in the context of a thought-guided debugging system. The program
uses a lexical analyzer written in Flex and a grammar written in Bison that work
in conjunction to scan the inputted C++ code for function definitions and class
definitions. The code is then transformed to produce trace information for each
defined function, while the original functionality of the code is left untouched.
We also implement two additional data structures that are used for information storage
during the course of the program.Notes:
An Empirical Study of Training and Testing Error in Boosting
Dartmouth Technical Report TR2001-394
David D. Latham
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/265/TR2001-394.ps.Z (457KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/265/TR2001-394.pdf (435KB)
Abstract:
Bounds have been proven for both training and testing error for the boosting algorithm
AdaBoost, but in practice neither seem to produce a particularly tight bound. In
this paper we share some observations of these bounds from empirical results, and
then explore some properties of the algorithm with an eye towards finding an improved
bound for the performance of AdaBoost. Based on our empirical evidence, the error
of a hypothesis which labels examples probabilistically based upon the confidence
of the vote of the weak hypotheses forms a tighter bound for the training error.Notes:
Measuring early usage of Dartmouth's wireless network
Dartmouth Technical Report TR2001-393
Pablo Stern
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/264/TR2001-393.ps.Z (1410KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/264/TR2001-393.pdf (330KB)
Abstract:
In Spring 2001, Dartmouth College installed a campus-wide 802.11b wireless network.
To understand how that network is used, we examined the usage characteristics of
the network over a five-week period. We monitored access points to determine user
behavior, and user and network traffic characteristics. Because our study coincided
with the deployment of the access points, our analysis captures the growth of a
wireless network. The results of this study help understand the behavior of mobile
users and provide a reference to network engineers wishing to deploy and expand
similar wireless networks.Notes:
SmartReminder: A Case Study on Context-Sensitive Applications
Dartmouth Technical Report TR2001-392
Arun Mathias
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/263/TR2001-392.ps.Z (457KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/263/TR2001-392.pdf (399KB)
Abstract:
Designing context-sensitive applications is challenging. We design and implement
SmartReminder to explore designing context-sensitive applications and to demonstrate
how the SOLAR system can be used in developing such applications. SmartReminder
is an application that reminds the user based on contextual information. Current
appointment-reminder applications remind the user about their appointments at an
arbitrarily specified time. For instance, they might remind the user ten minutes
before each appointment. SmartReminder, on the other hand, uses contextual information,
like location, to better estimate the appropriate reminder time for each appointment.
It reminds the user based on where they are, where they need to be, and how long
it will take them to get there. This paper presents SmartReminder as an illustration
of how context-sensitive applications can be designed using the SOLAR system for
dissemination of contextual information.Notes:
A Directory Infrastructure to Support Mobile Services
Dartmouth Technical Report TR2001-391
Ammar Khalid
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/262/TR2001-391.ps.Z (1667KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/262/TR2001-391.pdf (364KB)
Abstract:
Traditional Voice-over-IP applications such as Microsoft NetMeeting assume that the
user is on a machine with a fixed IP address. If, however, the user connects to
the Internet, via a wireless network, on a handheld device, his IP address frequently
changes as he moves from one subnet to another. In such a situation, we need a service
that can be queried for the most current IP address of a person whom we wish to
contact. In this project, we design and implement such a directory service. The
service authenticates all callers and callees, is robust against most host failure,
and scales to several thousand registered users.Notes:
Mobile Voice Over IP (MVOIP): An Application-level Protocol
Dartmouth Technical Report TR2001-390
G. Ayorkor Mills-Tettey
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/261/TR2001-390.ps.Z (1542KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/261/TR2001-390.pdf (427KB)
Abstract:
Current Voice over Internet Protocol (VOIP) protocols require participating hosts
to have fixed IP addresses for the duration of a VOIP call. When using a wireless-enabled
host, such as a tablet computer on an 802.11 wireless network, it is possible for
a participant in a VOIP call to roam around the network, moving from one subnet
to another and needing to change IP addresses. This address change creates the need
for mobility support in VOIP applications. We present the design of Mobile Voice
over IP (MVOIP), an application-level protocol that enables such mobility in a VOIP
application based on the ITU H.323 protocol stack. An MVOIP application uses hints
from the surrounding network to determine that it has switched subnets. It then
initiates a hand-off procedure that comprises pausing its current calls, obtaining
a valid IP address for the current subnet, and reconnecting to the remote party
with whom it was in a call. Testing the system shows that on a Windows 2000 platform
there is a perceivable delay in the hand-off process, most of which is spent in
the Windows API for obtaining DHCP addresses. Despite this bottleneck, MVOIP works
well on a wireless network.Notes:
Improving a Brokering System for Linking Distributed Simulations
Dartmouth Technical Report TR2001-389
Thomas B. Stephens
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/260/TR2001-389.ps.Z (105KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/260/TR2001-389.pdf (237KB)
Abstract:
The Agent Based Environment for Linking Simulations (ABELS) is a software framework
designed to provide disparate simulations with dynamically updated data sources.
It allows simulations and other agents to join a "cloud" of interacting producers
and consumers of data. Once they have joined the cloud, they can publish services
to other members and use methods published by others. This paper presents the initial
design of a set of matchmaking components for the ABELS framework. These components
dictate how services describe their abilities and requirements to ABELS. Furthermore,
they help ABELS successfully match data producing services to the requests of data
consuming clients. We begin by describing a system for a data producing service
to describe itself to the ABELS cloud, as well as a corresponding system for a data
consumer to describe its needs. We then describe in detail the three components
that make up the ABELS matchmaking system: the match ranker, which ranks a data
producer's ability to fill the request of a data consumer; the thesaurus, which
helps the match ranker recognize closely related terms; and the unit database, which
allows participants in the ABELS system to translate between related data units.
We also discuss how these basic components can be built upon and improved in future
versions of the ABELS framework.Notes:
Applying the Vector Radix Method to Multidimensional, Multiprocessor, Out-of-Core Fast Fourier Transforms
Dartmouth Technical Report TR2001-388
Michael F. Ringenburg
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/259/TR2001-388.ps.Z (263KB)
Abstract:
We describe an efficient algorithm for calculating Fast Fourier Transforms on matrices
of arbitrarily high dimension using the vector-radix method when the problem size
is out-of-core (i.e., when the size of the data set is larger than the total available
memory of the system). The algorithm takes advantage of multiple processors when
they are present, but it is also efficient on single-processor systems. Our work
is an extension of work done by Lauren Baptist in [Bapt99], which applied the vector-radix
method to 2-dimensional out-of-core matrices.
To determine the effectiveness of the algorithm, we present empirical results as
well as an analysis of the I/O, communication, and computational complexity. We
perform the empirical tests on a DEC 2100 server and on a cluster of Pentium-based
Linux workstations. We compare our results with the traditional dimensional method
of calculating multidimensional FFTs, and show that as the number of dimensions
increases, the vector-radix-based algorithm becomes increasingly effective relative
to the dimensional method.
In order to calculate the complexity of the algorithm, it was necessary to develop
a method for analyzing the interprocessor communication costs of the BMMC data-permutation
algorithm (presented in [CSW98]) used by our FFT algorithms. We present this analysis
method and show how it was derived.Notes:
A simple bound on the expected height of a randomly built binary search tree
Dartmouth Technical Report TR2001-387
Javed A. Aslam
Date: January 2001
Abstract:
Notes:
Mobile-Agent versus Client/Server Performance: Scalability in an Information-Retrieval Task
Dartmouth Technical Report TR2001-386
Robert S. Gray
David Kotz
Ronald A. Peterson
Peter Gerken
Martin Hofmann
Daria Chacon
Greg Hill
Niranjan Suri
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/257/TR2001-386.ps.Z (1072KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/257/TR2001-386.pdf (1153KB)
Abstract:
Mobile agents are programs that can jump from host to host in the network, at times
and to places of their own choosing. Many groups have developed mobile-agent software
platforms, and several mobile-agent applications. Experiments show that mobile agents
can, among other things, lead to faster applications, reduced bandwidth demands,
or less dependence on a reliable network connection. There are few if any studies
of the scalability of mobile-agent servers, particularly as the number of clients
grows. We present some recent performance and scalability experiments that compare
three mobile-agent platforms with each other and with a traditional client/server
approach. The experiments show that mobile agents often outperform client/server
solutions, but also demonstrate the deep interaction between environmental and application
parameters. The three mobile-agent platforms have similar behavior but their absolute
performance varies with underlying implementation choices.Notes:
Lock-free Scheduling of Logical Processes in Parallel Simulation
Dartmouth Technical Report TR2001-385
Xiaowen Liu
David M. Nicol
King Tan
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/256/TR2001-385.ps.Z (135KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/256/TR2001-385.pdf (264KB)
Abstract:
With fixed lookahead information in a simulation model, the overhead of asynchronous
conservative parallel simulation lies in the mechanism used for propagating time
updates in order for logical processes to safely advance their local simulation
clocks. Studies have shown that a good scheduling algorithm should preferentially
schedule processes containing events on the critical path. This paper introduces
a lock-free algorithm for scheduling logical processes in conservative parallel
discrete-event simulation on shared-memory multiprocessor machines. The algorithm
uses fetch&add operations that help avoid inefficiencies associated with using locks.
The lock-free algorithm is robust. Experiments show that, compared with the scheduling
algorithm using locks, the lock-free algorithm exhibits better performance when
the number of logical processes assigned to each processor is small or when the
workload becomes significant. In models with large number of logical processes,
our algorithm shows only modest increase in execution time due to the overhead in
the algorithm for extra bookkeeping.Notes:
Ambiguity-Directed Sampling for Qualitative Analysis of Sparse Data from Spatially-Distributed Physical Systems
Dartmouth Technical Report TR2001-384
Chris Bailey-Kellogg
Naren Ramakrishnan
Date: January 2001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/255/TR2001-384.ps.Z (346KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/255/TR2001-384.pdf (240KB)
Abstract:
A number of important scientific and engineering applications, such as fluid dynamics
simulation and aircraft design, require analysis of spatially-distributed data from
expensive experiments and complex simulations. In such data-scarce applications,
it is advantageous to use models of given sparse data to identify promising regions
for additional data collection. This paper presents a principled mechanism for applying
domain-specific knowledge to design focused sampling strategies. In particular,
our approach uses ambiguities identified in a multi-level qualitative analysis of
sparse data to guide iterative data collection. Two case studies demonstrate that
this approach leads to highly effective sampling decisions that are also explainable
in terms of problem structures and domain knowledge.
Reconstructing Ancient Egyptian Tombs
Dartmouth Technical Report TR2000-383
Hany Farid
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/254/TR2000-383.ps.Z (3809KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/254/TR2000-383.pdf (2803KB)
Abstract:
From the pyramids of Giza to the tombs of Thebes (modern Luxor), ancient Egypt's
glorious history has produced remarkable architecture. Sadly, the nearly four million
yearly tourists have taken a heavy toll on many of these ancient structures. Of
particular concern are many of the tombs located opposite to Luxor on the western
bank of the Nile. Digital reconstruction of these tombs has the potential to help
document and preserve these important historical structures. Photographing and reconstruction
of these tombs poses new and unique problems that this paper begins to address.
Techniques for removing image distortions, recovering 3-D shape, and correcting
for lighting imbalances are discussed. A complete reconstruction of the tomb of
Sennedjem is shown.
Bayes Optimal Metasearch: A Probabilistic Model for Combining the Results of Multiple Retrieval Systems
Dartmouth Technical Report TR2000-382
Javed A. Aslam
Mark Montague
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/253/TR2000-382.ps.Z (106KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/253/TR2000-382.pdf (204KB)
Abstract:
We introduce a new, probabilistic model for combining the outputs of an arbitrary
number of query retrieval systems. By gathering simple statistics on the average
performance of a given set of query retrieval systems, we construct a Bayes optimal
mechanism for combining the outputs of these systems. Our construction yields a
metasearch strategy whose empirical performance nearly always exceeds the performance
of any of the constituent systems. Our construction is also robust in the sense
that if ``good'' and ``bad'' systems are combined, the performance of the composite
is still on par with, or exceeds, that of the best constituent system. Finally,
our model and theory provide theoretical and empirical avenues for the improvement
of this metasearch strategy.Notes:
A Survey of Context-Aware Mobile Computing Research
Dartmouth Technical Report TR2000-381
Guanling Chen
David Kotz
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/252/TR2000-381.ps.Z (879KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/252/TR2000-381.pdf (125KB)
Abstract:
Context-aware computing is a mobile computing paradigm in which applications can
discover and take advantage of contextual information (such as user location, time
of day, nearby people and devices, and user activity). Since it was proposed about
a decade ago, many researchers have studied this topic and built several context-aware
applications to demonstrate the usefulness of this new technology. Context-aware
applications (or the system infrastructure to support them), however, have never
been widely available to everyday users. In this survey of research on context-aware
systems and applications, we looked in depth at the types of context used and models
of context information, at systems that support collecting and disseminating context,
and at applications that adapt to the changing context. Through this survey, it
is clear that context-aware research is an old but rich area for research. The difficulties
and possible solutions we outline serve as guidance for researchers hoping to make
context-aware computing a reality.
Naming and sharing resources across administrative boundaries (errata)
Dartmouth Technical Report TR2000-380
Jon Howell
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/251/TR2000-380.ps.Z (33KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/251/TR2000-380.pdf (285KB)
Abstract:
I tackle the problem of naming and sharing resources across administrative boundaries.
Conventional systems manifest the hierarchy of typical administrative structure
in the structure of their own mechanism. While natural for communication that follows
hierarchical patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users and administrators.
I propose to organize resource naming and security, not around administrative domains,
but around the sharing patterns of users.
The dissertation is organized into four main parts. First, I discuss the challenges
and tradeoffs involved in naming resources and consider a variety of existing approaches
to naming.
Second, I consider the architectural requirements for user-centric sharing. I evaluate
existing systems with respect to these requirements.
Third, to support the sharing architecture, I develop a formal logic of sharing
that captures the notion of restricted delegation. Restricted delegation ensures
that users can use the same mechanisms to share resources consistently, regardless
of the origin of the resource, or with whom the user wishes to share the resource
next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism
to the Simple Public Key Infrastructure and discuss how the formalism either supports
or discourages potential extensions to such a system.
Finally, I use the formalism to drive a user-centric sharing implementation for
distributed systems. I show how this implementation enables end-to-end authorization,
a feature that makes heterogeneous distributed systems more secure and easier to
audit. Conventionally, gateway services that bridge administrative domains, add
abstraction, or translate protocols typically impede the flow of authorization information
from client to server. In contrast, end-to-end authorization enables us to build
gateway services that preserve authorization information, hence we reduce the size
of the trusted computing base and enable more effective auditing. I demonstrate
my implementation and show how it enables end-to-end authorization across various
boundaries. I measure my implementation and argue that its performance tracks that
of similar authorization mechanisms without end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing benefits both users
and administrators.Notes:
Naming and sharing resources across administrative boundaries (Volume II)
Dartmouth Technical Report TR2000-379
Jon Howell
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/250/TR2000-379.ps.Z (485KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/250/TR2000-379.pdf (1181KB)
Abstract:
I tackle the problem of naming and sharing resources across administrative boundaries.
Conventional systems manifest the hierarchy of typical administrative structure
in the structure of their own mechanism. While natural for communication that follows
hierarchical patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users and administrators.
I propose to organize resource naming and security, not around administrative domains,
but around the sharing patterns of users.
The dissertation is organized into four main parts. First, I discuss the challenges
and tradeoffs involved in naming resources and consider a variety of existing approaches
to naming.
Second, I consider the architectural requirements for user-centric sharing. I evaluate
existing systems with respect to these requirements.
Third, to support the sharing architecture, I develop a formal logic of sharing
that captures the notion of restricted delegation. Restricted delegation ensures
that users can use the same mechanisms to share resources consistently, regardless
of the origin of the resource, or with whom the user wishes to share the resource
next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism
to the Simple Public Key Infrastructure and discuss how the formalism either supports
or discourages potential extensions to such a system.
Finally, I use the formalism to drive a user-centric sharing implementation for
distributed systems. I show how this implementation enables end-to-end authorization,
a feature that makes heterogeneous distributed systems more secure and easier to
audit. Conventionally, gateway services that bridge administrative domains, add
abstraction, or translate protocols typically impede the flow of authorization information
from client to server. In contrast, end-to-end authorization enables us to build
gateway services that preserve authorization information, hence we reduce the size
of the trusted computing base and enable more effective auditing. I demonstrate
my implementation and show how it enables end-to-end authorization across various
boundaries. I measure my implementation and argue that its performance tracks that
of similar authorization mechanisms without end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing benefits both users
and administrators.Notes:
Naming and sharing resources across administrative boundaries (Volume I)
Dartmouth Technical Report TR2000-378
Jon Howell
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/249/TR2000-378.ps.Z (922KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/249/TR2000-378.pdf (1316KB)
Abstract:
I tackle the problem of naming and sharing resources across administrative boundaries.
Conventional systems manifest the hierarchy of typical administrative structure
in the structure of their own mechanism. While natural for communication that follows
hierarchical patterns, such systems interfere with naming and sharing that cross
administrative boundaries, and therefore cause headaches for both users and administrators.
I propose to organize resource naming and security, not around administrative domains,
but around the sharing patterns of users.
The dissertation is organized into four main parts. First, I discuss the challenges
and tradeoffs involved in naming resources and consider a variety of existing approaches
to naming.
Second, I consider the architectural requirements for user-centric sharing. I evaluate
existing systems with respect to these requirements.
Third, to support the sharing architecture, I develop a formal logic of sharing
that captures the notion of restricted delegation. Restricted delegation ensures
that users can use the same mechanisms to share resources consistently, regardless
of the origin of the resource, or with whom the user wishes to share the resource
next. A formal semantics gives unambiguous meaning to the logic. I apply the formalism
to the Simple Public Key Infrastructure and discuss how the formalism either supports
or discourages potential extensions to such a system.
Finally, I use the formalism to drive a user-centric sharing implementation for
distributed systems. I show how this implementation enables end-to-end authorization,
a feature that makes heterogeneous distributed systems more secure and easier to
audit. Conventionally, gateway services that bridge administrative domains, add
abstraction, or translate protocols typically impede the flow of authorization information
from client to server. In contrast, end-to-end authorization enables us to build
gateway services that preserve authorization information, hence we reduce the size
of the trusted computing base and enable more effective auditing. I demonstrate
my implementation and show how it enables end-to-end authorization across various
boundaries. I measure my implementation and argue that its performance tracks that
of similar authorization mechanisms without end-to-end structure.
I conclude that my user-centric philosophy of naming and sharing benefits both users
and administrators.Notes:
Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
Dartmouth Technical Report TR2000-377
David Kotz
George Cybenko
Robert S. Gray
Guofei Jiang
Ronald A. Peterson
Martin O. Hofmann
Daria A. Chacon
Kenneth R. Whitebread
James Hendler
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/248/TR2000-377.ps.Z (162KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/248/TR2000-377.pdf (195KB)
Abstract:
Wireless networks are an ideal environment for mobile agents, since their mobility
allows them to move across an unreliable link to reside on a wired host, next to
or closer to the resources that they need to use. Furthermore, client-specific data
transformations can be moved across the wireless link and run on a wired gateway
server, reducing bandwidth demands. In this paper we examine the tradeoffs faced
when deciding whether to use mobile agents in a data-filtering application where
numerous wireless clients filter information from a large data stream arriving across
the wired network. We develop an analytical model and use parameters from filtering
experiments conducted during a U.S. Navy Fleet Battle Experiment (FBE) to explore
the model's implications.Notes:
The complexity of planning with partially-observable Markov decision processes
Dartmouth Technical Report TR2000-376
Martin Mundhenk
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/247/TR2000-376.ps.Z (464KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/247/TR2000-376.pdf (652KB)
Abstract:
This work surveys results on the complexity of planning under uncertainty. The planning
model considered is the partially-observable Markov decision process. The general
planning problems are, given such a process, (a) to calculate its performance under
a given control policy, (b) to find an optimal or approximate optimal control policy,
and (c) to decide whether a good policy exists. The complexity of this and related
problems depend on a variety of factors, including the observability of the process
state,
the compactness of the process representation, the type of policy, or even the number
of actions relative to the number of states. In most cases, the problem can be shown
to be complete for some known complexity class.
The skeleton of this survey are results from Littman, Goldsmith and Mundhenk (Journal
of Artificial Intelligence Research 1998), Mundhenk (Mathematics of Operations Research
2000), Mundhenk, Goldsmith, Lusena and Allender (Journal of the ACM 2000), and Lusena,
Goldsmith and Mundhenk (University of KY CS TR). But there are also some news.
An Economic CPU-Time Market for D'Agents
Dartmouth Technical Report TR2000-375
Ezra E. K. Cooper
Robert S. Gray
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/246/TR2000-375.ps.Z (96KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/246/TR2000-375.pdf (223KB)
Abstract:
A usable and efficient resource-management system has been created for use with D'Agents.
The software dynamically negotiates a price rate for CPU time, using the competitive
bids of mobile agents that offer currency in return for fast computation. The system
allows mobile agents to plan their expenditures across many hosts while minimizing
the time needed for their tasks. The ability to price CPU time opens the door for
service owners to be compensated for the computation consumed by agents and provides
an incentive for servers to allow anonymous agents. We discuss the theoretical background
which makes a CPU market system possible and the performance of the D'Agents market
system.Notes:
Depth from Flash
Dartmouth Technical Report TR2000-373
David B. Martin
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/245/TR2000-373.ps.Z (4921KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/245/TR2000-373.pdf (3694KB)
Abstract:
Digital camera technology has recently seen substantial improvements in image quality
while lower prices have made it affordable to the average consumer. Camera manufacturers,
however, are not taking full advantage of this new medium for image capture. By
filtering the already digitized image produced by these cameras through on-board
image processing algorithms we can dramatically increase the power of digital cameras.
For example, according to experts in the photographic industry, most people simply
take bad pictures. Classic examples of this phenomenon are photographs taken indoors
with a point-and-shoot style camera using its built-in flash. The subjects of these
photographs often seem to have a spotlight on them, making them look bright and
washed out while the rest of the photograph is dark and indistinct. This can primarily
be accounted for by a well known property of point light sources: falloff in brightness
is inversely proportional to the square of the distance between the light and the
object being illuminated. A technique first introduced in the field of computer
vision has been shown to successfully recover information about the distance between
the light source and objects in the world. We propose using this technique, which
is readily implementable in hardware, to correct for a variety of poorly illuminated
digital images.Notes:
Personal Radio
Dartmouth Technical Report TR2000-372
John C. Artz
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/244/TR2000-372.ps.Z (2752KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/244/TR2000-372.pdf (512KB)
Abstract:
With the development of new technologies that allow the broadcast of digital data
over radio signals, there are many possibilities for improving upon the traditional
radio station model for content delivery. The idea of Personal Radio is a system
that tailors content to meet the needs of each individual. Using Global Positioning
System (GPS) technology to play location specific content, the listening history
to play content an appropriate number of times, and user feedback to learn personal
preferences, the Personal Radio provides the listener with the content that is the
most useful/interesting to them. This paper will examine the general design of such
a system and present solutions developed in the implementation of several pieces
of the design.Notes:
An Infrastructure for a Mobile-Agent System that Provides Personalized Services to Mobile Devices
Dartmouth Technical Report TR2000-370
Debbie O. Chyi
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/243/TR2000-370.ps.Z (882KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/243/TR2000-370.pdf (367KB)
Abstract:
In this paper, we present the design of a mobile-agent system that provides a mobile
user with a personalized information retrieval service and we describe the implementation
of the infrastructure for such a system. This "Personal Agent System" gathers information
from the Internet and uses context-aware mechanisms to manage the information according
to a mobile user's needs and preferences. The user's schedule and location are the
context indicators in this system. These indicators are critical in ensuring that
users obtain only the information they want, receive information in a form that
is most useful for viewing on their mobile device, and is notified of new information
in a minimally intrusive manner. The system incorporates a rule-based learning system
to enhance the personalization achieved by the system.Notes:
Registration of Images with Dissimilar Contrast using a Hybrid Method Employing Correlation and Mutual Information
Dartmouth Technical Report TR2000-369
Karolyn A. Abram
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/242/TR2000-369.ps.Z (1731KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/242/TR2000-369.pdf (1433KB)
Abstract:
The problem of fitting one image into another is commonly called "registration."
Finding the best possible translation and rotation necessary to align two images
is one approach to solving this problem. Registration is a crucial component of
many remote sensing and medical image interpretation applications. Image alignment
techniques aid in volumetric estimations of complicated structures and allow radiologists
to accurately identify changes between sequential images. Radiologists require image
alignment capabilities to correct for patient motion and/or content displacement
between images.
Numerous image registration techniques exist for correcting the alignment problems
mentioned above. Unfortunately, most of these techniques, such as Correlation, fail
to find a good alignment when dealing with images that differ in contrast. The Mutual
Information method is able to align images independently of contrast, but it is
computationally intensive. We explore a hybrid technique that utilizes both Correlation
and Mutual Information. The Hybrid technique hopes to gain greater contrast independence
than Correlation alone while achieving a lower running time than a pure Mutual Information
technique.Notes:
A Simulation of Auroral Absorption
Dartmouth Technical Report TR2000-368
Eric Michael Greenberg
Date: January 2000
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/241/TR2000-368.pdf (3023KB)
Abstract:
HF radio transmissions propagate long distances by reflecting off the ionosphere.
At high latitudes radio propagation is strongly affected by the northern lights
(aurora borealis), which causes ionization at low altitudes and hence the absorption
of radio waves. Models of this process are still in a primitive state. A simulation
of radio wave propagation was created in order to test Foppiano and Bradley's empirical
model of auroral absorption. The simulation attempts to predict the net absorption
of signals at a receiver by simulating a large number of transmitters, even though
the exact sources of the signals are unknown. Although the simulation takes into
account auroral and nonauroral absorption as well as other sources of path loss,
the analysis focuses on the nighttime aurora. An intelligent search algorithm is
used in order to efficiently adjust the model to best fit the data. The output of
the simulation is qualitatively and quantitatively compared to signal levels observed
with HF radio receivers located in northern Canada. The analysis allows us to develop
alternative models of auroral absorption which account for the level of geomagnetic
activity, and these are compared to the standard Foppiano and Bradley model.Notes:
Approximation Algorithms for the Minimum Bends Traveling Salesman Problem
Dartmouth Technical Report TR2000-367
Cliff Stein
David P. Wagner
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/240/TR2000-367.ps.Z (102KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/240/TR2000-367.pdf (95KB)
Abstract:
The problem of traversing a set of points in the order that minimizes the total distance
traveled (traveling salesman problem) is one of the most famous and well-studied
problems in combinatorial optimization. It has many applications, and has been a
testbed for many of the must useful ideas in algorithm design and analysis. The
usual metric, minimizing the total distance traveled, is an important one, but many
other metrics are of interest.
In this paper, we introduce the metric of minimizing the number of turns in the
tour, given that the input points are in the Euclidean plane. To our knowledge this
metric has not been studied previously. It is motivated by applications in robotics
and in the movement of other heavy machinery: for many such devices turning is an
expensive operation. We give approximation algorithms for several variants of the
traveling salesman problem for which the metric is to minimize the number of turns.
We call this the minimum bends traveling salesman problem.
For the case of an arbitrary set of $n$ points in the Euclidean plane, we give an
O(lg z)-approximation algorithm, where z is the maximum number of collinear points.
In the worst case z can be as big as n, but z will often be much smaller. For the
case when the lines are restricted to being either horizontal or vertical, we give
a 2-approximation algorithm. If we have the further restriction that no two points
are allowed to have the same x- or y-coordinate, we give an algorithm that finds
a tour which makes at most two turns more than the optimal tour. Thus we have an
approximation algorithm with an additive, rather than a multiplicative error bound.
Beyond the additive error bound, our algorithm for this problem introduces several
interesting algorithmic techniques for decomposing sets of points in the Euclidean
plane that we believe to be of independent interest.Notes:
Performance Analysis of Mobile Agents for Filtering Data Streams on Wireless Networks
Dartmouth Technical Report TR2000-366
David Kotz
Guofei Jiang
Robert S. Gray
George Cybenko
Ronald A. Peterson
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/239/TR2000-366.ps.Z (387KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/239/TR2000-366.pdf (375KB)
Abstract:
Wireless networks are an ideal environment for mobile agents, because their mobility
allows them to move across an unreliable link to reside on a wired host, next to
or closer to the resources they need to use. Furthermore, client-specific data transformations
can be moved across the wireless link, and run on a wired gateway server, with the
goal of reducing bandwidth demands. In this paper we examine the tradeoffs faced
when deciding whether to use mobile agents to support a data-filtering application,
in which numerous wireless clients filter information from a large data stream arriving
across the wired network. We develop an analytical model and use parameters from
our own experiments to explore the model's implications.Notes:
Mobile Agents: Motivations and State-of-the-Art Systems
Dartmouth Technical Report TR2000-365
Robert S. Gray
David Kotz
George Cybenko
Daniela Rus
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/238/TR2000-365.ps.Z (291KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/238/TR2000-365.pdf (397KB)
Abstract:
A mobile agent is an executing program that can migrate, at times of its own choosing,
from machine to machine in a heterogeneous network. On each machine, the agent interacts
with stationary service agents and other resources to accomplish its task. In this
chapter, we first make the case for mobile agents, discussing six strengths of mobile
agents and the applications that benefit from these strengths. Although none of
these strengths are unique to mobile agents, no competing technique shares all six.
In other words, a mobile-agent system provides a single general framework in which
a wide range of distributed applications can be implemented efficiently and easily.
We then present a representative cross-section of current mobile-agent systems.Notes:
Landmarks for absolute localization
Dartmouth Technical Report TR2000-364
Jon Howell
Keith Kotay
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/237/TR2000-364.ps.Z (925KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/237/TR2000-364.pdf (367KB)
Abstract:
For certain experiments in mobile robotics, it is convenient to eliminate positional
estimation error in the interest of analyzing other parts of the experiment. We
designed and implemented a simple, accurate scheme for encoding and recovering absolute
position information. The encoding is a two-dimensional image printed on the plane
of the floor, and the absolute position information is recovered using a downward-looking
video camera mounted on a mobile robot.Notes:
A Formal Semantics for SPKI
Dartmouth Technical Report TR2000-363
Jon Howell
David Kotz
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/236/TR2000-363.ps.Z (372KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/236/TR2000-363.pdf (462KB)
Abstract:
We extend the logic and semantics of authorization due to Abadi, Lampson, et al.
to support restricted delegation. Our formal model provides a simple interpretation
for the variety of constructs in the Simple Public Key Infrastructure (SPKI), and
lends intuition about possible extensions. We discuss both extensions that our semantics
supports and extensions that it cautions against.Notes:
Reducing Mass Degeneracy in SAR by MS by Stable Isotopic Labeling
Dartmouth Technical Report TR2000-362
Chris Bailey-Kellogg
John J. Kelley
Clifford Stein
Bruce Randall Donald
Date: January 2000
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/235/TR2000-362.ps.Z (212KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/235/TR2000-362.pdf (394KB)
Abstract:
Mass spectrometry (MS) promises to be an invaluable tool for functional genomics,
by supporting low-cost, high-throughput experiments. However, large-scale MS faces
the potential problem of mass degeneracy -- indistinguishable masses for multiple
biopolymer fragments (e.g. from a limited proteolytic digest). This paper studies
the tasks of planning and interpreting MS experiments that use selective isotopic
labeling, thereby substantially reducing potential mass degeneracy. Our algorithms
support an experimental-computational protocol called Structure-Activity Relation
by Mass Spectrometry (SAR by MS), for elucidating the function of protein-DNA and
protein-protein complexes. SAR by MS enzymatically cleaves a crosslinked complex
and analyzes the resulting mass spectrum for mass peaks of hypothesized fragments.
Depending on binding mode, some cleavage sites will be shielded; the absence of
anticipated peaks implicates corresponding fragments as either part of the interaction
region or inaccessible due to conformational change upon binding. Thus different
mass spectra provide evidence for different structure-activity relations. We address
combinatorial and algorithmic questions in the areas of data analysis (constraining
binding mode based on mass signature) and experiment planning (determining an isotopic
labeling strategy to reduce mass degeneracy and aid data analysis). We explore the
computational complexity of these problems, obtaining upper and lower bounds. We
report experimental results from implementations of our algorithms.Notes:
An Access-Control Calculus for Spanning Administrative Domains
Dartmouth Technical Report PCS-TR99-361
Jon Howell
David Kotz
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/234/TR99-361.ps.Z (368KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/234/TR99-361.pdf (434KB)
Abstract:
In our quest to give users uniform access to resources unimpeded by administrative
boundaries, we discovered that we needed transitive sharing among users, with the
possibility of restricted access along each sharing link. To achieve that goal,
we extend Lampson et al.'s calculus for access control to support restricted delegations.
We discuss the advantages of our extension, including the simplification of constructs
like ACLs and statement expiration. We also apply our extension to model the Simple
Public Key Infrastructure and make suggestions about its future development. Our
extended calculus exposes some surprising consequences in such systems that use
restricted delegation.Notes:
A Game-Theoretic Formulation of Multi-Agent Resource Allocation
Dartmouth Technical Report PCS-TR99-360
Jonathan Bredin
Rajiv T. Maheswaran
Cagri Imer
Tamer Basar
David Kotz
Daniela Rus
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/233/TR99-360.ps.Z (265KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/233/TR99-360.pdf (230KB)
Abstract:
This paper considers resource allocation in a network with mobile agents competing
for computational priority. We formulate this problem as a multi-agent game with
the players being agents purchasing service from a common server. We show that there
exists a computable Nash equilibrium when agents have perfect information into the
future. We simulate a network of hosts and agents using our strategy to show that
our resource-allocation mechanism effectively prioritizes agents according to their
endowments.Notes:
SAR by MS for Functional Genomics (Structure-Activity Relation by Mass Spectrometry)
Dartmouth Technical Report PCS-TR99-359
Bruce Randall Donald
Chris Bailey-Kellogg
John J. Kelley
Clifford Stein
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/232/TR99-359.ps.Z (254KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/232/TR99-359.pdf (240KB)
Abstract:
Large-scale functional genomics will require fast, high-throughput experimental techniques,
coupled with sophisticated computer algorithms for data analysis and experiment
planning. In this paper, we introduce a combined experimental-computational protocol
called Structure-Activity Relation by Mass Spectrometry (SAR by MS), which
can be used to elucidate the function of protein-DNA or protein-protein complexes.
We present algorithms for SAR by MS and analyze their complexity. Carefully-designed
Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI TOF) and Electrospray
Ionization (ESI) assays require only femtomolar samples, take only microseconds
per spectrum to record, enjoy a resolution of up to one dalton in $10^6$, and (in
the case of MALDI) can operate on protein complexes up to a megadalton in mass.
Hence, the technique is attractive for high-throughput functional genomics.
In SAR by MS, selected residues or nucleosides are 2H-, 13C-, and/or 15N-labeled.
Second, the complex is crosslinked. Third, the complex is cleaved with proteases
and/or endonucleases. Depending on the binding mode, some cleavage sites will be
shielded by the crosslinking. Finally, a mass spectrum of the resulting fragments
is obtained and analyzed. The last step is the Data Analysis phase, in
which the mass signatures are interpreted to obtain constraints on the functional
binding mode. Experiment Planning entails deciding what labeling strategy
and cleaving agents to employ, so as to minimize mass degeneracy and spectral overlap,
in order that the constraints derived in data analysis yield a small number of binding
hypotheses.
A number of combinatorial and algorithmic questions arise in deriving algorithms
for both Experiment Planning and Data Analysis. We explore the complexity of these
problems, obtaining upper and lower bounds. Experimental results are reported from
an implementation of our algorithms.Notes:
The NOESY Jigsaw: Automated Protein Secondary Structure and Main-Chain Assignment from Sparse, Unassigned NMR Data
Dartmouth Technical Report PCS-TR99-358
Chris Bailey-Kellogg
Alik Widge
John J. Kelley
Marcelo J. Berardi
John H. Bushweller
Bruce Randall Donald
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/231/TR99-358.ps.Z (268KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/231/TR99-358.pdf (421KB)
Abstract:
High-throughput, data-directed computational protocols for Structural Genomics (or
Proteomics) are required in order to evaluate the protein products of genes for
structure and function at rates comparable to current gene-sequencing technology.
This paper presents the Jigsaw algorithm, a novel high-throughput, automated approach
to protein structure characterization with nuclear magnetic resonance (NMR). Jigsaw
consists of two main components: (1) graph-based secondary structure pattern identification
in unassigned heteronuclear NMR data, and (2) assignment of spectral peaks by probabilistic
alignment of identified secondary structure elements against the primary sequence.
Jigsaw's deferment of assignment until after secondary structure identification
differs greatly from traditional approaches, which begin by correlating peaks among
dozens of experiments. By deferring assignment, Jigsaw not only eliminates this
bottleneck, it also allows the number of experiments to be reduced from dozens to
four, none of which requires 13C-labeled protein. This in turn dramatically reduces
the amount and expense of wet lab molecular biology for protein expression and purification,
as well as the total spectrometer time to collect data.
Our results for three test proteins demonstrate that we are able to identify and
align approximately 80 percent of alpha-helical and 60 percent of beta-sheet structure.
Jigsaw is extremely fast, running in minutes on a Pentium-class Linux workstation.
This approach yields quick and reasonably accurate (as opposed to the traditional
slow and extremely accurate) structure calculations, utilizing a suite of graph
analysis algorithms to compensate for the data sparseness. Jigsaw could be used
for quick structural assays to speed data to the biologist early in the process
of investigation, and could in principle be applied in an automation-like fashion
to a large fraction of the proteome.Notes:
Investigating Measures for Pairwise Document Similarity
Dartmouth Technical Report PCS-TR99-357
Jeffrey D. Isaacs
Javed A. Aslam
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/230/TR99-357.ps.Z (29KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/230/TR99-357.pdf (29KB)
Abstract:
The need for a more effective similarity measure is growing as a result of the astonishing
amount of information being placed online. Most existing similarity measures are
defined by empirically derived formulas and cannot easily be extended to new applications.
We present a pairwise document similarity measure based on Information Theory, and
present corpus dependent and independent applications of this measure. When ranked
with existing similarity measures over TREC FBIS data, our corpus dependent information
theoretic similarity measure ranked first.Notes:
An Environment for the Facilitation of Robotic Programming
Dartmouth Technical Report PCS-TR99-356
Artem Lifschitz
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/229/TR99-356.ps.Z (116KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/229/TR99-356.pdf (173KB)
Abstract:
I have developed, tested, and evaluated a robot programming environment organized
as a library of flexible data structures to facilitate the creation of robotics
programs. Abstractions are the basis of all of the achievements of Computer Science,
and if it were possible to create a truly flexible, generic abstraction for the
programming of robots -- the science of robotics could advance at a faster pace.
For this reason, I have attempted to implement the abstraction of low-level commands,
and the assembling of them into hierarchies of higher-level actions. My libraries
provide mechanisms for the manipulation and queuing of actions, as well as for the
timing of low-level sensing and actuation. I have tested these libraries by implementing
a cooperative multi-agent formation algorithm in the SoccerServer environment. Often,
when a new level of abstraction is introduced -- significant overhead is added.
However, I have conducted experiments to show that there is no significant overhead
in using this system by comparing the performance of SoccerServer agents that have
been programmed using it against the performance of other agents.Notes:
The Implementation of DaSSF OTcl APIs
Dartmouth Technical Report PCS-TR99-355
Hongxia Quan
Date: January 1999
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/228/TR99-355.pdf (365KB)
Abstract:
As an extension of Tcl, Otcl provides basic functionality for object-oriented programming
in scripting language Tcl. We implemented the Otcl APIs for DaSSF (a parallel simulator
software written in C++ at Dartmouth College) using Tclcl software package written
in University of California at Berkeley. This document discussed the issues involved
in the implementation, especially the communications between C++ objects and Otcl
objects required by DaSSF and the naming problems.Notes:
Computers, Art and Smart Rooms: A Smart Picture Frame that Senses the Weather and Genetically Evolves Images
Dartmouth Technical Report PCS-TR99-354
Marisa E. Kolodny
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/227/TR99-354.ps.Z (1841KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/227/TR99-354.pdf (1055KB)
Abstract:
By using sensors to sense the environment and genetic programming to evolve images,
this thesis explores two methods for developing smart pictures that can be integrated
with a living space. The system presented senses the weather and indoor conditions,
displays current weather and forecast information retrieved from the web, and displays
genetically evolved images. Sensing the weather not only provides the user with
information they might find useful, but also allows the computer to gain a better
understanding of the user which in turn allows the computer to respond more accurately.
Genetic programming allows the computer to better respond to its environment by
evolving fitter programs.Notes:
Using Haptic Vector Fields for Animation Motion Control
Dartmouth Technical Report PCS-TR99-353
Bruce Randall Donald
Frederick Henle
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/226/TR99-353.ps.Z (803KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/226/TR99-353.pdf (563KB)
Abstract:
We are exploring techniques for animation authoring and editing using a haptic force-feedback
device. In our system, a family of animations is encoded by a bundle of trajectories.
This bundle in turn defines a time-varying, higher-order vector field on a configuration
space for the animation. A haptic input device provides a low-dimensional parameterization
of the resulting dynamical system, and the haptic force feedback permits browsing
and editing of the space of animations, by allowing the user to experience the vector
field as physical forces.
An Application of Word Sense Disambiguation to Information Retrieval
Dartmouth Technical Report PCS-TR99-352
Jason M. Whaley
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/225/TR99-352.ps.Z (152KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/225/TR99-352.pdf (172KB)
Abstract:
The problems of word sense disambiguation and document indexing for information retrieval
have been extensively studied. It has been observed that indexing using disambiguated
meanings, rather than word stems, should improve information retrieval results.
We present a new corpus-based algorithm for performing word sense disambiguation.
The algorithm does not need to train on many senses of each word; it uses instead
the probability that certain concepts will occur together. That algorithm is then
used to index several corpa of documents. Our indexing algorithm does not generally
outperform the traditional stem-based tf.idf model.Notes:
Fast Out-of-Core Sorting on Parallel Disk Systems
Dartmouth Technical Report PCS-TR99-351
Matthew D. Pearson
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/224/TR99-351.ps.Z (904KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/224/TR99-351.pdf (134KB)
Abstract:
This paper discusses our implementation of Rajasekaran's (l,m)-mergesort algorithm
(LMM) for sorting on parallel disks. LMM is asymptotically optimal for large problems
and has the additional advantage of a low constant in its I/O complexity. Our implementation
is written in C using the ViC* I/O API for parallel disk systems. We compare the
performance of LMM to that of the C library function qsort on a DEC Alpha server.
qsort makes a good benchmark because it is fast and performs comparatively well
under demand paging. Since qsort fails when the swap disk fills up, we can only
compare these algorithms on a limited range of inputs. Still, on most out-of-core
problems, our implementation of LMM runs between 1.5 and 1.9 times faster than qsort,
with the gap widening with increasing problem size.Notes:
Two Algorithms for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs
Dartmouth Technical Report PCS-TR99-350
Lauren M. Baptist
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/223/TR99-350.ps.Z (756KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/223/TR99-350.pdf (574KB)
Abstract:
We show two algorithms for computing multidimensional Fast Fourier Transforms (FFTs)
on a multiprocessor system with distributed memory when problem sizes are so large
that the data do not fit in the memory of the entire system. Instead, data reside
on a parallel disk system and are brought into memory in sections. We use the Parallel
Disk Model for implementation and analysis. The first method is a straightforward
out-of-core variant of a well-known method for in-core, multidimensional FFTs. It
performs 1-dimensional FFT computations on each dimension in turn. This method is
easy to generalize to any number of dimensions, and it also readily permits the
individual dimensions to be of any sizes that are integer powers of~2. The key step
is an out-of-core transpose operation that places the data along each dimension
into contiguous positions on the parallel disk system so that the data for the 1-dimensional
FFTs are contiguous. The second method is an adaptation of another well-known
method for in-core, multidimensional FFTs. This method computes all dimensions simultaneously.
It is more difficult to generalize to arbitrary radices and number of dimensions
in this method than in the first method. Our present implementation is therefore
limited to two dimensions of equal size, that are again integer powers of~2. We
present I/O complexity analyses for both methods as well as empirical results for
a DEC~2100 server and an SGI Origin~2000, each of which has a parallel disk system.
Our results indicate that the methods are comparable in speed in two-dimensions.Notes:
Improved Computer Detection and Mapping of Cerebral Oxygenation
Dartmouth Technical Report PCS-TR99-349
David H. Kung
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/222/TR99-349.ps.Z (1696KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/222/TR99-349.pdf (460KB)
Abstract:
Near-infrared (NIR) optical image reconstruction that incorporates blood oxygen level
dependant (BOLD) magnetic resonance imaging has the potential to improve both quantifiable
measurement of oxygenation and the spatial resolution involved in such mapping.
My thesis continues some preliminary work in this area through development of an
analytic diffusion parameter estimation algorithm for use with a NIR imaging array
and development of a finite element mesh utility to read a priori BOLD images and
tag them with property elements for NIR image resolution improvement.Notes:
A Two Dimensional Crystalline Atomic Unit Modular Self-reconfigurable Robot
Dartmouth Technical Report PCS-TR99-348
Marsette A. Vona
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/221/TR99-348.ps.Z (6075KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/221/TR99-348.pdf (1819KB)
Abstract:
Self-reconfigurable robots are designed so that they can change their external shape
without human intervention. One general way to achieve such functionality is to
build a robot composed of multiple, identical unit modules. If the modules are designed
so that they can be assembled into rigid structures, and so that individual units
within such structures can be relocated within and about the structure, then self-reconfiguration
is possible. We propose the Crystalline Atomic unit modular self-reconfigurable
robot, where each unit is called an Atom. In two dimensions, an Atom is square.
Connectors at the faces of each Atom support structure formation (such structures
are called Crystals). Centrally placed prismatic degrees of freedom give Atoms the
ability to contract their outer side-length by a constant factor. By contracting
and expanding groups of Atoms in a coordinated way, Atoms can relocate within and
about Crystals. Thus Atoms are shown to satisfy the two properties necessary to
function as modules of a self-reconfigurable robot. A powerful software simulator
for Crystalline Atomic robots in two and three dimensions, called xtalsim, is presented.
Xtalsim includes a high-level language interface for specifying reconfigurations,
an engine which expands implicit reconfiguration plans into explicit Crystal state
sequences, and an interactive animator which displays the results in a virtual environment.
An automated planning algorithm for generating reconfigurations, called the Melt-Grow
planner, is described. The Melt-Grow planner is fast (O(n2) for Crystals of n Atoms)
and complete for a fully general subset of Crystals. The Melt-Grow planner is implemented
and interfaced to xtalsim, and an automatically planned reconfiguration is simulated.
Finally, the mechanics, electronics, and software for an Atom implementation are
developed. Two Atoms are constructed and experiments are performed which indicate
that, with some hardware improvements, an interesting self-reconfiguration could
be demonstrated by a group of Atoms.Notes:
Existence Theorems for Scheduling to Meet Two Objectives
Dartmouth Technical Report PCS-TR99-347
April M. Rasala
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/220/TR99-347.ps.Z (273KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/220/TR99-347.pdf (369KB)
Abstract:
We will look at the existence of schedules which are simultaneously near-optimal
for two criteria. First, we will present some techniques for proving existence theorems,
in a very general setting, for bicriterion scheduling problems. We will then use
these techniques to prove existence theorems for a large class of problems. We will
consider the relationship between objective functions based on completion time,
flow time, lateness and the number of on-time jobs. We will also present negative
results first for the problem of simultaneously minimizing the maximum flow time
and average weighted flow time and second for minimizing the maximum flow time and
simultaneously maximizing the number of on-time jobs. In some cases we will also
present lower bounds and algorithms that approach our bicriterion existence theorems.
Finally we will improve upon our general existence results in one more specific
environment.Notes:
Parallel DaSSF Discrete-Event Simulation without Shared Memory
Dartmouth Technical Report PCS-TR99-346
James D. Chalfant
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/219/TR99-346.ps.Z (223KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/219/TR99-346.pdf (302KB)
Abstract:
The Dartmouth implementation of the Scalable Simulation Framework (DaSSF) is a discrete-event
simulator used primarily in the simulation of networks. It achieves high performance
through parallel processing. DaSSF 1.22 requires shared memory between all processors
in order to operate. This limits the number of processors available and the hardware
platforms that can exploit parallelism. We are interested in extending parallel
DaSSF operation to architectures without shared memory. We explore the requirements
of this by implementing parallel DaSSF using MPI as the sole form of interaction
between processors. The approaches used to achieve this can be abstracted and applied
to the current version of DaSSF. This would allow parallel simulation using shared
memory by processors within a single machine, and also at a higher level between
separate machines using distributed memory.Notes:
Mobile-Agent Planning in a Market-Oriented Environment
Dartmouth Technical Report PCS-TR99-345
Jonathan Bredin
David Kotz
Daniela Rus
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/218/TR99-345.ps.Z (297KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/218/TR99-345.pdf (432KB)
Abstract:
We propose a method for increasing incentives for sites to host arbitrary mobile
agents in which mobile agents purchase their computing needs from host sites. We
present a scalable market-based CPU allocation policy and an on-line algorithm that
plans a mobile agent's expenditure over a multihop ordered itinerary. The algorithm
chooses a set of sites at which to execute and computational priorities at each
site to minimize execution time while preserving a prespecified budget constraint.
We present simulation results of our algorithm to show that our allocation policy
and planning algorithm scale well as more agents are added to the system.Notes:
Greedy Approximation Algorithms for K-Medians by Randomized Rounding
Dartmouth Technical Report PCS-TR99-344
Neal E. Young
Date: January 1999
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/217/TR99-344.ps.Z (140KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/217/TR99-344.pdf (336KB)
Abstract:
We give an improved approximation algorithm for the general k-medians problem. Given
any epsilon>0, the algorithm finds a solution of total distance at most D(1+epsilon)
using at most k ln(n+n/epsilon) medians (a.k.a. sites), provided some solution of
total distance D using k medians exists. This improves over the best previous bound
(w.r.t. the number of medians) by a factor of Omega(1/epsilon) provided 1/epsilon=n^O(1).
The algorithm is a greedy algorithm, derived using the method of oblivious randomized
rounding. It requires at most k ln(n+n/epsilon) linear-time iterations. We also
derive algorithms for fractional and weighted variants of the problem.
Snowflake: Spanning administrative domains
Dartmouth Technical Report PCS-TR98-343
Jon Howell
David Kotz
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/216/TR98-343.ps.Z (137KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/216/TR98-343.pdf (128KB)
Abstract:
Many distributed systems provide a ``single-system image'' to their users, so the
user has the illusion that they are using a single system when in fact they are
using many distributed resources. It is a powerful abstraction that helps users
to manage the complexity of using distributed resources. The goal of the Snowflake
project is to discover how single-system images can be made to span administrative
domains. Our current prototype organizes resources in namespaces and distributes
them using Java Remote Method Invocation. Challenging issues include how much flexibility
should be built into the namespace interface, and how transparent the network and
persistent storage should be. We outline future work on making Snowflake administrator-friendly.
Abstractions for Simplifying Planning in Self-Reconfigurable Robotic Systems
Dartmouth Technical Report PCS-TR98-342
Craig McGray
Daniela Rus
Date: January 1998
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/215/TR98-342.pdf (814KB)
Abstract:
In [KVRM], we described a three-dimensional self-reconfiguring robot module called
the Molecule Robot.
In this paper, we provide a system of abstractions for modules in self-reconfigurable
robotic systems, and show how this system can be used to simplify the motion planning
of the Molecule Robot system.
Boosting a Simple Weak Learner For Classifying Handwritten Digits
Dartmouth Technical Report PCS-TR98-341
Matthew P. Carter
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/214/TR98-341.ps.Z (143KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/214/TR98-341.pdf (101KB)
Abstract:
A weak PAC learner is one which takes labeled training examples and produces a classifier
which can label test examples more accurately than random guessing. A strong learner
(also known as a PAC learner), on the other hand, is one which takes labeled training
examples and produces a classifier which can label test examples arbitrarily accurately.
Schapire has constructively proved that a strong PAC learner can be derived from
a weak PAC learner. A performance boosting algorithm takes a set of training examples
and a weak PAC learning algorithm and generates a strong PAC learner.
Our research attempts to solve the problem of learning a multi-valued function and
then boosting the performance of this learner.
We implemented the AdaBoost.M2 boosting algorithm. We developed a problem-general
weak learning algorithm, capable of running under AdaBoost.M2, for learning a multi-valued
function of uniform length bit sequences.
We applied our learning algorithms to the problem of classifying handwritten digits.
For training and testing data, we used the MNIST dataset.
Our experiments demonstrate the underlying weak learner's ability to achieve a fairly
low error rate on the testing data, as well as the boosting algorithm's ability
to reduce the error rate of the weak learner.Notes:
Hey, You Got Your Language In My Operating System!
Dartmouth Technical Report PCS-TR98-340
Jon Howell
Mark Montague
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/213/TR98-340.ps.Z (150KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/213/TR98-340.pdf (145KB)
Abstract:
Several projects in the operating systems research community suggest a trend of convergence
among features once divided between operating systems and languages. We describe
how partial evaluation and transformational programming systems apply to this trend
by providing a general framework for application support, from compilation to run-time
services. We contend that the community will no longer think of implementing a static
collection of services and calling it an operating system; instead, this general
framework will allow applications to be flexibly configured, and the ``operating
system'' will simply be the application support that is supplied at run-time.
Avoiding Conflicts Dynamically in Direct Mapped Caches with Minimal Hardware Support
Dartmouth Technical Report PCS-TR98-339
Peter N. DeSantis
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/212/TR98-339.ps.Z (770KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/212/TR98-339.pdf (664KB)
Abstract:
The memory system is often the weakest link in the performance of today's computers.
Cache design has received increasing attention in recent years as increases in CPU
performance continues to outpace decreases in memory latency. Bershad et al. proposed
a hardware modification called the Cache Miss Lookaside buffer which attempts to
dynamically identify data which is conflicting in the cache and remap pages to avoid
future conflicts. In a follow-up paper, Bershad et al. tried to modify this idea
to work with standard hardware but had less success than with their dedicated hardware.
In this thesis, we focus on a modification of these ideas, using less complicated
hardware and focusing more on sampling policies. The hardware support is reduced
to a buffer of recent cache misses and a cache miss counter. Because determination
of remapping candidates is moved to software, sampling policies are studied to reduce
overhead which will most likely fall on the OS. Our results show that sampling can
be highly effective in identifying conflicts that should be remapped. Finally, we
show that the theoretical performance of such a system can compare favorably with
more costly higher associativity caches.Notes:
The Effects of Singular Value Decomposition on Collaborative Filtering
Dartmouth Technical Report PCS-TR98-338
Michael H. Pryor
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/211/TR98-338.ps.Z (1439KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/211/TR98-338.pdf (287KB)
Abstract:
As the information on the web increases exponentially, so do the efforts to automatically
filter out useless content and to search for interesting content. Through both explicit
and implicit actions, users define where their interests lie. Recent efforts have
tried to group similar users together in order to better use this data to provide
the best overall filtering capabilities to everyone. This thesis discusses ways
in which linear algebra, specifically the singular value decomposition, can be used
to augment these filtering capabilities to provide better user feedback. The goal
is to modify the way users are compared with one another, so that we can more efficiently
predict similar users. Using data collected from the PhDs.org website, we tested
our hypothesis on both explicit web page ratings and implicit visits data.Notes:
Applications of Parallel I/O
Dartmouth Technical Report PCS-TR98-337
Ron Oldfield
David Kotz
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/210/TR98-337.ps.Z (88KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/210/TR98-337.pdf (166KB)
Abstract:
Scientific applications are increasingly being implemented on massively parallel
supercomputers. Many of these applications have intense I/O demands, as well as
massive computational requirements. This paper is essentially an annotated bibliography
of papers and other sources of information about scientific applications using parallel
I/O. It will be updated periodically.Notes:
Distributed Route Planning Using Partial Map Building
Dartmouth Technical Report PCS-TR98-336
Christine J. Alvarado
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/209/TR98-336.ps.Z (711KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/209/TR98-336.pdf (159KB)
Abstract:
Our goal is to manipulate and guide an object across an unknown environment toward
a goal in a known location in space. Our tools include a system of manipulation
robots, which are "blind" and one mobile scout robot who relies on a series of sonar
sensors for information about the environment. Previous solutions to this problem
have taken a simultaneous guiding and manipulating approach, moving the whole system
under the scout's guidance. My approach, however, presents a separate scouting algorithm
that can return a series of coordinates through which the manipulation system can
safely pass to reach the goal in a static environment. This new approach produces
more optimal paths to the goal, as well as evading the concern of what actions to
take should the entire system reach a dead end. In this paper I will present both
the algorithm and the experimental results I obtained when I built the scouting
system.Notes:
Multiple Media Correlation: Theory and Applications
Dartmouth Technical Report PCS-TR98-335
Charles B. Owen
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/208/TR98-335.ps.Z (1253KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/208/TR98-335.pdf (1588KB)
Abstract:
This thesis introduces multiple media correlation, a new technology for the automatic
alignment of multiple media objects such as text, audio, and video. This research
began with the question: what can be learned when multiple multimedia components
are analyzed simultaneously? Most ongoing research in computational multimedia has
focused on queries, indexing, and retrieval within a single media type. Video is
compressed and searched independently of audio, text is indexed without regard to
temporal relationships it may have to other media data.
Multiple media correlation provides a framework for locating and exploiting correlations
between multiple, potentially heterogeneous, media streams. The goal is computed
synchronization, the determination of temporal and spatial alignments that optimize
a correlation function and indicate commonality and synchronization between media
objects. The model also provides a basis for comparison of media in unrelated domains.
There are many real-world applications for this technology, including speaker localization,
musical score alignment, and degraded media realignment. Two applications, text-to-speech
alignment and parallel text alignment, are described in detail with experimental
validation. Text-to-speech alignment computes the alignment between a textual transcript
and speech-based audio. The presented solutions are effective for a wide variety
of content and are useful not only for retrieval of content, but in support of automatic
captioning of movies and video. Parallel text alignment provides a tool for the
comparison of alternative translations of the same document that is particularly
useful to the classics scholar interested in comparing translation techniques or
styles.
The results presented in this thesis include (a) new media models more useful in
analysis applications, (b) a theoretical model for multiple media correlation, (c)
two practical application solutions that have wide-spread applicability, and (d)
Xtrieve, a multimedia database retrieval system that demonstrates this new technology
and demonstrates application of multiple media correlation to information retrieval.
This thesis demonstrates that computed alignment of media objects is practical
and can provide immediate solutions to many information retrieval and content presentation
problems. It also introduces a new area for research in media data analysis.Notes:
C Compiler Targeting the Java Virtual Machine
Dartmouth Technical Report PCS-TR98-334
Jack Pien
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/207/TR98-334.ps.Z (121KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/207/TR98-334.pdf (114KB)
Abstract:
One of the major drawbacks in the field of computer software development has been
the inability for applications to compile once and execute across many different
platforms. With the emergence of the Internet and the networking of many different
platforms, the Java programming language and the Java Platform was created by Sun
Microsystems to address this "Write Once, Run Anywhere" problem. What sets a compiled
Java program apart from programs compiled from other high level languages is the
ability of a Java Virtual Machine to execute the compiled Java program on any platform,
as long as the Java Virtual Machine is running on top of that platform. Java's cross
platform capabilities can be extended to other high level languages such as C. The
main objective of our project is to implement a compiler targeting the Java Platform
for a subset of the C language. This will allow code written in that subset of C
to be compiled into Java Virtual Machine instructions, also known as JVM bytecode,
which can then be executed on a Java Virtual Machine running on any platform. The
reader is assumed to be intimately familiar with compiler construction, the use
of the flex scanner generator, the use of the GNU bison parser generator, and the
structure and implementation of the Java Virtual Machine.Notes:
An Implementation of External-Memory Depth-First Search
Dartmouth Technical Report PCS-TR98-333
Christopher S. Leon
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/206/TR98-333.ps.Z (103KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/206/TR98-333.pdf (106KB)
Abstract:
In many different areas of computing, problems can arise which are too large to fit
in main memory. For these problems, the I/O cost of moving data between main memory
and secondary storage (for example, disks) becomes a significant bottleneck affecting
the performance of the program. Since most algorithms do not take into account the
size of main memory, new algorithms have been developed to optimize the number of
I/O's performed. This paper details the implementation of one such algorithm, for
external-memory depth-first search. Depth-first search is a basic tool for solving
many problems in graph theory, and since graph theory is applicable for many large
computational problems, it is important to make sure that such a basic tool is designed
to avoid the bottleneck of main memory to secondary storage I/O's. The algorithm
whose implementation is described in this paper is sketched out in an extended abstract
by Chiang et al. We attempt to improve the given algorithm by minimizing I/O's performed,
and to extend the algorithm by finding disjoint trees, and by classifying all the
edges in the problem.Notes:
Multiscouting: Guiding distributed manipulation with multiple mobile sensors
Dartmouth Technical Report PCS-TR98-332
Michael G. Ross
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/205/TR98-332.ps.Z (849KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/205/TR98-332.pdf (287KB)
Abstract:
This thesis investigates the use of multiple mobile sensors to guide the motion of
a distributed manipulation system. In our system, multiple robots cooperatively
place a large object at a goal in a dynamic, unstructured, unmapped environment.
We take the system developed in [Rus, Kabir, Kotay, Soutter 1996], which employs
a single mobile sensor for navigational tasks, and extend it to allow the use of
multiple mobile sensors. This allows the system to perform successful manipulations
in a larger class of spaces than was possible in the single scout model. We focus
on the development of a negotiation protocol that enables multiple scouts to cooperatively
plan system motion. This algorithm enhances the previous' system's scalability and
adds greater fault-tolerance. Two alternate algorithms for cooperation: a modification
of negotiation and a bidding protocol, are also discussed. Finally, an implementation
of the negotiation protocol is described and experimental data produced by the implementation
is analyzed.Notes:
Utility Driven Mobile-Agent Scheduling
Dartmouth Technical Report PCS-TR98-331
Jonathan Bredin
David Kotz
Daniela Rus
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/204/TR98-331.ps.Z (132KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/204/TR98-331.pdf (235KB)
Abstract:
We investigate the possibility of using markets to regulate mobile agents, computer
programs that are capable of migrating from one machine to another. Market participation
requires quantitative information about resource consumption to define demand and
calculate utility.
We create a formal utility model to derive user-demand functions, allowing agents
to efficiently plan expenditure and deal with price fluctuations. By quantifying
demand and utility, resource owners can precisely set a value for a good. We simulate
our model in a mobile agent scheduling environment and show how prices fluctuate,
compounding uncertainty in an agent's plans. To solve this problem, we propose that
resource owners sell options to allow agents to trade away their risk.Notes:
Straightforward Java Persistence Through Checkpointing
Dartmouth Technical Report PCS-TR98-330
Jon Howell
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/203/TR98-330.ps.Z (138KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/203/TR98-330.pdf (107KB)
Abstract:
Several techniques have been proposed for adding persistence to the Java language
environment. This paper describes a scheme based on checkpointing the Java Virtual
Machine, and compares the scheme to other techniques. Checkpointing offers two unique
advantages: first, the implementation is independent of the JVM implementation,
and therefore survives JVM updates; second, because checkpointing saves and restores
execution state, even threads become persistent entities.Notes:
An Implementation of the Vesta Parallel File System API on the Galley Parallel File System
Dartmouth Technical Report PCS-TR98-329
Matthew P. Carter
David Kotz
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/202/TR98-329.ps.Z (202KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/202/TR98-329.pdf (139KB)
Abstract:
To demonstrate the flexibility of the Galley parallel file system and to analyze
the efficiency and flexibility of the Vesta parallel file system interface, we implemented
Vesta's application-programming interface on top of Galley. We implemented the Vesta
interface using Galley's file-access methods, whose design arose from extensive
testing and characterization of the I/O requirements of scientific applications
for high-performance multiprocessors. We used a parallel CPU, parallel I/O, out-of-core
matrix-multiplication application to test the Vesta interface in both its ability
to specify data access patterns and in its run-time efficiency. In spite of its
powerful ability to specify the distribution of regular, non-overlapping data access
patterns across disks, we found that the Vesta interface has some significant limitations.
We discuss these limitations in detail in the paper, along with the performance
results.
A framework for World Wide Web client-authentication protocols
Dartmouth Technical Report PCS-TR98-328
Cem Paya
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/201/TR98-328.ps.Z (542KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/201/TR98-328.pdf (156KB)
Abstract:
Existing client-authentication protocols deployed on the World Wide Web today are
based on conventional distributed systems and fail to address the problems specific
to the application domain. Some of the protocols restrict the mobility of the client
by equating user identity to a machine or network address, others depend on sound
password management strategies, and yet others compromise the privacy of the user
by transmitting personal information for authentication. We introduce a new framework
for client-authentication by separating two goals that current protocols achieve
simultaneously:
1. Maintain persistent sense of identity across different sessions.
2. Prove facts about the user to the site.
These problems are independent, in the sense that any protocol for solving the first
problem can be combined with any protocol for solving the second. Separation of
the two purposes opens up the possibility of designing systems which balance two
conflicting goals, authentication and anonymity. We propose a solution to the first
problem, based on the Digital Signature Standard. The implications of this framework
from the point of view of user privacy are examined. The paper is concluded with
suggestions for integrating the proposed scheme into the existing WWW architecture.Notes:
Agent Tcl: A flexible and secure mobile-agent system
Dartmouth Technical Report PCS-TR98-327
Robert S. Gray
Date: January 1998
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/200/TR98-327.ps.Z (1156KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/200/TR98-327.pdf (1409KB)
Abstract:
A mobile agent is an autonomous program that can migrate under its own control from
machine to machine in a heterogeneous network. In other words, the program can suspend
its execution at an arbitrary point, transport itself to another machine, and then
resume execution from the point of suspension. Mobile agents have the potential
to provide a {em single, general framework} in which a wide range of distributed
applications can be implemented efficiently and easily. Several challenges must
be faced, however, most notably reducing migration overhead, protecting a machine
from malicious agents (and an agent from malicious machines), and insulating the
agent against network and machine failures. Agent Tcl is a mobile-agent system under
development at Dartmouth College that has evolved from a Tcl-only system into a
multiple-language system that currently supports Tcl, Java, and Scheme. In this
thesis, we examine the motivation behind mobile agents, describe the base Agent
Tcl system and its security mechanisms for protecting a machine against malicious
agents, and analyze the system's current performance. Finally, we discuss the security,
fault-tolerance and performance enhancements that will be necessary for Agent Tcl
and mobile agents in general to realize their full potential.Notes:
Utility Driven Mobile-Agent Scheduling
Dartmouth Technical Report PCS-TR98-311
Jonathan Bredin
David Kotz
Daniela Rus
Date: January 0005
Abstract:
Notes:
Market-based Resource Control for Mobile Agents
Dartmouth Technical Report PCS-TR97-326
Jonathan Bredin
David Kotz
Daniela Rus
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/198/TR97-326.ps.Z (132KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/198/TR97-326.pdf (220KB)
Abstract:
Mobile agents are programs that can migrate from machine to machine in a heterogeneous,
partially disconnected network. As mobile agents move across a network, they consume
resources. We discuss a system for controlling the activities of mobile agents that
uses electronic cash, a banking system, and a set of resource managers. We describe
protocols for transactions between agents. We present fixed-pricing and dynamic-pricing
policies for resources. We focus on and analyze the sealed-bid second-price auction
as a mechanism for dynamic pricing.Notes:
Approximating Disjoint-Path Problems Using Greedy Algorithms and Packing Integer Programs
Dartmouth Technical Report PCS-TR97-325
Stavros G. Kolliopoulos
Clifford Stein
Date: January 1997
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/197/TR97-325.pdf (1369KB)
Abstract:
In the edge(vertex)-disjoint path problem we are given a graph $G$ and a set ${cal
T}$ of connection requests. Every connection request in ${cal T}$ is a vertex pair
$(s_i,t_i),$ $1 leq i leq K.$ The objective is to connect a maximum number of the
pairs via edge(vertex)-disjoint paths. The edge-disjoint path problem can be generalized
to the multiple-source unsplittable flow problem where connection request $i$ has
a demand $rho_i$ and every edge $e$ a capacity $u_e.$ All these problems are NP-hard
and have a multitude of applications in areas such as routing, scheduling and bin
packing.
Given the hardness of the problem, we study polynomial-time approximation algorithms.
In this context, a $rho$-approximation algorithm is able to route at least a $1/rho$
fraction of the connection requests. Although the edge- and vertex-disjoint path
problems, and more recently the unsplittable flow generalization, have been extensively
studied, they remain notoriously hard to approximate with a bounded performance
guarantee. For example, even for the simple edge-disjoint path problem, no $o(sqrt{|E|})$-approximation
algorithm is known. Moreover some of the best existing approximation ratios are
obtained through sophisticated and non-standard randomized rounding schemes.
In this paper we introduce techniques which yield algorithms for a wide range of
disjoint-path and unsplittable flow problems. For the general unsplittable flow
problem, even with weights on the commodities, our techniques lead to the first
approximation algorithm and obtain an approximation ratio that matches, to within
logarithmic factors, the $O(sqrt{|E|})$ approximation ratio for the simple edge-disjoint
path problem. In addition to this result and to improved bounds for several disjoint-path
problems, our techniques simplify and unify the derivation of many existing approximation
results.
We use two basic techniques. First, we propose simple greedy algorithms for edge-
and vertex-disjoint paths and second, we propose the use of a framework based on
packing integer programs for more general problems such as unsplittable flow. A
packing integer program is of the form maximize $c^{T}cdot x,$ subject to $Ax leq
b,$ $A,b,c geq 0.$ As part of our tools we develop improved approximation algorithms
for a class of packing integer programs, a result that we believe is of independent
interest.Notes:
Computing Dense Clusters On-line for Information Organization
Dartmouth Technical Report PCS-TR97-324
J. Aslam
K. Pelekhov
Daniela Rus
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/196/TR97-324.ps.Z (122KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/196/TR97-324.pdf (387KB)
Abstract:
We present and analyze the off-line star algorithm for clustering static information
systems and the on-line star algorithm for clustering dynamic information systems.
These algorithms partition a document collection into a number of clusters that
is naturally induced by the collection. We show a lower bound on the accuracy of
the clusters produced by these algorithms. We use the random graph model to show
that both star algorithms produce correct clusters in time Theta(V + E). Finally,
we provide data from extensive experiments.Notes:
ViC*: A Compiler for Virtual-Memory C*
Dartmouth Technical Report PCS-TR97-323
Alex Colvin
Thomas H. Cormen
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/195/TR97-323.ps.Z (107KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/195/TR97-323.pdf (272KB)
Abstract:
This paper describes the functionality of ViC*, a compiler for a variant of the data-parallel
language C* with support for out-of-core data. The compiler translates C* programs
with shapes declared outofcore, which describe parallel data stored on disk. The
compiler output is a SPMD-style program in standard C with I/O and library calls
added to efficiently access out-of-core parallel data. The ViC* compiler also applies
several program transformations to improve out-of-core data layout and access.Notes:
Determining an Out-of-Core FFT Decomposition Strategy for Parallel Disks by Dynamic Programming
Dartmouth Technical Report PCS-TR97-322
Thomas H. Cormen
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/194/TR97-322.ps.Z (97KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/194/TR97-322.pdf (250KB)
Abstract:
We present an out-of-core FFT algorithm based on the in-core FFT method developed
by Swarztrauber. Our algorithm uses a recursive divide-and-conquer strategy, and
each stage in the recursion presents several possibilities for how to split the
problem into subproblems. We give a recurrence for the algorithm's I/O complexity
on the Parallel Disk Model and show how to use dynamic programming to determine
optimal splits at each recursive stage. The algorithm to determine the optimal splits
takes only Theta(lg^2 N) time for an N-point FFT, and it is practical. The out-of-core
FFT algorithm itself takes considerably longer.Notes:
Multiple Media Stream Data Analysis: Theory and Applications (Extended version)
Dartmouth Technical Report PCS-TR97-321
Charles B. Owen
Fillia Makedon
Date: January 2010
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/193/TR97-321.ps.Z (41KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/193/TR97-321.pdf (134KB)
Abstract:
This paper presents a new model for multiple media stream data analysis as well as
descriptions of some applications of this model in development at Dartmouth College.
This model formalizes the exploitation of correlations between multiple, potentially
heterogeneous, media streams in support of numerous application areas. The goal
of the technique is to determine temporal and spatial alignments which optimize
a correlation function and indicate commonality and synchronization between media
streams. It also provides a framework for comparison of media in unrelated domains.
Applications such as text-to-speech alignment, functional magnetic resonance imaging,
speaker localization, and degraded media realignment are described.Notes:
On-Line File Caching
Dartmouth Technical Report PCS-TR97-320
Neal E. Young
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/192/TR97-320.ps.Z (58KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/192/TR97-320.pdf (165KB)
Abstract:
Consider the following file caching problem: in response to a sequence of requests
for files, where each file has a specified size and retrieval cost, maintain a cache
of files of total size at most some specified k so as to minimize the total retrieval
cost. Specifically, when a requested file is not in the cache, bring it into the
cache, pay the retrieval cost, and choose files to remove from the cache so that
the total size of files in the cache is at most k. This problem generalizes previous
paging and caching problems by allowing objects of arbitrary size and cost, both
important attributes when caching files for world-wide-web browsers, servers, and
proxies. We give a simple deterministic on-line algorithm that generalizes many
well-known paging and weighted-caching strategies, including least-recently-used,
first-in-first-out, flush-when-full, and the balance algorithm. On any request sequence,
the total cost incurred by the algorithm is at most k/(k-h+1) times the minimum
possible using a cache of size h <= k. For any algorithm satisfying the latter bound,
we show it is also the case that for most choices of k, the retrieval cost is either
insignificant or the competitive ratio is constant. This helps explain why competitive
ratios of many on-line paging algorithms have been typically observed to be constant
in practice.
Generating, Visualizing and Evaluating High Quality Clusters for Information Organization
Dartmouth Technical Report PCS-TR97-319
J. Aslam
K. Pelekhov
Daniela Rus
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/191/TR97-319.ps.Z (189KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/191/TR97-319.pdf (227KB)
Abstract:
We present and analyze the star clustering algorithm. We discuss an implementation
of this algorithm that supports browsing and document retrieval through information
organization. We define three parameters for evaluating a clustering algorithm to
measure the topic separation and topic aggregation achieved by the algorithm. In
the absence of benchmarks, we present a method for randomly generating clustering
data. Data from our user study shows evidence that the star algorithm is effective
for organizing information.Notes:
An Information Retrieval System for Performing Hierarchical Document Clustering
Dartmouth Technical Report PCS-TR97-318
Eric Hagen
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/190/TR97-318.ps.Z (501KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/190/TR97-318.pdf (141KB)
Abstract:
This thesis presents a system for web-based information retrieval that supports precise
and informative post-query organization (automated document clustering by topic)
to decrease real search time on the part of the user. Most existing Information
Retrieval systems depend on the user to perform intelligent, specific queries with
Boolean operators in order to minimize the set of returned documents. The user essentially
must guess the appropriate keywords before performing the query. Other systems use
a vector space model which is more suitable to performing the document similarity
operations which permit hierarchical clustering of returned documents by topic.
This allows "post query" refinement by the user. The system we propose is a hybrid
beween these two systems, compatibile with the former, while providing the enhanced
document organization permissable by the latter.Notes:
Performing BMMC Permutations Efficiently on Distributed-Memory Multiprocessors with MPI
Dartmouth Technical Report PCS-TR97-317
Thomas H. Cormen
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/189/TR97-317.ps.Z (88KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/189/TR97-317.pdf (269KB)
Abstract:
This paper presents an architecture-independent method for performing BMMC permutations
on multiprocessors with distributed memory. All interprocessor communication uses
the MPI function MPI_Sendrecv_replace(). The number of elements and number of processors
must be powers of 2, with at least one element per processor, and there is no inherent
upper bound on the ratio of elements per processor.
Our method transmits only data without transmitting any source or target indices,
which conserves network bandwidth. When data is transmitted, the source and target
processors implicitly agree on each other's identity and the indices of the elements
being transmitted.
A C-callable implementation of our method is available from Netlib. The implementation
allows preprocessing (which incurs a modest cost) to be factored out for multiple
runs of the same permutation, even if on different data. Data may be laid out in
any one of several ways: processor-major, processor-minor, or anything in between.Notes:
The Complexity Of Clerkship Scheduling
Dartmouth Technical Report PCS-TR97-316
Jon Feldman
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/188/TR97-316.ps.Z (317KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/188/TR97-316.pdf (142KB)
Abstract:
Medical students must complete a clerkship program in their fourth year. Individual
students have preferences for the clerkships to which they are assigned. However,
individual hospitals also have capacities on how many students may be assigned to
each clerkship. The problem of scheduling medical students to clerkships is formalized.
The problem is then placed in a theoretical framework, and the most general case
of Clerkship Scheduling is proven NP-hard. A detailed approximation algorithm is
given, and an implementation of this algorithm is discussed and tested.Notes:
Admission Control Policies for Internet File Transfer Protocols
Dartmouth Technical Report PCS-TR97-315
Simon Holmes a Court
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/187/TR97-315.ps.Z (604KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/187/TR97-315.pdf (216KB)
Abstract:
Server congestion is a major cause of frustration with the Internet. It is not uncommon
for a server with a new release of popular software to be swamped by many times
more clients than it can possibly handle. Current Internet file transfer protocols,
namely FTP and HTTP, do not have any policy to regulate client admission. In this
thesis we are concerned with server admission policies that will improve clients'
experience with servers under heavy load. Using a purpose-built network simulator,
we compare the prevalent protocols with two new protocols that include policies
taken from processor scheduling.
By applying more intelligent client admission policies it is hoped that the quality
of service on the Internet can be improved.Notes:
KLZ: A Prototype X Protocol Compression System
Dartmouth Technical Report PCS-TR97-314
Ka-Tak Lo
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/186/TR97-314.ps.Z (411KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/186/TR97-314.pdf (139KB)
Abstract:
One of the most commonly used graphics protocol is the X Protocol, enabling programs
to display graphics images. When running the X Protocol over the network, a lot
of structured data (messages with fields) need to be transmitted. Delays can be
detected by human users when connected through a low-bandwidth network. The solution
is to compress the X protocol. XRemote, a network version of the X Protocol, uses
Dictionary-based compression. In XRemote, strings are recorded in the dictionary.
When a string repeats, its index in the dictionary is transmitted. Higher Bandwidth
X (HBX) uses statistical modeling techniques instead. A context model, which depends
on the nature of the field in a particular type of message and the frequencies of
the values of the field, is associated with each field. XRemote is much faster than
HBX, but HBX achieves better compression than XRemote. The KLZ system is developed
to take advantage of our knowledge about the fields in the XMotionNotify packet
(what X sends when the mouse moves) and fast Dictionary (LZW) compression. In essence,
KLZ reorders and rewrites fields in the XMotionNotify packet so that the fields
will be more easily compressed by the fast LZ coder. My experiments show that KLZ
compresses this packet nearly as well as HBX, and 5 times better than pure LZ. KLZ
is slightly faster than pure LZ, and and 10 times faster than HBX. Since many modems
already implement LZ compression, KLZ could also be used to reorder data before
passing them to the modem with LZ compression for transmission. This reordering
would lead to vastly improved compression almost for free.Notes:
Self-Organizing File Cabinet
Dartmouth Technical Report PCS-TR97-313
Dawn Lawrie
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/185/TR97-313.ps.Z (496KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/185/TR97-313.pdf (147KB)
Abstract:
This thesis presents a self-organized file cabinet. This file cabinet uses electronic
information to augment the physical world. By using a scanner to transform documents
into electronic files, the self-organized file cabinet can index the documents on
visual and textual information. The self-organized file cabinet helps the user find
the documents at a later date. The focus of this thesis is on the design and evaluation
of the self-organized file cabinet. User studies show that this tool is natural
to use.Notes:
A Split-Phase Interface for Parallel File Systems
Dartmouth Technical Report PCS-TR97-312
Sanjay Khanna
David Kotz
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/184/TR97-312.ps.Z (49KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/184/TR97-312.pdf (157KB)
Abstract:
We describe the effects of a new user-level library for the Galley Parallel File
System. This library allows some pre-existing sequential programs to make use of
the Galley Parallel File System with minimal modification. It permits programs to
efficiently use the parallel file system because the user-level library groups accesses
together. We examine the performance of our library, and we show how code needs
to be modified to use the library.
On the Power of Multi-Objects
Dartmouth Technical Report PCS-TR97-311
Prasad Jayanti
Sanjay Khanna
Date: January 1997
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/183/TR97-311.pdf (815KB)
Abstract:
In the standard ``single-object'' model of shared-memory computing, it is assumed
that a process accesses at most one shared object in each of its steps. In this
paper, we consider a more powerful variant---the ``multi-object'' model---in which
each process may access *any* finite number of shared objects atomically in each
of its steps. We present results that relate the synchronization power of a type
in the multi-object model to its synchronization power in the single-object model.
Although the types fetch&add and swap have the same synchronization power in the
single-object model, Afek, Merritt, and Taubenfeld showed that their synchronization
powers differ in the multi-object model. We prove that this divergence phenomenon
is exhibited {em only/} by types at levels 1 and 2; all higher level types have
the same unbounded synchronization power in the multi-object model stated above.
This paper identifies all possible relationships between a type's synchronization
power in the single-object model and its synchronization power in the multi-object
model.Notes:
Multimedia Data Analysis using ImageTcl (Extended version)
Dartmouth Technical Report PCS-TR97-310
Charles B. Owen
Fillia Makedon
Date: January 2010
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/182/TR97-310.ps.Z (41KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/182/TR97-310.pdf (135KB)
Abstract:
ImageTcl is an new system which provides powerful Tcl/Tk based media scripting capabilities
similar to those of the ViewSystem and Rivl in a unique environment that allows
rapid prototyping and development of new components in the C++ language. Powerful
user tools automate the creation of new components as well as the addition of new
data types and file formats. Applications using ImageTcl at the Dartmouth Experimental
Visualization Laboratory (DEVLAB) include multiple stream media data analysis, automatic
image annotation, and image sequence motion analysis. ImageTcl combines the high
speed of compiled languages with the testing and parameterization advantages of
scripting languages.Notes:
ASML: Automatic Site Markup Language 1.03
Dartmouth Technical Report PCS-TR97-309
Charles B. Owen
Fillia Makedon
Glen Frank
Michael Kenyon
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/181/TR97-309.ps.Z (1252KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/181/TR97-309.pdf (582KB)
Abstract:
Creation of large and complex World Wide Web sites is hampered by the "page at a
time" approach of many tools and the programming knowledge and custom software development
required for automated solutions. This report describes the development of the Automatic
Site Markup Language (ASML). ASML is a new markup language designed to produce large,
complicated web sites which can include dynamic content.
ASML extends HTML with new, high-level features while still preserving complete
compatibility with common browser and server technologies. It has powerful indexing
and searching facilities, and enables the automatic translation of document formats.
Most importantly, ASML provides HTML-like features at the site level rather than
just the page level
An Efficient Scheme for a Distributed Video Retrieval System for Remote Users
Dartmouth Technical Report PCS-TR97-308
Fillia Makedon
James Matthews
Charles B. Owen
Samuel A. Rebelsky
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/180/TR97-308.ps.Z (29KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/180/TR97-308.pdf (37KB)
Abstract:
The new era of digital video and multimedia technologies has created the potential
for large libraries of digital video. With this new technology come the challenges
of creating usable means by which such large and diverse depositories of digital
information (digital libraries) can be efficiently queried and accessed so that
(a) the response is fast, (b) the communication over the Internet is minimal and
(c) the retrieval is characterized by high precision and recall.
In this paper we discuss how existing digital video editing tools, together with
data compression techniques, can be combined to create a fast, accurate and cost
effective video retrieval system for remote users. The traditional approaches employed
in text databases, such as keyword searching and volume browsing, are inadequate
mechanisms for a video retrieval system for remote users because, (a) they don't
apply to video at all, or (b) they are not practical due to the amounts of data
involved, or (c) they have insufficient resolution to be useful in a video archive.
New techniques must be developed that facilitate the query and selection of digital
video. This paper presents one such scheme.
Automatic Video Pause Detection Filter
Dartmouth Technical Report PCS-TR97-307
Xiaowen Liu
Charles B. Owen
Fillia Makedon
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/179/TR97-307.ps.Z (179KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/179/TR97-307.pdf (231KB)
Abstract:
Increasing interest in multimedia research has been drawn upon the development of
video indexing and content-based image retrieval techniques. In this report, we
proposed several pause detection algorithms, which instead of searching for significant
visual transitions, the algorithms detect significant pauses in video streams. A
realization of the algorithms was implemented using ImageTcl toolkit developed at
Dartmouth Experimental Visualization Laboratory. In addition to proposing and studying
the effectiveness of the pause detection algorithms, another major goal will be
to incorporate our algorithms into ImageTcl and test the stability and applicability
of the ImageTcl environment. Priliminary experiments showed relatively good results
of our pause detection algorithms.
AGDB: A Debugger for Agent Tcl
Dartmouth Technical Report PCS-TR97-306
Melissa Hirschl
David Kotz
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/178/TR97-306.ps.Z (139KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/178/TR97-306.pdf (206KB)
Abstract:
The Agent Tcl language is an extension of Tcl/Tk that supports distributed programming
in the form of transportable agents. AGDB is a debugger for the Agent Tcl language.
AGDB mixes of traditional and distributed debugging facilities. Traditional debugging
features include breakpoints (line-specific, conditional, and once-only), watch
conditions and variables, and interrupts. Distributed-debugging features address
issues inherent in distributed programming such as migration and communication.
These capabilities make debugging distributed programs difficult because they add
complexities like race conditions to the set of problems a program can encounter.
This paper discusses how AGDB uses distributed debugging features to debug agents.
The Self-Organizing Desk
Dartmouth Technical Report PCS-TR97-305
Daniela Rus
P. deSantis
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/177/TR97-305.ps.Z (962KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/177/TR97-305.pdf (624KB)
Abstract:
The self-organizing desk is a system that enhances a physical desk-top with electronic
information. It can remember, organize, update, and manipulate the information contained
in the documents on a desk. The system consists of a simple robot eye that can survey
the desk, a module for smart extraction of information from the images taken by
the robot, a module for representing this information in multiple views, and a module
that allows a user to interact with this information.Notes:
Automated Parallelization of Discrete State-space Generation
Dartmouth Technical Report PCS-TR97-304
David M. Nicol
Gianfranco F. Ciardo
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/176/TR97-304.ps.Z (164KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/176/TR97-304.pdf (273KB)
Abstract:
We consider the problem of generating a large state-space in a distributed fashion.
Unlike previously proposed solutions that partition the set of reachable states
according to a hashing function provided by the user, we explore heuristic methods
that completely automate the process. The first step is an initial random walk through
the state space to initialize a search tree, duplicated in each processor. Then,
the reachability graph is built in a distributed way, using the search tree to assign
each newly found state to classes assigned to the available processors. Furthermore,
we explore two remapping criteria that attempt to balance memory usage or future
workload, respectively. We show how the cost of computing the global snapshot required
for remapping will scale up for system sizes in the foreseeable future. An extensive
set of results is presented to support our conclusions that remapping is extremely
beneficial.Notes:
Multiprocessor Out-of-Core FFTs with Distributed Memory and Parallel Disks
Dartmouth Technical Report PCS-TR97-303
Thomas H. Cormen
Jake Wegmann
David M. Nicol
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/175/TR97-303.ps.Z (152KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/175/TR97-303.pdf (309KB)
Abstract:
This paper extends an earlier out-of-core Fast Fourier Transform (FFT) method for
a uniprocessor with the Parallel Disk Model (PDM) to use multiple processors. Four
out-of-core multiprocessor methods are examined. Operationally, these methods differ
in the size of "mini-butterfly" computed in memory and how the data are organized
on the disks and in the distributed memory of the multiprocessor. The methods also
perform differing amounts of I/O and communication. Two of them have the remarkable
property that even though they are computing the FFT on a multiprocessor, all interprocessor
communication occurs outside the mini-butterfly computations. Performance results
on a small workstation cluster indicate that except for unusual combinations of
problem size and memory size, the methods that do not perform interprocessor communication
during the mini-butterfly computations require approximately 86% of the time of
those that do. Moreover, the faster methods are much easier to implement.Notes:
Cross-input Amortization Captures the Diffuse Adversary
Dartmouth Technical Report PCS-TR96-302
Neal E. Young
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/174/TR96-302.ps.Z (61KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/174/TR96-302.pdf (156KB)
Abstract:
Koutsoupias and Papadimitriou recently raised the question of how well deterministic
on-line paging algorithms can do against a certain class of adversarially biased
random inputs. Such an input is given in an on-line fashion; the adversary determines
the next request probabilistically, subject to the constraint that no page may be
requested with probability more than a fixed $epsilon>0$.
In this paper, we answer their question by estimating, within a factor of two, the
optimal competitive ratio of any deterministic on-line strategy against this adversary.
We further analyze randomized on-line strategies, obtaining upper and lower bounds
within a factor of two. These estimates reveal the qualitative changes as $epsilon$
ranges continuously from 1 (the standard model) towards 0 (a severely handicapped
adversary).
The key to our upper bounds is a novel charging scheme that is appropriate for adversarially
biased random inputs. The scheme adjusts the costs of each input so that the expected
cost of a random input is unchanged, but working with adjusted costs, we can obtain
worst-case bounds on a per-input basis. This lets us use worst-case analysis techniques
while still thinking of some of the costs as expected costs.
High Quality Alias Free Image Rotation
Dartmouth Technical Report PCS-TR96-301
Charles B. Owen
Fillia Makedon
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/173/TR96-301.ps.Z (2125KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/173/TR96-301.pdf (496KB)
Abstract:
This paper presents new algorithms for the rotation of images. The primary design
criteria for these algorithms is very high quality. Common methods for image rotation,
including convolutional and separable approaches, are examined and shown to exhibit
significant high frequency aliasing problems. A new resampling filter design methodology
is presented which minimizes the problem for conventional convolution-based image
rotation. The paper also presents a new separable image rotation algorithm which
exhibits improved performance in term of reduction in artifacts and an efficient
$O(N^{2} log N)$ running time.Notes:
Galley: A New Parallel File System for Parallel Applications
Dartmouth Technical Report PCS-TR96-300
Nils Nieuwejaar
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/172/TR96-300.ps.Z (493KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/172/TR96-300.pdf (765KB)
Abstract:
Most current multiprocessor file systems are designed to use multiple disks in parallel,
using the high aggregate bandwidth to meet the growing I/O requirements of parallel
scientific applications. Most multiprocessor file systems provide applications with
a conventional Unix-like interface, allowing the application to access those multiple
disks transparently. This interface conceals the parallelism within the file system,
increasing the ease of programmability, but making it difficult or impossible for
sophisticated application and library programmers to use knowledge about their I/O
to exploit that parallelism. In addition to providing an insufficient interface,
most current multiprocessor file systems are optimized for a different workload
than they are being asked to support. In this work we examine current multiprocessor
file systems, as well as how those file systems are used by scientific applications.
Contrary to the expectations of the designers of current parallel file systems,
the workloads on those systems are dominated by requests to read and write small
pieces of data. Furthermore, rather than being accessed sequentially and contiguously,
as in uniprocessor and supercomputer workloads, files in multiprocessor file systems
are accessed in regular, structured, but non-contiguous patterns. Based on our observations
of multiprocessor workloads, we have designed Galley, a new parallel file system
that is intended to efficiently support realistic scientific multiprocessor workloads.
In this work, we introduce Galley and discuss its design and implementation. We
describe Galley's new three-dimensional file structure and discuss how that structure
can be used by parallel applications to achieve higher performance. We introduce
several new data-access interfaces, which allow applications to explicitly describe
the regular access patterns we found to be common in parallel file system workloads.
We show how these new interfaces allow parallel applications to achieve tremendous
increases in I/O performance. Finally, we discuss how Galley's new file structure
and data-access interfaces can be useful in practice.
A Critique of the Telecommunications Description Language (TeD)
Dartmouth Technical Report PCS-TR96-299
Brian J. Premore
David M. Nicol
Xiaowen Liu
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/171/TR96-299.ps.Z (69KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/171/TR96-299.pdf (227KB)
Abstract:
TeD is an object-oriented description language designed to facilitate the modeling
of large scale telecommunication networks, with simulation on parallel and distributed
platforms. TeD models are mapped to the Georgia Tech Time Warp engine (GTW) for
execution. In this paper we outline the features of TeD, pointing out its strengths
and identifying characteristics that gave us trouble as we used TeD to model detailed
networks. Our issues are motivated specifically by a model of TCP and a model of
multicast resource allocation. Our intention is to illustrate by example what TeD
can do, and characteristics that a potential TeD user should be aware of.
The Dark Side of Risk (What your mother never told you about Time Warp)
Dartmouth Technical Report PCS-TR96-298
David M. Nicol
Xiaowen Liu
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/170/TR96-298.ps.Z (77KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/170/TR96-298.pdf (197KB)
Abstract:
This paper is a reminder of the danger of allowing ``risk'' when synchronizing a
parallel discrete-event simulation: a simulation code that runs correctly on a serial
machine may, when run in parallel, fail catastrophically. This can happen when Time
Warp presents an ``inconsistent'' message to an LP, a message that makes absolutely
no sense given the LP's state. Failure may result if the simulation modeler did
not anticipate the possibility of this inconsistency. While the problem is not new,
there has been little discussion of how to deal with it; furthermore the problem
may not be evident to new users or potential users of parallel simulation. This
paper shows how the problem may occur, and the damage it may cause. We show how
one may eliminate inconsistencies due to lagging rollbacks and stale state, but
then show that so long as risk is allowed it is still possible for an LP to be placed
in a state that is inconsistent with model semantics, again making it vulnerable
to failure. We finally show how simulation code can be tested to ensure safe execution
under a risk-free protocol. Whether risky or risk-free, we conclude that under current
practice the development of correct and safe parallel simulation code is not transparent
to the modeler; certain protections must be included in model code or model testing
that are not rigorously necessary if the simulation were executed only serially.
Applications of Parallel I/O
Dartmouth Technical Report PCS-TR96-297
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/169/TR96-297.ps.Z (58KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/169/TR96-297.pdf (158KB)
Abstract:
Scientific applications are increasingly being implemented on massively parallel
supercomputers. Many of these applications have intense I/O demands, as well as
massive computational requirements. This paper is essentially an annotated bibliography
of papers and other sources of information about scientific applications using parallel
I/O. It will be updated periodically.Notes:
Tuning STARFISH
Dartmouth Technical Report PCS-TR96-296
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/168/TR96-296.ps.Z (145KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/168/TR96-296.pdf (279KB)
Abstract:
STARFISH is a parallel file-system simulator we built for our research into the concept
of disk-directed I/O. In this report, we detail steps taken to tune the file systems
supported by STARFISH, which include a traditional parallel file system (with caching)
and a disk-directed I/O system. In particular, we now support two-phase I/O, use
smarter disk scheduling, increased the maximum number of outstanding requests that
a compute processor may make to each disk, and added gather/scatter block transfer.
We also present results of the experiments driving the tuning effort.
On the Existence of Schedules that are Near-Optimal for both Makespan and Total Weighted Completion time
Dartmouth Technical Report PCS-TR96-295
Clifford Stein
Joel Wein
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/167/TR96-295.ps.Z (56KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/167/TR96-295.pdf (173KB)
Abstract:
We give a simple proof that, for any instance of a very general class of scheduling
problems, there exists a schedule of makespan at most twice that of the optimal
possible and of total weighted completion time at most twice that of the optimal
possible. We then refine the analysis, yielding variants of this theorem with improved
constants, and give some algorithmic consequences of the technique.
Performing Out-of-Core FFTs on Parallel Disk Systems
Dartmouth Technical Report PCS-TR96-294
Thomas H. Cormen
David M. Nicol
Date: January 1997
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/166/TR96-294.ps.Z (119KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/166/TR96-294.pdf (289KB)
Abstract:
The Fast Fourier Transform (FFT) plays a key role in many areas of computational
science and engineering. Although most one-dimensional FFT problems can be solved
entirely in main memory, some important classes of applications require out-of-core
techniques. For these, use of parallel I/O systems can improve performance considerably.
This paper shows how to perform one-dimensional FFTs using a parallel disk system
with independent disk accesses. We present both analytical and experimental results
for performing out-of-core FFTs in two ways: using traditional virtual memory with
demand paging, and using a provably asymptotically optimal algorithm for the Parallel
Disk Model (PDM) of Vitter and Shriver. When run on a DEC 2100 server with a large
memory and eight parallel disks, the optimal algorithm for the PDM runs up to 144.7
times faster than in-core methods under demand paging. Moreover, even including
I/O costs, the normalized times for the optimal PDM algorithm are competitive, or
better than, those for in-core methods even when they run entirely in memory.Notes:
Early Experiences in Evaluating the Parallel Disk Model with the ViC* Implementation
Dartmouth Technical Report PCS-TR96-293
Thomas H. Cormen
Melissa Hirschl
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/165/TR96-293.ps.Z (177KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/165/TR96-293.pdf (363KB)
Abstract:
Although several algorithms have been developed for the Parallel Disk Model (PDM),
few have been implemented. Consequently, little has been known about the accuracy
of the PDM in measuring I/O time and total time to perform an out-of-core computation.
This paper analyzes timing results on a uniprocessor with several disks for two
PDM algorithms, out-of-core radix sort and BMMC permutations, to determine the strengths
and weaknesses of the PDM. The results indicate the following. First, good PDM
algorithms are usually not I/O bound. Second, of the four PDM parameters, two (problem
size and memory size) are good indicators of I/O time and running time, but the
other two (block size and number of disks) are not. Third, because PDM algorithms
tend not to be I/O bound, asynchronous I/O effectively hides I/O times. The software
interface to the PDM is part of the ViC* run-time library. The interface is a set
of wrappers that are designed to be both efficient and portable across several parallel
file systems and target machines.Notes:
FFTs for the 2-Sphere-Improvements and Variations
Dartmouth Technical Report PCS-TR96-292
Dennis M. Healy
Daniel N. Rockmore
Sean S. B. Moore
Date: January 1996
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/164/TR96-292.pdf (2404KB)
Abstract:
Object Oriented Scenes for Virtual Light
Dartmouth Technical Report PCS-TR96-291
Jonathan A. Moore
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/163/TR96-291.ps.Z (75KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/163/TR96-291.pdf (139KB)
Abstract:
Ray tracing is one of many way to use a computer to generate an image. Ray tracers
produce images by simulating light. Eliminating the details that might distract
one from the interesting parts of ray tracing algorithms was purpose of my thesis
project. The software I have written can be divide into three parts: the virtual
frame buffer, the support classes and the ray tracing abstract base classes. The
virtual frame buffer class, vfb, provides a simple means of rendering and studying
the final image produced by a graphical algorithm. The support classes provide an
elegant notation for the equations involved in ray tracing. The ray tracing base
classes and associated classes provide a object oriented structure for defining
the objects that make up a scene.Notes:
MRI On the Fly: Accelerating MRI Imaging Using LDA Classification with LDB Feature Extraction
Dartmouth Technical Report PCS-TR96-290
Y. Joy Ko
Michael B. Taylor
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/162/TR96-290.ps.Z (2334KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/162/TR96-290.pdf (2222KB)
Abstract:
To improve MRI acquisition time, we explored the uses of linear discriminant analysis
(LDA), and local discriminant bases (LDB) for the task of classifying MRI images
using a minimal set of signal acquisitions.
Our algorithm has both off-line and on-line components. The off-line component
uses the k-basis algorithm to partition a set of training images (all from a particular
region of a patient) into classes. For each class, we find a basis by applying the
best basis algorithm on the images in that class. We keep these bases to be used
by the on-line process. We then apply LDB to the training set with the class assignments,
determining the best discriminant basis for the set. We rank the basis coordinates
according to discriminating power, and retain the top M coordinates for the on-line
algorithm. We keep the top M coordinates, which index the basis functions with the
most discriminating capability, for on-line purposes. Finally, we train LDA on these
transformed coordinates, producing a classifier for the images.
With the off-line requirements complete, we can take advantage of the simplicity
and speed of the on-line mechanism to acquire an image in a similar region of the
patient. We need acquire only the M important coordinates of the image in the discriminant
basis to create a ``scout image.'' This image, which can be acquired quickly since
M is much much smaller than the number of measurements needed to fill in the values
of the 256 by 256 pixels, is then sent through the map furnished by LDA which in
turn assigns a class to the image. Returning to the list of bases that we kept from
the k-bases algorithm, we find the optimal basis for the particular class at hand.
We then acquire the image using that optimal basis, omitting the coefficients with
the least truncation error. The complete image can then be quickly reconstructed
using the inverse wavelet packet transform.
The power of our algorithm is that the on-line task is fast and simple, while the
computational complexity lies mostly in the off-line task that needs to be done
only once for images in a certain region. In addition, our algorithm only makes
use of the flexibility of MRI hardware, so no modifications in hardware design are
needed.Notes:
Segmenting Workstation Screen Images
Dartmouth Technical Report PCS-TR96-289
Denis M. Serenyi
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/161/TR96-289.ps.Z (927KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/161/TR96-289.pdf (398KB)
Abstract:
None.Notes:
The Panda Array I/O Library on the Galley Parallel File System
Dartmouth Technical Report PCS-TR96-288
Joel T. Thomas
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/160/TR96-288.ps.Z (457KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/160/TR96-288.pdf (103KB)
Abstract:
The Panda Array I/O library, created at the University of Illinois, Urbana-Champaign,
was built especially to address the needs of high-performance scientific applications.
I/O has been one of the most frustrating bottlenecks to high performance for quite
some time, and the Panda project is an attempt to ameliorate this problem while
still providing the user with a simple, high-level interface. The Galley File System,
with its hierarchical structure of files and strided requests, is another attempt
at addressing the performance problem. My project was to redesign the Panda Array
library for use on the Galley file system. This project involved porting Panda's
three main functions: a checkpoint function for writing a large array periodically
for 'safekeeping,' a restart function that would allow a checkpointed file to be
read back in, and finally a timestep function that would allow the user to write
a group of large arrays several times in a sequence. Panda supports several different
distributions in both the compute-node memories and I/O-node disks.
We have found that the Galley File System provides a good environment on which to
build high-performance libraries, and that the mesh of Panda and Galley was a successful
combination.Notes:
Implementation and Analysis of Software Based Fault Isolation
Dartmouth Technical Report PCS-TR96-287
Scott M. Silver
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/159/TR96-287.ps.Z (132KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/159/TR96-287.pdf (98KB)
Abstract:
Extensible applications rely upon user-supplied, untrusted modules to extend their
functionality. To remain reliable, applications must isolate themselves from user
modules. One method places each user module in a separate address space (process),
which uses hardware virtual memory support to isolate the user process. Costly inter-process
communication, however, prohibits frequent communication between the application
and the untrusted module. We implemented and analyzed a software method for isolating
an application from user modules. The technique uses a single address space. We
provide a logical address space and per-module access to system resources for each
module. Our software technique is a two-step process. First, we augment a module's
code so that it cannot access any address outside of an assigned range. Second,
we prevent the module from using system calls to access resources outside of its
fault domain.
This method for software isolation has two particular advantages over processes.
First, for frequently communicating modules, we significantly reduce context switch
time. Thus, we demonstrate near-optimal inter-module communication using software
fault isolation. Second, our software-based techniques provide an efficient and
expedient solution in situations where only one address space is available (e.g.,
kernel, or a single-address-space operating system).Notes:
The Galley Parallel File System
Dartmouth Technical Report PCS-TR96-286
Nils Nieuwejaar
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/158/TR96-286.ps.Z (160KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/158/TR96-286.pdf (274KB)
Abstract:
Most current multiprocessor file systems are designed to use multiple disks in parallel,
using the high aggregate bandwidth to meet the growing I/O requirements of parallel
scientific applications. Many multiprocessor file systems provide applications with
a conventional Unix-like interface, allowing the application to access multiple
disks transparently. This interface conceals the parallelism within the file system,
increasing the ease of programmability, but making it difficult or impossible for
sophisticated programmers and libraries to use knowledge about their I/O needs to
exploit that parallelism. In addition to providing an insufficient interface, most
current multiprocessor file systems are optimized for a different workload than
they are being asked to support. We introduce Galley, a new parallel file system
that is intended to efficiently support realistic scientific multiprocessor workloads.
We discuss Galley's file structure and application interface, as well as the performance
advantages offered by that interface.Notes:
Mobile agents for mobile computing
Dartmouth Technical Report PCS-TR96-285
Robert S. Gray
David Kotz
Saurab Nog
Daniela Rus
George Cybenko
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/157/TR96-285.ps.Z (94KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/157/TR96-285.pdf (212KB)
Abstract:
Mobile agents are programs that can move through a network under their own control,
migrating from host to host and interacting with other agents and resources on each.
We argue that these mobile, autonomous agents have the potential to provide a convenient,
efficient and robust programming paradigm for distributed applications, particularly
when partially connected computers are involved. Partially connected computers include
mobile computers such as laptops and personal digital assistants as well as modem-connected
home computers, all of which are often disconnected from the network. In this paper,
we describe the design and implementation of our mobile-agent system, Agent Tcl,
and the specific features that support mobile computers and disconnected operation.
These features include network-sensing tools and a docking system that allows an
agent to transparently move between mobile computers, regardless of when the computers
connect to the network.
DartFlow: A Workflow Management System on the Web using Transportable Agents
Dartmouth Technical Report PCS-TR96-283
Ting Cai
Peter A. Gloor
Saurab Nog
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/156/TR96-283.ps.Z (237KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/156/TR96-283.pdf (111KB)
Abstract:
Workflow management systems help streamline business processes and increase productivity.
This paper describes the design and implementation of the DartFlow workflow management
system. DartFlow uses Web-browser embedded Java applets as its front end and transportable
agents as the backbone. While Java applets provide a safe and platform independent
GUI, the use of transportable agents makes DartFlow highly flexible and scalable.
This paper describes the design and implementation of DartFlow, as well as a workflow
application that exploits DartFlow's agent-based design.
Some Applications of Generalized FFTs
Dartmouth Technical Report PCS-TR96-282
Daniel N. Rockmore
Date: January 1996
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/155/TR96-282.pdf (1732KB)
Abstract:
Generalized FFTs are efficient algorithms for computing a Fourier transform of a
function defined on finite group, or a bandlimited function defined on a compact
group. The development of such algorithms has been accompanied and motivated by
a growing number of both potential and realized applications. This paper will attempt
to survey some of these applications. Appendices include some more detailed examples.
Generalized FFTS - A Survey of Some Recent Results
Dartmouth Technical Report PCS-TR96-281
David K. Maslen
Daniel N. Rockmore
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/154/TR96-281.ps.Z (231KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/154/TR96-281.pdf (554KB)
Abstract:
In this paper we survey some recent work directed towards generalizing the fast Fourier
transform (FFT). We work primarily from the point of view of group representation
theory. In this setting the classical FFT can be viewed as a family of efficient
algorithms for computing the Fourier transform of either a function defined on a
finite abelian group, or a bandlimited function on a compact abelian group. We discuss
generalizations of the FFT to arbitrary finite groups and compact Lie groups.Notes:
An RPC Mechanism for Transportable Agents
Dartmouth Technical Report PCS-TR96-280
Saurab Nog
Sumit Chawla
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/153/TR96-280.ps.Z (85KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/153/TR96-280.pdf (230KB)
Abstract:
Transportable agents are autonomous programs that migrate from machine to machine,
performing complex processing at each step to satisfy client requests. As part of
their duties agents often need to communicate with other agents. We propose to use
remote procedure call (RPC) along with a flexible interface definition language
(IDL), to add structure to inter-agent communication. The real power of our Agent
RPC comes from a client-server binding mechanism based on flexible IDL matching
and from support for multiple simultaneous bindings. Our agents are programmed in
Agent Tcl; we describe how the Tcl implementation made RPC particularly easy to
implement. Finally, although our RPC is designed for Agent Tcl programs, the concepts
would also work for standard Tcl programs.
Fast compression of transportable Tcl scripts
Dartmouth Technical Report PCS-TR96-279
Robert S. Gray
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/152/TR96-279.ps.Z (46KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/152/TR96-279.pdf (127KB)
Abstract:
An information agent is charged with the task of searching a collection of electronic
resources for information that is relevant to the user's current needs. These resources
are often distributed across a network and can contain tremendous quantities of
data. One of the paradigms that has been suggested for allowing efficient access
to such resources is transportable agents -- the agent is sent to the machine that
maintains the information resource; the agent executes on this remote machine and
then returns its results to the local machine. We have implemented a transportable
agent system that uses the Tool Command Language (Tcl) as the agent language.
Each Tcl script can suspend its execution at an arbitrary point, transport itself
to another machine and resume execution on the new machine. The execution state
of the script -- which includes the commands that have not been executed -- must
be transmitted to the new machine. Although the execution state tends to be small,
there will be a large number of agents moving across the network in a large-scale
system. Thus it is desirable to compress the execution state as much as possible.
Furthermore any compression scheme must be fast so that it does not become a bottleneck
between the transportable agent system and the network routines. In this paper we
explore several fast compression methods.
Transportable Information Agents
Dartmouth Technical Report PCS-TR96-278
Robert S. Gray
Daniela Rus
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/151/TR96-278.ps.Z (182KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/151/TR96-278.pdf (214KB)
Abstract:
We have designed and implemented autonomous software agents. Autonomous software
agents navigate independently through a heterogeneous network. They are capable
of sensing the network configuration, monitoring software conditions, and interacting
with other agents. Autonomous agents are implemented as transportable programs,
e.g., programs that are capable of suspending execution, moving to a different machine,
and starting from where they left off. We illustrate the intelligent behavior of
autonomous agents in the context of distributed information-gathering tasks.Notes:
Compositional Reasoning is not possible in Determining the Solvability of Consensus
Dartmouth Technical Report PCS-TR96-277
Prasad Jayanti
Date: January 1996
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/150/TR96-277.pdf (503KB)
Abstract:
Consensus, which requires processes with different input values to eventually agree
on one of these values, is a fundamental problem in fault-tolerant computing. We
study this problem in the context of asynchronous shared-memory systems. In our
model, shared-memory consists of a sequence of cells and supports a specific set
of operations. Prior research on consensus focussed on its solvability in shared-memories
supporting specific operations. In this paper, we investigate the following general
question: Let OP1 and OP2 be any two sets of operations such that each set includes
read and write operations. Suppose there is no consensus protocol for N processes
in a shared-memory that supports only operations in OP1 and in a shared-memory that
supports only operations in OP2. Does it follow that there is no consensus protocol
for N processes in a shared-memory that supports all operations in OP1 and all operations
in OP_2? This question is in the same spirit as the robustness question, but there
are significant differences, both conceptually and in the models of shared-memory
for which the two questions are studied. For deterministic types, the robustness
question has been known to have a positive answer, In contrast, we prove that the
answer to the question posed above is negative even if operations are deterministic.
A Fast Parallel Implementation of the Wavelet Packet Best Basis Algorithm on the MP-2 for Real-Time MRI
Dartmouth Technical Report PCS-TR96-271
Sumit Chawla
Dennis M. Healy
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/149/TR96-271.ps.Z (381KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/149/TR96-271.pdf (233KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/149/TR96-271.pdf (233KB)
Abstract:
Adaptive signal representations such as those determined by best-basis type algorithms
have found extensive application in image processing, although their use in real-time
applications may be limited by the complexity of the algorithm. In contrast to the
wavelet transform which can be computed in O(n) time, the full wavelet packet expansion
required for the standard best basis search takes O(n log n) time to compute. In
the parallel world, however, both transforms take O(log n) to compute when the number
of processors equal the number of data elements, making the wavelet packet expansion
attractive to implement. This note describes near real-time performance obtained
with a parallel implementation of best basis algorithms for Wavelet Packet bases.
The platform for our implementation is a DECmpp 12000/Sx 2000, a parallel machine
identical to the MasPar MP-2. The DECmpp is a single instruction, multiple data
(SIMD) system; such systems support a data parallel programming model, a model well
suited to the task at hand. We have implemented the 1D and the 2D WPT on this machine
and our results show a significant speedup over the sequential counterparts. In
the 1D case we almost attain the theoretical speedup, while in the 2D case we increase
execution speed by about two orders of magnitude. The current implementation of
the 1D transform is limited to signals of length 2048, and the 2D transform is limited
to images of size: 32x32, 64x64, and 128x128. We are currently working on extending
our transform to handle signals and images of larger size.
A Queuing Analysis of Bandwidth Allocation Schemes for Compressed Video
Dartmouth Technical Report PCS-TR96-257
Saurab Nog
Carl J. Beckmann
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/148/TR96-257.ps.Z (133KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/148/TR96-257.pdf (328KB)
Abstract:
Video and audio compression techniques allow continuous media streams to be transmitted
at bit rates that are a function of the delivered quality of service. Digital networks
will be increasingly used for the transmission of such continuous media streams.
This paper describes an admission control policy in which the quality of service
is negotiated at stream initiation, and is a function of both the desired quality
of service and the available bandwidth resources. The advantage of this approach
is the ability to robustly service large numbers of users, while providing increased
quality of service during low usage periods. Several simple algorithms for implementing
this policy are described and evaluated using queuing model analysis applied to
video-on-demand. The queuing model results are compared with simulation results
to validate their accuracy.Notes:
Information retrieval, information structure, and information agents
Dartmouth Technical Report PCS-TR96-255
Daniela Rus
Devika Subramanian
Date: January 1996
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/147/TR96-255.pdf (2869KB)
Abstract:
This paper presents a customizable architecture for software agents that capture
and access information in large, heterogeneous, distributed electronic repositories.
The key idea is to exploit underlying structure at various levels of granularity
to build high-level indices with task-specific interpretations. Information agents
construct such indices and are configured as a network of reusable modules called
structure detectors and segmenters. We illustrate our architecture with the design
and implementation of smart information filters in two contexts: retrieving stock
market data from Internet newsgroups, and retrieving technical reports from Internet
ftp sites.
A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet
Dartmouth Technical Report PCS-TR95-273
Saurab Nog
David Kotz
Date: January 1996
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/146/TR95-273.ps.Z (116KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/146/TR95-273.pdf (304KB)
Abstract:
Communication is a very important factor affecting distributed applications. Getting
a close handle on network performance (both bandwidth and latency) is thus crucial
to understanding overall application performance. We benchmarked some of the metrics
of network performance using two sets of experiments, namely roundtrip and datahose.
The tests were designed to measure a combination of network latency, bandwidth,
and contention. We repeated the tests for two protocols (TCP/IP and MPI) and three
networks (100 Mbit FDDI (Fiber Distributed Data Interface), 100 Mbit Fast Ethernet,
and 10 Mbit Ethernet). The performance results provided interesting insights into
the behaviour of these networks under different load conditions and the software
overheads associated with an MPI implementation (MPICH). This document presents
details about the experiments, their results, and our analysis of the performance.
Revised on 1/8/96 to emphasize our use of a particular MPI implementation, MPICH.
Finding Real-Valued Single-Source Shortest Paths in o(n^3) Expected Time
Dartmouth Technical Report PCS-TR95-272
Stavros G. Kolliopoulos
Clifford Stein
Date: January 1995
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/145/TR95-272.pdf (798KB)
Abstract:
Given an $n$-vertex directed network $G$ with real costs on the edges and a designated
source vertex $s$, we give a new algorithm to compute shortest paths from $s$. Our
algorithm is a simple deterministic one with $O(n^2 log n)$ expected running time
over a large class of input distributions.
The shortest path problem is an old and fundamental problem with a host of applications.
Our algorithm is the first strongly-polynomial algorithm in over 35 years to improve
upon some aspect of the running time of the celebrated Bellman-Ford algorithm for
arbitrary networks, with any type of cost assignments.
A Fast Parallel Implementation of the Wavelet Packet Best Basis Algorithm on the MP-2 for Real-Time MRI
Dartmouth Technical Report PCS-TR95-271
Sumit Chawla
Dennis M. Healy
Date: January 1996
Abstract:
This technical report is equivalent to technical report PCS-TR96-271, issued in October
1996.Notes:
Interfaces for Disk-Directed I/O
Dartmouth Technical Report PCS-TR95-270
David Kotz
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/143/TR95-270.ps.Z (43KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/143/TR95-270.pdf (142KB)
Abstract:
In other papers I propose the idea of disk-directed I/O for multiprocessor file systems.
Those papers focus on the performance advantages and capabilities of disk-directed
I/O, but say little about the application-programmer's interface or about the interface
between the compute processors and I/O processors. In this short note I discuss
the requirements for these interfaces, and look at many existing interfaces for
parallel file systems. I conclude that many of the existing interfaces could be
adapted for use in a disk-directed I/O system.
An API for Choreographing Data Accesses
Dartmouth Technical Report PCS-TR95-267
Elizabeth A. M. Shriver
Leonard F. Wisniewski
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/142/TR95-267.ps.Z (120KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/142/TR95-267.pdf (279KB)
Abstract:
Current APIs for multiprocessor multi-disk file systems are not easy to use in developing
out-of-core algorithms that choreograph parallel data accesses. Consequently, the
efficiency of these algorithms is hard to achieve in practice. We address this deficiency
by specifying an API that includes data-access primitives for data choreography.
With our API, the programmer can easily access specific blocks from each disk in
a single operation, thereby fully utilizing the parallelism of the underlying storage
system. Our API supports the development of libraries of commonly-used higher-level
routines such as matrix-matrix addition, matrix-matrix multiplication, and BMMC
(bit-matrix-multiply/complement) permutations. We illustrate our API in implementations
of these three high-level routines to demonstrate how easy it is to use.Notes:
Complexity Analysis of Two Permutations Used by Fast Cosine Transform Algorithms
Dartmouth Technical Report PCS-TR95-266
Sean S. B. Moore
Leonard F. Wisniewski
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/141/TR95-266.ps.Z (98KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/141/TR95-266.pdf (272KB)
Abstract:
Recently developed fast cosine transform (FCT) algorithms require fewer operations
than any other known general algorithm. Similar to related fast transform algorithms
(e.g., the FFT), these algorithms permute the data before, during, or after the
computation of the transform. The choice of this permutation may be an important
consideration in reducing the complexity of the permutation algorithm. In this paper,
we derive the complexity to generate the permutation mappings used in these FCT
algorithms for power-of-2 data sets by representing them as linear index transformations
and translating them into combinational circuits. Moreover, we show that one of
these permutations not only allows efficient implementation, but is also self-invertible,
i.e., we can use the same circuit to generate the permutation mapping for both the
fast cosine transform and its inverse, like the bit-reversal permutation used by
FFT algorithms. These results may be useful to designers of low-level algorithms
for implementing fast cosine transforms.
Structured Permuting in Place on Parallel Disk Systems
Dartmouth Technical Report PCS-TR95-265
Leonard F. Wisniewski
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/140/TR95-265.ps.Z (146KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/140/TR95-265.pdf (288KB)
Abstract:
The ability to perform permutations of large data sets in place reduces the amount
of necessary available disk storage. The simplest way to perform a permutation often
is to read the records of a data set from a source portion of data storage, permute
them in memory, and write them to a separate target portion of the same size. It
can be quite expensive, however, to provide disk storage that is twice the size
of very large data sets. Permuting in place reduces the expense by using only a
small amount of extra disk storage beyond the size of the data set. This paper features
in-place algorithms for commonly used structured permutations. We have developed
an asymptotically optimal algorithm for performing BMMC (bit-matrix-multiply/complement)
permutations in place that requires at most $frac{2N}{BD}left( 2ceil{frac{rank{gamma}}{lg
(M/B)}} + frac{7}{2}right)$ parallel disk accesses, as long as $M geq 2BD$, where
$N$ is the number of records in the data set, $M$ is the number of records that
can fit in memory, $D$ is the number of disks, $B$ is the number of records in a
block, and $gamma$ is the lower left $lg (N/B) times lg B$ submatrix of the characteristic
matrix for the permutation. This algorithm uses $N+M$ records of disk storage and
requires only a constant factor more parallel disk accesses and insignificant additional
computation than a previously published asymptotically optimal algorithm that uses
$2N$ records of disk storage. We also give algorithms to perform mesh and torus
permutations on a $d$-dimensional mesh. The in-place algorithm for mesh permutations
requires at most $3ceil{N/BD}$ parallel I/Os and the in-place algorithm for torus
permutations uses at most $4dN/BD$ parallel I/Os. The algorithms for mesh and torus
permutations require no extra disk space as long as the memory size~$M$ is at least~$3BD$.
The torus algorithm improves upon the previous best algorithm in terms of both time
and space.
Process Migration for Heterogeneous Distributed Systems
Dartmouth Technical Report PCS-TR95-264
Matt Bishop
Mark Valence
Leonard F. Wisniewski
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/139/TR95-264.ps.Z (100KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/139/TR95-264.pdf (273KB)
Abstract:
The policies and mechanisms for migrating processes in a distributed system become
more complicated in a heterogeneous environment, where the hosts may differ in their
architecture and operating systems. These distributed systems include a large quantity
and great diversity of resources which may not be fully utilized without the means
to migrate processes to the idle resources. In this paper, we present a graph model
for single process migration which can be used for load balancing as well as other
non-traditional scenarios such as migration during the graceful degradation of a
host. The graph model provides the basis for a layered approach to implementing
the mechanisms for process migration in a Heterogeneous Migration Facility (HMF).
HMF provides the user with a library to automatically migrate processes and checkpoint
data.
File-Access Characteristics of Parallel Scientific Workloads
Dartmouth Technical Report PCS-TR95-263
Nils Nieuwejaar
David Kotz
Apratim Purakayastha
Carla Schlatter Ellis
Michael Best
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/138/TR95-263.ps.Z (178KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/138/TR95-263.pdf (273KB)
Abstract:
Phenomenal improvements in the computational performance of multiprocessors have
not been matched by comparable gains in I/O system performance. This imbalance has
resulted in I/O becoming a significant bottleneck for many scientific applications.
One key to overcoming this bottleneck is improving the performance of parallel file
systems.
The design of a high-performance parallel file system requires a comprehensive understanding
of the expected workload. Unfortunately, until recently, no general workload studies
of parallel file systems have been conducted. The goal of the CHARISMA project was
to remedy this problem by characterizing the behavior of several production workloads,
on different machines, at the level of individual reads and writes. The first set
of results from the CHARISMA project describe the workloads observed on an Intel
iPSC/860 and a Thinking Machines CM-5. This paper is intended to compare and contrast
these two workloads for an understanding of their essential similarities and differences,
isolating common trends and platform-dependent variances. Using this comparison,
we are able to gain more insight into the general principles that should guide parallel
file-system design.Notes:
A 2-2/3 Approximation for the Shortest Superstring Problem
Dartmouth Technical Report PCS-TR95-262
Chris Armen
Clifford Stein
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/137/TR95-262.ps.Z (137KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/137/TR95-262.pdf (302KB)
Abstract:
Given a collection of strings S={s_1, ..., s_n} over an alphabet Sigma, a superstring
alpha of S is a string containing each s_i as a substring; that is, for each i,
1<=i<=n, alpha contains a block of |s_i| consecutive characters that match s_i exactly.
The shortest superstring problem is the problem of finding a superstring alpha of
minimum length.
The shortest superstring problem has applications in both data compression and
computational biology. In data compression, the problem is a part of a general model
of string compression proposed by Gallant, Maier and Storer (JCSS '80). Much of
the recent interest in the problem is due to its application to DNA sequence assembly.
The problem has been shown to be NP-hard; in fact, it was shown by Blum et al.(JACM
'94) to be MAX SNP-hard. The first O(1)-approximation was also due to Blum et al.,
who gave an algorithm that always returns a superstring no more than 3 times the
length of an optimal solution. Several researchers have published results that improve
on the approximation ratio; of these, the best previous result is our algorithm
ShortString, which achieves a 2 3/4-approximation (WADS '95).
We present our new algorithm, G-ShortString, which achieves a ratio of 2 2/3. It
generalizes the ShortString algorithm, but the analysis differs substantially from
that of ShortString. Our previous work identified classes of strings that have a
nested periodic structure, and which must be present in the worst case for our algorithms.
We introduced machinery to descibe these strings and proved strong structural properties
about them. In this paper we extend this study to strings that exhibit a more relaxed
form of the same structure, and we use this understanding to obtain our improved
result.
Ph.D. Thesis Proprosal: Transportable Agents
Dartmouth Technical Report PCS-TR95-261
Robert S. Gray
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/136/TR95-261.ps.Z (170KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/136/TR95-261.pdf (366KB)
Abstract:
One of the paradigms that has been suggested for allowing efficient access to remote
resources is transportable agents. A transportable agent is a named program that
can migrate from machine to machine in a heterogeneous network. The program chooses
when and where to migrate. It can suspend its execution at an arbitrary point, transport
to another machine and resume execution on the new machine. Transportable agents
have several advantages over the traditional client/server model. Transportable
agents consume less network bandwidth and do not require a connection between communicating
machines -- this is attractive in all networks and particularly attractive in wireless
networks. Transportable agents are a convenient paradigm for distributed computing
since they hide the communication channels but not the location of the computation.
Transportable agents allow clients and servers to program each other. However transportable
agents pose numerous challenges such as security, privacy and efficiency. Existing
transportable agent systems do not meet all of these challenges. In addition there
has been no formal characterization of the performance of transportable agents.
This thesis addresses these weakness. The thesis has two parts -- (1) formally characterize
the performance of transportable agents through mathematical analysis and network
simulation and (2) implement a complete transportable agent system.
Simulation of a Video-on-Demand System
Dartmouth Technical Report PCS-TR95-260
Song Bac Toh
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/135/TR95-260.ps.Z (494KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/135/TR95-260.pdf (122KB)
Abstract:
This paper presents a simulation study of a video-on-demand system. The focus of
the study is the effectiveness of different caching strategies on a video-on-demand
system with two levels of cache, RAM and disks, in front of a tape library. Using
an event-driven simulator, I show that caching was helpful in increasing the service
capacity of the system. On-demand caching showed its advantages especially when
the requests were clustered around a few popular titles (in other words, there was
temporal locality).Notes:
A Multiple Discrete Pass Algorithm on a DEC Alpha 2100
Dartmouth Technical Report PCS-TR95-259
Scott R. Cushman
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/134/TR95-259.ps.Z (93KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/134/TR95-259.pdf (218KB)
Abstract:
Notes:
TIAS: A Transportable Intelligent Agent System
Dartmouth Technical Report PCS-TR95-258
Kenneth Harker
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/133/TR95-258.ps.Z (43KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/133/TR95-258.pdf (70KB)
Abstract:
In recent years, there has been an explosive growth in the amount of information
available to our society. In particular, the amount of information available on-line
through vast networks like the global Internet has been growing at a staggering
rate. This growth rate has by far exceeded the rate of growth in network speeds,
as has the number of individuals and organizations seeking access to this information.
There is thus a motivation to find abstract methods of manipulating this on-line
data in ways that both serve the needs of end users efficiently and use network
resources intelligently. In lieu of a traditional client-server model of information
processing, which is both inflexible and potentially very inefficient, a Transportable
Intelligent Agent system has the potential to achieve a more efficient and flexible
network system. An intelligent agent is a program that models the information space
for a user, and allows the user to specify how the information is to be processed.
A transportable agent can suspend its execution, transport itself to a new location
on a network, and resume execution at the new location. This is a particularly attractive
model for both wireless and dialup networks where a user might not be able to maintain
a permanent network connection, as well as for situations where the amount of information
to be processed is large relative to the network bandwidth. Preliminary work in
the field has shown that such agent systems are possible and deserve further study.
This thesis describes a prototype transportable intelligent agent system that extends
work already done in the field. Agents are written in a modified version of the
Tcl programming language and transported using TCP/IP connections. Several simple
examples demonstrate the properties of the system.Notes:
Issues and Obstacles with Multimedia Authoring (renumbered)
Dartmouth Technical Report PCS-TR95-256
Fillia Makedon
Samuel A. Rebelsky
Matthew Cheyney
Charles B. Owen
Peter A. Gloor
Date: January 1995
Abstract:
Notes:
Expanding the Potential for Disk-Directed I/O
Dartmouth Technical Report PCS-TR95-254
David Kotz
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/131/TR95-254.ps.Z (88KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/131/TR95-254.pdf (166KB)
Abstract:
As parallel computers are increasingly used to run scientific applications with large
data sets, and as processor speeds continue to increase, it becomes more important
to provide fast, effective parallel file systems for data storage and for temporary
files. In an earlier work we demonstrated that a technique we call disk-directed
I/O has the potential to provide consistent high performance for large, collective,
structured I/O requests. In this paper we expand on this potential by demonstrating
the ability of a disk-directed I/O system to read irregular subsets of data from
a file, and to filter and distribute incoming data according to data-dependent functions.Notes:
Low-level Interfaces for High-level Parallel I/O
Dartmouth Technical Report PCS-TR95-253
Nils Nieuwejaar
David Kotz
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/130/TR95-253.ps.Z (95KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/130/TR95-253.pdf (212KB)
Abstract:
As the I/O needs of parallel scientific applications increase, file systems for multiprocessors
are being designed to provide applications with parallel access to multiple disks.
Many parallel file systems present applications with a conventional Unix-like interface
that allows the application to access multiple disks transparently. By tracing all
the activity of a parallel file system in a production, scientific computing environment,
we show that many applications exhibit highly regular, but non-consecutive I/O access
patterns. Since the conventional interface does not provide an efficient method
of describing these patterns, we present three extensions to the interface that
support strided, nested-strided, and nested-batched I/O requests. We show how these
extensions can be used to express common access patterns.Notes:
Content-based image retrieval: color and edges
Dartmouth Technical Report PCS-TR95-252
Robert S. Gray
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/129/TR95-252.ps.Z (643KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/129/TR95-252.pdf (474KB)
Abstract:
One of the tools that will be essential for future electronic publishing is a powerful
image retrieval system. The author should be able to search an image database for
images that convey the desired information or mood; a reader should be able to search
a corpus of published work for images that are relevant to his or her needs. Most
commercial image retrieval systems associate keywords or text with each image and
require the user to enter a keyword or textual description of the desired image.
This text-based approach has numerous drawbacks -- associating keywords or text
with each image is a tedious task; some image features may not be mentioned in the
textual description; some features are ``nearly impossible to describe with text'';
and some features can be described in widely different ways [Niblack, 1993].
In an effort to overcome these problems and improve retrieval performance, researchers
have focused more and more on content-based image retrieval in which retrieval is
accomplished by comparing image features directly rather than textual descriptions
of the image features. Features that are commonly used in content-based retrieval
include color, shape, texture and edges. In this report we describe a simple content-based
system that retrieves color images on the basis of their color distributions and
edge characteristics. The system uses two retrieval techniques that have been described
in the literature -- i.e. histogram intersection to compare color distributions
and sketch comparison to compare edge characteristics.
The performance of the system is evaluated and various extensions to the existing
techniques are proposed.
Disk-directed I/O for an Out-of-Core Computation
Dartmouth Technical Report PCS-TR95-251
David Kotz
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/128/TR95-251.ps.Z (109KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/128/TR95-251.pdf (248KB)
Abstract:
New file systems are critical to obtain good I/O performance on large multiprocessors.
Several researchers have suggested the use of collective file-system operations,
in which all processes in an application cooperate in each I/O request. Others have
suggested that the traditional low-level interface (read, write, seek) be augmented
with various higher-level requests (e.g., read matrix), allowing the programmer
to express a complex transfer in a single (perhaps collective) request. Collective,
high-level requests permit techniques like two-phase I/O and disk-directed I/O to
significantly improve performance over traditional file systems and interfaces.
Neither of these techniques have been tested on anything other than simple benchmarks
that read or write matrices. Many applications, however, intersperse computation
and I/O to work with data sets that cannot fit in main memory. In this paper, we
present the results of experiments with an ``out-of-core'' LU-decomposition program,
comparing a traditional interface and file system with a system that has a high-level,
collective interface and disk-directed I/O. We found that a collective interface
was awkward in some places, and forced additional synchronization. Nonetheless,
disk-directed I/O was able to obtain much better performance than the traditional
system.Notes:
DartCVL: The Dartmouth C Vector Library
Dartmouth Technical Report PCS-TR95-250
Thomas H. Cormen
Sumit Chawla
Preston Crow
Melissa Hirschl
Roberto Hoyle
Keith D. Kotay
Rolf H. Nelson
Nils Nieuwejaar
Scott M. Silver
Michael B. Taylor
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/127/TR95-250.ps.Z (95KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/127/TR95-250.pdf (244KB)
Abstract:
As a class project, we implemented a version of CVL, the C Vector Library, on a DECmpp
12000/Sx 2000, which is equivalent to the MasPar MP-2 massively parallel computer.
We compare our implementation, DartCVL, to the University of North Carolina implementation,
UnCvl.
DartCVL was designed for the MP-2 architecture and UnCvl was designed for the MP-1.
Because the MasPar MP-1 and MP-2 are functionally equivalent, both DartCVL and UnCvl
will run on either. Differences in the designs of the two machines, however, may
lead to different software design decisions. DartCVL differs from UnCvl in two key
ways. First, DartCVL uses hierarchical virtualization, whereas UnCvl uses cut-and-stack.
Second, DartCVL runs as much serial code as possible on the console, whereas UnCvl
runs all serial code on the Array Control Unit (ACU). The console (a DECstation
5000/240 at Dartmouth) has a significantly faster serial processor than the ACU.
DartCVL is optimized for the MP-2, and our timing results indicate that it usually
runs faster than UnCvl on the 2048-processor machine at Dartmouth.
ViC*: A Preprocessor for Virtual-Memory C*
Dartmouth Technical Report PCS-TR94-243
Thomas H. Cormen
Alex Colvin
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/126/TR94-243.ps.Z (81KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/126/TR94-243.pdf (241KB)
Abstract:
This paper describes the functionality of ViC*, a compiler-like preprocessor for
out-of-core C*. The input to ViC* is a C* program but with certain shapes declared
verb`outofcore`, which means that all parallel variables of these shapes reside
on disk. The output is a standard C* program with the appropriate I/O and library
calls added for efficient access to out-of-core parallel variables.
Building Multimedia Proceedings: The Roles of Video in Interactive Electronic Conference Proceedings
Dartmouth Technical Report PCS-TR94-241
Samuel A. Rebelsky
Fillia Makedon
James Matthews
Charles B. Owen
Laura Bright
Kenneth Harker
Nancy Toth
Panagiotis Metaxas
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/125/TR94-241.pdf (2969KB)
Abstract:
Incremental Equational Programming
Dartmouth Technical Report PCS-TR94-240
Samuel A. Rebelsky
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/124/TR94-240.pdf (821KB)
Abstract:
The Design and Development of Interactive Multimedia Conference Proceedings
Dartmouth Technical Report PCS-TR94-239
Samuel A. Rebelsky
James Ford
Kenneth Harker
Fillia Makedon
Panagiotis Metaxas
Charles B. Owen
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/123/TR94-239.pdf (914KB)
Abstract:
Efficient Parallel Algorithms for Closest Point Problems
Dartmouth Technical Report PCS-TR94-238
Peter Su
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/122/TR94-238.pdf (5998KB)
Abstract:
This dissertation develops and studies fast algorithms for solving closest point
problems. Algorithms for such problems have applications in many areas including
statistical classification, crystallography, data compression, and finite element
analysis. In addition to a comprehensive empirical study of known sequential methods,
I introduce new parallel algorithms for these problems that are both efficient and
practical. I present a simple and flexible programming model for designing and analyzing
parallel algorithms. Also, I describe fast parallel algorithms for nearest-neighbor
searching and constructing Voronoi diagrams. Finally, I demonstrate that my algorithms
actually obtain good performance on a wide variety of machine architectures.
The key algorithmic ideas that I examine are exploiting spatial locality, and random
sampling. Spatial decomposition provides allows many concurrent threads to work
independently of one another in local areas of a shared data structure. Random sampling
provides a simple way to adaptively decompose irregular problems, and to balance
workload among many threads. Used together, these techniques result in effective
algorithms for a wide range of geometric problems.
The key experimental ideas used in my thesis are simulation and animation. I use
algorithm animation to validate algorithms and gain intuition about their behavior.
I model the expected performance of algorithms using simulation experiences, and
some knowledge as to how much critical primitive operations will cost on a given
machine. In addition, I do this without the burden of esoteric computational models
that attempt to cover every possible variable in the design of a computer system.
An iterative process of design, validation, and simulation delays the actual implementation
until as many details as possible are accounted for. Then, further experiments are
used to turn the implementations for better performance.Notes:
Distributed Scheduling in Finite Capacity Networks
Dartmouth Technical Report PCS-TR94-236
Perry Fizzano
Clifford Stein
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/121/TR94-236.ps.Z (54KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/121/TR94-236.pdf (127KB)
Abstract:
We consider the problem of scheduling unit-sized jobs in a distributed network of
processors. Each processor only knows the number of jobs it and its neighbors have.
We give an analysis of intuitive algorithm and prove that the algorithm produces
schedules that are within a logarithmic factor of the length of the optimal schedule
given that the optimal schedule is sufficiently long.
A DAta-Parallel Programming Library for Education (DAPPLE)
Dartmouth Technical Report PCS-TR94-235
David Kotz
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/120/TR94-235.ps.Z (55KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/120/TR94-235.pdf (191KB)
Abstract:
In the context of our overall goal to bring the concepts of parallel computing into
the undergraduate curriculum, we set out to find a parallel-programming language
for student use. To make it accessible to students at all levels, and to be independent
of any particular hardware platform, we chose to design our own language, based
on a data-parallel model and on C++. The result, DAPPLE, is a C++ class library
designed to provide the illusion of a data-parallel programming language on conventional
hardware and with conventional compilers. DAPPLE defines Vectors and Matrices as
basic classes, with all the usual C++ operators overloaded to provide elementwise
arithmetic. In addition, DAPPLE provides typical data-parallel operations like scans,
permutations, and reductions. Finally, DAPPLE provides a parallel if-then-else statement
to restrict the scope of the above operations to partial vectors or matrices.Notes:
Hypergraph Partitioning Algorithms
Dartmouth Technical Report PCS-TR94-233
Tom Leighton
Fillia Makedon
Spyros Tragoudas
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/119/TR94-233.pdf (1610KB)
Abstract:
We present the first polynomial time approximation algorithms for the balanced hypergraph
partitioning problem. The approximations are within polylogarithmic factors of the
optimal solutions. The choice of algorithm involves a time complexity/approximation
bound tradeoff. We employ a two step methodology. First we approximate the flux
of the input hypergraph. This involves an approximate solution to a concurrent flow
problem on the hypergraph. In the second step we use the approximate flux to obtain
approximations for the balanced bipartitioning problem. Our results extend the approximation
algorithms by Leighton-Rao on graphs to hypergraphs. We also give the first polylogarithmic
times optimal approximation algorithms for multiway (graph and hypergraph) partitioning
problems into bounded size sets.
A better approximation algorithm for the latter problem is finally presented for
the special case of bounded sets of size at most O(log n) on planar graphs and hypergraphs,
where n is the number of nodes of the input instance.
Exploring the Use of I/O Nodes for Computation in a MIMD Multiprocessor
Dartmouth Technical Report PCS-TR94-232
David Kotz
Ting Cai
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/118/TR94-232.ps.Z (72KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/118/TR94-232.pdf (186KB)
Abstract:
As parallel systems move into the production scientific computing world, the emphasis
will be on cost-effective solutions that provide high throughput for a mix of applications.
Cost-effective solutions demand that a system make effective use of all of its resources.
Many MIMD multiprocessors today, however, distinguish between ``compute'' and ``I/O''
nodes, the latter having attached disks and being dedicated to running the file-system
server. This static division of responsibilities simplifies system management but
does not necessarily lead to the best performance in workloads that need a different
balance of computation and I/O.
Of course, computational processes sharing a node with a file-system service may
receive less CPU time, network bandwidth, and memory bandwidth than they would on
a computation-only node. In this paper we examine this issue experimentally. We
found that high-performance I/O does not necessarily require substantial CPU time,
leaving plenty of time for application computation. There were some complex file-system
requests, however, which left little CPU time available to the application. (The
impact on network and memory bandwidth still needs to be determined.) For applications
(or users) that cannot tolerate an occasional interruption, we recommend that they
continue to use only compute nodes. For tolerant applications needing more cycles
than those provided by the compute nodes, we recommend that they take full advantage
of both compute and I/O nodes for computation, and that operating systems should
make this possible.Notes:
Multimedia authoring, development environments, and digital video editing.
Dartmouth Technical Report PCS-TR94-231
Fillia Makedon
James Matthews
Charles B. Owen
Samuel A. Rebelsky
Date: January 0001
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/117/TR94-231.ps.Z (777KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/117/TR94-231.pdf (236KB)
Abstract:
Multimedia systems integrate text, audio, video, graphics, and other media and allow
them to be utilized in a combined and interactive manner. Using this exciting and
rapidly developing technology, multimedia applications can provide extensive benefits
in a variety of arenas, including research, education, medicine, and commerce. While
there are many commercial multimedia development packages, the easy and fast creation
of a useful, full-featured multimedia document is not yet a straightforward task.
This paper addresses issues in the development of multimedia documents, ranging
from user-interface tools that manipulate multimedia documents to multimedia communication
technologies such as compression, digital video editing and information retrieval.
It outlines the basic steps in the multimedia authoring process and some of the
requirements that need to be met by multimedia development environments. It also
presents the role of video, an essential component of multimedia systems and the
role of programming in digital video editing. A model is described for remote access
of distributed video. The paper concludes with a discussion of future research directions
and new uses of multimedia documents.Notes:
A Multiprocessor Extension to the Conventional File System Interface
Dartmouth Technical Report PCS-TR94-230
Nils Nieuwejaar
David Kotz
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/116/TR94-230.ps.Z (67KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/116/TR94-230.pdf (174KB)
Abstract:
As the I/O needs of parallel scientific applications increase, file systems for multiprocessors
are being designed to provide applications with parallel access to multiple disks.
Many parallel file systems present applications with a conventional Unix-like interface
that allows the application to access multiple disks transparently. By tracing all
the activity of a parallel file system in a production, scientific computing environment,
we show that many applications exhibit highly regular, but non-consecutive I/O access
patterns. Since the conventional interface does not provide an efficient method
of describing these patterns, we present an extension which supports strided and
nested-strided I/O requests.Notes:
A New Approach to the Minumum Cut Problem
Dartmouth Technical Report PCS-TR94-229
David R. Karger
Clifford Stein
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/115/TR94-229.pdf (2900KB)
Abstract:
Deciding Finiteness for Matrix Groups Over Function Fields
Dartmouth Technical Report PCS-TR94-227
Robert Beals
Daniel N. Rockmore
Ki-Seng Tan
Date: January 1995
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/114/TR94-227.pdf (620KB)
Abstract:
Let S be any finite subset GLn(F(t)) where F is a field. In this paper we give algorithms
to decide if the group generated by S is finite. In the case of characteristic zero,
slight modifications of earlier work of Babai, Beals and Rockmore [1] give polynomial
time deterministic algorithms to solve this problem. The case of positive characteristic
turns out to be more subtle and our algorithms depend on a structure theorem proved
here, generalizing a theorem of Weil. We also present a fairly detailed analysis
of the size of finite subgroups in this case and give bounds which depend upon the
number of generators. To this end we also introduce the notion of the diameter of
a finitely generated algebra and derive some upper bounds related to this quantity.
In positive characteristic the deterministic algorithms we present are exponential.
A randomized algorithm based on ideas of the Meat-Axe is also given. While not provably
efficient, the success of the Meat-Axe suggests the randomized algorithm will be
useful.
Disk-directed I/O for MIMD Multiprocessors
Dartmouth Technical Report PCS-TR94-226
David Kotz
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/113/TR94-226.ps.Z (196KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/113/TR94-226.pdf (377KB)
Abstract:
Many scientific applications that run on today's multiprocessors are bottlenecked
by their file I/O needs. Even if the multiprocessor is configured with sufficient
I/O hardware, the file-system software often fails to provide the available bandwidth
to the application. Although libraries and improved file-system interfaces can make
a significant improvement, we believe that fundamental changes are needed in the
file-server software. We propose a new technique, disk-directed I/O, that flips
the usual relationship between server and client to allow the disks (actually, disk
servers) to determine the flow of data for maximum performance. Our simulations
show that tremendous performance gains are possible. Indeed, disk-directed I/O provided
consistent high performance that was largely independent of data distribution, and
close to the maximum disk bandwidth.Notes:
Human Creativity Through Computer Gaming
Dartmouth Technical Report PCS-TR94-225
Christine McGavran
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/112/TR94-225.pdf (7396KB)
Abstract:
Notes:
BMMC Permutations on a DECmpp 12000/sx 2000
Dartmouth Technical Report PCS-TR94-224
Kristin Bruhl
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/111/TR94-224.ps.Z (187KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/111/TR94-224.pdf (353KB)
Abstract:
Increasingly, modern computing problems, including many scientific and business applications,
require huge amounts of data to be examined, modified, and stored. Parallel computers
can be used to decrease the time needed to operate on such large data sets, by allowing
computations to be performed on many pieces of data at once. For example, on the
DECmpp machine used in our research, there are 2048 processors in the parallel processor
array. The DECmpp can read data into each of these processors, perform a computation
in parallel on all of it, and write the data out again, theoretically decreasing
the execution time by a factor of 2048 over the time required by one of its processors.
Often, the computations that occur after the data is in the processors involve rearranging,
or permuting, the data within the array of parallel processors. Information moves
between processors by means of a network connecting them. Communication through
the network can be very expensive, especially if there are many collisions--simultaneous
contentions for the same network resource--between items of data moving from one
processor to another. When a program performs hundreds or even thousands of these
permutations during its execution, a bottleneck can occur, impeding the overall
performance of the program.
Effective algorithms that decrease the time required to permute the data within
a parallel computer can yield a significant speed increase in running programs with
large data sets. Cormen has designed algorithms to improve performance when the
data movement is defined by certain classes of permutations. This thesis will examine
the performance of one of these classes, the bit-matrix-multiply/complement (BMMC)
permutation, when implemented on the DECmpp. Although Cormen's algorithm was designed
for parallel disk systems, this thesis adapts it to permutations of data residing
in the memory of the parallel processors.
The DECmpp network follows the model of an Extended Delta Network (EDN). One characteristic
of an EDN is that it has a set of input and output ports to the network, each of
which can carry only one item of data at a time. If more than one item needs to
travel over a given port, a collision occurs. The data must access the port serially,
which slows down the entire operation. Cormen's algorithm reduces these collisions
by computing a schedule for sending the data over the network.
For small data sets, it is not worthwhile to perform the extra operations to generate
such a schedule, because the overhead associated with computing the schedule outweighs
the time gained by preventing collisions at the network ports. As the size of the
data set increases, eliminating collisions becomes more and more valuable. On the
DECmpp, when the data permutation involves more than 128 elements per processor,
our algorithm beats the more naive and obvious method for permuting in the parallel
processor array.Notes:
Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems
Dartmouth Technical Report PCS-TR94-223
Thomas H. Cormen
Thomas Sundquist
Leonard F. Wisniewski
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/110/TR94-223.ps.Z (159KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/110/TR94-223.pdf (390KB)
Abstract:
We give asymptotically equal lower and upper bounds for the number of parallel I/O
operations required to perform bit-matrix-multiply/complement (BMMC) permutations
on parallel disk systems. In a BMMC permutation on N records, where N is a power
of 2, each (lg N)-bit source address x maps to a corresponding (lg N)-bit target
address y by the matrix equation y = Ax XOR c, where matrix multiplication is performed
over GF(2). The characteristic matrix A is (lg N) x (lg N) and nonsingular over
GF(2). Under the Vitter-Shriver parallel-disk model with N records, D disks, B records
per block, and M records of memory, we show a universal lower bound of $Omega left(
frac{N}{BD} left( 1 + frac{rank{gamma}}{lg (M/B)} right) right)$ parallel I/Os for
performing a BMMC permutation, where gamma is the lower left (lg (N/B)) x (lg B)
submatrix of the characteristic matrix. We adapt this lower bound to show that the
algorithm for bit-permute/complement (BPC) permutations in Cormen93a is asymptotically
optimal. We also present an algorithm that uses at most $frac{2N}{BD} left( 4 ceil{frac{rank{gamma}}{lg
(M/B)}} + 4 right)$ parallel I/Os, which asymptotically matches the lower bound
and improves upon the BMMC algorithm in Cormen93a. When rank (gamma) is low, this
method is an improvement over the general-permutation bound of $Theta left( frac{N}{BD}
frac{lg (N/B)}{lg (M/B)} right)$.
We introduce a new subclass of BMMC permutations, called memory-load-dispersal (MLD)
permutations, which can be performed in one pass. This subclass, which is used in
the BMMC algorithm, extends the catalog of one-pass permutations appearing in Cormen93a.
Although many BMMC permutations of practical interest fall into subclasses that
might be explicitly invoked within the source code, we show how to detect in at
most $N/BD + ceil{frac{lg (N/B) + 1}{D}}$ parallel I/Os whether a given vector of
target addresses specifies a BMMC permutation. Thus, one can determine efficiently
at run time whether a permutation to be performed is BMMC and then avoid the general-permutation
algorithm and save parallel I/Os by using our algorithm.
Efficiency and Stability Issues in the Numerical Computation of Fourier Transforms and Convolutions on the 2-Sphere
Dartmouth Technical Report PCS-TR94-222
Dennis M. Healy
Sean S. B. Moore
Daniel N. Rockmore
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/109/TR94-222.pdf (3727KB)
Abstract:
A Detailed Simulation Model of the HP 97560 Disk Drive
Dartmouth Technical Report PCS-TR94-220
David Kotz
Song Bac Toh
Sriram Radhakrishnan
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/108/TR94-220.ps.Z (71KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/108/TR94-220.pdf (180KB)
Abstract:
We implemented a detailed model of the HP 97560 disk drive, to replicate a model
devised by Ruemmler and Wilkes (both of Hewlett-Packard, HP). Our model simulates
one or more disk drives attached to one or more SCSI buses. The design is broken
into three components: a test driver, the disk model itself, and the discrete-event
simulation support. Thus, the disk model can be easily extracted and used in other
simulation environments. We validated our model using traces obtained from HP, using
the same "demerit" measure as Ruemmler and Wilkes. We obtained a demerit percentage
of 3.9%, indicating that our model was extremely accurate. This paper describes
our implementation, and is meant for those wishing to use our model, see our validation,
or understand our code.Notes:
Fast Spherical Transforms on Distance Transitive Graphs
Dartmouth Technical Report PCS-TR94-219
James R. Driscoll
Dennis M. Healy
Daniel N. Rockmore
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/107/TR94-219.pdf (1417KB)
Abstract:
SPEDE: Simple Programming Environment for Distributed Execution
Dartmouth Technical Report PCS-TR94-218
James Gochee
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/106/TR94-218.ps.Z (21KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/106/TR94-218.pdf (29KB)
Abstract:
One of the main goals for people who use computer systems, particularly computational
scientists, is speed. In the quest for ways to make applications run faster, engineers
have developed parallel computers, which use more than one CPU to solve a task.
However, many institutions already posses significant computational power in networks
of workstations. Through software, it is possible to glue together clusters of machines
to simulate a parallel environment. SPEDE is one such system, designed to place
the potential of local machines at the fingertips of the programmer. Through a simple
interface, users design computational objects that can be linked and run in parallel.
The goal of the project is to have a small portable environment that allows various
types of computer systems to interact. SPEDE requires no altering of the kernel
and does not require system privileges to use. Using SPEDE, programmers can get
significant speedup for computationally intensive problems. As an example, a Mandelbrot
image generator was implemented, that attained a five-fold speedup with eight processors.Notes:
SPEDE: A Simple Programming Environment for Distributed Execution (Users' Manual)
Dartmouth Technical Report PCS-TR94-217
James Gochee
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/105/TR94-217.ps.Z (44KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/105/TR94-217.pdf (63KB)
Abstract:
Traditional single processor computers are quickly reaching their full computational
potentials. The quest for faster and faster chips have brought technology to the
point where the laws of physics are hampering future gains. Significant gains in
speed must therefore come from using multiple processors instead of a single processor.
This technology usually represents itself in the form of a parallel computer, such
as the Connection Machine Model 5. Recently however, much interest has been focused
on software that organizes single processor computers to behave like a parallel
computer. This is desirable for sites which have large installations of workstations,
since the cost of new parallel systems are prohibitive. SPEDE, a Simple Programming
Environment for Distributed Execution, was designed for this purpose. It allows
UNIX based machines of varying hardware types to be organized and utilized by a
programmer of parallel applications. SPEDE is a user level system in that it requires
no special privileges to run. Every user keeps a separate copy of the system so
that security issues are covered by the normal UNIX operating environment. SPEDE
is characterized as a large grained distributed environment. This means that applications
which have a large processing to I/O ratio will be much more effective than those
with a small ratio. SPEDE allows users to coordinate the use of many computers
through a straightforward interface. Machines are organized by classes, which are
terms that can be used to label and group them into more manageable units. For example,
users might want to create a class based on the byte ordering of machines, or by
their location. Users can then specify more completely which machines they want
to use for a particular session. Sessions are essentially the interaction between
objects in the SPEDE environment. A user creates an object to perform a certain
task, such as constructing part of a fractal image. Objects can send and receive
messages from other objects using a simple interface provided with SPEDE. Objects
are machine independent, which means that the same object can be run simultaneously
on different platforms. This is achieved by translating all messages into standard
network byte ordering. However, if user data is being passed between objects, it
is the user's responsibility to make sure byte ordering is correct. The SPEDE system
involves several major components. These components help control and manage object
interactions. Figure 1 shows a running session running with three machines (each
surrounded by an oval rectangle). There are also three objects running, two named
MandComp and one named Mand. Each object is on a different machine, although it
is possible to have multiple objects on a single machine. In the figure, the lines
connecting the various entities represent socket connections. UNIX sockets are the
transport mechanism used in SPEDE, although one could implement a lower level protocol
for more efficient communication. Sockets can also be a problem because some machines
have strict limits on the number of connections a user can have open at any given
time.Notes:
Scheduling in a Ring with Unit Capacity Links
Dartmouth Technical Report PCS-TR94-216
Perry Fizzano
Clifford Stein
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/104/TR94-216.ps.Z (54KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/104/TR94-216.pdf (147KB)
Abstract:
We consider the problem of scheduling unit-sized jobs on a ring of processors with
the objective of minimizing the completion time of the last job. Unlike much previous
work we place restrictions on the capacity of the network links connecting processors.
We give a polynomial time centralized algorithm that produces optimal length schedules.
We also give a simple distributed 2-approximation algorithm.
Fast Greedy Triangulation Algorithms
Dartmouth Technical Report PCS-TR94-215
Matthew T. Dickerson
Robert L. Scot Drysdale
Scott A. McElfresh
Emo Welzl
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/103/TR94-215.ps.Z (100KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/103/TR94-215.pdf (273KB)
Abstract:
The greedy triangulation of a set $S$ of $n$ points in the plane is the triangulation
obtained by starting with the empty set and at each step adding the shortest compatible
edge between two of the points, where a compatible edge is defined to be an edge
that crosses none of the previously added edges. In this paper we present a simple,
practical algorithm that computes the greedy triangulation in expected time $O(n
log n)$ and space $O(n)$ for points uniformly distributed over any convex shape.
A variant of this algorithm should be fast for some other distributions. As part
of this algorithm we give an edge compatiblity test that requires $O(n)$ time for
both tests and updates to the underlying data structure. We also prove properties
about the expected lengths of edges in greedy and Delaunay triangulations of uniformly
distributed points.
A 2-3/4-Approximation Algorithm for the Shortest Superstring Problem
Dartmouth Technical Report PCS-TR94-214
Chris Armen
Clifford Stein
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/102/TR94-214.pdf (760KB)
Abstract:
Given a collection of strings S={s_1,...,s_n} over an alphabet Sigma, a superstring
alpha of S is a string containing each s_i as a substring, that is, for each i,
1<=i<=n, alpha contains a block of |s_i| consecutive characters that match s_i exactly.
The shortest superstring problem is the problem of finding a superstring alpha of
minimum length.
The shortest superstring problem has applications in both computational biology
and data compression. The problem is NP-hard [GallantMS80]; in fact, it was recently
shown to be MAX SNP-hard [BlumJLTY91]. Given the importance of the applications,
several heuristics and approximation algorithms have been proposed.
Constant factor approximation algorithms have been given in [BlumJLTY91] (factor
of 3), [TengY93] (factor of 2-8/9), [CzumajGPR94] (factor of 2-5/6) and [KosarajuPS94]
(factor of 2-50/63). Informally, the key to any algorithm for the shortest superstring
problem is to identify sets of strings with large amounts of similarity, or overlap.
While the previous algorithms and their analyses have grown increasingly sophisticated,
they reveal remarkably little about the structure of strings with large amounts
of overlap. In this sense, they are solving a more general problem than the one
at hand. In this paper, we study the structure of strings with large amounts of
overlap and use our understanding to give an algorithm that finds a superstring
whose length is no more than 2-3/4 times that of the optimal superstring. We prove
several interesting properties about short periodic strings, allowing us to answer
questions of the following form: given a string with some periodic structure, characterize
all the possible periodic strings that can have a large amount of overlap with the
first string.
Job Scheduling in Rings
Dartmouth Technical Report PCS-TR94-213
Perry Fizzano
Clifford Stein
David R. Karger
Joel Wein
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/101/TR94-213.ps.Z (89KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/101/TR94-213.pdf (219KB)
Abstract:
We give distributed approximation algorithms for job scheduling in a ring architecture.
In contrast to almost all other parallel scheduling models, the model we consider
captures the influence of the underlying communications network by specifying that
task migration from one processor to another takes time proportional to the distance
between those two processors in the network. As a result, our algorithms must balance
both computational load and communication time. The algorithms are simple, require
no global control, and work in a variety of settings. All come with small constant-factor
approximation guarantees; the basic algorithm yields schedules of length at most
4.22 times optimal. We also give a lower bound on the performance of any distributed
algorithm some results for a simple capacitated case, and the results of simulation
experiments, which give better results than our worst-case analysis.
Dynamic File-Access Characteristics of a Production Parallel Scientific Workload
Dartmouth Technical Report PCS-TR94-211
David Kotz
Nils Nieuwejaar
Date: January 1994
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/100/TR94-211.ps.Z (97KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/100/TR94-211.pdf (235KB)
Abstract:
Multiprocessors have permitted astounding increases in computational performance,
but many cannot meet the intense I/O requirements of some scientific applications.
An important component of any solution to this I/O bottleneck is a parallel file
system that can provide high-bandwidth access to tremendous amounts of data in parallel
to hundreds or thousands of processors.
Most successful systems are based on a solid understanding of the characteristics
of the expected workload, but until now there have been no comprehensive workload
characterizations of multiprocessor file systems. We began the CHARISMA project
in an attempt to fill that gap. We instrumented the common node library on the iPSC/860
at NASA Ames to record all file-related activity over a two-week period. Our instrumentation
is different from previous efforts in that it collects information about every read
and write request and about the mix of jobs running in the machine (rather than
from selected applications).
The trace analysis in this paper leads to many recommendations for designers of
multiprocessor file systems. First, the file system should support simultaneous
access to many different files by many jobs. Second, it should expect to see many
small requests, predominantly sequential and regular access patterns (although of
a different form than in uniprocessors), little or no concurrent file-sharing between
jobs, significant byte- and block-sharing between processes within jobs, and strong
interprocess locality. Third, our trace-driven simulations showed that these characteristics
led to great success in caching, both at the compute nodes and at the I/O nodes.
Finally, we recommend supporting strided I/O requests in the file-system interface,
to reduce overhead and allow more performance optimization by the file system.Notes:
Videoscheme: A Research, Authoring, and Teaching Tool for Multimedia
Dartmouth Technical Report PCS-TR94-209
J. Matthews
F. Makedon
P. Gloor
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/99/TR94-209.pdf (371KB)
Abstract:
The availability of digital multimedia technology poses new challenges to researchers,
authors, and educators, even as it creates new opportunities for rich communication.
This paper suggests interactive computer programming as a fruitful approach to these
challenges. VideoScheme, a prototype video programming environment, is described
along with promising applications.
Conference on a Disk: A Successful Experiment in Hypermedia Publishing (Extended Abstract)
Dartmouth Technical Report PCS-TR94-208
M. Cheyney
P. Gloor
D. B. Johnson
F. Makedon
J. Matthews
P. Metaxas
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/98/TR94-208.pdf (332KB)
Abstract:
Academic conferences are a long-standing and effective form of multimedia communication.
Conference participants can transmit and recieve information through sight, speech,
gesture, text, and touch. This same-time, same-place communication is sufficiently
valuable to justify large investments in time and travel funds. Printed conference
proceedings are attempts to recapture the value of a life conference, but they are
limited by a fragmented and inefficient approach to the problem. We addressed this
problem in the multimedia proceedings of the DAGS'92 conference. The recently published
CD-ROM delibers text, graphic, audio, and video information as an integrated whole,
with extensive provisions for random access and hypermedia linking. We belive that
this project provides a model for future conference publications and highlights
some of the research issues that must be resolved before similar publications can
be quickly and inexpensively produced.
Issues and Obstacles with Multimedia Authoring
Dartmouth Technical Report PCS-TR94-207
Fillia Makedon
Samuel A. Rebelsky
Matthew Cheyney
Charles B. Owen
Peter A. Gloor
Date: January 1995
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/97/TR94-207.ps.Z (26KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/97/TR94-207.pdf (37KB)
Abstract:
Unlike traditional authoring, multimedia authoring involves making hard choices,
forecasting technological evolution and adapting to software and hardware technology
changes. It is, perhaps, an unstable field of endeavor for an academic to be in.
Yet, it is important that academics are, in fact, part of this process. This paper
discusses some of the common threads shared by three dissimilar cases of multimedia
authoring which we have experimented with, that of multimedia conference proceedings,
multimedia courseware development and multimedia information kiosks. We consider
these applications from an academic point of view and review the benefits and pitfalls
of academic development while sharing points of hard-learned wisdom. We draw on
experiences from some of the projects run at the Dartmouth Experimental Visualization
Laboratory (DEVlab), where we have been developing different types of multimedia
applications.Notes:
Efficient Sequential and Parallel Algorithms for the Negative Cycle Problem
Dartmouth Technical Report PCS-TR94-206
Dimitrios Kavvadias
Grammati E. Pantziou
Paul G. Spirakis
Christos D. Zaroliagis
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/96/TR94-206.pdf (612KB)
Abstract:
We present here an algorithm for detecting (and outputting, if exists) a negative
cycle in an $n$-vertex planar digraph $G$ with real edge weights. Its running time
ranges from $O(n)$ up to $O(n^{1.5}log n)$ as a certain topological measure of $G$
varies from $1$ up to $Theta(n)$. Moreover, an efficient CREW PRAM implementation
is given. Our algorithm applies also to digraphs whose genus $gamma$ is $o(n)$.
Quickest Paths: Faster Algorithms and Dynamization
Dartmouth Technical Report PCS-TR94-204
Dimitrios Kagaris
Grammati E. Pantziou
Spyros Tragoudas
Christos D. Zaroliagis
Date: January 1994
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/95/TR94-204.pdf (1196KB)
Abstract:
Given a network $N=(V,E,{c},{l})$, where $G=(V,E)$, $|V|=n$ and $|E|=m$, is a directed
graph, ${c}(e) > 0$ is the capacity and ${l}(e) ge 0$ is the lead time (or delay)
for each edge $ein E$, the quickest path problem is to find a path for a given source--destination
pair such that the total lead time plus the inverse of the minimum edge capacity
of the path is minimal. The problem has applications to fast data transmissions
in communication networks. The best previous algorithm for the single--pair quickest
path problem runs in time $O(r m+r n log n)$, where $r$ is the number of distinct
capacities of $N$ cite{ROS}. In this paper, we present algorithms for general, sparse
and planar networks that have significantly lower running times. For general networks,
we show that the time complexity can be reduced to $O(r^{ast} m+r^{ast} n log n)$,
where $r^{ast}$ is at most the number of capacities greater than the capacity of
the shortest (with respect to lead time) path in $N$. For sparse networks, we present
an algorithm with time complexity $O(n log n + r^{ast} n + r^{ast} tilde{gamma}
log tilde{gamma})$, where $tilde{gamma}$ is a topological measure of $N$. Since
for sparse networks $tilde{gamma}$ ranges from $1$ up to $Theta(n)$, this constitutes
an improvement over the previously known bound of $O(r n log n)$ in all cases that
$tilde{gamma}=o(n)$. For planar networks, the complexity becomes $O(n log n + nlog^3
tilde{gamma}+ r^{ast} tilde{gamma})$. Similar improvements are obtained for the
all--pairs quickest path problem. We also give the first algorithm for solving the
dynamic quickest path problem.
Parallel h-v Drawings of Binary Trees
Dartmouth Technical Report PCS-TR93-202
Panagiotis Metaxas
Grammati E. Pantziou
Antonios Symvonis
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/94/TR93-202.pdf (455KB)
Abstract:
In this paper we present a method to obtain optimal h-v and inclusion drawings in
parallel. Based on parallel tree contraction, our method computes optimal (with
respect to a class of cost functions of the enclosing rectangle) drawings in $O(log^2
n)$ parallel time by using a polynomial number of EREW processors. The number of
processors reduces substantially when we study minimum area drawings. The method
can be extended to compute optimal inclusion layouts in the case where each leaf
$l$ of the tree is represented by rectangle $l_x times l_y$ (the dimensions of which
are part of the input). For polynomial area layouts, our work places the problem
of obtaining optimal size h-v or inclusion drawings in NC, presenting the first
algorithm with polylogarithmic time complexity. Our method also yields an NC algorithm
for the slicing floorplanning problem. Whether this problems was in NC was an open
question~cite{CT90}.
Parallel Max Cut Approximations
Dartmouth Technical Report PCS-TR93-201
Grammati E. Pantziou
Paul G. Spirakis
Christos D. Zaroliagis
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/93/TR93-201.pdf (368KB)
Abstract:
Given a graph with positive integer edge weights one may ask whether there exists
an edge cut whose weight is bigger than a given number. This problem is NP-complete.
We present here an approximation algorithm in NC which provides tight upper bounds
to the proportion of edge cuts whose size is bigger than a given number. Our technique
is based on the methods to convert randomized parallel algorithms into deterministic
ones introduced by Karp and Wigderson. The basic idea of those methods is to replace
an exponentially large sample space by one of polynomial size. In this work, we
prove the interesting result that the statistical distance of random variables of
the small sample space is bigger than the statistical distance of corresponding
variables of the exponentially large space, which is the space of all edge cuts
taken equiprobably.
On-Line and Dynamic Shortest Paths Through Graph Decompositions
Dartmouth Technical Report PCS-TR93-200
Hristo N. Djidjev
Grammati E. Pantziou
Christos D. Zaroliagis
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/92/TR93-200.pdf (1646KB)
Abstract:
We describe algorithms for finding shortest paths and distances in a planar digraph
which exploit the particular topology of the input graph. An important feature of
our algorithms is that they can work in a dynamic environment, where the cost of
any edge can be changed or the edge can be deleted. For outerplanar digraphs, for
instance, the data structures can be updated after any such change in only $O(log
n)$ time, where $n$ is the number of vertices of the digraph. We also describe the
first parallel algorithms for solving the dynamic version of the shortest path problem.
Our results can be extended to hold for digraphs of genus $o(n)$.
The Expected Lifetime of "Single-Address-Space" Operating Systems
Dartmouth Technical Report PCS-TR93-198
David Kotz
Preston Crow
Date: January 1993
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/91/TR93-198.ps.Z (559KB)
Abstract:
Trends toward shared-memory programming paradigms, large (64-bit) address spaces,
and memory-mapped files have led some to propose the use of a single virtual-address
space, shared by all processes and processors. Typical proposals require the single
address space to contain all process-private data, shared data, and stored files.
To simplify management of an address space where stale pointers make it difficult
to re-use addresses, some have claimed that a 64-bit address space is sufficiently
large that there is no need to ever re-use addresses. Unfortunately, there has been
no data to either support or refute these claims, or to aid in the design of appropriate
address-space management policies. In this paper, we present the results of extensive
kernel-level tracing of the workstations in our department, and discuss the implications
for single-address-space operating systems. We found that single-address-space systems
will not outgrow the available address space, but only if reasonable space-allocation
policies are used, and only if the system can adapt as larger address spaces become
available.Notes:
Wavelet Localization of the Radon Transform
Dartmouth Technical Report PCS-TR93-196
Tim Olson
Joe Destefano
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/90/TR93-196.pdf (2214KB)
Abstract:
Vector Layout in Virtual-Memory Systems for Data-Parallel Computing
Dartmouth Technical Report PCS-TR93-194
Thomas H. Cormen
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/89/TR93-194.pdf (292KB)
Abstract:
Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems
Dartmouth Technical Report PCS-TR93-193
Thomas H. Cormen
Leonard F. Wisniewski
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/88/TR93-193.pdf (263KB)
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/88/TR93-193.ps.Z (91KB)
Abstract:
Notes:
Off-line Cursive Handwriting Recognition Using Style Parameters
Dartmouth Technical Report PCS-TR93-192
Berrin A. Yanikoglu
Peter A. Sandon
Date: January 1993
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/87/TR93-192.ps.Z (201KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/87/TR93-192.pdf (396KB)
Abstract:
(No abstract available).
Accurate Verification of Five-Axis Numerically Controlled Machining
Dartmouth Technical Report PCS-TR93-191
Jerome L. Quinn
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/86/TR93-191.pdf (4727KB)
Abstract:
Throughput of Existing Multiprocessor File Systems (An Informal Study)
Dartmouth Technical Report PCS-TR93-190
David Kotz
Date: January 1993
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/85/TR93-190.ps.Z (33KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/85/TR93-190.pdf (93KB)
Abstract:
Fast file systems are critical for high-performance scientific computing, since many
scientific applications have tremendous I/O requirements. Many parallel supercomputers
have only recently obtained fully parallel I/O architectures and file systems, which
are necessary for scalable I/O performance. Scalability aside, I show here that
many systems lack sufficient absolute performance. I do this by surveying the performance
reported in the literature, summarized in an informal table.
Efficient Parallel Algorithms for some Tree Layout Problems
Dartmouth Technical Report PCS-TR93-189
J. Diaz
A. Gibbons
Grammati E. Pantziou
M. Serna
Paul G. Spirakis
J. Toran
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/84/TR93-189.pdf (593KB)
Abstract:
The minimum cut and minimum length linear arrangement problems usually occur in solving
wiring problems and have a lot in common with job sequencing questions. Both problems
are NP-complete for general graphs and in P for trees. We present here two algorithms
in NC. The first solves the minimum length linear arrangement problem for unrooted
trees in $O(log^2 n)$ time and $O(n^2 3^{log n})$ CREW PRAM processors. The second
algorithm solves the minimum cut arrangement for unrooted trees of maximum degree
$d$ in $O(d log^2 n)$ time and $O(n^2 /log n)$ CREW PRAM processors.
Integrating Theory and Practice in Parallel File Systems
Dartmouth Technical Report PCS-TR93-188
Thomas H. Cormen
David Kotz
Date: January 1993
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/83/TR93-188-update.ps.Z (67KB)
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/83/TR93-188-update.pdf (188KB)
Abstract:
Several algorithms for parallel disk systems have appeared in the literature recently,
and they are asymptotically optimal in terms of the number of disk accesses. Scalable
systems with parallel disks must be able to run these algorithms. We present a list
of capabilities that must be provided by the system to support these optimal algorithms:
control over declustering, querying about the configuration, independent I/O, turning
off file caching and prefetching, and bypassing parity. We summarize recent theoretical
and empirical work that justifies the need for these capabilities.Notes:
VideoScheme: A Programmable Video Editing System for Automation and Media Recognition
Dartmouth Technical Report PCS-TR93-187
James Matthews
Peter A. Gloor
Fillia Makedon
Date: January 1993
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/82/TR93-187.pdf (800KB)
Abstract:
The recent development of powerful, inexpensive hardware and software support had
made digital video editing possible on personal computers and workstations. To date
the video editing application category has been dominated by visual, easy-to-use,
direct manipulation interfaces. These systems bring high-bandwidth human-computer
interaction to a task formerly characterized by slow, inflexible, indirectly-operated
machines. However, the direct manipulation computer interfaces are limited by their
manual nature, and can not easily accommodate algorithmically- defined operations.
This paper proposes a melding of the common direct manipulation interfaces with
a programming language which we have enhanced to manipulate digital audio and video.
The result is a system which can automate routine tasks as well as perform tasks
based on sophisticated media recognition algorithms.
Formal Implementation of High-Level Languages for Data-Parallel Programming
Dartmouth Technical Report PCS-TR92-186
Deb Banerjee
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/81/TR92-186.pdf (3948KB)
Abstract:
The success of parallel architectures has been limited by the lack of high-level
parallel programming languages and useful programming models. The data-parallel
model of programming has been demonstrated to be useful and natural on a wide variet
of parallel architectures. This dissertation presents a set of formal techniques
for compiling high- level languages based on data-parallelism.
Algorithms for Closest Point Problems: Practice and Theory
Dartmouth Technical Report PCS-TR92-185
Peter Su
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/80/TR92-185.pdf (1144KB)
Abstract:
This paper describes and evaluates know sequential algorithms for constructing planar
Voronoi diagrams and Delaunay triangulations. In addition, it describes a new incremental
algorithm which is simple to understand and implement, but whose performance is
competitive with all known methods. The experiments in this paper are more than
just simple benchmarks, they evaluate the expected performance of the algorithms
in a precise and machine independent fashion. Thus, the paper also illustrates how
to use experimental tools to both understand the behaviour of different algorithms
and to guide the algorithm design process.
Building Segment Trees in Parallel
Dartmouth Technical Report PCS-TR92-184
Peter Su
Robert L. Scot Drysdale
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/79/TR92-184.pdf (635KB)
Abstract:
The segment tree is a simple and important data structure in computational geometry
[7,11]. We present an experimental study of parallel algorithms for building segment
trees. We analyze the algorithms in the context of both the PRAM (Parallel Random
Access Machine) and hypercube architectures. In addition, we present performance
data for implementations developed on the Connection Machine. We compare two different
parallel alforitms, and we also compare our parallel algorithms to a good sequential
algorithm for doing the same job. In this way, we evaluate the overall efficiency
of our parallel methods. Our performance results illustrates the problems involved
in using popular machine models(PRAM) and analysis techniques (asymptotic efficiency)
to predict the performance of parallel algorithms on real machines. We present two
different analyses of our algorithms and show that neither is effective in predicting
the actual performance numbers that we obtained.
Concurrent Local Search for Fast Proximity Algorithms on Parallel and Vector Architectures
Dartmouth Technical Report PCS-TR92-183
Peter Su
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/78/TR92-183.pdf (725KB)
Abstract:
This paper presents a fast algorithm for solving the all-nearest-neighbors problem.
The algorithm uses a data parallel style of programming which can be efficiently
utilized on a variety of parallel and vector architectures [4,21,26]. I have implemented
the algorithm in C on one such architecture, the Cray Y-MP. On one Cray CPU, the
implementation is about 19 times faster than a fast sequential algorithm running
on a Sparc workstation. The main idea in the algorithm is to divide the plane up
into a fixed grid of cells, or buckets. When the points are well distributed, the
algorithm processes each query point, q, by searching a small number of cells close
to q. Bentley, WEide and Yao first presented this idea for conventional architectures
[3], but the technique works equally well on parallel and vector machines, leading
to a simple, efficient algorithm. We can also use the cell technique to solve a
wide variety of basic computational problems such as finding closest pairs, sorting
and constructing Voronoi diagrams and Delaunay triangulations.
How to Encrypt /usr/dict/words in About a Second
Dartmouth Technical Report PCS-TR92-182
Peter Su
Matt Bishop
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/77/TR92-182.pdf (283KB)
Abstract:
We present an implementation of the Data Encryption Standard on the Connection Machine
architecture. The DES encryption algorithm is ideally suited to the Connection Machine
because it consists of bit serial operations, and thousands of encryptions can be
done in parallel, independently of one another. Thus, our code encrypts passwords
about ten times faster than the fastest competition that we know about.
In addition, the nature of the Connection Machine's architecture is such that some
of the optimizations that make DES run much faster on conventional architectures
have no effect on the performance of the Connection Machine. Our comparison of a
simple implementation along with one that uses many optimizations illustrates this
fact.
On The De Bruijn Torus Problem
Dartmouth Technical Report PCS-TR92-181
Glenn Hurlbert
Garth Isaak
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/76/TR92-181.pdf (421KB)
Abstract:
A (kn;n)k-de Bruijn Cycle is a cyclic k-ary sequence with the property that every
k-ary n-tuple appears exactly once contiguously on the cycle. A (kr, ks; m, n)k-de
Bruijn Torus is a k-ary krXks toroidal array with the property that every k-ary
m x n matrix appears exactly once contiguously on the torus. As is the case with
de Bruijn cycles, the 2-dimensional version has many interesting applications, from
coding and communications to pseudo-random arrays, spectral imaging, and robot self-location.
J.C. Cock proved the existence of such tori for all m, n, and k, and Chung, Diaconis,
and Graham asked if it were possible that r = s and m -= n for n even. Fan, Fan,
Ma and Siu showed this was possible for k - 2. Combining new techniques with old,
we prove the result for k > 2 and show that actually much more is possible. The
cases in 3 or more dimensions remain.
A Visualization System for Correctness Proofs of Graph Algorithms
Dartmouth Technical Report PCS-TR92-180
Peter A. Gloor
Donald B. Johnson
Fillia Makedon
Panagiotis Metaxas
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/75/TR92-180.pdf (1010KB)
Abstract:
In this paper we describe a system for visualizing correctness proofs of graph algorithms.
The system has been demonstrated for a greedy algorithm. Prims algorithm for finding
a minimum spanning tree of an undirected, weighted graph. We believe that our system
is particularly appropriate for greedy algorithms, though much of what we discuss
can guide visualization of proofs in other contexts. While an example is not a proof,
our system provides concrete examples to illustrate the operation of the algorithm.
These examples can be referred to by the user interactively and alternatively with
the visualization of the proof where the general case is portrayed abstractly.
Multiprocessor File System Interfaces
Dartmouth Technical Report PCS-TR92-179
David Kotz
Date: January 1992
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/74/TR92-179.ps.Z (72KB)
Abstract:
Increasingly, file systems for multiprocessors are designed with parallel access
to multiple disks, to keep I/O from becoming a serious bottleneck for parallel applications.
Although file system software can transparently provide high-performance access
to parallel disks, a new file system interface is needed to facilitate parallel
access to a file from a parallel application. We describe the difficulties faced
when using the conventional (Unix-like) interface in parallel applications, and
then outline ways to extend the conventional interface to provide convenient access
to the file for parallel programs, while retaining the traditional interface for
programs that have no need for explicitly parallel file access. Our interface includes
a single naming scheme, a multiopen operation, local and global file pointers, mapped
file pointers, logical records, multifiles, and logical coercion for backward compatibility.
Parallel Algorithms For Graph Problems (Thesis)
Dartmouth Technical Report PCS-TR92-178
Panagiotis Metaxas
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/73/TR92-178.pdf (3227KB)
Abstract:
In this thesis we examine three problems in graph theory and propose efficient parallel
algorithms for solving them. We also introduce a number of parallel algorithmic
techniques.
Multiplicatively Weighted Crystal Growth Voronoi Diagrams (Thesis)
Dartmouth Technical Report PCS-TR92-177
Barry F. Schaudt
Date: January 1992
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/72/TR92-177.pdf (4057KB)
Abstract:
Voronoi diagrams and variants of Voronoi diagrams have been used for many years to
model crystal growth. If the boundary of the growing crystals are circular and all
the crystals start at the same time and have the same constant growth rate, then
the Voronoi diagram is used to model the growth. If the crystals start at different
times, the additively weighted Voronoi diagram is used to model the crystal growth.
In this thesis, I propose a new type of Voronoi diagram called the multiplicatively
weighted crystal growth Voronoi diagram, that can be used to model crystal growth
when the crystals have different constant growth rates. In this new model, the distance
from a site to a point in its region is measured along a shortest path lying entirely
within the region. In the multiplicatively weighted crystal growth Voronoi diagram,
a growing crystal (or region) may "wrap around" another site's region. When a region
wraps around, distances from the site are in part measured along the boundary of
the two regions, treating one of the regions as an obstacle, rather than along a
straight line that passes through the region.
The worst case size of the multiplicatively weighted crystal growth Voronoi, diagram
is 0(n 2). To construct the diagram, techniques from numerical analysis are used
to approximate and to intersect curves described by a system of first order differential
equations. Numerical methods to approximated a curve construct a polygonal approximation
of the curve. One step of the numerical methods constructs an edge of the polygonal
approximation. In the new Voronoi diagram, a step may require 0(n ) constant time
operations. Let S be the number of steps required by the numerical method used just
to draw the diagram. In the worst case, the algorithm presented in this thesis requires
O (n 3) intersection calculations plus O (nS lg S ) time using O (n 3 + S ) space.
A variant of this algorithm requires O (n 3) intersection calculations plus O (nS
2 + n 2S ) time using O (n 2) space.
Also presented are some variants of the new Voronoi diagram. One of these variants
uses a convex polygon distance function. The multiplicatively weighted crystal growth
Voronoi diagram using a convex polygon distance function does not require numerical
methods to construct.
Parallel Computer Needs at Dartmouth College
Dartmouth Technical Report PCS-TR92-176
David Kotz
Fillia Makedon
Matt Bishop
Robert L. Scot Drysdale
Donald B. Johnson
Panagiotis Metaxas
Date: January 1992
URL (application/x-compress) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/71/TR92-176.ps.Z (63KB)
Abstract:
To determine the need for a parallel computer on campus, a committee of the Graduate
Program in Computer Science surveyed selected Dartmouth College faculty and students
in December, 1991, and January, 1992. We hope that the information in this report
can be used by many groups on campus, including the Computer Science graduate program
and DAGS summer institute, Kiewit's NH Supercomputer Initiative, and by numerous
researchers hoping to collaborate with people in other disciplines.
We found significant interest in parallel supercomputing on campus. An on-campus
parallel supercomputing facility would not only support numerous courses and research
projects, but would provide a locus for intellectual activity in parallel computing,
encouraging interdisciplinary collaboration. We believe that this report is a first
step in that direction.
Optimal Algorithms for Multipacket Routing Problems on Rings
Dartmouth Technical Report PCS-TR91-174
Fillia Makedon
Antonios Symvonis
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/70/TR91-174.pdf (761KB)
Abstract:
We study multipacket routing problems. We divide the multipacket routing problem
into two classes, namely, distance limited and bisection limited routing problems.
Then, we concentrate on rings of processors. We prove a new lower bound of 2n/ 3
routing steps for the case of distance limited routing problems. We also give an
algorithm that tightens this lower bound. For bisection limited problems the lower
bound is kn/ 4,k >2, where k is the number of packets per processor. The trivial
algorithm needs in the worst case k | n /2| steps to terminate. An algorithm that
completes the routing in kn /4 + 2.5 n routing steps is given. We define the class
of pure routing algorithms and we demonstrate that new lower bounds hold if the
routing is to be done by an algorithm in this class.
Effects of Replication on the Duration of Failure in Distributed Databases
Dartmouth Technical Report PCS-TR91-169
Donald B. Johnson
Larry Raab
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/69/TR91-169.pdf (780KB)
Abstract:
Replicating data objects has been suggested as a means of increasing the performance
of a distributed database system in a network subject to link and site failures.
Since a network may partition as a consequence of such failures, a data object may
become unavailable from a given site for some period of time. In this paper we study
duration failure, which we define as the length of time, once the object becomes
unavailable from a particular site, that the object remains unavailable. We show
that, for networks composed of highly-reliable components, replication does not
substantially reduce the duration of failure. We model a network as a collection
of sites and links, each failing and recovering independently according to a Poisson
process. Using this model, we demonstrate via simulation that the duration of failure
incurred using a non-replicated data object is nearly as short as that incurred
using a replicated object and a replication control protocol, including an unrealizable
protocol which is optimal with respect to availability. We then examine analytically
a simplified system in which the sites but not the links are subject to failure.
We prove that if each site operates with probability p, then the optimal replication
protocol, Available Copies [5,26], reduces the duration of failure by at most a
factor of 1-p/1+p. Lastly, we present bounds for general systems, those in which
both the sites and the communications between the sites may fail. We prove, for
example, that if sites are 95% reliable and a communications failure is sufficiently
short (either infallible or satisfying a function specified in the paper) then replication
can improve the duration of failure by at most 2.7% of that experienced using a
single copy. These results show that replication has only a small effect of the
duration of failure in present-day partitionable networks comprised of realistically
reliable components.
Availability Issues in Data Replication in Distributed Database
Dartmouth Technical Report PCS-TR91-168
Donald B. Johnson
Larry Raab
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/68/TR91-168.pdf (549KB)
Abstract:
Replication of data at more than one site in a distributed database has been reported
to increase the availability in data in systems where sites and links are subject
to failure. We have shown in results summarized in this paper that in many interesting
cases the advantage is slight. A well-placed single copy is available to transactions
almost as much of the time as is correct replicated data no matter how ingeniously
it is managed. We explain these findings in terms of the behavior of the partitions
that form in networks where components fail. We also show that known and rather
simple protocols for the maintenance of multiple copies are essentially best possible
by comparing them against an unrealizable "protocol" that knows the future. We complete
our study of these questions by reporting that while computing the availability
of data is #P-complete, nonetheless there is a tight analytical bound on the amount
replication can improve over a well-located single copy. We close with some observations
regarding system design motivated by this work.
Complexity of Network Reliability and Optimal Database Placement Problems
Dartmouth Technical Report PCS-TR91-167
Donald B. Johnson
Larry Raab
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/67/TR91-167.pdf (468KB)
Abstract:
A fundamental problem of distributed database design in an existing network where
components can fail is finding an optimal location at which to place the database
in a centralized system or copies of each data item in a decentralized or replicated
system. In this paper it is proved for the first time exactly how hard this placement
problem is under the measure of data availability. Specifically, we show that the
optimal placement problem for availability is #P- complete, a measure of intractability
at least as severe as NP-completeness. Given the anticipated computational difficulty
of finding an exact solution, we go on to describe an effective, practical method
for approximating the optimal copy placement. To obtain these results, we model
the environment in which a distributed database operates by a probabilistic graph,
which is a set of fully-reliable vertices representing sites, and a set of edges
representing communication links, each operational with a rational probability.
We prove that finding the optimal copy placement in a probabilistic graph is #P-complete
by giving a sequence of reductions from #Satisfiability. We generalize this result
to networks in which each site and each link has an independent, rational operational
probability and to networks in which all the sites or all the links have a fixed,
uniform operational probabilities.
A Parallel Algorithm for the Minimum Spanning Tree
Dartmouth Technical Report PCS-TR91-166
Donald B. Johnson
Panagiotis Metaxas
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/66/TR91-166.pdf (537KB)
Abstract:
An Object-Oriented Learning/Design Support Environment
Dartmouth Technical Report PCS-TR91-165
Jill P. David
Julie C. Jumes
Fillia Makedon
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/65/TR91-165.pdf (439KB)
Abstract:
We present an object-oriented experimental learning and design support environment,
call AVT, for an Algorithm Visualization Tool, implemented in Digitalk's Smalltalk/V1
on a Macintosh II2, AVT provides a domain- independent visualization tool, an exploratory
learning environment, and an experimental heuristic design environment. Algorithm
visualization is the exploration of ways to visualize intuitively the computational
behavior of an algorithm using multiple views, some of which are visual in the graphical
sense [2,4]. AVT employs other views (combining text and graphics) to explain the
problem, the strategy, the heuristics, and the reasoning process behind the solutions.
User interaction in AVT includes not only passive viewingof the animated algorithmic
process but also active participation in the design of the steps of the algorithm.
Object-Oriented Programming(OOP)offers an attractive paradigm for rapidly implementing
heuristics as well as more coherent and understandable code [1,12]. Inheritance
properties of OOP languages capture natural mechanisms such as specialization, abstraction,
and evolution allowing us to model our environment in a more natural manner[11].
Ilona: An advanced CAI Tutorial System for the Fundamentals of Logic
Dartmouth Technical Report PCS-TR91-164
Otto Mayer
Graham E. Oberem
Fillia Makedon
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/64/TR91-164.pdf (691KB)
Abstract:
An advanced tutorial system for teaching the fundamentals of logic has been developed
to run on UNIX work stations and commonly available micro-computers. An important
part of this tutorial is the intelligent problem solving environment which allows
students to practise wiriting logical sentences in mathematical notation. A natural
language system for intelligent logic narrative analysis (ILONA) allows students
to type in their own logical sentences in plain English and then have the computer
check their working when they write these in mathematical form. ILONA is an intelligent
tutoring system which allows students a great deal of initiative in problem solving
and provides a degree of flexibility in answer evaluation not found in traditional
CAI systems. The concepts and structures used in the development of ILONA are easily
transferable to other domains.
Multipacket Routing on Rings
Dartmouth Technical Report PCS-TR91-163
Fillia Makedon
Adononios Simvonis
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/63/TR91-163.pdf (622KB)
Abstract:
We study multipacket routing problems. We divide the multipacket routing problem
into two classes, namely, distance limited and bisection limited routing problems.
Then, we concentrate on rings of processors. Having a full understanding of the
multipacket routing problem on rings is essential before trying to attack the problem
for the more general case of r-dimensional meshes and tori. We prove a new lower
bound of 2n/3 routing steps for the case of distance limited routing problems. We
also give an algorithm that tightens this lower bound. For bisection limited problems,
we present an algorithm that completes the routing in near optimal time.
A Metric Towards Efficient Exhaustive Test Pattern Generation
Dartmouth Technical Report PCS-TR91-162
Dimitrios Kagaris
Fillia Makedon
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/62/TR91-162.pdf (543KB)
Abstract:
A viable technique [7] in built-in self-test (BIST)[2] is to generate test patterns
pseudo-exhaustively by using linear feedback shift registers (LFSR's). The goal
is to find an appropriate primitive polynomial of degree d that will generat 2d
test patterns in order to exercise all circuit outputs simultaneously. In an attempt
to reduce the degree d of the polynomial the following strategy was proposed in
[6,5]. In the first phase, partition the circuit into segments by inserting a small
number of register cells, so that the input dependency of any circuit element in
the segments is no more than d. Then, obain an appropriate primitive polynomial
of degree d by inserting additional register cells. In [12] we have proposed a heuristic
for phase one that does not necessarily partition the circuit. Extensive experimentation
has shown that this results in a considerably smaller cell overhead.
In this paper we extend our heuristic in [12], so that the minimization of the
number of register cells is done in conjunction with a quantity that naturally reflects
the difficulty of deriving an appropriate primitive polynomial of degree d. Experimentation
shows that the proposed heuristic results again in an overall smaller number of
register cells than a partition based approach and in an efficient framework for
test pattern generation.
On Minimizing Hardware Overhead for Exhaustive Circuit Testability
Dartmouth Technical Report PCS-TR91-161
Dimitrios Kagaris
Fillia Makedon
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/61/TR91-161.pdf (452KB)
Abstract:
Exhaustive built-in self testing is given much attention as a viable technique in
the context of VLSI technology. In this paper, we present heuristic in order to
make exhaustive testing of combinational circuits practical. The goal is to place
a small number of register cells on the nets of the input circuit so that the input
dependency of combinational elements in the circuit is less than a small given integer
k. Our heuristic guarantees that each output can be individually tested with 2k
test patterns and can be used as a subroutine to generat efficient test patterns
to test all the outputs of the circuit simultaneously. For example, we can connect
the register cells in a Linear Feedback Shift Register(LFSR).
Minimizing the number of the inserted register cells reduces the hardware overhead
as well as the upper bound on the number of test patterns generated. A heuristic
approach has been proposed only for the case when an element in the circuit schematic
denotes a boolean gate. An element may, however, also be used to represent a combinatorial
circuit model. Our heuristic applies to this case as well. Extensive experimentation
indicates that the proposed technique is very efficient.
Connected Components in O(lg3/2|V|) Parallel Time for the CREW PRAM
Dartmouth Technical Report PCS-TR91-160
Donald B. Johnson
Panagiotis Metaxas
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/60/TR91-160.pdf (1272KB)
Abstract:
Computing the connected components of an undirected graph G = (V,E) on |V| = n vertices
and |E| = m edges is a fundamental computational problem. The best known parallel
algorithm for the CREW PRAM model runs on O(lg2n) time using n2/lg2n processors
[CLC82,HCS79]. For the CRCW PRAM model in which concurrent writing is permitted,
the best known algorithm runs in O(lg n) time using almost (n+m)/lg n processors
[SV82,CV86,AS87]. Unfortunately, simulating this algorithm on the weaker CREW model
increases its running time to O(lg2n) [CDR86, KR90,Vis83]. We present here an efficient
and simple algorithm that runs in O(lg 3/2n) time using n+m CREW processors.
Optimal Parallel and Sequential Algorithms for the Vertex Updating Problem of a Minimum Spanning Tree
Dartmouth Technical Report PCS-TR91-159
Donald B. Johnson
Panagiotis Metaxas
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/59/TR91-159.pdf (983KB)
Abstract:
We present a set of rules that can be used to give optimal solutions to the vertex
updating problem for a minimum spanning tree: Update a given MST when a new vertex
z is introducted, along with weighted edges that connect z with the vertices of
the graph. These rules lead to simple parallel algorithms that run in O(lg n) parallel
time using n/lg n EREW PRAMs. They can also be used to derive simple linear-time
sequential algorithms for the same problem. Furthermore, we show how our solution
can be used to solve the multiple vertex updating problem.
Implementation Notes on bdes(1)
Dartmouth Technical Report PCS-TR91-158
Matt Bishop
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/58/TR91-158.pdf (433KB)
Abstract:
This note describes the implementation of bdes, the file encryption program being
distributed in the 4.4 release of the Berkeley Software Distribution. It implements
all modes of the Data Encryption Standard program.
An Overview of Computer Viruses in a Research Environment
Dartmouth Technical Report PCS-TR91-156
Matt Bishop
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/57/TR91-156.pdf (1551KB)
Abstract:
The threat of attack by computer viruses is in reality a very small part of a much
more general threat, specifically attacks aimed at subverting computer security.
This paper examines computer viruses as malicious logic in a research and development
environment, relates them to various models of security and integrity, and examines
current research techniques aimed at controlling the threats viruses in particular,
and malicious logic in gerneral, pose to computer systems. Finally, a brief examination
of the vulnerabilities of research and development systems that malicious logic
and computer viruses may exploit is undertaken.
A Security Analysis of Version 2 of the Network Time Protocol NTP: A Report to the Privacy and Security Research Group
Dartmouth Technical Report PCS-TR91-154
Matt Bishop
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/56/TR91-154.pdf (999KB)
Abstract:
The Network Time Protocol is being used throughout the Internet to provide an accurate
time service. This paper examines the security requirements of such a service, analyzes
version 2 of the NTP protocol to determine how well it meets these requirements,
and suggests improvements where appropriate.
Privacy-Enhanced Electronic Mail
Dartmouth Technical Report PCS-TR91-150
Matt Bishop
Date: January 1991
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/55/TR91-150.pdf (2070KB)
Abstract:
(Revision 3). The security of electronic mail sent through the Internet may be described
in exactly three words: there is none. The Privacy and Security Research Group has
recommended implementing mechanisms designed to provide security enhancements. The
first set of mechanisms provides a protocol to provide privacy, integrity, and authentication
for electronic mail; the second provides a certificate-based key management infrastructure
to support key distribution throughout the internet, to support the first set of
mechanisms. This paper describes these mechanisms, as well as the reasons behind
their selection and how these mechanisms can be used to provide some measure of
securtiy in the exchange of electronic mail.
Finding Optimal Quorum Assigments for Distributed Databases
Dartmouth Technical Report PCS-TR90-158
Donald B. Johnson
Larry Raab
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/54/TR90-158.pdf (691KB)
Abstract:
Replication has been studied as a method of increasing the availability of a data
item in a distributed database subject to component failures and consequent partitioning.
The potential for partitioning requires that a protocol be employed which guarantees
that any access to a data item is aware of the most recent update to that data item.
By minimizing the number of access requests denied due to this constraint, we maximize
availability. In the event that all access requests are reads, placing one copy
of the data item at each site clearly leads to maximum availability. The other extreme,
all access requests are write requests or are treated as such, has been studied
extensively in the literature. In this paper we investigate the performance of systems
with both read and write requests. We describe a distributed on-line algorithm for
determining the optimal parameters, or optimal quorum assignments, for a commonly
studied protocol, the quorum consensus protocol[9]. We also show how to incorporate
these optimization techniques into a dynamic quorum reassignment protocol. In addition,
we demonstrate via simulation both the value of this algorithm and the effect of
various read-write rations on availability. This simulation, on 101 sites and up
to 5050 links(fully- connected), demonstrates that the techniques described here
can greatly increase data availability, and that the best quorum assignments are
frequently realized at the extreme values of the quorum parameters.
A Tight Upper Bound on the Benefits of Replication and Consistency Control Protocols
Dartmouth Technical Report PCS-TR90-157
Donald B. Johnson
Larry Raab
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/53/TR90-157.pdf (610KB)
Abstract:
We present an upper bound on the performance provided by a protocol guaranteeing
mutually exclusive access to a replicated resource in a network subject to component
failure and subsequent partitioning. The bound is presented in terms of the performance
of a single resource in the same network. The bound is tight and is the first such
bound known to us. Since mutual exclusion is one of the requirements for maintaining
the consistency of a database object, this bound provides an upper limit on the
availability provided by any database consistency control protocol, including those
employing dynamic data relocation and replication. We show that if a single copy
provides availability A for 0 <= A <= 1, then no scheme can achieve availability
greater than sqrt(A) in the same network. We show this bound to be the best possible
for any network with availability greater than .25. Although, as we proved, the
problem of calculating A is #P-complete, we describe a method for approximating
the optimal location for a single copy which adjusts dynamically to current network
characteristcs. This bound is most useful for high availabilities, which tend to
be obtainable with modern networks and their constituent components.
Effects of Replication on Data Availability
Dartmouth Technical Report PCS-TR90-155
Donald B. Johnson
Larry Raab
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/52/TR90-155.pdf (768KB)
Abstract:
In this paper we examine the effects of replication on the availability of data in
a large network. This analysis differs from previous analyses in that it compares
the performance of a dynamic consistency control protocol not only to that of other
consistency control protocols, but also to the performance of non-replication and
to an upper bound on data availability. This analysis also differes in that we gather
extensive simulations on large networks subject to partitions at realistically high
component reliabilities. We examine the dynamic consistency protocol presented by
Jajodia and Mutchler [9, 12] and by Long and Paris[18] along with two proposed enhancements
to this protocol[10,11]. We study networks of 101 sites and up to 5050 links (fully-connected)
in which all components, although highly reliable, are subject to failure. We demonstrate
the importance in this realistic environment of an oft neglected parameter of the
system model, the ratio of transaction submissions to component failures. We also
show the impact of the number of copies on both the protocol performance and the
potential of replicaion as measured by the upper bound. Our simulations show that
the majority of current protocol performs optimally for topologies that yield availabilities
of at least 65%. On the other hand, the availability provided by non-replicaion
is inferior to that of the majority of current protocol by a most 5.9 percentage
points for these same topologies. At this point of maximum difference, theprimary
copy protocol yields availability 59.1% and the majority of current protocol yields
availability 65.0%. We discuss the characteristics of the model limiting the performance
of replication.
Administrator's Guide to the Digital Signature Facility "Rover"
Dartmouth Technical Report PCS-TR90-153
Matt Bishop
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/51/TR90-153.pdf (798KB)
Abstract:
This document describes the installation and maintenance of the rover utility, which
provides a digital signature capability for internet messages.
A Proactive Password Checker
Dartmouth Technical Report PCS-TR90-152
Matt Bishop
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/50/TR90-152.pdf (655KB)
Abstract:
Password selection has long been a difficult issue; traditionally, passwords are
either assigned by the computer or chosen by the user. When the computer does the
assignments, the passwords are often hard to remember; when the User makes the selection,
the passwords are often easy to guess. This paper describes a technique, and a mechanism,
to allow users to select passwords which to them are easy to remember but to others
would be very difficult to guess. The technique is site, user, and group configurable,
and allows rapid changing of constraints impossed upon the passwords. Although experience
with this technique has been limited, it appears to have much promise.
Applying the Take-Grant Protection Model
Dartmouth Technical Report PCS-TR90-151
Matt Bishop
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/49/TR90-151.pdf (1021KB)
Abstract:
The Take-Grant Protection Model has in the past been used to model multilevel security
hierarchies and simple protection systems. The models are extended to include theft
of rights and sharing of information, and additional security policies are examined.
The analysis suggests that in some cases the basic rules of the Take-Grant Protection
Model should be augmented to represent the policy properly; when appropriate, such
modifications are made and their effects with respect to the policy and its Take-Grant
representations are discussed
Term Reduction Using Directed Congruence Closure
Dartmouth Technical Report PCS-TR90-149
L. Paul Chew
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/48/TR90-149.pdf (2809KB)
Abstract:
Many problems in computer science can be described in terms of reduction rules that
tell how to transform terms. Problems that can be handled in this way include interpreting
programs, implementing abstract data types, and proving certain kinds of theorems.
A terms is said to have a normal form if it can be transformed, using the reduction
rules, into a term to which no further reduction rules apply. In this paper, we
extend the Congruence Closure Algorithm, an algorithm for finding the consequences
of a finite set of equations, to develop Directed Congruence Closure, a technique
for finding the normal form of a term provided the reduction rules satisfy the conditions
for a regular term rewriting system. This technique is particularly efficient because
it inherits, from the Congruence Closure Algorithm, the ability to remember all
objects that have already been proved equivalent.
There is a Planar Graph Almost as Good as the Complete Graph
Dartmouth Technical Report PCS-TR90-148
L. Paul Chew
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/47/TR90-148.pdf (460KB)
Abstract:
Given a set S of points in the plane, there is a triangulation of S such that a path
found within this triangulation has length bounded by a constant times the straight-line
distance between the endpoints of the path. Specifically, for any two points a and
b of S there is a path along edges of the triangulation with length less that 10
times [ab], where [ab] is the straight-line Euclidean distance between a and b.
The triangulation that has this property is the L1 metric Delauney triangulation
for the set S. This result can be applied to motion planning in the plane. Given
a source, a destination, and a set of polygonal obstacles of size n, an O(n) size
data structure can be used to find a reasonable approximation to the shortest path
between the source and the destination in O (n log n) time.
Building Voronoi Diagrams for Convex Polygons in Linear Expected Time
Dartmouth Technical Report PCS-TR90-147
L. Paul Chew
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/46/TR90-147.pdf (313KB)
Abstract:
Let P be a list of points in the plane such that the points of P taken in order form
the vertices of a convex polygon. We introduce a simple, linear expected-time algorithm
for finding the Voronoi diagram of the points in P. Unlike previous results on expected-time
algorithms for Voronoi diagrams, this method does not require any assumptions about
the distribution of points. With minor modifications, this method can be used to
design fast algorithms for certain problems involving unrestricted sets of points.
For example, fast expected-time algorithms can be designed to delete a point from
a Voronoi diagram, to build an order k Voronoi diagram for an arbitrary set of points,
and to determine the smallest enclosing circle for points at the vertices of a convex
hull.
Planar Graphs and Sparse Graphs from Efficient Motion Planning in the Plane
Dartmouth Technical Report PCS-TR90-146
L. Paul Chew
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/45/TR90-146.pdf (398KB)
Abstract:
Given a source, a destination, and a number of obstacles in the plane, the Motion
Planning Program is to determine the best path to move an object (a robot) from
the source to the destination without colliding with any of the obstacles. For us,
motion is restricted to the plane, the robot is represented by a point, and the
obstacles are represented by a set of polygons with a total of n vertices among
all the polygonal obstacles.
A Bound of Data Availability when Networks Partition
Dartmouth Technical Report PCS-TR90-145
Michael Goldweber
Donald B. Johnson
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/44/TR90-145.pdf (683KB)
Abstract:
Many consistency or replication control schemes that increase data availability in
distributed systems exist, and the search for improvements continues, though there
have been no good nontrivial upper bound demonstrating how much improvement is possible.
We present a new upper bound for data availability under replication for general
networks. In addition we also describe a new technique that yields near optimal
levels of data availability with respect to this bound.
Matching Multiple Patterns From Right to Left
Dartmouth Technical Report PCS-TR90-143
Samuel W. Bent
Date: January 1990
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/43/TR90-143.pdf (836KB)
Abstract:
We address the problem of matching multiple pattern strings against a text string.
Just as the Aho-Corasick algorithm generalizes the Knuth-Morris-Pratt single-pattern
algorithm to handle multiple patterns, we exhibit two generalizations of the Boyer-Moore
algorithm to handle multiple patterns. In order to obtain worst-case time bounds
better than quadratic, our algorithms remember some of the previous history of the
matching.
A Comparison of Consistency Control Protocols
Dartmouth Technical Report PCS-TR89-142
Michael Goldweber
Donald B. Johnson
Larry Raab
Date: January 1989
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/42/TR89-142.pdf (685KB)
Abstract:
In this paper we analyze three protocols for maintaining the mutual consistency of
replicated objects in a distributed computing environment and compare their performance
with that of an oracle protocol whose performance is optimal. We examine these protocols,
two dynamic protocols and the majority consensus protocol, via simulations using
two measures of availability. The analysis shows that the dynamic protocols, under
realistic assumptions, do not perform significantly better than the static voting
scheme. Finally we demonstrate that none of these approaches perform as well as
our oracle protocol which is shown to be an upper bound on availability.
Asymptotically Fast Algorithms for Spherical and Related Transforms
Dartmouth Technical Report PCS-TR89-141
James R. Driscoll
Dennis M. Healy
Date: January 1989
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/41/TR89-141.pdf (787KB)
Abstract:
This paper considers the problem of computing the harmonic expansion of functions
defined on the sphere. We begin by proving convolution theorems that relate the
convolution of two functions on the sphere to a "multiplication" in the sprectral
domain, as well as the multiplication of two functions on the sphere to a "convolution"
in the spectral domain. These convolution theorems are then used to develop a sampling
theorem on the sphere.
On the Worst Case of Three Algorithms for Computing the Jacobi Symbol
Dartmouth Technical Report PCS-TR89-140
Jeffrey Shallit
Date: January 1989
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/40/TR89-140.pdf (587KB)
Abstract:
We study the worst-case behavior of three iterative algorithms- Eisenstein's algorithm,
Lebesgue's algorithm, and the "ordinary" Jacobi symbol algorithm - for computing
the Jacobi symbol. Each algorithm is similar in format to the Euclidean algorithm
for computing gcd (u,v).
Learning Object-Centered Representations
Dartmouth Technical Report PCS-TR88-139
Peter A. Sandon
Date: January 1988
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/39/TR88-139.pdf (8568KB)
Abstract:
When we look at a familiar object from a novel viewpoint, we are usually able to
recognize it. In this thesis, we address the problem of learning to recognize objects
under transformations associated with viewpoint. Our vision model combines a hierarchical
representation of shape features with an explicit representation of the transformation.
Shape features are represented in a layered pyramid-shaped subnetwork, while the
transformation is explicitly represented in an auxiliary subnetwork. The two connectionist
networks are conjunctively combined to allow object- centered shape features to
be computed in the upper layers of the network. A simulation of a 2-D translation
subnetwork demonstrates the ability to learn to recognize shapes in different locations
in an image, such that those same shapes can be recognized in novel locations. Two
new learning methods are presented, which provide improved behavior over previous
backpropagation methods. Both methods involve ciompetitive interactions among clusters
of nodes. The new learning methods demonstrate improved learning over the generalized
delta rule when applied to a number of network tasks. In the first method, called
error modification, competition is based on the error signals computed from the
gradient of the output error. The result of this competition is a set of midified
error signals representing a contrast enhanced version of the original errors. The
error modification method reduces the occurrence of network configurations that
correspond to local error minima. In the second method, called error augmentation,
competition is based on that activations of the nodes in the cluster. Network changes
resulting from this competition augment those specified by the error gradient computation.
This competition is implemented by the trace comparison rule, a new self-organizing
mechanism that is effective in developing highly discriminating features within
the cluster. The error augmentation method improves learning in the lower network
layers when backpropagged error is weak.
An Application of a Fast Data Encryption Standard Implementation
Dartmouth Technical Report PCS-TR88-138
Matt Bishop
Date: January 1988
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/38/TR88-138.pdf (1024KB)
Abstract:
The Data Encryption Standard is used as the basis for the UNIX password encryption
scheme. Some of the security of that scheme depends on the speed of the implementation.
This paper presents a mathematical formulation of a fast implementation of the DES
in software, discusses how the mathematics can be translated into code, and then
analyzes the UNIX password scheme to show how these results can be used to implement
it. Experimental results are provided for several computers to show that the given
method speeds up the computation of a password by roughly 20 times (depending on
the specific computer).
Theft of Information in the Take-Grant Protection Model
Dartmouth Technical Report PCS-TR88-137
Matt Bishop
Date: January 1988
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/37/TR88-137.pdf (1100KB)
Abstract:
(Revised 5/90). Questions of information flow are in many ways more important than
questions of access control, because the goal of many security policies is to thwart
the unauthorized release of information, not merely the illicit obtaining of access
rights to that information. The Take-Grant Protection Model is an excellent theoretical
tool for examining such issues because conditions necessary and sufficienct for
information to flow between tow objects, and for rights to object to be obtained
or stolen, are known. In this paper we extend these results by examinig the question
of information flow from an object the owner of which is unwilling to release that
information. Necessary and sufficient conditions for such "theft of information"
to occur are derived, and bounds on the number of subjects that must take action
for the theft to occur are presented. To emphasize the usefulness of these results,
the security policies of complete isolation,transfer of rights with the cooperation
of an owner, and transfer of information (but not rights) with the cooperation of
the owner are presented; the last is usedto model a simple reference monitor guarding
a resource.
The Sharing of Rights and Information in a Capability-Based Protection System
Dartmouth Technical Report PCS-TR88-136
Matt Bishop
Date: January 1988
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/36/TR88-136.pdf (868KB)
Abstract:
The paper examines the question of sharing of rights and information in the Take-Grant
Protection Model by concentrating on the similarities between the two; in order
to do this, we state and prove new theorems for each that specifically show the
similarities. The proof for one of the original theorems is also provided. These
statements of necessary and sufficient conditions are contrasted to illustrate the
proposition that transferring rights and transferring information are fundamentally
the same, as one would expect in a capability-based system. We then discuss directions
for future research in light of these results.
Learning Object-Centered Representations
Dartmouth Technical Report PCS-TR87-139
Peter A. Sandon
Date: January 1987
Abstract:
Notes:
Making Mail Friendlier: Adding Macintosh features and multimedia documents in UNIX mail
Dartmouth Technical Report PCS-TR86-135
John R. Meier
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/34/TR86-135.pdf (1850KB)
Abstract:
This paper describes a Macintosh application which acts as a front-end to Unix mail.
Features of the Macintosh interface such as icons, menus, and windows replace the
command driven interface. Complicated editing commands are replaced with mouse selection
and cut, copy, and paste. Message can be composed of text, pictures, and any Macintosh
file, because they are encoded into plain text, sent through the mail system,and
then unencoded by the receiving end. The designs of the mail server and communications
interface are such that mail servers and communications other than Unix mail and
a serial line may be easily implemented.
Producing Software Using Tools in a Workstation Environment
Dartmouth Technical Report PCS-TR86-134
Mark Sherman
Robert L. Scot Drysdale
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/33/TR86-134.pdf (881KB)
Abstract:
We discuss how we taught students to build and use translation, interpretive, editing
and monitoring tools in an undergraduate software engineering course. Students used
the tools on low-cost workstations (Macintoshes) to build large, group projects.
The students' projects used all available features of workstation environments,
including graphics, windows, fonts, mice, networks, and sound generators. We found
that 1) the use of tools increased student productivity, 2) a shift in a data structure
and algorithm topics is needed to cover material relevant for workstation environments,
3) new topics in system design are required for a workstation environment, 4) traditional
material can be easily illustrated with a workstation environment and 5) students
enjoyed being able to manipulate the advanced features of workstations in their
work, which in turn increased their motivation for and concentration on the course
material.
Voronoi Diagrams Based on Convex Distance Functions
Dartmouth Technical Report PCS-TR86-132
L. Paul Chew
Robert L. Scot Drysdale
Date: January 1986
Abstract:
We present an "expanding waves" view of Voronoi diagrams that allows such diagrams
to be defined for very general metrics and for distance measures that do not qualify
as matrics. If a pebble is dropped into a still pond, circular waves move out from
the point of impact. If n pebbles are dropped simultaneously, the paces where wave
fronts meet define the Voronoi diagram on the n points of impact. The Voronoi diagram
for any normed matric, including the Lp metrics, can be obtained by changing the
shape of the wave front from a circle to the shape of the "circle" in that metric.
(For example, the "circle" in the L1 metric is diamond shaped.) For any convex wave
shape there is a corresponding convex distance function. Even if the shape is not
symmetric about its center (a triangle, for example), although the resulting distance
function is not a metric, it can still be used to define a Voronoi diagram. Like
Voronoi diagrams based on the Euclidean metric, the Voronoi diagrams based on other
nomed metrics can be used to solve various closest-point problems (all-nearest-neighbors,
minimum spanning trees, etc.). Some of these problems also make sense for convex
distance functions which are not metrics. In particular, the "largest empty circle"
problem becomes the "largest empty convex shape" problem, and "motion planning for
a disc" becomes "motion planning for a convex shape". These problems can both be
solved quickly given the Voronoi diagram. We present an asymptotically optimal algorithm
for computing Voronoi diagrams based on convex distance functions.Notes:
The Pairwise Intersection Problem for Monotone Polygons
Dartmouth Technical Report PCS-TR86-131
David B. Levine
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/31/TR86-131.pdf (1903KB)
Abstract:
Geometric intersection problems arise in a number of areas of computer science including
graphics and VLSI design rule checking. Previous work has concentrated on solving
the pairwise intersection problem for line segments and iso-oriented rectangles.
This thesis extends that work by presenting efficient algorithms to solve the pairwise
intersection problem for monotone polygons. For general segments, the problem has
been solved in O(N+I)*logN) time using a sweeping line technique, where N is the
number of segments and I is the number of intersections reported. We combine this
technique with approaches taken to solve the iso-oriented rectangle problem to yield
an algorithm which solves the pairwise intersection problem for monotone polygons
in the same asymptotic time. In addition, there are certain classes of line segments
for which the pairwise intersection problem may be solved in O(N*logN + I) time,
the best possible. We generalize each such class of line segments to a class of
polygons and present algorithms to solve the associated polygon problem. Finally,
we discuss the impacts which possible improvements to the line segment problem would
have on our results.
Finding Largest Empty Circles with Location Constraints
Dartmouth Technical Report PCS-TR86-130
L. Paul Chew
Robert L. Scot Drysdale
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/30/TR86-130.pdf (444KB)
Abstract:
Let S be a set of n points in the plane and let CH(S) represent the convex hull of
S. The Largest Empty Circle (LEC) problem is the problem of finding the largest
circle centered with CH(S) such that no point of S lies within the circle. Shamos
and Hoey (SH75) outlined an algorithm for solving this problem in time O(n log n)
by first computing the Voronoi diagram, V(S), in time O(n log n), then using V(S)
and CH(S) to compute the largest empty circle in time O(n). In a recent paper [Tou83],
Toussaint pointed out some problems with the algorithm as outlined by Shamos and
presented an algorithm which, given V(S) and CH(S), solves the LEC problem in time
O(n log n). In this note we show that Shamos' original claim was correct: given
V(S) and CH(S), the LEC problem can be solved in time O(n). More generally, given
V(S) and a convex k-gon P, the LEC centered within P can be found in time O(k+n).
We also improve on an algorithm given by Toussaint for computing the LEC when the
center is constrained to lie within an arbitrary simple polygon. Given a set S of
n points and an arbitrary simple k-gon P, the largest empty circle centered within
P can be found in time O(kn + n log n). This becomes O(kn) if the Voronoi diagram
of S is already given.
An Algorithm for Resource Allocation Requiring Low Overhead Communication
Dartmouth Technical Report PCS-TR86-129
Ann Marks
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/29/TR86-129.pdf (2169KB)
Abstract:
A heuristic algorithm for allocating resource units to sites in a distributed system
is presented. Starting with a given allocation of sites, the algorithm performs
a series of optimizations involving pairs of sites in an attempt to improve the
worst pair-wise imbalance present in the system; termination occurs when no further
improvement is possible. After outlining the general form of the algorithm, which
effectively defines an entire family of algorithms, we present theoretical results
that speak to the performance of the algorithm as measured in the number of optimizations
that can be done, the amount of control communication required and the worst case
imbalance of the resulting allocation. Subsequently, two particular algorithms in
the family are given and the results of a simulation study of their performance
is presented.
An Image Processing Software Package for the Laser Scanning Phase Modulation Microscope
Dartmouth Technical Report PCS-TR86-128
William J. Murray
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/28/TR86-128.pdf (3462KB)
Abstract:
This thesis documents the most recent effort to develop a user-friendly image processing
software package for the Laser Scanning Phase Modulation Microscope (LSPMM). The
LSPMM is composed of three integrated subsystems, the Laser Scanning (LS) system,
the Phase Modulation (PM) system, and Digital Image Acquisition (DIA) system. Under
the control of the image processing software, the DIA system can receive and store
the digital image data, display the image on a monochrome monitor, and process the
image to provide the microscopist with quantitative information regarding the image.
The implementation of this image processing software package required the specification
of a four level software hierarchy to serve as an organizational framework, with
the highest level interacting with the LSPM microscopist, and the lowest level performing
hardware control. This framework should prove useful for the development and implementation
of additional software in the future. The programs that were developed accept command
line arguments; however, most will interactively query the user if the command line
arguments are not known. This software provides the microscopist with the capability
to scan, save, and display a 512 by 512 pixel image. The image may be scanned to,
saved from, or displayed in either of the two DeAnza image display memory planes.
Considerable effort has been made to incorporate all of the devices useful for image
processing into a single operating system kernel. This alleviates the problem of
taking down one operating system and bringing up another version in order to dump
image files on magnetic tape.
Binary Trees (v. 2.1 -- September 25, 1985)
Dartmouth Technical Report DCS-TR86-127
John Glenn
Date: January 1986
Abstract:
Binary Tree is a program that uses animation to illustrate insertion, deletion and
searching of nodes in a binary search tree. Pre-order, in-order, and post-order
tree walks are supported. Binary Tree can also perform a recursive descent parse
of arithmetic expressions and show the resulting abstract syntax tree. Lisa Pascal
source and application available on a single-sided, MFS format microdisk.Notes:
Using Low-Cost Workstations to Investigate Computer Networks and Distributed Systems
Dartmouth Technical Report PCS-TR86-126
Mark Sherman
Ann Marks
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/26/TR86-126.pdf (832KB)
Abstract:
A quality education in contemporary computer science requires that students gain
experience with realistic systems. Many efforts at bringing empirical computer science
to undergraduates focus on rather old technologies, for example, building a compiler
or simulating a disk scheduler. Although efforts are being made to use some newer
technologies, the efforts are concentrating on teaching traditional material in
a new medium. However, the medium itself -- networked workstations in a server environment
-- is worthy of exploration by undergraduate students in a laboratory setting. At
Dartmouth, we developed a Computer Network Laboratory to let students experiment
with computer networks, protocols and distributed systems. Through this article,
we wish to share our experiences in the design of the laboratory and give an example
of how the laboratory was used in a computer network course.
Functions Returning Values of Dynamic Size
Dartmouth Technical Report PCS-TR86-125
Mark Sherman
Andy Hisgen
Jonathan Rosenberg
David Alex Lamb
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/25/TR86-125.pdf (656KB)
Abstract:
Modern programming languages, such as Ada (Ichbiah 80), permit the definition of
functions that return values whose size can not be determined until the function
returns. This paper discusses five implementation techniques that can be used to
implement this capability. Comparisons of the techniques are provided and guidelines
for selecting a particular technique for a compiler are given.
A Network Package for the Macintosh Using the DoD Internet Protocols
Dartmouth Technical Report PCS-TR86-124
Mark Sherman
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/24/TR86-124.pdf (1814KB)
Abstract:
This memo describes the design, implementation and use of the MacIP libraries and
programs for running DoD Internet Protocols on the Macintosh over Apple Talk. It
is preliminary documentation and out of date, but it is all that exists.
BRUCE: A Graphics System with Hidden Line and Hidden Surface Algorithms
Dartmouth Technical Report PCS-TR86-123
Keith Vetter
Christopher Roche
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/23/TR86-123.pdf (2665KB)
Abstract:
Accurately representing the physical world by computer is a topic which has direct
benefits to fields like chemistry and architecture, and is a source of much research
in computer science. This paper examines the steps necessary to develop and implement
a graphical system that will allow for the modeling of physical world objects. In
particular, this is a description of BRUCE: a graphical system that will describe
a world of three dimensional polyhedra, implementing algorithms for hidden line
and hidden surface removal. This paper also deals with the problems incurred along
the way and suggestions for further improvement of BRUCE.
Havoc V.85 Software
Dartmouth Technical Report DCS-TR86-122
David Cohn
Stephen Madancy
Date: January 1986
Abstract:
This disk contains alpha release 85 of the HAVOC system, with examples. It is written
in C and 68000 assembler.Notes:
Creating Havoc: Havoc Development Program
Dartmouth Technical Report PCS-TR86-121
David Cohn
Stephen Madancy
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/21/TR86-121.pdf (1785KB)
Abstract:
One area where use of the computer is essential is in the modern scientific laboratory.
High speed computation, data storage and data analysis enable scientists to perform
experiments that would otherwise be impractical. A problem inherent to the effective
use of special purpose laboratory computers, however, is the fact that this equipment
has generally been developed for highly specific uses, and has either tried to cope
with existing high-level languages or has abandoned the attempt and required the
user to program in a low-level assembly or machine language. Our idea was to design,
develop and implement a programming language that is suited to the needs of a laboratory
scientist. Our results have led us to believe that the best way to achieve our goals
was using an interpretive/compiled programming environment (similar in spirit to
FORTH) in which large programs could be built in small, coherent pieces, that could
easily be tested on as high or low a level as the programmer desired. Our language,
Havoc, adheres to these principles while providing many of the more widespread and
useful language features not found in FORTH. Besides giving it motivation, this
preliminary report describes the current design and implementation status of the
HAVOC system. The current version of the HAVOC system is available for the Macintosh.
View 3: A Programming Environment for Distributed Programming
Dartmouth Technical Report PCS-TR86-120
Ann Kratzer
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/20/TR86-120.pdf (469KB)
Abstract:
View 3 is an experimental programming environment to support the development and
use of distributed programs. It builds upon three major concepts. First, distributed
programs and distributed processes are basic objects. Second, the port mechanism
allows a process to exchange information with another process, a file or an I/O
device without concern for the type of the object on the other end of the port.
Third, test and use of distributed programs are facilitated by the user interface
program screen format that allows the user to control both the format and contents
of the physical terminal.
Task Queues: A General Model for the Implementation of Communications Protocols
Dartmouth Technical Report PCS-TR86-119
Ann Kratzer
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/19/TR86-119.pdf (661KB)
Abstract:
When any computer communications network is built, its communications protocol must
always be implemented. The protocol is implementetd on the switching nodes of the
network. The node software must respond in real time to events generated external
to the switching node. Thus, the software running on a switching node constitutes
a concurrent program; this complicates the design, implementation and testing of
the switching node software. The task queue model presented in this paper defines
a structure for this software that facilitates the design, implementation and testing
of communications protocols.
A Distributed Strategy for Resource Allocation
Dartmouth Technical Report PCS-TR86-118
Ann Kratzer
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/18/TR86-118.pdf (263KB)
Abstract:
In this paper, we present a decentralized algorithm for determining how resources
should be allocated to sites. This algorithm is general in the sense that it can
be used for allocating different kinds of resources. This algorithm can be applied
either statically, used on a quiescent system, or used dynamically as the system
runs. Throughout, we consider only systems in which a resource has a fixed cost
associated with it regardless of where it may be located (i.e. the system is homogeneous).
View-3 and Ada: Tools for Building Systems with Many Tasks
Dartmouth Technical Report PCS-TR86-117
Ann Kratzer
Mark Sherman
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/17/TR86-117.pdf (549KB)
Abstract:
This paper discusses some useful features for tools that are intended to be used
for developing systems with multiple tasks. We include a description of one tool
that has been built, View-3. We also describe some problems that might be encountered
when trying to fit this kind of tool into an APSE system.
A Practical, Distributed Environment for Macintosh Software Development
Dartmouth Technical Report PCS-TR86-116
Mark Sherman
Ann Marks
Rob Collins
Heather Anderson
Jerry Godes
Denis Devlin
Leonid Spector
Vivian Sewelson
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/16/TR86-116.pdf (439KB)
Abstract:
We describe a development environment we created for prototyping software for the
Macintosh. The programs are developed and executed on a large time-shared computer
but can use the full facilities of the Macintosh. By using this system, we combine
the advantages of the large system, such as large amounts of disk storage and automatic
file backups, with the advantages of the Macintosh, such as advanced graphics, mouse
control and sound synthesis. We also describe several projects that used the distributed
development system. We conclude with a description of our future plans for this
environment.
KD Tree Simulator
Dartmouth Technical Report DCS-TR86-115
Ed Grosz
Date: January 1986
Abstract:
This MFS formatted disk contains an animation program for illustrating KD-Trees (D-dimensional
search trees) (actually, only 2 dimensions are shown). Both the runnable application
and the Lisa Object Pascal/MacApp sources are included.Notes:
Music Editor
Dartmouth Technical Report DCS-TR86-114
Ed Grosz
Date: January 1986
Abstract:
This disk contains a simple music editor that creates resources that can be used
directly by the Macintosh Sound Driver. The MFS formatted disk contains both the
runnable program and the Lisa Object Pascal/MacApp Sources.Notes:
A Collection of MDS Example Programs
Dartmouth Technical Report DCS-TR86-113
Joe Bergin
Robert L. Scot Drysdale
Jerry Keough
Larry Gallagher
Vivian Sewelson
Mark Sherman
Date: January 1986
Abstract:
This MFS formatted disk contains a collection of example programs written for teaching
machine architecture concepts using the Macintosh Development System. Sample programs
illustrate how instructions can be treated as data, computer arithmetic, looping
features, interrupt handling, coroutines, recursive subprograms and stack frames,
traps and direct manipulation of the screen. In addition, a supervisor program is
provided that implements basic, line-oriented input and output facilities for the
68000 assembly language programs on the Macintosh.Notes:
QDT (Quickdraw Terminal) Libraries
Dartmouth Technical Report TCS-TR86-112
Mark Sherman
Ann Marks
Date: January 1986
Abstract:
This Unix tar-tape contains the C libraries used on Unix and the PL/1 Libraries used
on DCTS for communicating with QDT. (No longer available.)
QDT (Quickdraw Terminal) Disks
Dartmouth Technical Report DCS-TR86-111
Mark Sherman
Date: January 1986
Abstract:
This disk set consists of one disk containing a running version of QDT over a serial
line, one disk containing a running version of QDT over Apple Talk (using KSP),
and one disk containing the Lisa Pascal sources for creating QDT (either version).Notes:
Stack Package Implementing Tasks and Coroutines for TML Pascal on the Macintosh
Dartmouth Technical Report DCS-TR86-110
Mark Sherman
Date: January 1986
Abstract:
This disk contains the necessary TML Pascal and MDS Assembler files for implementing
tasking and coroutines in TML Pascal Programs. Two example programs and brief documentation
are provided. MDS format disk.Notes:
MacIP Sources and Applications
Dartmouth Technical Report DCS-TR86-109
Mark Sherman
Tim Maroney
Date: January 1986
Abstract:
The MacIP set consists of three disks, all single-sided, MFS formatted. Two disks
contain the Lisa Pascal/Lisa Assembler sources for the network libraries and application
programs; the third disk contains the runnable applications along with an AppleTalk
configured system. The current version by Tim Maroney fixes all reported bugs mentioned
in the MacIP Technical Report and includes the TCP library and Telnet program.Notes:
Dartmouth-Smalltalk: An Exercise in Implementation
Dartmouth Technical Report PCS-TR86-108
Joon Sup Lee
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/8/TR86-108.pdf (4006KB)
Abstract:
This report discusses a preliminary implementation of the Smalltalk-80 virtual machine
for Vax/Unix.
What Should a Discrete Mathematics Course Be?
Dartmouth Technical Report PCS-TR86-107
Kenneth P. Bogart
Date: January 1986
Abstract:
This paper presents the results of a survey of mathematics and computer science departments
on their needs and intentions for survey courses in discrete mathematics. The paper
then presents an idealized syllabus for such a course, recognizing that many topics
in the syllabus should be regarded as optional. The paper concludes with a brief
summary of the experience at Dartmouth in teaching such a course.Notes:
Algorithms for Iterative Array Multiplication
Dartmouth Technical Report PCS-TR86-106
Shinji Nakamura
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/6/TR86-106.pdf (1577KB)
Abstract:
Algorithms for the parallel multiplication of two n bit binary numbers by an iterative
array of logic cells are discussed. The regular interconnection structures of the
multiplier array cell elements, which are ideal for VLSI implementation, are described.
The speed and hardware complexity of two new iterative array algorithms, both of
which require n cell delays for one n by n bit multiplication, are compared with
a straight-forward iterative array algorithm having a 2n cell delay and its higher
radix version having an n cell delay.
Parallel Accessible Memory
Dartmouth Technical Report PCS-TR86-105
Shinji Nakamura
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/5/TR86-105.pdf (725KB)
Abstract:
A new design of a memory device which allows simultaneous access to more than one
location is proposed. The unique feature of this multiple accessibility of the memory
is realized by applying a binomial concentrator, a type of sparse crossbar interconnection
network, to content-addressable memory. The organization of the memory system and
the concentration network structure as well as the network characteristics are described
along with a distributive control algorithm. Applications of the memory system to
parallel processing environments are also included.
Uses of Generics in Ada
Dartmouth Technical Report PCS-TR86-104
Mark Sherman
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/4/TR86-104.pdf (746KB)
Abstract:
This paper discusses how Ada generic procedures and packages can be used in novel
ways to provide general initialization of records, expressions using discriminants,
record field hiding and explicit control of parameter binding.
Digital Logic Simulator
Dartmouth Technical Report DCS-TR86-103
John W. Scott
Date: January 1986
Abstract:
Logic is a Macintosh program that simulates discrete, digital logic circuits. The
two disks contain an executable version of the program, MacApp sources and brief
documentation (which is also available separately in hardcopy).Notes:
Instructions for Using Logic
Dartmouth Technical Report PCS-TR86-102
John W. Scott
Date: January 1986
URL (application/pdf) http://www.cs.dartmouth.edu/cms_file/SYS_techReport/2/TR86-102.pdf (608KB)
Abstract:
This document provides instructions on the use of "Logic", a digital logic simulator
for the Macintosh written for Dartmouth students in the Machine Architecture and
Digital Electronics courses.
Maze Game for the Macintosh
Dartmouth Technical Report DCS-TR86-101
Mark Sherman
Date: January 1986
Abstract:
Maze is a multiperson game using a broadcast protocol on the Macintosh. This disk
contains the sources (in TML Pascal) for a Maze game.Notes: