close
Home People Research Programs Admissions

2011–2012  Dartmouth Computer Science Colloquium Schedule

All colloquia take place on Wednesday at 4:15 in 006 Steele unless otherwise noted.


February 23 (Thursday, in Filene Aud, Moore):
Matthew Lee, Carnegie Mellon University

Sensing and Sense-making of Everyday Actions

Our interactions with people, objects, and technology in our everyday lives are an immense source of information for understanding our habits, motives, and intentions in how we manage our energy use, finances, social lives, and health. My research focuses on designing, building, and evaluating sensing systems that capture the rich behavioral patterns found in everyday life and reveal them to help people achieve their goals. Knowing a priori what context and features to sense in our everyday activities remains a challenge. Furthermore, identifying meaning within large amounts of personal sensor data remains a difficult task for computers. To address this, I bring the human into the loop and investigate their sensemaking process when making sense of personal sensor data to find opportunities for computational support. In this talk, I will discuss how I leverage the human sensemaking process through the design of two sensing systems that support healthy aging. I will discuss MemExerciser, a mobile lifelogging system that automatically records pictures and sounds and selects meaningful content to help people with Alzheimer's disease to remember their experiences better. I will also talk about dwellSense, a system of sensors embedded in the home that can automatically capture and analyze over time how older adults peform activities important for independence. I will conclude with new opportunities for computational support for sensing and acting on our everyday actions.

Bio:
Matthew Lee is a PhD Candidate in the Human-Computer Interaction Institute at Carnegie Mellon University. He is advised by Anind Dey. His research interests lie at the intersection of human-computer interaction, ubiquitous computing, human behavior, informatics, and applications in real-world problems such as health and sustainability. He is interested in bringing about the future of mobile and ubiquitous computing by designing, building, and evaluating sensing systems that capture the rich behavioral patterns found in everyday life and reveal them to help people achieve their goals. His work has been published in premier ACM academic conferences such as UbiComp, Assets, and CHI where he has been recognized with a best paper nomination. His doctoral work focuses sensing everyday activities in the home to support the self-awareness of changes in cognitive and functional abilities for older adults. He is the lead researcher for dwellSense, a Robert Wood Johnson Foundation Project HealthDesign project. He received his masters degree in Human-Computer Interaction from Carnegie Mellon University and his undergraduate degree in Computer Science and Cognitive Science with honors from the University of California at Berkeley.


February 15: Chris Kanich, University of California, San Diego

Understanding Spam Economics

Over the past two decades, the Internet has become an essential tool in the lives of millions of people. Unfortunately, this success has also attracted cybercriminals who exploit the Internet as a platform for illicit gain.  Perhaps the most familiar scam is sending unsolicited advertisements (spam), clogging inboxes and putting people's computers at risk of dangerous malware infections.

Understanding the mechanisms and effectiveness of these scams is essential to building effective countermeasures to cybercrime. In this talk, I'll explain the modern spamming landscape and present research that help us better understand how spammers make their money online.

One effort uses the technique of botnet infiltration to examine a spam campaign from the point of view of the spammers. Botnet infiltration allows us to measure their operation including the advertisements'

effectiveness and the worldwide use of spam filtering techniques. The second effort exploits key information leaks to answer key questions about the modern affiliate marketing-based spam ecosystem, from estimating their worldwide gross revenue, to understanding customer demographics and their most popular products. I'll end by discussing future work in this space and outline research directions that exploit criminal's online architecture and motivations to develop effective defenses.


February 9 (Thursday): Zheng "Eddy" Zhang, College of William and Mary

Dynamic Optimization for Irregular Applications on Many-Core Architectures

Enhancing the match between software executions and hardware features is key to computing efficiency in terms of both performance and energy consumption. The match is constantly complicated by emerging architecture features in computing systems and has become a continuously evolving problem. In this talk, I will present some recent findings in the implications of three prominent features of modern systems: the heterogeneity, the rapid growth of processor-level parallelism and the increasingly complex interplay among computing units. In particular, I will focus on how to streamline computations containing dynamic irregularities for General Purpose Graphic Processing Units (GPGPUs), a broadly adopted many-core architecture. The talk will begin with the theoretical foundations of GPGPU program-level transformation techniques, and further describe a runtime optimization system, named G-Streamline, as a unified software solution to irregularities in both memory references and control flows. The system enables on-the-fly elimination of irregularities through adaptive CPU-GPU pipelining and kernel splitting schemes. Working in a holistic fashion, it maximizes whole-program performance by resolving conflicts among optimizations. In the end, I will briefly describe my other work which includes a study of the influence of shared cache on multicore and a new paradigm, named shared-cache-aware optimizations, for parallel software locality enhancement.

Bio:
Zheng (Eddy) Zhang is a Ph.D. candidate at the Computer Science Department of the College of William & Mary. She received her M.S. in Computer Science at William & Mary with a Computational Operations Research(COR) specialization. Her research generally lies in the area of compilers and programming systems, with a focus on revealing and exploiting the implications of emerging hardware features on the development, compilation, and execution of software. She is the lead author of a paper that won the Best Paper Award at PPoPP'10, and a recipient of the Google Anita Borg Memorial Scholarship.


February 1: David Kempe, University of Southern California

Subset Selection for Linear Regression

One of the central problems in many data-driven sciences is the right selection of attributes to observe in order to accurately predict a variable of interest. Applications abound in areas as diverse as medical sciences (predicting medical conditions based on observable attributes), social sciences (predicting future behaviors or outcomes) and sensor networks, among others.

In many of these cases, time or cost constraints prohibit sampling more than a few attributes. Also, many application domains use linear regression as a method of prediction, and evaluate the quality of attributes in terms of the R^2 fit with the quantity to be predicted. This motivates the following formal problem definition: "Given the covariances between observable variables X_i and a target variable Z, select k of the variables X_i such that the selected set has the best possible R^2 fit with Z."

The main result presented in this talk is that so long as the covariance matrix between the X_i variables is far from singular, greedy algorithms frequently used in practice are provably constant-factor approximations. The proof is based on extending the widely used concept of submodularity to a notion of approximate submodularity, and relating it to the spectrum of the covariance matrix. Furthermore, we will investigate various graph-theoretical properties of covariance matrices which allow for efficient exact or approximate algorithms.

We conclude with several exciting open questions.

This talk is based on joint work with Abhimanyu Das, appearing in ICML 2011 and STOC 2008.


January 18: Kimo Johnson, Co-founder of GelSight, Inc.

Portable, High-Resolution 3D Surface Metrology

I will present a novel system for measuring 3D surface geometry using a single camera, some LEDs, and a block of clear rubber. While seemingly simple, this system is capable of measuring geometry down to single-digit microns. The key component of the system is the block of rubber, which we cover with a reflective skin on one side. When the rubber is pressed against a surface, the reflective skin conforms to the surface topography, effectively painting it with a uniform reflectance. By controlling the optical properties of the surface, our approach allows for the precise measurement of any material. We have called the system GelSight due to our unique elastomeric sensor. In this talk, I will present the core GelSight technology, describe several systems we have built, and show a working prototype. I will also show measurements from a variety of materials, representing applications from diverse industries.

Bio:
Kimo Johnson is a co-founder of GelSight, Inc., a company formed to commercialize the GelSight surface scanning technology. Before leaving academia to lead GelSight, Inc., Kimo was a Research Scientist at MIT. During his time at MIT, he worked on numerous projects in computer vision and graphics, including the project that became GelSight. Most recently, Kimo's research interests have focused on image-based analysis of shape, illumination, color, and texture. In addition, he enjoys collaborating on interdisciplinary problems, such as image enhancement, forensics, audio analysis, and music theory. He holds undergraduate degrees in Mathematics and Music Theory from the University of New Hampshire, a master's degree in Electro-Acoustic Music from Dartmouth College, and a Ph.D. in Computer Science from Dartmouth College, where he worked with Prof. Hany Farid on image forensics.


January 11: Brendan Lucier, Microsoft Research NE

Dueling Algorithms

In this talk I will revisit classic algorithmic search and optimization problems from the perspective of competition. Rather than a single optimizer minimizing expected cost, we consider a game in which a search problem is presented to two players, whose only goal is to outperform the opponent. Classic algorithms often perform quite badly in this competitive setting, which can be thought of as an exponentially large zero-sum game. I will describe general techniques by which optimal (or approximately optimal) strategies can be found, leveraging the structure inherent in an algorithmic duel. I will consider examples of ranking, hiring, compression, binary search, and others, and give bounds on how often one can beat the classic optimization algorithms in such duels.

Based on joint work with Nicole Immorlica, Adam Tauman Kalai, Ankur Moitra, Andrew Postlewaite, and Moshe Tennenholtz.


January 4: Katrina Ligett, California Institute of Technology

A Formal Approach to Data Privacy

Organizations commonly hold databases of sensitive information that they would like to use or publish in some sanitized manner. How can they do so? At what cost to the people whose information is contained in the database? At what loss of usefulness? What are the tradeoffs involved? How can one reason formally about private information?

In this talk we provide an introduction to the notion of differential privacy and a taste of recent results in this area. As time allows, we'll discuss some very recent work using privacy techniques to conduct unbiased surveys.

Based on joint work with Avrim Blum, Anupam Gupta, Moritz Hardt, Frank McSherry, Aaron Roth, and Kunal Talwar.


November 30: Alessandro Senes, University of Wisconsin at Madison

Efficient Strategies for Conformational Sampling for the Optimization of Side Chains in Proteins

A key problem in biology is predicting the structure of a protein directly from its amino acid sequence. The sequence corresponds to a well-defined three-dimensional structure, and the structure defines the biological function of the protein. The protein structural prediction problem can be subdivided into two parts: the prediction of the conformation of the backbone and the correct placement of the flexible side chains of the amino acids that branch out of the backbone, a step referred to as "side chain optimization." Side chain optimization is combinatorially complex and is generally approached by using a library of representative conformations. The library transforms a continuum search space into a discretized problem for which a number of powerful deterministic or stochastic algorithms are available. The precise nature of the library is important because it determines the quality of the final outcome of a side chain optimization procedure. In this talk I will address the question of what would be the best possible selection of discretized elements for the library to maximize performance in side chain optimization, in terms of both time and accuracy. We are also creating methods for predicting the precise sampling needs of each individual position in a protein, with the idea that transferring sampling from positions that are easy to satisfy to those that require many conformations will result in improved performance.


November 15: Jeffrey P. Bigham, University of Rochester

Real-Time Crowd Support for People with Disabilities

The past few decades have seen the development of wonderful new intelligent technology that serves as sensors and agents onto an inaccessible world for people with disabilities, but it remains both too prone to errors and too limited in the scope to reliably address many problems faced by people with disabilities in their everyday lives. We have been developing approaches to crowdsourcing that work in real-time to overcome these problems. In this talk, I'll discuss the following recent projects that use real-time crowdsourcing: (i) VizWiz, an accessible iPhone application that blind people use to take a picture, speak a question, and receive answers from the crowd in under a minute. More than 20,000 questions have been asked so far, giving us insight into the types of questions blind people want answered. (ii) Legion, a system that lets dynamic groups collaboratively control existing user interfaces using a VNC-like setup. These applications collectively inform a new model of human-computer interaction in which a dynamic group of unreliable individuals act as a single reliable user.

Bio:
Jeffrey P. Bigham is an Assistant Professor in the Department of Computer Science at the University of Rochester where he directs ROC HCI. His works spans Access Technology, Human Computation, and Intelligent User Interfaces. He is specifically interested in technology that engages the crowd to assist people with disabilities in their everyday lives. Professor Bigham received his Ph.D. in 2009 in Computer Science and Engineering from the University of Washington working with Dr. Richard Ladner, and his B.S.E. from Princeton in 2003. Jeffrey has received a number of awards for his work, including the Andrew W. Mellon Foundation Award for Technology Collaboration, the MIT Technology Review Top 35 Innovators Under 35 Award, two ASSETS Best Student Paper Awards, and the UIST 2010 Best Paper Award.


November 9: Dmitry Malioutov, DRW Algorithmic Trading

Smooth Isotonic Covariances for Interest Rate Risk Modeling

In this talk we consider the problem of estimating the covariance matrix of a high-dimensional random vector in the scarce data setting, where the number of samples is less than or comparable to the dimension. The sample covariance matrix is a poor choice in this setting, and a variety of structural assumptions or priors have been considered in the literature: covariance selection models with sparse precision matrices, low-rank models (PCA and factor analysis), sparse plus low-rank, covariance shrinkage, and others.

We suggest to use another type of structure, which plays an important role in applications such as interest rate modeling in computational finance: we assume that the random vectors can be indexed over a low-dimensional manifold, and the covariance matrix has smoothness and monotonicity properties over the manifold. We describe how these assumptions can be enforced in a convex optimization framework using semidefinite programming (SDP) via interior point methods and first order proximal gradient methods. Furthermore we also describe how this framework could be applied to problems with missing data, and with asynchronous measurements.

Bio:
Dmitry Malioutov completed his PhD at MIT in the area of statistical signal processing, machine learning and convex optimization. After a postdoctoral position in Microsoft Research he is now a researcher in the Algorithmic trading group in DRW holdings. His research interests include sparse signal representation and compressed sensing, graphical models and message passing algorithms, convex optimization, and relaxations of combinatorial optimization problems. He is also interested in statistical risk management in high-dimensional settings, and market micro-structure analysis in computational finance. He coauthored more than 25 academic papers, and received the 2011 IEEE Signal Processing Society best paper award, the 2006 IEEE ICASSP best student paper award, and the MIT presidential fellowship.


November 3 (Thursday): Jesse Walker, Intel Security Lab

Conceptual Foundations of the Ivy Bridge Random Number Generator

This talk will describe the ideas behind the Ivy Bridge Random Number Generator. The talk begins with a discussion of how cryptography uses randomness, followed by a discussion of some important concepts associated with it. The presentation motivates the new design by examining the shortcomings of Intel’s 1999 RNG. With this background, the talk describes Intel’s new RNG architecture and its underlying theory.

Bio:
Jesse Walker is a researcher in Intel’s Security Research Lab, based in Hillsboro, OR. He was the architect for Intel’s new random number generator, which is scheduled to first ship in the Ivy Bridge processor. Dr. Walker is also a co-author of Skein, a finalist cryptographic hash algorithm in NIST’s SHA-3 competition. Other notable accomplishments include being the first to identify the flaws in WEP, Wi-Fi’s original security protocol, and serving as the architect for 802.11i, WEP’s replacement. Dr. Walker was honored by the Wi-Fi Alliance and Intel in 2005 for opening China to Wi-Fi, where sales had been blocked due to China’s encryption regulations. He received a Ph.D. in mathematics from the University of Texas in 1980.


October 26: Mikael Vejdemo Johansson, University of St. Andrews (Scotland)

The Topology of Politics

Data analysis techniques swept through the field of political science in the 1990s, providing completely new ways to look at political entities. By viewing members of parliaments as points in a vector space spanned by the votes they had to cast, questions like party classification and comparison reduce to pure data analysis.

In a completely different context, the last decade has seen the rise of algebraic topology as a source of methods for application areas far outside of mathematics. Data analysis tools with roots in algebraic topology have proven themselves to be useful to computational geometry, cancer research, materials sciences, and numerous other scientific endeavors. Throughout the rise of topological data analysis, a leading observation has been that through the use of topology, the analyses performed are qualitative rather than quantitative: topological approaches do not rely on the particular metrics used in any given dataset help them become stable and resilient to noise. In work jointly with Gunnar Carlsson (Stanford) and Anders Sandberg (Oxford), we work on using the developments in topological data analysis to improve on the analysis of political data. I shall give examples and demonstrate the relevant techniques, and show how we can improve on classical analyses with tools from topology.

Bio:
Mikael Vejdemo-Johansson finished his doctorate in computational homological algebra at the Friedrich-Schiller University in Jena, Germany, in 2008, and has since done research in algebraic topology by way of explicit computations. After a 3-year postdoctoral study at Stanford University, he is now a Scientific Officer at the University of St. Andrews in Scotland, working on parallelization, cluster computing and HPC-adaptation of computational algebra. He retains his interest in computational and applied algebraic topology, and is currently co-authoring the first textbook in Topological Data Analysis.


October 19: Aleksander Madry, Microsoft Research

Online Algorithms and the k-Server Conjecture

Traditionally, in the problems considered in optimization, one needs to produce the solution only after the whole input is made available. However, in many real-world scenarios the input is revealed gradually, and one needs to make irrevocable decisions along the way while having only partial information on the whole input. This motivates us to develop models that allow us to address such scenarios. In this talk, I will consider one of the most popular approaches to dealing with uncertainty in optimization: the online model and competitive analysis. I will focus on a central problem in this area: the k-server problem. This problem captures many online scenarios - in particular, the widely studied caching problem - and is considered by many to be the "holy grail" problem of the field. I will present a new randomized algorithm for the k-server problem that is the first online algorithm for this problem that achieves polylogarithmic competitiveness.

Based on joint work with Nikhil Bansal, Niv Buchbinder, and Joseph (Seffi) Naor.


October 5: Tony Wirth, University of Melbourne

Set Cover Algorithms For Very Large Datasets

The problem of Set Cover—to find the smallest subcollection of sets that covers some universe—is at the heart of many data and analysis tasks. It arises in a wide range of settings, including operations research, machine learning, planning, data quality and data mining. Although finding an optimal solution is NP-hard, the greedy algorithm is widely used, and typically finds solutions that are close to optimal. However, a direct implementation of the greedy approach, which picks the set with the largest number of uncovered items at each step, does not behave well when the input is very large and disk resident. The greedy algorithm must make many random accesses to disk, which are unpredictable and costly in comparison to linear scans. In order to scale Set Cover to large datasets, we provide a new algorithm which finds a solution that is provably close to that of greedy, but which is much more efficient to implement using modern disk technology. We also apply careful indexing techniques to implement the standard greedy algorithm. Our experiments show a ten-fold improvement in speed on moderately-sized datasets over a naive greedy implementation, and an even greater improvement on larger datasets.

This paper is joint work with Graham Cormode and Howard Karloff, both at AT&T Labs-Research, and Andrew Turpin, at The University of Melbourne. A preliminary version of this paper appeared in CIKM 2010.

Bio:

Tony Wirth is a senior lecturer in the Department of Computer Science and Software Engineering at The University of Melbourne: he joined the department in 2005. Tony received the BSc (Hons) and MSc degrees from The University of Melbourne, and the MA and PhD degrees at Princeton University. His research interests are based around theoretical computer science, including graph algorithms, approximation algorithms, algorithms on external memory, communication complexity, adaptive sampling, and sequence analysis problems in bioinformatics.


September 28: Gayle Laakmann McDowell, CareerCup.com

How to Crack the Coding Interview: Skills and Strategies for Software
Engineers

CS interviews are a different breed from other interviews and, as such, require specialized skills and techniques. This talk will teach you how to prepare for coding interviews, what top companies like Google, Amazon, and Microsoft really look for, and how to tackle the toughest programming and algorithm problems.  This talk will be a deeply technical and will include stories from the speaker's extensive interviewing experience as well as a live "demo" of how to tackle a technical problem.

Signed copies of McDowell's Cracking the Coding Interview and The Google Résumé will be on sale after the talk for $20. Cracking the Coding Interview is the #1 interview prep book for software engineers and the 5th edition, just released in Aug 2011, is nearly double the size of the previous edition. The Google Résumé is a broader book to teach people what they need to do to position themselves for a tech job, starting from early in college up through the offer and job performance itself. The books are rated as 5 and 4.5 stars respectively on Amazon.

Bio:
Gayle Laakmann McDowell is the founder and CEO of CareerCup.com and the author of Cracking the Coding Interview and The Google Resume. CareerCup is the leading source of technical interview preparation and provides a free forum with 3000+ technical interview questions, a book, a video, and mock interviews.

Gayle has worked as a Software Engineer for Google, Microsoft and Apple and has extensive interviewing experience on both sides of the table. She has interviewed and received offers from Google, Microsoft, Apple, Amazon, IBM, Goldman Sachs and a variety of other firms, and she has interviewed over 120 candidates at Google and served on its hiring committee.

Gayle holds a BSE and MSE from UPenn in Computer Science, and an MBA from the Wharton School.

About
People
Research
Programs
Admissions
© Copyright 2010, Dartmouth College Computer Science Department, All Rights Reserved