2012–2013 Dartmouth Computer Science Colloquium Schedule
All colloquia take place on Wednesday at 4:15 in 006 Steele unless otherwise noted.
Amarendra K. Das, Dartmouth College, Geisel School of Medicine
Incorporating Semantic Similarity into Machine Learning Approaches for Biomedical Data
Biomedical researchers are inundated with vast amounts of digital information, ranging from electronic medical records to genomic experiments. There is a need for machine learning approaches that can assist in discovering and making sense of patterns within such complex data sets. Semantic similarity measures, based on ontologies or concept hierarchies, quantify the relatedness of concepts within complex data sets, and can be used computationally in cluster analysis and information retrieval.
In this talk, I discuss the use of semantic similarity in handling two informatics challenges: (1) finding distinct patterns of care in a longitudinal clinical database of treatment events, and (2) finding published articles, or snippets within the publications, that match encoded definitions of disease categories. Our research indicates that incorporating background domain knowledge into machine learning approaches improves the relevance of the results.
Olga Zhaxybayeva, Dartmouth College, Department of Biological Sciences
Genomic Insights into Microbial Evolution
Over the past two decades genomic data has revolutionized biology, and microbiology in particular. It revealed how little we know about enormous microbial diversity and about microbes’ incredible abilities to adapt to various environments. Genomes represent historical records of past evolutionary events that lead to such adaptations. In my presentation, I will introduce the field of comparative genomics, discuss some of its recent advances, and highlight the challenges of biological research in the post-genomic era.
Olga Zhaxybayeva received B.S. in Applied Mathematics from Kazakh State University (Almaty, Kazakhstan) and Ph.D. in Genetics from University of Connecticut (Storrs, CT). After postdoctoral training at Dalhousie University (Halifax, NS), she joined Dartmouth as a faculty member in July 2012. Her research interests include evolution of microbial populations, origin and early evolution of life, and development of algorithms for analyses of genomic and metagenomic data. For more information, visit her laboratory web page at http://www.dartmouth.edu/~ecglab/.
Daniel Scharstein, Middlebury College
Benchmarking Stereo Vision and Optical Flow Algorithms
Stereo vision and optical flow methods attempt to measure scene depth and motion by matching and tracking pixels across images. To evaluate the performance of such methods, we need "ground truth" -- the true depth or true object motion. In this talk I will describe different techniques for creating image datasets with ground truth, including structured lighting, laser and CT scanners, and hidden fluorescent texture. The Middlebury datasets, created in collaboration with undergraduates, are now well-established benchmarks in computer vision, and I will discuss both benefits and potential pitfalls of such benchmarks. I will also briefly touch on how data with ground truth can aid in developing new algorithms.
Daniel Scharstein, Professor of Computer Science at Middlebury College in Vermont, studied Computer Science at the Universitaet Karlsruhe, Germany, and received his PhD from Cornell University in 1997. His research interests include computer vision, image-based rendering, and robotics. He maintains several online computer vision benchmarks at http://vision.middlebury.edu.
Nick Gillian, MIT Media Lab
Rapid Learning: Novel Machine Learning Algorithms and Tools for Rapid Gestural-Interaction Design, Prototyping and User Customization
The growth of increasingly accessible sensing devices combined with affordable computational-processing power suggests exciting new opportunities for developers, professional designers and performance artists alike to integrate novel interactive systems into their software applications, prototypes and performances. Although these new sensing devices greatly simplify the task of capturing a user’s movements, recognizing a specific-sensor signal as a user’s explicit-control gesture is still far-from trivial. In this talk I will describe how a number of novel machine-learning algorithms can be used to rapidly learn and recognize a user’s gestures from a small number of user-supplied training examples. I will show how designers can use these algorithms to rapidly prototype gestural-interactions, as well as allowing users to customize their own gestural interactions. Furthermore, I will present several new software tools that are being developed at the Media Lab to allow a more diverse group of users to integrate gesture recognition and machine learning into their own software, installations, and musical instruments.
Nick is currently a postdoctoral affiliate in the Responsive Environments research group at the MIT Media Lab. Nick’s research focuses on the design and development of machine-learning algorithms and software tools that can be used for real-time gesture recognition. Nick is particularly interested in designing new machine-learning algorithms that can be rapidly trained with just a few training examples of the gestures a user wants the system to recognize. Nick is also developing new software and toolkits that enable a more diverse group of people to integrate gesture recognition and machine learning into their own interfaces, art installations, and musical instruments. More information about Nick can be found at his website: www.nickgillian.com
Krister Swenson, Université de Montréal
Computational Approaches for Inferring Homology Relationships for Whole Genomes from Many Species
The first step in most multi-gene studies is to infer the evolutionary relationships between the genes in question. The fundamental question is whether two genes are evolutionarily related through a speciation or a duplication event. With the high cost of lab experiments and a general uncertainty about the relationship between function and evolutionary descent, these relationships are usually inferred on a large scale through computational means using gene sequence information. In this talk we prove that under certain reasonable conditions, the classical method of reconciling a gene tree with a species tree (introduced in 1979) can never infer the true duplication and loss history for a set of homologous genes, and offer foundational work towards a remedy. We then present a new method, and a related approximation algorithm, for inferring recently duplicated genes, using the whole genomes of a set of related species. We strive to make this talk accessible and stimulating for computer scientists, biologist, and mathematicians alike.
A native of the Dartmouth area, Krister owes his interest in computer science to the course CS 5 (now CS 1), which he attended as a high school student at Lebanon High. He is currently a postdoctoral fellow at the Université de Montréal and McGill. He was awarded CS graduate student of the year at the University of New Mexico. In 2010, his dissertation was elected as one of the top seven over all disciplines at the École Polytechnique Fédérale de Lausanne. Recently, his paper received the "best paper" award at the highly competitive RECOMB 2012 in Barcelona.
Patricia Hannaway, Dartmouth College
Bridging Technology and Art in the Entertainment, Feature Film Industry
The Pipeline and Technical Aspects of Bringing a Computer Feature to the Screen and Facial System Models for Expression
Computer Graphics filmmaking is a collaborate effort between creativity, artistic expression and technological programming. Today, films are created from scratch with computer technology, stored on servers, output to digital format, animated with digital programming and software, and merged with “live action” film plates to create the stunning, visual effects we take for granted. Who are the folks that make this magic happen? What kind of a pipeline is involved to produce such a product? What set of skills are required? How is proprietary programming and R&D a part of this process? And why is it essential that artists know some programming and engineers study design and aesthetics?
A native of Massachusetts, Patricia grew up in Marblehead and is currently a Visiting Professor in Computer Science at Dartmouth College in Digital Arts. A twenty year veteran of the CG feature film industry, she holds a BA from Smith College, an MFA from the School of Visual Arts in Computer Graphics and an MFA from the New York Academy of Art. Upon graduating, she was hired by Disney Feature Animation and applied her artistic skills to computer technology, working on such films as: “Toy Story”, “Pocahontas”, “Mulan” and “Hercules”. Additionally, she worked at Dreamworks and ILM on such films as: “Shrek”, “Antz” and “Star Wars”. She was the Senior Animator at WETA Digital for the character of “Gollum” in the “Lord of the Rings: The Two Towers”, being an instrumental contributor in developing the facial system for the character. She was a Visiting Professor at Stanford University and is currently doing film development with Aardman Animations in Bristol, UK. Also a painter, she recently completed a commissioned piece for the City of Palo Alto dealing with “Green Technology” and has a studio in Palo Alto, CA.
Jeff Dagle, Pacific Northwest National Lab
Maintaining Grid Resilience with the Adoption of Smart Grid Technologies
The interconnected electric power grid was recently recognized by the National Academy of Engineering as the greatest engineering achievement of twentieth century. It has achieved high levels of reliability through the rigorous application of robust engineering principles relating to implementing resilient design practices. For example, wide-area communications are often limited to supervisory control functions that supplement localized closed-loop control systems organized in a hierarchical scheme. This enables some global optimization by changing local set-points without introducing too much dependency on these wide area communications. Industry trends now underway envision increased deployment of smart grid technologies, including wide-area monitoring, protection and control. While these technologies are intended to provide additional functionality and reliability, there nevertheless remains a concern that the unintended consequences of unenvisioned event scenarios could result in a more brittle infrastructure as a result. Particularly as it relates to cyber security, new single points of failure could be introduced that might have far-reaching consequences. The solution is maintaining careful consideration to implementing engineering design principles that enforce resilience into the architecture of the system at all levels. This presentation will provide perspectives of resiliency as it relates to electric power systems. The concept of resilience in the context of electric power systems will be defined and discussed, the impact of new technologies will be considered, different failure scenarios will be analyzed, and mitigation strategies will be recommended.
Jeffrey E. Dagle has worked at the Pacific Northwest National Laboratory, operated by Battelle for the U.S. Department of Energy (DOE), since 1989 and currently manages several projects in the areas of transmission reliability and security, including the North American SynchroPhasor Initiative (NASPI) and cyber security reviews for the DOE Smart Grid Investment Grants and Smart Grid Demonstration Projects associated with the American Recovery and Reinvestment Act of 2009. He is a Senior Member of the Institute of Electrical and Electronics Engineers (IEEE), a member of the International Society of Automation (ISA) and National Society of Professional Engineers (NSPE), and is a licensed Professional Engineer in the State of Washington. He received the 2001 Tri-City Engineer of the Year award by the Washington Society of Professional Engineers, led the data requests and management task for the U.S.-Canada Power System Outage Task Force investigation of the August 14, 2003 blackout, supported the DOE Infrastructure Security and Energy Restoration Division with on-site assessments in New Orleans following Hurricane Katrina in fall 2005, and is the recipient of two patents, a Federal Laboratory Consortium (FLC) Award in 2007, and an R&D 100 Award in 2008 for the Grid Friendly™ Appliance Controller technology. Mr. Dagle was a member of a National Infrastructure Advisory Council (NIAC) study group formed in 2010 to establish critical infrastructure resilience goals. He received B.S. and M.S. degrees in Electrical Engineering from Washington State University in 1989 and 1994, respectively.
Victor Lempitsky, Skolkovo Institute of Science and Technology (Skoltech)
Inverted Multi-Index for Efficient Similarity Search in Billion-Scale Datasets
I will present a new data structure for efficient similarity search in very large datasets of high-dimensional vectors. This structure called the "inverted multi-index" generalizes the inverted index idea via the use of product quantization. As a result of this generalization, inverted multi-indices achieve a much denser subdivision of the search space compared to inverted indices, while retaining the same retrieval complexity, preprocessing time and memory efficiency. Our experiments with large datasets of visual descriptors demonstrate that because of the denser subdivision, inverted multi-indices are able to return much shorter candidate lists with higher recall. Augmented with a suitable reranking procedure, multi-indices were able to improve the speed of approximate nearest neighbor search on the dataset of 1 billion SIFT descriptors by an order of magnitude compared to the best previously published systems, while achieving same or better recall and incurring only few percent of memory overhead. This is a joint work with Artem Babenko.
Victor Lempitsky is an assistant professor at Skolkovo Institute of Science and Technology (Skoltech), which is a new research university in Moscow. Prior to that, he was a postdoc researcher with the Visual Geometry Group at Oxford University, and with the Computer Vision group at Microsoft Research Cambridge. He also was a researcher at Yandex, which is the main Russian Internet search company. Victor holds a PhD ("kandidat nauk") from Moscow State University (2007). His interests are in various aspects of computer vision (visual recognition, image understanding, fine-grained classification, visual search) and biomedical image analysis.
Mala Radhakrishnan, Wellesley College
Andrés Molina-Markham, Dartmouth College