Dartmouth logo Dartmouth College Computer Science
Technical Report series
CS home
TR home
TR search TR listserv
By author: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
By number: 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986

Synchronizing Keyframe Facial Animation to Multiple Text-to-Speech Engines and Natural Voice with Fast Response Time
William Pechter
Dartmouth TR2004-501

Abstract: This thesis aims to create an automated lip-synchronization system for real-time applications. Specifically, the system is required to be fast, consist of a limited number of keyframes with small memory requirements, and create fluid and believable animations that synchronize with text-to-speech engines as well as raw voice data.

The algorithms utilize traditional keyframe animation and a novel method of keyframe selection. Additionally, phoneme-to-keyframe mapping, synchronization, and simple blending rules are employed. The algorithms provide blending between keyframe images, borrow information from neighboring phonemes, accentuate phonemes b, p and m, differentiate between keyframes for phonemes with allophonic variations, and provide prosodromic variation by including emotion while speaking. The lip-sync animation synchronizes with multiple synthesized voices and human speech. A fast and versatile online real-time java chat interface is created to exhibit vivid facial animation.

Results show that the animation algorithms are fast and show accurate lip-synchronization. Additionally, surveys showed that the animations are visually pleasing and improve speech understandability 96% of the time. Applications for this project include internet chat capabilities, interactive teaching of foreign languages, animated news broadcasting, enhanced game technology, and cell phone messaging.

Note: Senior Honors Thesis. Advisors: Lorie Loeb, Hany Farid, and Stephen Linder

PDF PDF (3339KB)

Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]

Or copy and paste:
   William Pechter, "Synchronizing Keyframe Facial Animation to Multiple Text-to-Speech Engines and Natural Voice with Fast Response Time." Dartmouth Computer Science Technical Report TR2004-501, June 2004.

Notify me about new tech reports.

Search the technical reports.

To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu

Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Technical reports collection maintained by David Kotz.