Intelligent Content Production for a Virtual Speaker

Title	Intelligent Content Production for a Virtual Speaker
Publication Type	Book Chapter
Year of Publication	2005
Authors	Smid K, Pandzic IS, Radman V
Editor	Bolc L, Michalewicz Z, Nishida T
Book Title	Intelligent Media Technology for Communicative Intelligence
Series Title	Lecture Notes in Computer Science
Volume	3490
Pagination	163-174
Publisher	Springer
City	Berlin / Heidelberg
ISBN	978-3-540-29035-3
Abstract	We present a graphically embodied animated agent (a virtual speaker) capable of reading a plain English text and rendering it in a form of speech accompanied by the appropriate facial gestures. Our system uses a lexical analysis of an English text and statistical models of facial gestures in order to automatically generate the gestures related to the spoken text. It is intended for the automatic creation of the realistically animated virtual speakers, such as newscasters and storytellers and incorporates the characteristics of such speakers captured from the training video clips. Our system is based on a visual text-to-speech system which generates a lip movement synchronised with the generated speech. This is extended to include eye blinks, head and eyebrow motion, and a simple gaze following behaviour. The result is a full face animation produced automatically from the plain English text.
DOI	10.1007/11558637_17

You are here