You are here

Intelligent Content Production for a Virtual Speaker

TitleIntelligent Content Production for a Virtual Speaker
Publication TypeBook Chapter
Year of Publication2005
AuthorsSmid K, Pandzic IS, Radman V
EditorBolc L, Michalewicz Z, Nishida T
Book TitleIntelligent Media Technology for Communicative Intelligence
Series TitleLecture Notes in Computer Science
Volume3490
Pagination163-174
PublisherSpringer
CityBerlin / Heidelberg
ISBN978-3-540-29035-3
Abstract

We present a graphically embodied animated agent (a virtual speaker) capable of reading a plain English text and rendering it in a form of speech accompanied by the appropriate facial gestures. Our system uses a lexical analysis of an English text and statistical models of facial gestures in order to automatically generate the gestures related to the spoken text. It is intended for the automatic creation of the realistically animated virtual speakers, such as newscasters and storytellers and incorporates the characteristics of such speakers captured from the training video clips. Our system is based on a visual text-to-speech system which generates a lip movement synchronised with the generated speech. This is extended to include eye blinks, head and eyebrow motion, and a simple gaze following behaviour. The result is a full face animation produced automatically from the plain English text.

DOI10.1007/11558637_17