This paper presents new advances in speech technology carried out by the Speech Technology Group (GTH) at the Universidad Politécnica de Madrid (UPM) to develop enhanced interfaces at home. These interfaces provide a better interaction for people with disabilities. The speech recognizer includes a speaker identification feature (that makes an acoustic adaptation possible for improving recognition performance) and an emotion classifier (to detect the user emotion). The understanding module, with a bottom-up strategy, increases its flexibility against recognition errors. The dialog manager has been improved by a new dialog control based on Bayesian Networks and a new platform for developing multimodal, multilingual, and user dependent dialog services from scratch. Finally, the speech synthesis module includes new advances for increasing the voice naturalness and incorporating emotions. These advances have been integrated into a new interface for controlling a Hi-Fi audio system, thus significantly increasing its ergonomics.
|