Abstracts of the Second GEINTRA Conference Series on Audio-Visual Signal Processing and Applications in Intelligent Spaces 2014 (June 18, 2014)

ABSTRACTS

9:30h “When the crowd watches the crowd: understanding impressions in online conversational video”. Dr. Daniel Gatica-Perez. IDIAP Research Institute (Martigny, Switzerland)

ABSTRACT

Online conversational video is creating new possiblities for communication and interaction in social media. I will present an overview of a framework developed in my research group to understand social impressions in online conversational video, and specifically applied to vlogging. I will examine the role that video crowdsourcing techniques can play in interpersonal perception research, describe how we use them to collect online impressions about vloggers, and summarize some of the associated challenges. I will then show how these crowdsourced resources can be used to study connections between nonverbal and verbal cues measured from vlog posts and a number of social constructs including personality traits and mood (joint work with Joan-Isaac Biel).

10:45h “Attention recognition: from contextual analysis of head poses to 3D gaze tracking using remote RGB-D sensors”. Dr. Jean-Marc Odobez. IDIAP Research Institute (Martigny, Switzerland)

ABSTRACT

Gaze (and its discrete version the visual focus of attention, VFOA) is acknowledged as one of the most important non-verbal cues in human communication. However, its automatic estimation is a highly challenging problem, in particular, when large user mobility is expected and minimal intrusion is required. In this talk, I will discuss these main challenges associated to this task and how we have addressed them, focusing on the recent techniques investigated to perform 3D gaze tracking from RGB-D (color and depth) cameras like the Kinect that can represent an alternative to the highly costly and/or intrusive systems currently available. The method will be illustrated using several examples from human-robot or human-human interaction analysis like automatic gaze coding of natural dyadic interactions.

12:00 “Some Challenges in ADAS based on computer vision”. Dr. Luis-Miguel Bergasa, UAH (Madrid, Spain)

ABSTRACT

In the last decade, research has moved towards more intelligent on-board systems for vehicles that aim to anticipate and try to avoid or mitigate the severity of traffic accidents. These systems are referred to as Advanced Driver Assistance Systems (ADAS) in the sense that they assist the driver to take decisions, provide signals in possible dangerous driving situations, and perform counteractive measures in cases of unavoidable accidents. A major characteristic of ADAS is the requirement of observing and understanding key aspects of the vehicle's environment in real-time.

For this purpose, the use of Computer Vision is becoming a kind of standard that most car manufacturers are incorporating in their models. The RobeSafe Group conducts exhaustive research in the field of Safety and ADAS with the goal of transferring the results of its technology to the automotive and infrastructure industries. This speech reviews some of the ADAS developed in the group and presents some of the papers accepted in the 2014 IEEE Intelligent Vehicles Symposium, which will be held between June 8-11, in Dearborn, Michigan, USA.

LUNCH BREAK

15:00 “User-centric need-driven affect modeling for spoken conversational agents: Design and evaluation”. Dr. Juan-Manuel Montero, UPM (Madrid, Spain)

ABSTRACT

One barrier to the creation of Spoken Conversational Agents (SCAs) has been the lack of methods for detecting and modelling emotions in a task-independent way. This seminar focuses on the design and evaluation of affective SCAs, from the rule-based speech understanding, automatic detection of the state of the user and the computation of internal emotional variables of the agent in each dialogue turn, to the expression of affect using emotional Text-To-Speech synthesis in Spanish.

16:15 “Towards speaker independent expressivity: Paralinguistic transplantation”. Dr. Roberto Barra. UPM (Madrid, Spain)

ABSTRACT

The incorporation of expressive capabilities to speech technology greatly expands the potential fields of use: beginning with lie detectors, emotion detectors, continuing with realistic expressive dialogue systems that adapt to the users emotional state or systems capable of using emotions for better conveying their actual state finishing by more commercial applications such as automatic voice castings and synthesis for movies or video games or automatic synthesis of audiobooks with multiple characters.

In the speech signal, paralinguistic rubrics such as the intention, the emotion, or the speaking style are fused with the information of the speaker, the gender or the language, introducing difficulties in those applications where we want to automatically detect those paralinguistic features in a speaker independent way, or in those applications where we want to express the emotional state of an artificial agent independently of the identity of the synthetic voice used.

This seminar will focus on the analysis of how to extract and model paralinguistic information (such as the emotion or the speaking style) from the acoustic signal independently from the speaker, and transplant those paralinguistic features to new target voices. The results obtained by perceptual evaluation show that the emotion or target speaking style are transplanted successfully with appropriate strength, keeping the identity of the target speaker.

17:30 “Emotional speech analysis for stress detection”. Dr. Roberto Gil. UAH (Madrid, Spain)

ABSTRACT

Stress is a reaction or response of the subject to face up the daily mental, emotional or physical challenges. Continuous scanning of stress levels of a subject is a key point to understand and control personal stress. Stress is expressed by physiological changes, emotional reactions, and conduct changes. Some of the physiological changes are the increase of adrenaline produced to intensify the concentration, or the rise in heart rate and the acceleration of the reflexes. Concerning emotional reactions, they can be expressed by changes in the prosody of speech. In this talk we study the design of classification systems of stress levels using emotional speech analysis, paying special attention to the particularities of emotional speech databases and the extraction of features for speech analysis. For this purpose, genetic algorithms combined with bootstrapping techniques are useful tools in order to determine the combination of features which produces less error.

LANGUAGE

English

VENUE

Sala de Grados. Escuela Politécnica Superior – UAH

Ctra. Madrid-Barcelona, km 33,600

28805 Alcalá de Henares, Madrid. Spain

ORGANIZERS

Dr. Marta Marrón Romera

Dr. Javier Macías Guarasa

Dra. Cristina Losada

Dr. Manuel Mazo Quintas

Dr. Carlos Luna

Dr. Sira Palazuelos

http://www.geintra-uah.org/conferences2014

Geintra

Inicio

Información general

Personal

Publicaciones

Proyectos de investigación

Demos

Datasets y software abierto

Trabaja con nosotros

Login

Líneas de investigación

Trabaja con nosotros

Contacta con el grupo