It is well known that the emotional state of a speaker usually alters the way she/he speaks. Although all the components of the voice can be affected by emotion in some statistically-significant way, not all these deviations from a neutral voice are identified by human listeners as conveying emotional information. In this paper we have carried out several perceptual and objective experiments that show the relevance of prosody and segmental spectrum in the characterization and identification of four emotions in Spanish. A Bayes classifier has been used in the objective emotion identification task. Emotion models were generated as the contribution of every emotion to the build-up of a universal background emotion codebook. According to our experiments, surprise is primarily identified by humans through its prosodic rubric (in spite of some automatically-identifiable segmental characteristics); while for anger the situation is just the opposite. Sadness and happiness need a combination of prosodic and segmental rubrics to be reliably identified.
|