Personal tools
You are here: Home Team Non-working demos EmoSpeak

See below for system requirements and explanations.



System requirements

Sun Java 1.5 browser plugin

Explanations

EmoSpeak is an interface for testing and demonstrating emotional speech synthesis, created in the NECA project by Marc Schröder. In contrast to traditional emotional speech synthesis systems, the emotional state expressed through the voice is represented not in terms of discrete categories (joy, anger, sadness, ...), but by means of the continuous emotion dimensions activation, evaluation and power. The activation-evaluation space is represented as a two-dimensional circle as in the Feeltrace tool.

The centre of the circle corresponds to a neutral state. Emotional intensity increases with the distance from the centre, i.e. states at the periphery are the most intense states.

The green dot marks the emotional state to be expressed. It can be moved with the mouse. According to a set of emotional prosody rules, the acoustic parameters guiding the speech synthesis are updated, and the resulting MaryXML document is displayed in the bottom of the applet window.

When the user pushes the Play button, the MaryXML document is sent to the Mary server at DFKI, and the resulting audio is played on the user's machine.

One important difference between this system and other diphone-based systems expressing emotions is that we can model diphone voice quality (in terms of vocal effort). The voices de6 and de7 (which you can download from the MBROLA homepage) have been recorded with three levels of vocal effort (soft, modal, loud) especially for the NECA project. Listen to the effect by comparing de6 and de7 to the other voices de1-de5 in which voice quality cannot be modelled!

Note: If you want to further modify the MaryXML document or save the audio as a file, simply copy the MaryXML document from the applet and feed it into the Inside TTS interface.