Interspeech 2009, pp 2431-2434.
Department of Computer Science, University of Texas at El Paso
Abstract: Spoken dialog systems today do not vary the prosody of their utterances, although prosody is known to have many useful expressive functions. In a corpus of memory quizzes, we identify eleven dimensions of prosodic variation, each with its own expressive function. We also identified the situations in which each was used, and determined how to identify these situations from the dialog context and the prosody of the interlocutor's previous utterance. We implemented the resulting response rules and had 21 users interact with two versions of the system. Overall they preferred the version in which the prosodic forms of the acknowledgments were chosen to be suitable for each specific context. This suggests that simple adjustments to system prosody based on local context can have value to users.
Full Paper (pdf)
Seven prosodic variants of good job (au files):
neutral   creaky   upturn   enthusiastic   vibrato   elongated   creaky-elongated
Experimental procedure (video, 15MB, 3 minutes):
Nigel Ward's Publications