Introduction to Prosody: A Mini-Tutorial and a Short Course

Introduction to Prosody: Tutorials and a Short Course

Introduction to Prosody

Version 4: Prosody Tutorial Video Series

In 29 Lectures, with Gina Levow. Playlist at Youtube

Version 3: Prosody: Models, Methods and Applications

Tutorial at ACL 2021, with Gina Levow

Extended Abstract

Starter Bibliography

Slides, as of July 27 (45MB zip)

Version 2: Prosody Research and Applications: The State of the Art

A Mini-Tutorial at Interspeech 2019

Prosody is essential in human interaction, enabling people to show interest, establish rapport, efficiently convey nuances of attitude or intent, and so on. Prosody is relevant to every area of speech science, but our current understanding of prosody is fragmentary. In contrast to some areas of speech technology, where superhuman performance has been demonstrated on core tasks, models and techniques for handling prosody have lagged. This survey will give non-specialists the knowledge needed to decide whether and how to integrate prosodic information into their models and systems. It will overview the different ways in which prosody serves paralinguistic, phonological and pragmatic functions; discuss the roles of prosody in applications including speech recognition, speech synthesis, dialog systems, information retrieval and the inference of speaker states and traits; and present current trends, including modeling prosody beyond just intonation, representing prosodic knowledge with constructions of multiple prosodic features in specific temporal configurations, modeling observed prosody as the result of the superposition of patterns representing independent intents, modeling multispeaker phenomenon, and the use of sequence-to-sequence methods and unsupervised methods. Finally we will consider remaining challenges in research and applications.

Slides

Video of Talk

Short Bibliography (now replaced by a new version)

Version 1: Introduction to Prosody

A Short Course at the 2019 Linguistics Institute

Instructors: Nigel G. Ward, University of Texas at El Paso, Francisco Torreira, McGill University

Course Description

Prosody, broadly defined as the aspects of spoken utterances that are not governed by segmental contrasts, is challenging to analyze because it operates close to the limits of conscious introspection, and because most spoken utterances involve multiple prosodic dimensions simultaneously conveying multiple meanings or serving multiple communicative functions. This course will help participants learn to identify, discover, and describe meaningful prosodic properties and patterns in spoken utterances.

The approach will be theory-neutral and descriptively eclectic. The focus will be on primary observation and preliminary analysis and ideation rather than hypothesis testing based on pre-existing theories. The course will include lectures, ear and production training exercises, discussions of readings, qualitative and quantitative analysis with Praat, and other tools, hands-on analysis of provided and contributed data, and the development and presentation of student research proposals. The course is designed to be broadly accessible, with knowledge of phonetics not required. Case studies will, depending on student interests, include sociolinguistic differences in the production and perception of prosodic forms, the mapping between prosody and other layers of linguistic and communicative organization (e.g. syntax, discourse, conversational turn-taking), cross-language comparisons, cross-cultural issues, and the prosody of non-native speakers.

Syllabus (tentative)

Motivation for the Course

Prosody is of wide cross-cutting interest, and this course will highlight its relevance to topics beyond phonetics: grammar, discourse analysis, pragmatics, nonverbal communication, and other areas. The relation to the theme of Language in the Digital Era will be in terms of introducing participants to software tools and computation methods for analyzing prosody. There will also be digressions on ways to model prosody for applications including emotion detection, personality inference, detection of medical conditions, speech synthesis, information retrieval, engendering rapport in virtual agents, and so on.

Logistics

This course was held June - July, 2019, in Davis, California, as part of the LSA 2019 Linguistics Institute.

Target Learning Outcomes

Upon successful completion of this course, students will be able to:

Perceive subtle prosodic properties of utterances in any language
Produce exemplars of a wide range of meaningful prosodic features and patterns
Describe prosodic observations using standard terminology and notations
Explain the articulatory, acoustic, and perceptual bases of prosodic features
Appreciate the diversity of individual, dialectal, and language variety in prosody
Relate new observations to common patterns of prosodic form and common ways in which forms convey meaning, across diverse languages
Understand and critically evaluate new research
Contribute to human knowledge of prosody, by choosing and applying research methods appropriate for the goal, including at the stages of discovery and ideation, hypothesis formation, research design, prosodic feature selection, and tool use
Tell their own stories of discovery
Explain the differences between and connections among the traditional schools of thought in prosody, including their methods, terminology, and formalisms
Describe the limits of our current knowledge and the research challenges involved in extending it
Discuss how to exploit prosody for various engineering and other practical applications

Audio Examples for Exercises

Exercise I1: chodai-f dakko-f damedame-f nene-f achichi-m baibai-m
Exercise I2: good job: A B C D E F G (from Ward and Escalante-Ruiz, 2009)
Exercise I3: -creaky +creaky -nasal +nasal -nasal +nasal (from Ward 2019 )
Assignment A1: aa bb cc dd ee ff gg hh ii jj kk
Assignment A2:
- The Speech Accent Archive
- es1 L2-es1 es2 L2-es2 es3 (left track) L2-es3 es4 (right track) L2-es4
Exercise I4: chotto matte, gohan desu yo, ii naa, meh, awww, I'm good
Other: this class, from the Santa Barbara Corpus