Mid-Level Prosodic Feature Toolkit

This open-source toolkit contains Matlab functions to compute prosodic features. The intended use is research and applications-building using statistical and machine-learning approaches.

It includes the Principal Components Analysis workflow that we are using to study action-coordinating prosody (paper) and language learners' prosody. Earlier versions were used for gaze-behavior modeling, prosodic-construction discovery, for language modeling, for information retrieval, and other purposes (publications).

It is written in Matlab. It is self-contained except for pitch extraction; for this there is a simple interface for the fxrapt function of VoiceBox, which is also free and an easy download.

Compared to the popular OpenSmile feature set, these features are designed to capture the dialog-relevant aspects of prosody. The code works directly on audio data, without any need for segmentation or other preprocessing. The code is designed for robustness, simplicity, and flexibility.

This code is public-domain freeware. You are welcome to do with it anything you like, without restriction. Although there are no known bugs, this code is provided as-is.

Download from Github, or a significantly older local copy here: documentation, code (8.0M compressed tarfile).

Nigel Ward, 2015 - 2021 , University of Texas at El Paso and Kyoto University