Phonetic Labelers Manual for the Conversational Grunt Project, Version 2.1

Nigel Ward, April 18, 2000

Introduction

The purpose of labeling is to provide support or disconfirmation for various hypotheses regarding grunts in conversation. This evidence will be of two types: observations about distribution, and observations about specific tokens. To support the distributional arguments, the data must be labeled consistently. The detailed arguments about specific tokens, however, will be done based on careful listening --- the labels will support this only by helping us decide which tokens to listen to.

The things to label are as follows:

  1. All grunts. Grunts are as defined in "Sound Symbolism in Communicative Grunts in Japanese and English (draft)". Include clicks and salient exhalations and inhalations. Do not include glottalizations or stutterings.
  2. All laughs, except for cases of laughing-while-speaking, where the laughter occurs while the speaker is saying some words.
  3. All words whose prononciation is distorted to be grunt-like, for example "mright".
  4. All occurances of "yes", "no", and "so", and "hai" and "so", as special cases.
  5. Interesting tokens (words) which appear in positions where grunts are common, such as as back-channels and fillers. For example, right, and, but, you, and I often occur in these positions. Word fragments are generally interesting.

Each label consists of four fields:

  1. postion/function code
  2. type code
  3. word or sound
  4. remarks (optional)
Fields are separated by colons. Spaces are not permitted anywhere.

Position/Function Codes

(Things occuring by themselves)

r
response: a short utterance by itself, produced as a response to a question etc. by the other speaker. A short utterance which is immediately followed by a longer utterance should be labeled a filler (f) or a lexical occurance (l).
q
query. a short query. In English, typically huh?
a
aizuchi (back-channel). A sound occuring in isolation, during another person's turn or shortly after he finishes. Not required by the other person (unlike r). Not requiring a response from the other person (unlike q). Produced in response to a substantive turn by the other (unlike c and i). Further defined in Prosodic Features which Cue Back-channel Responses in English and Japanese.
c
confirmation: a sound produced in response to a back-channel.
i
isolate: produced when neither person has the turn. Includes ``post-completion'' grunts, laughter in response to one's own joke, and self-directed speech including sighs. Typically comes after a second or more of silence on the left, if not, it's probably a "final particle" (j). Typically comes before a second of more of silence on the right, otherwise it's probably a filler.

(Things occuring with longer utterances)

f
filler or "filled pause". Turn-initial. Includes bids for the floor, whether or not they are actually lead in to an utterance. A single turn may start with a sequence of several fillers. Content words are not counted as fillers. Utterance-initial words like "but" and "well" and "yeah" generally are.
j
final particle (especially common in japanese): a particle occuring at the end of an utterance, or of a clause. Typically serves to indicate something about the speaker's attitude to what he just said, or how he expects the listener to take it. Elicitations of acknowledgement (right? etc.), in particular, can be labeled j even when they are not utterance-final. Typical Japanese final particles are yo, ne, ze, zo, na, jan, ne. The English final particles include you know, I think and but and right, in certain positions and intonations.
d
disfluencies within a turn. Disfluencies at turn start should be labeled with f (not that it's always possible to distinguish between a new turn start and a continuation after a short pause). Lengthened sounds at turn end are, should, in most cases, be labeled as final particles (j).
l
lexical: Occurs within the body of a sentence, with no special intonation. A case where a grunt is behaving syntactically and prosodically just like a word. Typically occuring in quotes.

(Other)

o
other: Things that fit none of the categories above. Should be rare.
In cases where a tokens performs more than one function, chose the label based on the function which seems strongest.

Type Codes

b
an ordinary boring word. for example, tree, cat. This category also includes so, and, right, yes, and no, at least for now. In Japanese, we include ne, sa, and yo, and nanka, demo, and so, soso, jan, mon. If a single intonation unit, asoodesuka and asoonanda can be considered words.
w
laughter (warai)
p
pure grunts. uh-huh is the classic example. This category also includes okay and yeah, at least for now. It also includes ya and iya and ma..
c
compounds, such as "and-um". In most cases these should be broken up and labeled individually, as a word followed by a grunt.

Word or Sound Labels

This field is intended to allow simple gathering of statistics on types and tokens; it should therefore include the dictionary representation of the word. If the person says "perhap-", write "perhaps". If the person says "woelll", write "well". If the person says "myeah", write "yeah". If the person laughs, write whatever it sounds like, very approximately, or else just write "laugh".

Remarks

Here you can write whatever you want. One good thing to do is note the exact sound you heard. Two styles are allowed: IPA-like labeling, and comic-strip-style labeling. You may also append comments, such as (breathy) or (soft). Listen to the actual sound --- do not assume that the grunts are respecting the phonotactics of the language.

Examples

d:b:like 0 500
j:b:like 3000 3410
a:p:yeah 4660 5140
a:p:yeah-yeah 10860 11600
l:b:but 14660 14902
a:p:eh 27040 27220
a:b:really:quietly 29160 29680
a:w:hh-hh 31800 32260
a:w:hhhh 35970 37600
f:c:an'-um 39300 39820
a:p:yeah 49810 50060
f:b:like 53290 53600
r:b:no 55770 56220
j:b:like 57100 57460
j:b:but 58350 58560
a:p:aaa 10380 10880
a:p:haa 16120 16800
q:b:ittenaii?_ 20920 21460
a:p:aaa 23720 24250
a:p:~a 25000 25315
a:p:aa 28600 28860
a:p:aa 31440 31690
a:x:po:? 32550 32680
a:p:eee 33560 34100
a:b:sukaa 35240 35960
f:p:(inhalation?) 36800 37500
f:p:oN 37500 37800
d:b:baaskeN 38620 39670
j:p:ja 40270 41270
a:p:aN 41730 42080
(inhalation) 43000 43980
a:w:haha 43980 44490
a:p:mm 46480 46740
x! 47240 47740
a:p:~a 47960 48280
a:p:aa 51540 52020
a:p:aN 54520 54770
a:b:sokka-sooka:sokk(u)sokka 55460 56280
a:p:aaa 59220 59650