Symbolic Prosody Prosodic Transcription Systems ToBI, introduced above, can be used as a notation for transcription of prosodic training data and as a high-level specification for the symbolic phase of prosodic generation. Alternatives to ToBI also exist for these purposes, and some of them are amenable to automated prosody annotation of corpora. Some examples of this type of system are discussed in this section.

PROSPA was developed specially to meet the needs of discourse and conversation analysis, and it has also influenced the Prosody Group in the European ESPRIT 2589 SAM (Multilingual Speech Input/Output Assessment, Methodology and Standardization) project [50]. The system has annotations for general or global trends over long spans shown in Table 15.5, short, accent-lending pitch movements on particular vowels are transcribed in Table 15.

6, and the pitch shape after the last accent in a () sequence, or tail, is indicated in Table 15.7..

Table 15.5 Ann otations for general or global trends over long spans. () F R H M L H/F extent of a sequence of cohesive accents globally falling intonation globally rising intonation level intonation on high tone level level intonation on middle tone level level intonation on low tone level falling intonation on a globally high tone level sequence of weakly accented or unaccented syllables.

Table 15.6 Annotations for accent-lending pitch movements on particular vowels. + = Upward pitch movement Downward pitch movement level pitch accent.

Table 15.7 Ann otations for pitch shape after the last accent in a () sequence, or tail. falling tails / /` rising tails level tails combinations of tails (rising-falling here).

INTSINT is a coding system of intonation described in [22]. It provides a formal encoding of the symbolic or phonologically significant events on a pitch curve. Each such target point of the stylized curve is coded by a symbol, either as an absolute tone, scaled glob-.

Prosody ally with respect to the speakers pitch range, or as a relative tone, defined locally in conjunction with the neighboring target points. Absolute tones in INSINT are defined according to the speaker s pitch range as shown in Table 15.8: Relative tones are notated in INTSINT with respect to the height of the preceding and following target points.

. Table 15.8 The definition of absolute tones in INSINT. T M B top of the speaker"s pitch range initial, mid value bottom of the speaker"s pitch range.

In a transcription, numerical values are retained for all F0 target points. TILT [60] is one of the most interesting models of prosodic annotation. It can represent a curve in both its qualitative (ToBI-like) and quantitative (parametrized) aspects.

Generally any interesting movement (potential pitch accent or boundary tone) in a syllable can be described in terms of TILT events, and this allows annotation to be done quickly by humans or machines without specific attention to linguistic/functional considerations, which are paramount for ToBI labeling. The linguistic/functional correlations of TILT events can be linked by subsequent analysis of the pragmatic, semantic, and syntactic properties of utterances..

Table 15.9 The definition of relative tones in INSINT. H L S U D target higher than both immediate neighbours target lower than both immediate neighbours target not dif/ferent from preceding target target in a rising sequence target in a falling sequence.

The automatic parametrization of a pitch event on a syllable is in terms of: starting f0 value (Hz) duration amplitude of rise (Arise, in Hz) amplitude of fall (Afall, in Hz) starting point, time aligned with the signal and with the vowel onset The tone shape, mathematically represented by its tilt, is a value computed directly from the f0 curve by the following formula:. tilt = Arise A fall Arise + A fall (15.1). Duration Assignment Table 15.10 Label scheme for syllables. sil c a fb rb afb arb m mfb mrb l lrb lfb Silence Connection Major pitch accent Falling boundary Rising boundary Accent+falling boundary Accent+rising boundary Minor accent Minor accent+falling boundary Minor accent+rising boundary Level accent Level accent+rising boundary Level accent+falling boundary.

A likely syllable for tilt analysis in the contour can be automatically detected based on high energy and relatively extreme F0 values or movements. Human annotators can select syllables for attention and label their qualities according to Table 15.10.

