Pitch (Hz) y l e i s a r vos ana r u: as ta h y. v ä st7 p i s teitä n o l: a Time (s)

Similar documents
SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne

Phonetic and phonological properties of the final pitch accent in Catalan declaratives

L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES

Intonation difficulties in non-native languages.

Aspects of North Swedish intonational phonology. Bruce, Gösta

Things to remember when transcribing speech

Guidelines for ToBI Labelling (version 3.0, March 1997) by Mary E. Beckman and Gayle Ayers Elam

Prosodic focus marking in Bai

Julia Hirschberg. AT&T Bell Laboratories. Murray Hill, New Jersey 07974

Text-To-Speech Technologies for Mobile Telephony Services

Measuring and synthesising expressivity: Some tools to analyse and simulate phonostyle

Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements

TRANSCRIPTION OF GERMAN INTONATION THE STUTTGART SYSTEM

Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords.

CHARTES D'ANGLAIS SOMMAIRE. CHARTE NIVEAU A1 Pages 2-4. CHARTE NIVEAU A2 Pages 5-7. CHARTE NIVEAU B1 Pages CHARTE NIVEAU B2 Pages 11-14

Prosodic Phrasing: Machine and Human Evaluation

Master of Arts in Linguistics Syllabus

Teaching and Learning Mandarin Tones. 19 th May 2012 Rob Neal

Quarterly Progress and Status Report. Word stress and duration in Finnish

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

Carla Simões, Speech Analysis and Transcription Software

Efficient diphone database creation for MBROLA, a multilingual speech synthesiser

1 Introduction. Finding Referents in Time: Eye-Tracking Evidence for the Role of Contrastive Accents

4 Pitch and range in language and music

WinPitch LTL II, a Multimodal Pronunciation Software

Thirukkural - A Text-to-Speech Synthesis System

PTE Academic Preparation Course Outline

Introductory Guide to the Common European Framework of Reference (CEFR) for English Language Teachers

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard

Discourse Markers in English Writing

On the distinction between 'stress-timed' and 'syllable-timed' languages Peter Roach

English Speech Timing: A Domain and Locus Approach. Laurence White

PTE Academic. Score Guide. November Version 4

Intonation and information structure (part II)

Houghton Mifflin Harcourt StoryTown Grade 1. correlated to the. Common Core State Standards Initiative English Language Arts (2010) Grade 1

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Glossary of key terms and guide to methods of language analysis AS and A-level English Language (7701 and 7702)

Quarterly Progress and Status Report. Preaspiration in Southern Swedish dialects

Common Phonological processes - There are several kinds of familiar processes that are found in many many languages.

SPEECH Biswajeet Sarangi, B.Sc.(Audiology & speech Language pathology)

Developmental Verbal Dyspraxia Nuffield Approach

Modern foreign languages

M.A. Hispanic Linguistics. University of Arizona: Tucson, AZ.

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

Functional Auditory Performance Indicators (FAPI)

What makes a language attractive its sound, national identity or familiarity?

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking

SPEECH OR LANGUAGE IMPAIRMENT EARLY CHILDHOOD SPECIAL EDUCATION

Rubrics for Assessing Student Writing, Listening, and Speaking High School

The prosodic structure of quantificational sentences in Greek

OUTCOMES OF LANGUAGE LEARNING AT THE END OF BASIC EDUCATION IN 2013 TIIVISTELMÄ

Teaching English as a Foreign Language (TEFL) Certificate Programs

Historical Linguistics. Diachronic Analysis. Two Approaches to the Study of Language. Kinds of Language Change. What is Historical Linguistics?

How Can Teachers Teach Listening?

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis

Socio-Cultural Knowledge in Conversational Inference

A System for Labeling Self-Repairs in Speech 1

Electronic Thesis and Dissertations UCLA

Pronunciation in English

MODELING OF USER STATE ESPECIALLY OF EMOTIONS. Elmar Nöth. University of Erlangen Nuremberg, Chair for Pattern Recognition, Erlangen, F.R.G.

Applying Repair Processing in Chinese Homophone Disambiguation

10. Sentence stress and intonation

ON THE RELATIONSHIP BETWEEN PHONOLOGY AND PHONETICS (OR WHY PHONETICS IS NOT PHONOLOGY)

Office Phone/ / lix@cwu.edu Office Hours: MW 3:50-4:50, TR 12:00-12:30

ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING

Course Syllabus My TOEFL ibt Preparation Course Online sessions: M, W, F 15:00-16:30 PST

Mary E. Beckman Curriculum Vitae Education and employment: Extramural research and teaching: Other professional activities:

Proceedings of Meetings on Acoustics

The Effects of Speaking Rate on Mandarin Tones. C2008 Yun H. Stockton

Phonetic Perception and Pronunciation Difficulties of Russian Language (From a Canadian Perspective) Alyssa Marren

A Guide to Cambridge English: Preliminary

The term intonation refers to a means for conveying information in speech which is

Age Effect on the Acquisition of Second Language Prosody. Becky H. Huang & Sun-Ah Jun University of California, Los Angeles

Maryland 4-H Public Speaking Guide

Musical Literacy. Clarifying Objectives. Musical Response

ENGLISH LANGUAGE. A Guide to co-teaching The OCR A and AS level English Language Specifications. A LEVEL Teacher Guide.

Bachelors of Science Program in Communication Disorders and Sciences:

Why Is This Topic So Important? Communication Styles: The Secret of Flexible Behavior. Considerations Regarding Communication

Program curriculum for graduate studies in Speech and Music Communication

Colaboradores: Contreras Terreros Diana Ivette Alumna LELI N de cuenta: Ramírez Gómez Roberto Egresado Programa Recuperación de pasantía.

Corpus Design for a Unit Selection Database

Keywords academic writing phraseology dissertations online support international students

Demonstrate technical proficiency on instrument or voice at a level appropriate for the corequisite

Nonverbal Communication Human Communication Lecture 26

Thai Language Self Assessment

Module Catalogue for the Bachelor Program in Computational Linguistics at the University of Heidelberg

Limited Domain Synthesis of Expressive Military Speech for Animated Characters

Between voicing and aspiration

(in Speak Out! 23: 19-25)

Hyunah Ahn

ENGLISH FILE Intermediate

Modelling compound intonation in Dala and Gotland Swedish Schötz, Susanne; Bruce, Gösta; Granström, Björn

Speech Production 2. Paper 9: Foundations of Speech Communication Lent Term: Week 4. Katharine Barden

DETECTION OF QUESTIONS IN CHINESE CONVERSATIONAL SPEECH. Jiahong Yuan & Dan Jurafsky. Stanford University {jy55, jurafsky}@stanford.

. Niparko, J. K. (2006). Speech Recognition at 1-Year Follow-Up in the Childhood

A Computer Program for Pronunciation Training and Assessment in Japanese Language Classrooms Experimental Use of Pronunciation Check

PART THREE STEPS IN DOING A TRANSCRIPTION

MUSIC GLOSSARY. Accompaniment: A vocal or instrumental part that supports or is background for a principal part or parts.

INTERVALS WITHIN AND BETWEEN UTTERANCES. (.) A period in parentheses indicates a micropause less than 0.1 second long. Indicates audible exhalation.

Transcription:

FIN-ToBI Tones and Break Indices for Finnish Jarmo Välikangas Department of Phonetics, University of Helsinki 3..2002 Abstract ToBI (Tones and Break Indices) is a general notation system for spoken language. The basic idea behind it is to describe prosodic aspects such asintonational patterns and standardise the description of prosody. At adequate levels, ToBi system is adaptable for describing Finnish boundary tones, phrase accents and breaks. Introduction ToBI (Tones and Break Indices) system was originally developed [3] for the description of intonational features of English [0] and it has been succesfully localised for many languages, like Dutch, Greek, Spanish, Italian, Japanese, Cantonese, Pan-Mandarin, French, Serbo-Croatian and Swedish. ToBI system associates speech signal (fundamental frequency contour) with textual information. The basic system uses 4 text tiers.. An ortographic tier. As the name says, this tier represents the ortographic form of words. In English, ortographic form is 'a straightforward transcription of all words in the utterance' [2]. 2. A break-index tier. The break-index is a 5-step scale which describes how the breaks are interpreted. See Table. 3. A tone tier. Pitch events (as fundamental frequency contour) can be marked tones like intonational boundaries (phrasal tones) or accented syllables (pitch accents). Markings for phrasal tones are relative L (low) and H (high) and they are assigned at intonation or intermediate phrases. Intonational boundaries are marked with boundary tone '%'. For example, low final intonational phrase boundary tone is marked 'L%'. Intermediate phrase boundaries (accents) are marked with '-' when they are connected with intonational boundary and '+' when

Type Descripition 0 clitilication phrase-medial word boundaries 2 a strong disjuncture marked by a pause 3 intermediate intonation phrase boundary 4 full intonation phrase boundary Table : ToBI break indices they are not connected with intonational boudary. Tone target syllables are marked with '*'. Intermediate phrase boundary markings are usually combined like 'L+' (rising peak accent). All the unaccented syllables are left without markings. 4. A miscellaneous tier. Miscellaneous tiers can be used for any other text like comments or extra-linguistic information, for example. 2 Finnish prosodic and intonational features 2. Prosodic features Prosodic features of speech may be described in terms of energetic, tonal and temporal phenomena. These are perceived as loudness (volume), melody (pitch) and duration (including pauses). 2.2 Intonational Features 2.2. Stress Finnish stress as an intonational feature is twofold comprising lexical stress and sentence accent. As Karlsson has described, Finnish lexical stress has a fixed place on the first syllable of word, indicating "there is a word junction before this syllable" [8]. This has been also noticed by Tuomainen et al when they concluded that word stress is the primary cue to word boundaries in Finnish. They also noticed that the change in fundamental frequency contour is the primary cue for the perception of word boundaries [2]. Within the words, lexical stress describes relationship or prominence between syllables and there are two kind of syllables, namely strong and weak (see Table 2). On the segmental level, strong syllables tend to have more energy, sound segments are longer and the vowel quality is more distinct. Although word stress is a built-in feature of Finnish speech, different linguistic functions cause sentence-level features to dominate and stress should be judged in the context of the whole utterance (Table 2). Iivonen et al have categorised sentence accent as three types which are () main stress 2

me nem me ju nal la muu laan word stress x x x sentence accent x x Table 2: <We travel to Muula by train.> (<Menemme junalla Muulaan.>) Word stress and sentence accent, strong syllables are marked 'x'. (neutral, contrastive or emphatic), (2) secondary stress and (3) nonstressed [4]. Stress as an intonational feature can manifest itself by all three prosodic features and describe the relative prominence in an utterance. 2.2.2 Phrasing The most natural idea of phrasing is a breath group. Articulation and phonation follow breathing which provides phrasing with natural start and end, consecutively. However, spontaneous speech can have different units when speaker decides to use prosodic repertoire for communicative aims by using pauses for deviding speech flow into smaller phrases. See Figure 2. 2.2.3 Tune The subglottal pressure of speech production gets lower towards the end of speech utterance. This is perceived as a lowering declination i.e. lowering melody line. In terms of prosody, lowering melody line is typically connected with standart Finnish declaration sentence. Finnish clause final features are supposed to be the lowest point of fundamental frequency contour and after the last word stress the vowel quality is often characterised as laryngalised (creaky) voice [5, 4]. No specific final fundamental contour for questions has been addressed in spoken, spontaneous Finnish, but speakers are able to use fundamental frequency for contrast and emphasis which may sometimes be understood as questions [, 4]. The rising fundamental frequency contour in the sentence final position, "dot-intonation", is sometimes used to express continuation, too [4]. Hirvonen and Iivonen have claimed that the interrogative function of Finnish is executed as initial global high fundamental frequency [5, 6]. In the context of style research, Iivonen connects the initial high fundamental frequency also with information focusing. In news-style read speech, new topics (new newsitems) are highlighted by the use of higher initial fundamental frequency [7]. 3

3 ToBI EXAMPLES 3. Tones Interphrasal accents Välimaa-Blum has introduced pierrehumbertian pitch accent analysis in Finnish and the results are relevant starting point for ToBI notation. Välimaa-Blum has found two complex accents 'L+' and 'L*+H' and two boundary tones (H% and L%) []. Complex accent 'L+' and boundary tones are illustrated in Figure below. L*+H Prominent low syllable and ascending fundamental frequency contour denotes late accentuation, displacement of normal accent from the first syllable. No example here. L+ Välimaa-Blum describes this kind of complex accent as 'neutral contour [where] the accents form a gradually declining pattern of L+ accents'. See Figure. 200 Pitch (Hz) y l e i s a r vos ana r u: as ta h y v ä ym päri st7 p i s teitä n o l: a L+ L+ L+ L+ L+ 0 0 4.73687 Time (s) Figure : <The general grade for the food good, points for enviroment zero.> (<Yleisarvosana ruoasta hyvä, ympäristö pisteitä nolla>) Complex accents 'L+' and fundamental frequency contour of Finnish declarative sentence. Intonational boundary tones %H As noted before, Iivonen et al has claimed that initial high fundamental frequency contour can be interpretated as a question or as an introduction to the new thematic material [7]. %L Low initial boundary tone seems to be (no examples here) strongly connected with coordinating and subordinating conjunctions like<mutta> (but) and <sillä> (because). No examples here. 4

H% Rising boundary tone can be used in interphrasal position (see Figure 2), but is very seldom used in the sentence final position, too. No examples here, see section 2.2.3 for references. L% The lowest point of fundamental frequency contour is considered as typical end of Finnish declaration sentence. See Figures 2, 3 and section 2.2.3. 200 Pitch (Hz) y a r l: u m j a:n m y l: e i m a: n H% H% L% 0 0 5.0828 Time (s) Figure 2: <Eila went to pick berries, Noora went to the mill, Leevi went fishing and the others went swimming.> (<Eila meni marjaan, Noora meni myllylle, Leevi meni ongelle ja muut menivät uimaan.>) Figure 2 describes how last words of intermediate phrases have boundary tones H% and final intonational boundary tone L%. Underlined words refer to the words with boundary tones. Ladd has interestingly suggested that in some cases, complex phrase accent seems to be a result of natural declination [9]. This implies different interpretation of accents in Finnish, too. There's not always 'L' accent but a normal lowering declaration which is followed by simple accent ''. This would bring down complex accents to simple ones and reserve possible 'L' or 'L*' occurrences for marked cases with accent. As noted before, Välimaa- Blum has interpreted 'L*+H' as late accent where primary stress is displaced from the first syllable of the word []. Since there are no such phonologically distinctive features in Finnish, simple '' can also cover complex 'L+' cases. Despite the absence of accent 'L', phonetically prominent features can be communicated and mapped with other tiers. 3.2 Break Indices 0 The weakest form of break can be interpretated as clitilisated morphing of two syllables. For example, <onko se... > (<is it...>) can be reduced to <onkse... >, omitting /o/ at the word boundary. Clitilisation 5

File: /work/docs/esitykset/tobi/mv_000.wav Page: of Printed: Thu Jan 03 8:54:48 Hz 7 5 3 9 7 L% 3 4 yleisarvosana ruoasta hyvä ympäristö pisteitä nolla t 0.5.0.5 2.0 2.5 3.0 3.5 4.0 4.5 Figure 3: <The general grade for the food good, points for enviroment zero.> (<Yleisarvosana ruoasta hyvä, ympäristö pisteitä nolla>). An example of ToBI annotation with tonal, break and ortographic tiers. doesn't comment differencies between standart written forms and estimated transcription form, nor abbreviations, for example. A typical instance is free variation of allophones, like in Figure where written form <ruoan> is changed to phonetic form [ru:an]. This stage of break indices describes normal word boundaries. In Finnish, juncture may need more detailed description, like <tämä nero> - <tämän ero> (<this genius> - <difference of this>) as well as sandhi and initial dubbling, too. 2 'A strong disjuncture marked by a pause or virtual pause', as the definition claims, is open to various interpretations. In spontaneous speech the distinction between strong and weak disjunctures, whether rhytmic or tonal, is unclear. 3 Within full intonational phrase, intermediate intonation phrase boundary is the single phrase tone from the last pitch accent to the boundary. See Figure 3 where boundary tone H% precedes break 3. 4 Full break is strongly connected with tonal boundaries and it is rather unambiguous as such. The combination of boundary tones, tone accents and breaks are illustrated in Figure 3. The first tier is a tone tier, where tonal events ('', 'L%') are marked at the prominent places of the fundamantal frequency contour. The next tier lists break indices (, 3, 4) and connects them with the word boundaries and ortographic tier. 6

4 Discussion A suggestion for basic tonal paradigm of Finnish intonation is ( %L %H ) n o + ( H% L% ) where optional initial boundary tone '%H' or '%L' is followed by oneor more intermediate phrase accents '' and final intonational boundary tone 'H%' or 'L%'. The least unambiguous break indices are and 4 which describe interphrasal word boundaries and intonational phrase breaks. More complicated break indices, such as2, are unclear and they also need more research in the context of filled pauses and discontinuations. From the phonetic point of view, the basic problem of ToBI is the relativity of notation. The mark '', which stands for phrase accents, does not really tell much about the nature of phonetic substance. There may be a twist in the fundamental frequency contour, to some extent, but '' doesn't expose where the slope starts and where it ends. In the more global perspective of fundamental frequency contour, '' accent looks different in the initial position of the utterance than in the final position. In addition to this, in the realms of phonetics, accents mayinvolve other prosodic repertoire than fundamental frequency contour. Prominent features of speech may be carried out by means of energy and temporal arrangements, too. Behind the scenes, there seems to be many points of interest which ToBI brings out. A strictly phonetic view requests more detailed inspection, but on the other hand, discourse analysis would happily connect final boundary tone with relevant place for turn-taking, for example. Furthermore, tones, accents and breaks are alluring gateway to other linguistic functions like information focus or speech acts. References [] Lehessaari A.-L. Alkoholin vaikutus puheen prosodiikkaan. PhD thesis, University of Helsinki, 996. [2] S. Baumann et al. Tobi - a phonological system for the transcription of german intonation. In Proc. Prosody 2000: Speech Recognition and Synthesis Workshop, 2000. [3] Mary E. Beckman et al. Guidelines for tobi labelling. Technical report, The Ohio State University Research Foundation, 993, version 3, March 997. 7

[4] Iivonen A. et al. Puheen intonaatio. Helsinki: Gaudeamus, 987. [5] P. Hirvonen. Finnish and English Communicative Intonation. PhD thesis, University ofturku, 970. [6] A. Iivonen. Kommunikatiivisten aktien ja puhetyylien prosodiasta. In Tuomainen J. Ojala S., editor, 2. Fonetiikan päivät, pages 8, 200. [7] A. Iivonen et al. Prosodic characteristics in english, finnish and german radio and tv newscasts. In Klippi A. Iivonen A., editor, Studies in Logopedics and Phonetics, number 6 in Publications of the Department of Phonetics, pages 47 68. Department of Phonetics, University of Helsinki, 996. [8] F. Karlsson. Yleinen kielitiede. Helsinki: Yliopistopaino, 994. [9] R. Ladd. Prosody : theory and experiment : studies presented to Gösta Bruce, volume 4 of Text, speech and information technology, chapter 2, pages 37 5. Kluwer Academic Publishers, 2000. [0] J. Pierrehumbert et al. Plans and Intentions in Communication and Discourse. SDF Benchmark Series in Computational Linguistics. MIT Press, 990. [] Välimaa-Blum R. A pitch accent analysis of intonation in finnish. Ural- Altaische Jahrbücher N.F., 2:82 9, 993. [2] J. Tuomainen et al. Fundamental frequency is an important acoustic cue to word boundaries in spoken finnish. In Ohala J.J. et al., editor, Proceedings of the 4th International Congress of Phonetic Sciences, pages 92 923, August -7 999. 8