ANALYSIS AND SYNTHESIS OF F 0 CONTOURS OF DECLARATIVE, INTERROGATIVE, AND IMPERATIVE UTTERANCES OF BANGLA. Professor Emeritus, The University of Tokyo

Size: px
Start display at page:

Download "ANALYSIS AND SYNTHESIS OF F 0 CONTOURS OF DECLARATIVE, INTERROGATIVE, AND IMPERATIVE UTTERANCES OF BANGLA. Professor Emeritus, The University of Tokyo"

Transcription

1 ANALYSIS AND SYNTHESIS OF F 0 CONTOURS OF DECLARATIVE, INTERROGATIVE, AND IMPERATIVE UTTERANCES OF BANGLA Anal Haque Warsi 1, Tulika Basu 1, Keikichi Hirose 2, Hiroya Fujisaki 3 1 Centre for Development of Advanced Computing(C-DAC), Kolkata 2 Department of Information and Communication Engineering, The University of Tokyo 3 Professor Emeritus, The University of Tokyo anal.warsi@cdackolkata.in, tulika.basu@cdackolkata.in, hirose@gavo.t.u-tokyo.ac.jp, fujisaki@alum.mit.edu ABSTRACT This study first examines the differences in the gross features of the fundamental frequency contour (the F 0 contour) responsible for discriminating utterances of three sentence types, namely declarative, imperative and interrogative, in Bangla. In order to realize these differences in speech synthesis, these differences are then interpreted in terms of differences in the parameters of the commandresponse model for F 0 contour generation. Finally, the results of model-based analysis were used to generate synthetic speech stimuli for a perceptual experiment in order to verify the results of analysis. The result of the experiment indicates that the synthesized F 0 contours are quite satisfactory for the perception of utterances of three sentence types in Bangla, and thus can be successfully used in a concatenative Text-to-Speech System of Standard Colloquial Bangla (SCB) developed by C-DAC, Kolkata. Index Terms: Sentence type, F 0 contour, analysis, synthesis, Bangla, 1. INTRODUCTION Speech has been used as the most efficient means of communication amongst human beings. One of the features of speech as an effective communication medium is to convey simultaneously not only linguistic information but also to transmit paralinguistic and nonlinguistic information. Segmental features alone do not serve all these purposes. Communication becomes meaningful only when the suprasegmental features are imposed and is useful in speech production as well as perception. The rising and falling tones of language, segmental duration and amplitude separate ideas, distinguish questions and statements and show special emphasis. Hence the technique of imposing knowledge of prosody of a language is extremely important in machine-generated speech in the arena of spoken language technologies. But these prosodic cues which are inherent in speech cannot be quantified in an absolute manner and are highly relative to individual speaking style, gender, dialect and other phonological factors [1]. The difficulty in characterizing supra-segmental features has resulted in various schemes for labeling or modeling the intonation of various languages across the world. The ToBI labeling system and the command-response model are most widely known among them. ToBI [2] is basically meant to capture the phonological aspect of speech. It is designed for transcribing the prosodic events of utterances in terms of a set of labels. Since the prosodic organization differs from language to language, there are many different ToBI systems, each one specific to each language [3]. Since, however, it does not provide quantitative description, ToBI labels cannot be used for high-quality speech synthesis. On the other hand, the command-response model of fundamental frequency contour (henceforth F 0 contour) generation by Fujisaki and his coworkers [4] is based on quantitative formulation of the physiological and physical mechanisms of speech production, and thus can provide highly accurate reproduction/prediction of tonal features of speech regardless of difference of language [5]. Production/detection of prosodic features is vital in all applications of spoken language technology including textto-speech (TTS) synthesis [6], speech recognition [7], speech understanding [8], and speech-to-speech translation [9]. The preliminary research work described in this paper is motivated by the desire to enhance the intonation model within a text-to-speech synthesis framework of Bangla. C-DAC, Kolkata, jointly with the University of Tokyo, had analyzed the F 0 contours of Bangla declarative sentences using the command-response model. The obtained results were used to identify the prosodic words and the prosodic phrases from the text and to assign appropriate parameter values to their respective commands for use in the TTS system [10]. In the present study, an attempt is made to identify the differences in intonation that exist among utterances of the three types of Bangla sentences, i.e., declarative, imperative and interrogative, so that the findings can be used for improving the naturalness of the synthesized output of

2 Bangla Text-to-Speech system developed by C-DAC, Kolkata [11]. The rest of the paper is organized as follows. Section 2 gives a brief description of the command-response model. Section 3 describes the design of speech corpus. Section 4 presents experimental procedures along with their results. Finally, discussion and conclusion are presented in section THE COMMAND-RESPONSE MODEL The command-response model describes an F 0 contour in the logarithmic scale as the superposition of a baseline value, phrase components, and accent components as given by Equation (1) [4]. I F 0(t) = ln(f b) + ( 0 ) + { ( 1 ) 2 ApiGp t T i Aaj Ga t T j G a(t T j)} i= 1 j= 1 α 2 t exp( α t), t 0 G p( t) = 0, t < 0 min[1 (1 )exp( ), ], ( ) + β t β t γ Ga t = 0, J t 0 t < 0 Equation (2) represents the response of a second-order critically-damped linear filter (the phrase control mechanism) to an impulse called phrase command, and Equation (3) represents the response of another secondorder critically-damped linear filter (the accent control mechanism) to a step function called accent command. The parameter α is the natural angular frequency of the phrase control mechanism which essentially characterizes the rate of declination of the phrase component, β is the natural angular frequency of the accent control mechanism which characterizes the rate of local rise/fall of the F 0 contour, and γ is the ceiling parameter. In Equation (1), F b is the baseline frequency. Its value is dependent on the size, mass, and stiffness of the vocal folds, and therefore varies from individual to individual. Among utterances of a single speaker it does not vary greatly as long as the speaking style remains the same, but can vary appreciably if the speaking style changes. In the context of the present analysis, F b was allowed to vary from utterance to utterance since it was felt to be desirable to produce the closest approximation to measured F 0 contour of each utterance. I and J are the numbers of phrase and accent commands, respectively. A pi is the magnitude of the i th phrase command and A aj is the amplitude of the j th accent command. T 0i is the time of occurrence of the i th phrase command, while T 1j and T 2j are the onset and offset times of the j th accent command. 3. DESIGN OF BANGLA SPEECH CORPUS The Bangla speech corpus used in this study was designed and created by the authors at CDAC-Kolkata. (1) (2) (3) 3.1. Speech Material Fifty sentences each of declarative, imperative and interrogative ( yes-no question) types were designed for this study in such a way that except for the punctuation mark there was no difference in the textual content. Hence except for prosody there was no linguistic difference in the utterances of each of the three types of sentences. Table 1. Example of two sentences recorded in three utterance types, i.e., declarative, imperative and interrogative. Bangla IPA English Translation Bangla IPA English Translation / / /ebar gǥm lagaben/ Declarative: (You will) Sow wheat seeds this season. Imperative: (You must) Sow wheat seeds this season. Interrogative: Will you sow wheat seeds this season? / / /tumi baȴar ȴao/ Declarative: (Generally) You go to the market. Imperative: Go to the market. Interrogative: Do you go to the market? 3.2. Informant Selection and Recording The speech signal was initially recorded by 6 female speakers. All of them are native speakers of Standard Colloquial Bangla (SCB), the official dialect of West Bengal, and are in the age group between 20 to 40 years. The metadata of the speakers is given in Table 2. Table 2. Metadata of informants. Serial Age Group Sex Number 1. Between Years F 2. Between Years F 3. Between Years F 4. Between Years F 5. Between Years F 6. Between Years F

3 The recordings of speech data for all six speakers were done in a studio environment and digitized at a sampling rate of 22,050 Hz with an accuracy of 16 bits/sample. The data was recorded using the legendary Shure SM58 vocal microphone with the help of Cool Edit Pro Software. During the recording a constant distance from the microphone element and the speaker s mouth was maintained. Identical environment was maintained for every speaker. For each of the six speakers recording was done in two sessions. In the first session the speaker was asked to give three repetitions such that each declarative utterance is followed by interrogative and then by imperative utterance. In the second session the speakers were asked to give another set of three repetitions. But this time they were asked first to speak all the declarative utterances, then all the interrogative utterances and finally all the imperative utterances. In designing the speech material, the following points were kept in mind: (1) the sentences should be natural in their meaning. (2) Unvoiced stops and fricatives were avoided in order to minimize the segmental effect on F 0 contours. All the sentences used in this study were 5 to 8 syllables long, and were uttered without intra-sentential pause. After recording, a listening test was conducted in order to select the data for the analysis. For this purpose the utterances were randomly mixed and presented to 10 native listeners for giving their judgments on the utterance type. Finally, for analysis, only those utterances were selected which all the listeners perceived as the respective type of utterances. 4. EXPERIMENTAL PROCEDURE The experiment was conducted in the following three stages: (i) analysis of gross features of F 0 contours, (ii) model-based analysis of F 0 contours and extraction of model parameters, and (iii) synthesis of speech based on the analysis results and perceptual verification of validity of synthesis Analysis of Gross Features of F 0 Contours Results F 0 analysis of the recorded data shows that Declaratives generally have falling intonation and a terminal fall. Interrogatives utterances have a gradually rising intonation and a terminal rise. Compared with Declaratives and Interrogatives, Imperatives tend to have a rather flat intonation, i.e., neither gradually falling (typical for Declaratives) nor gradually rising (typical for the Interrogatives). It is evident from the data in Table 3 that maximum F 0 (Max. F 0 ) value plays a vital role in distinguishing Interrogatives from the other two types of utterances in Bangla. Namely, Interrogatives have much higher values of Max. F 0 compared to their Declarative and Imperative counterparts. However, there is no significant difference in the minimum F 0 (Min. F 0 ) values between the three types of utterances. Table 3. Comparison of maximum and minimum F 0 values. Max. F 0 µ: σ: Min. F 0 µ: σ: 6.72 µ: σ: µ: σ: 7.55 µ: σ: µ: σ: Analysis of prosodic words in the utterances of all the six speakers reveals that compared to Declaratives and Imperatives, Interrogatives tend to have a much larger dynamic range of F 0 in the last prosodic word of the utterance. Figure 1 represents the dynamic range of F 0 expressed as the ratio of Max. F 0 to Min. F 0 of the final prosodic word of each utterance. In case of Interrogatives, F 0 starts at Hertz and ends at Hertz. On the other hand, in Declaratives and Imperatives F 0 starts at Hertz and ends at a value which is Hertz lower. In the first place, analysis of gross features of the F 0 contours of declarative, imperative and interrogative utterances (henceforth denoted by Declaratives, Imperatives and Interrogatives, respectively) for all the six speakers was done Method of analysis In order to capture the gross features of the F 0 contours, the F 0 values were extracted at every 20ms. After obtaining frame-wise F 0 values for the entire utterance, five values were measured as gross features representing the entire profile of the utterance. They were: maximum, minimum, mean, standard deviation, and ratio of maximum to minimum within the entire utterance. The computed features were then analyzed to identify the differences in terms of F 0, if any, that exist among the three types of the utterances. Figure 1: Dynamic range of F 0 in the final prosodic word of Declaratives, Imperatives and Interrogatives of six speakers.

4 4.2. Analysis of F 0 Contours Using the Command- Response Model Since the differences in the gross features cannot be directly utilized for speech synthesis, model-based analysis of the F 0 contours is necessary. For this purpose, automatic extraction of the command-response model parameters followed by fine manual tuning was carried out for accurate extraction of model parameters of F 0 contours. The extracted command parameters were then analyzed to find out how the observed differences in the gross features of F 0 contours of the three types of utterances can be explained in terms of the command response model Method of analysis For analyzing the command parameters of the utterance, the extracted F 0 contour (in the logarithmic scale) was decomposed into its constituents, i.e., the baseline frequency, the phrase components and the accent components. The magnitude and timing of the underlying phrase and accent commands were estimated using the Analysis-by-Synthesis method. The extracted parameters were then compared to identify the differences that exist among them. Figure 2 shows how the command parameters were estimated Results From Tables 4 and 5 it is clear that Interrogatives have a much lower baseline frequency (F b ) and have a higher value of utterance-initial A p. Imperatives, on the other hand, have a higher F b and a lower utterance-initial A p value. In addition to the initial positive phrase command, Declaratives and Imperatives have an additional negative phrase command towards the end of the utterance. The lead times of the phrase commands are presented in Table 6. From the data presented in Table 6 it is clear that in case of Imperatives the onset of the utterance initial phrase command is delayed by milliseconds as compared with Declaratives and Interrogatives. In case of Declaratives the negative phrase commands start approximately 80 milliseconds before the segmental onset of the first syllable of the final prosodic word. In case of Imperatives they start approximately 33 milliseconds before the segmental onset of the last syllable of the final prosodic word of the utterance. Thus the duration of the falling F 0 slope is greater in case of Declaratives than Imperatives. This negative slope causes the terminal fall characterizing Declaratives. Table 4. Base frequency in hertz (F b ) µ: µ: µ: σ: σ: 4.54 σ: 8.15 Table 5. Phrase command magnitude (A p ) Initial Final µ: µ: µ: σ: σ: σ: µ: µ: σ: σ: Table 6. Phrase command lead-time [in s] Figure 2: Examples of modeled F 0 contour for three utterance types. The upper panel shows the speech waveform, the original (pink-crossed) and matched contour (blue-solid). The lower panel consists of two parts: the phrase command onset and magnitude, and accent command onset, offset and amplitude. Initial Final µ: µ: µ: σ: σ: σ: µ: µ: σ: σ: From the study of accent commands it is observed that intonation of most of the prosodic words can be modeled using a single negative accent command except that of the final prosodic word of Interrogatives, which requires an additional positive accent command (µ: and σ:

5 0.078). It is also observed that in Declaratives the negative accent command amplitude (A a ) increases from the phraseinitial position to phrase-medial positions. On the other hand, the amplitude of the negative accent command in Interrogatives decreases from the phrase-initial position to phrase-initial positions. In Imperatives, the negative accent command amplitude remains more or less the same within a prosodic phrase (Figure 3). As for the onset and offset times of the accent commands, it was observed that there is no significant difference across the three types of utterances in Bangla and the value of onset and offset times are almost identical to those already reported for declaratives in [10]. Figure 3: Negative accent command amplitude (A a ) of Declaratives, Imperatives and Interrogatives within a prosodic phrase Speech Synthesis and Verification of Results The synthesized F 0 contour was evaluated using two methods: a) the objective method, and b) the subjective method. A set of 25 Bangla sentences of each type (not included in the analysis data) is used for the evaluation Method of analysis In the objective evaluation, the original F 0 contour is compared with the synthesized F 0 contour and the root mean squared value of the difference is calculated in log F 0. For the subjective evaluation, 5 subjects - 2 males (L1, L2) and 3 females (L3, L4, and L5) - were selected. All subjects are native speakers of Standard Colloquial Bangla and are not speech experts. In order to verify the results of the analysis, 25 synthetic stimuli of each type were generated using the ESNOLA-based Bangla Text to Speech system developed by C-DAC, Kolkata. The stimuli were ordered randomly and were presented to the subjects for categorizing the sentence type as Declaratives, Imperatives or Interrogatives. The subjects were also asked to provide an indication of the degree of confidence of judgment on a 5-point scale (1: least confident, 5: most confident). The steps which are required for the generation of F 0 contour is described below: Step 1: Determine the sentence type, i.e., whether it is Declarative, Imperative or Interrogative, on the basis of parts of speech tag and punctuation mark. Step 2: Select the value for base-frequency based on the sentence type (Table 3). Step 3: Predict the position of the phrase command. Step 4: Select the value for the magnitude (Table 5) and lead time (Table 6) of the phrase command on the basis of sentence type and position of the phrase command. Step 5: Predict the position and type of the accent command by identifying the prosodic words [10]. Step 6: Select an appropriate value for the amplitude of the accent command based on the sentence type and its position within a phrase. Step 7: Selection of t 1 and t 2 values based on the position and length of the prosodic word [10]. Step 8: Using the command-response model, synthesize the F 0 contour based on the values predicted by the above steps Results The objective evaluation of the synthesized F 0 contour reveals that the root mean squared difference per sample of the synthesized and original contour for the above 75 sentences is quite small (0.051). The result of subjective evaluation is presented in Table 7. It is evident from the table that majority of the identified sentence type matches their intended type. Declarative stimuli are identified as intended 92.6 % of the times, while 7.4% of the time they are identified as either Interrogatives or Imperatives. The mean certainty rating for declarative stimuli is 4.32 and it is highly significant for Declaratives (p<0.0001, F=18.12). Imperatives have the identification rate of 84.5%. They are mostly confused with Declaratives (12.3%) and sometimes with Interrogatives (3.2%). The mean certainty rating for imperative stimuli is 3.70 and it is significant for Imperatives (p<0.0001, F=34.8). Interrogatives stimuli are identified as intended 90.2% of the time, while 9.8% of the time they are categorized as either Declaratives or Imperatives. The mean certainty rating for interrogative stimuli is 4.52 and it is also highly significant for Interrogatives (p<0.0001, F=24.37). Table 7. Identification result in percentage. Declaratives Imperatives Interrogatives

6 5. DISCUSSION AND CONCLUSION The present study first showed prosodic differences that exist among the three major utterance types of Bangla in terms of gross features of F 0 contours. The differences were then more precisely analyzed and quantified using the command-response model, and expressed in terms of differences in its model parameters. The results were further used for generating F 0 contours of synthetic speech and were perceptually verified, proving the usefulness of the command-response model in analysis as well as in highquality speech synthesis of Bangla. This encourages us to explore further along this direction, and analyze F 0 contours of other types of Bangla utterances for the purpose of highquality text-to-speech synthesis. 6. ACKNOWLEDGEMENTS The authors of this paper are greatly thankful to all the informants and evaluators for their unconditional participation in the experiments. 7. REFERENCES [1] Rangarajan, Vivek., Narayanan, S., Bangalore, S., Acoustic-Syntactic Maximum Entropy Model For Automatic Prosody Labeling, Proc. of the IEEE/ACL Workshop on Spoken Language Technology, pp , [2] Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, W. C., Price, P., Pierrehumbert, J., Hirschberg, J., ToBI: A Standard Scheme for Labeling Prosody, Proc. of the Second International Conference on Spoken Language Processing, Banff, Canada, , [3] [4] Fujisaki, H., Hirose, K., "Analysis of Voice Fundamental Frequency Contours for Declarative Sentences of Japanese," J. Acoust. Soc. Japan (E), vol.5, no.4, pp , [5] Fujisaki, H., Information, Prosody, and Modeling, Proc. of Speech Prosody 2004, Nara, Japan, 1-10, [6] Bulyko, I., and Ostendorf, M., Joint Prosody Prediction and Unit Selection for Concatenative Speech Synthesis, Proc. Of ICASSP, [7] Hasegawa-Johnson, M., Chen, K., Cole, J., Borys, S., Kim, S.-S., Cohen, A., Zhang, T. Choi, Y., Kim, H., Yoon, T.-J., Chavara, S., Simultaneous Recognition of Words and Prosody in the Boston University Radio Speech Corpus, Speech Commun., vol. 46, pp , [8] Nöth, E., Batliner, A., KieBling, A., Kompe, R., and Niemann, H., VERBMOBIL: The Use of Prosody in the Linguistic Components of a Speech Understanding System, IEEE Trans. Speech Audio Process. vol. 8, no. 5, pp , Sep [9] Aguero, P.D., Adell, J., Bonafonte, A., Prosody Generation for Speech-to-Speech Translation, Proc. of ICASSP, [10] Das Mandal, S. K., Warsi, A. H., Basu, T., Hirose K., Fujisaki, H., Analysis and Synthesis of F 0 Contours for Bangla Readout Speech, Proc. of Oriental COCOSDA 2010, Kathmandu, Nepal, [11] Das Mandal, S. K. and Datta, A. K., Epoch Synchronous Non-OverLapping Add (ESNOLA) Method Based Concatenative Synthesis System for Bangla, Proc. of 6th ISCA Workshop on Speech Synthesis, University of Bonn, Germany, , 2007.

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Phonetic and phonological properties of the final pitch accent in Catalan declaratives

Phonetic and phonological properties of the final pitch accent in Catalan declaratives Abstract Phonetic and phonological properties of the final pitch accent in Catalan declaratives Eva Estebas-Vilaplana * This paper examines the phonetic and phonological properties of the last pitch accent

More information

Regionalized Text-to-Speech Systems: Persona Design and Application Scenarios

Regionalized Text-to-Speech Systems: Persona Design and Application Scenarios Regionalized Text-to-Speech Systems: Persona Design and Application Scenarios Michael Pucher, Gudrun Schuchmann, and Peter Fröhlich ftw., Telecommunications Research Center, Donau-City-Strasse 1, 1220

More information

L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES

L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES L2 EXPERIENCE MODULATES LEARNERS USE OF CUES IN THE PERCEPTION OF L3 TONES Zhen Qin, Allard Jongman Department of Linguistics, University of Kansas, United States qinzhenquentin2@ku.edu, ajongman@ku.edu

More information

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis

Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Experiments with Signal-Driven Symbolic Prosody for Statistical Parametric Speech Synthesis Fabio Tesser, Giacomo Sommavilla, Giulio Paci, Piero Cosi Institute of Cognitive Sciences and Technologies, National

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne

SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne SWING: A tool for modelling intonational varieties of Swedish Beskow, Jonas; Bruce, Gösta; Enflo, Laura; Granström, Björn; Schötz, Susanne Published in: Proceedings of Fonetik 2008 Published: 2008-01-01

More information

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various

More information

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software

Carla Simões, t-carlas@microsoft.com. Speech Analysis and Transcription Software Carla Simões, t-carlas@microsoft.com Speech Analysis and Transcription Software 1 Overview Methods for Speech Acoustic Analysis Why Speech Acoustic Analysis? Annotation Segmentation Alignment Speech Analysis

More information

Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements

Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements Prosodic Characteristics of Emotional Speech: Measurements of Fundamental Frequency Movements Authors: A. Paeschke, W. F. Sendlmeier Technical University Berlin, Germany ABSTRACT Recent data on prosodic

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;

More information

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings,

stress, intonation and pauses and pronounce English sounds correctly. (b) To speak accurately to the listener(s) about one s thoughts and feelings, Section 9 Foreign Languages I. OVERALL OBJECTIVE To develop students basic communication abilities such as listening, speaking, reading and writing, deepening their understanding of language and culture

More information

Text-To-Speech Technologies for Mobile Telephony Services

Text-To-Speech Technologies for Mobile Telephony Services Text-To-Speech Technologies for Mobile Telephony Services Paulseph-John Farrugia Department of Computer Science and AI, University of Malta Abstract. Text-To-Speech (TTS) systems aim to transform arbitrary

More information

Julia Hirschberg. AT&T Bell Laboratories. Murray Hill, New Jersey 07974

Julia Hirschberg. AT&T Bell Laboratories. Murray Hill, New Jersey 07974 Julia Hirschberg AT&T Bell Laboratories Murray Hill, New Jersey 07974 Comparing the questions -proposed for this discourse panel with those identified for the TINLAP-2 panel eight years ago, it becomes

More information

Technical Report. Overview. Revisions in this Edition. Four-Level Assessment Process

Technical Report. Overview. Revisions in this Edition. Four-Level Assessment Process Technical Report Overview The Clinical Evaluation of Language Fundamentals Fourth Edition (CELF 4) is an individually administered test for determining if a student (ages 5 through 21 years) has a language

More information

Robust Methods for Automatic Transcription and Alignment of Speech Signals

Robust Methods for Automatic Transcription and Alignment of Speech Signals Robust Methods for Automatic Transcription and Alignment of Speech Signals Leif Grönqvist (lgr@msi.vxu.se) Course in Speech Recognition January 2. 2004 Contents Contents 1 1 Introduction 2 2 Background

More information

An Arabic Text-To-Speech System Based on Artificial Neural Networks

An Arabic Text-To-Speech System Based on Artificial Neural Networks Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department

More information

Subjective SNR measure for quality assessment of. speech coders \A cross language study

Subjective SNR measure for quality assessment of. speech coders \A cross language study Subjective SNR measure for quality assessment of speech coders \A cross language study Mamoru Nakatsui and Hideki Noda Communications Research Laboratory, Ministry of Posts and Telecommunications, 4-2-1,

More information

Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R.

Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R. Q1. The graph below shows how a sinusoidal alternating voltage varies with time when connected across a resistor, R. (a) (i) State the peak-to-peak voltage. peak-to-peak voltage...v (1) (ii) State the

More information

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman

A Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints

More information

Functional Auditory Performance Indicators (FAPI)

Functional Auditory Performance Indicators (FAPI) Functional Performance Indicators (FAPI) An Integrated Approach to Skill FAPI Overview The Functional (FAPI) assesses the functional auditory skills of children with hearing loss. It can be used by parents,

More information

Things to remember when transcribing speech

Things to remember when transcribing speech Notes and discussion Things to remember when transcribing speech David Crystal University of Reading Until the day comes when this journal is available in an audio or video format, we shall have to rely

More information

4 Pitch and range in language and music

4 Pitch and range in language and music 4 Pitch and range in language and music 4.1 Average and range of pitch in spoken language and song 4.1.1 Average and range of pitch in language Fant (1956) determined the average values for fundamental

More information

Develop Software that Speaks and Listens

Develop Software that Speaks and Listens Develop Software that Speaks and Listens Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks or registered

More information

Prosodic Phrasing: Machine and Human Evaluation

Prosodic Phrasing: Machine and Human Evaluation Prosodic Phrasing: Machine and Human Evaluation M. Céu Viana*, Luís C. Oliveira**, Ana I. Mata***, *CLUL, **INESC-ID/IST, ***FLUL/CLUL Rua Alves Redol 9, 1000 Lisboa, Portugal mcv@clul.ul.pt, lco@inesc-id.pt,

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Speech Communication Session 2aSC: Linking Perception and Production

More information

The Phase Modulator In NBFM Voice Communication Systems

The Phase Modulator In NBFM Voice Communication Systems The Phase Modulator In NBFM Voice Communication Systems Virgil Leenerts 8 March 5 The phase modulator has been a point of discussion as to why it is used and not a frequency modulator in what are called

More information

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS

SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS SOME ASPECTS OF ASR TRANSCRIPTION BASED UNSUPERVISED SPEAKER ADAPTATION FOR HMM SPEECH SYNTHESIS Bálint Tóth, Tibor Fegyó, Géza Németh Department of Telecommunications and Media Informatics Budapest University

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

DETECTION OF QUESTIONS IN CHINESE CONVERSATIONAL SPEECH. Jiahong Yuan & Dan Jurafsky. Stanford University {jy55, jurafsky}@stanford.

DETECTION OF QUESTIONS IN CHINESE CONVERSATIONAL SPEECH. Jiahong Yuan & Dan Jurafsky. Stanford University {jy55, jurafsky}@stanford. DETECTION OF QUESTIONS IN CHINESE CONVERSATIONAL SPEECH Jiahong Yuan & Dan Jurafsky Stanford University {jy55, jurafsky}@stanford.edu ABSTRACT What features are helpful for Chinese question detection?

More information

Use of Human Big Data to Help Improve Productivity in Service Businesses

Use of Human Big Data to Help Improve Productivity in Service Businesses Hitachi Review Vol. 6 (216), No. 2 847 Featured Articles Use of Human Big Data to Help Improve Productivity in Service Businesses Satomi Tsuji Hisanaga Omori Kenji Samejima Kazuo Yano, Dr. Eng. OVERVIEW:

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

Ohio Early Learning and Development Standards Domain: Language and Literacy Development

Ohio Early Learning and Development Standards Domain: Language and Literacy Development Ohio Early Learning and Development Standards Domain: Language and Literacy Development Strand: Listening and Speaking Topic: Receptive Language and Comprehension Infants Young Toddlers (Birth - 8 months)

More information

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.

More information

ARTICLE. Sound in surveillance Adding audio to your IP video solution

ARTICLE. Sound in surveillance Adding audio to your IP video solution ARTICLE Sound in surveillance Adding audio to your IP video solution Table of contents 1. First things first 4 2. Sound advice 4 3. Get closer 5 4. Back and forth 6 5. Get to it 7 Introduction Using audio

More information

From Concept to Production in Secure Voice Communications

From Concept to Production in Secure Voice Communications From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure

More information

Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords.

Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords. PITCH Degree of highness or lowness of the voice caused by variation in the rate of vibration of the vocal cords. PITCH RANGE The scale of pitch between its lowest and highest levels. INTONATION The variations

More information

HD Radio FM Transmission System Specifications Rev. F August 24, 2011

HD Radio FM Transmission System Specifications Rev. F August 24, 2011 HD Radio FM Transmission System Specifications Rev. F August 24, 2011 SY_SSS_1026s TRADEMARKS HD Radio and the HD, HD Radio, and Arc logos are proprietary trademarks of ibiquity Digital Corporation. ibiquity,

More information

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,

More information

Evaluation of a Segmental Durations Model for TTS

Evaluation of a Segmental Durations Model for TTS Speech NLP Session Evaluation of a Segmental Durations Model for TTS João Paulo Teixeira, Diamantino Freitas* Instituto Politécnico de Bragança *Faculdade de Engenharia da Universidade do Porto Overview

More information

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military

More information

Creating voices for the Festival speech synthesis system.

Creating voices for the Festival speech synthesis system. M. Hood Supervised by A. Lobb and S. Bangay G01H0708 Creating voices for the Festival speech synthesis system. Abstract This project focuses primarily on the process of creating a voice for a concatenative

More information

Trigonometric functions and sound

Trigonometric functions and sound Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude

More information

Thai Language Self Assessment

Thai Language Self Assessment The following are can do statements in four skills: Listening, Speaking, Reading and Writing. Put a in front of each description that applies to your current Thai proficiency (.i.e. what you can do with

More information

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency

The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency The Effect of Long-Term Use of Drugs on Speaker s Fundamental Frequency Andrey Raev 1, Yuri Matveev 1, Tatiana Goloshchapova 2 1 Speech Technology Center, St. Petersburg, RUSSIA {raev, matveev}@speechpro.com

More information

Lecture 1-10: Spectrograms

Lecture 1-10: Spectrograms Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed

More information

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard

THE VOICE OF LOVE. Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard THE VOICE OF LOVE Trisha Belanger, Caroline Menezes, Claire Barboa, Mofida Helo, Kimia Shirazifard University of Toledo, United States tbelanger@rockets.utoledo.edu, Caroline.Menezes@utoledo.edu, Claire.Barbao@rockets.utoledo.edu,

More information

Intonation difficulties in non-native languages.

Intonation difficulties in non-native languages. Intonation difficulties in non-native languages. Irma Rusadze Akaki Tsereteli State University, Assistant Professor, Kutaisi, Georgia Sopio Kipiani Akaki Tsereteli State University, Assistant Professor,

More information

Implementing an In-Service, Non- Intrusive Measurement Device in Telecommunication Networks Using the TMS320C31

Implementing an In-Service, Non- Intrusive Measurement Device in Telecommunication Networks Using the TMS320C31 Disclaimer: This document was part of the First European DSP Education and Research Conference. It may have been written by someone whose native language is not English. TI assumes no liability for the

More information

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals Modified from the lecture slides of Lami Kaya (LKaya@ieee.org) for use CECS 474, Fall 2008. 2009 Pearson Education Inc., Upper

More information

WinPitch LTL II, a Multimodal Pronunciation Software

WinPitch LTL II, a Multimodal Pronunciation Software WinPitch LTL II, a Multimodal Pronunciation Software Philippe MARTIN UFRL Université Paris 7 92, Ave. de France 75013 Paris, France philippe.martin@linguist.jussieu.fr Abstract We introduce a new version

More information

Recognition of Emotions in Interactive Voice Response Systems

Recognition of Emotions in Interactive Voice Response Systems Recognition of Emotions in Interactive Voice Response Systems Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns HP Laboratories Palo Alto HPL-2003-136 July 2 nd, 2003* E-mail: {sherif.yacoub, steven.simske,

More information

VOICE RECOGNITION KIT USING HM2007. Speech Recognition System. Features. Specification. Applications

VOICE RECOGNITION KIT USING HM2007. Speech Recognition System. Features. Specification. Applications VOICE RECOGNITION KIT USING HM2007 Introduction Speech Recognition System The speech recognition system is a completely assembled and easy to use programmable speech recognition circuit. Programmable,

More information

Culture and Language. What We Say Influences What We Think, What We Feel and What We Believe

Culture and Language. What We Say Influences What We Think, What We Feel and What We Believe Culture and Language What We Say Influences What We Think, What We Feel and What We Believe Unique Human Ability Ability to create and use language is the most distinctive feature of humans Humans learn

More information

A System for Labeling Self-Repairs in Speech 1

A System for Labeling Self-Repairs in Speech 1 A System for Labeling Self-Repairs in Speech 1 John Bear, John Dowding, Elizabeth Shriberg, Patti Price 1. Introduction This document outlines a system for labeling self-repairs in spontaneous speech.

More information

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James

More information

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions IMPLEMENTATION NOTE Subject: Category: Capital No: A-1 Date: January 2006 I. Introduction The term rating system comprises all of the methods, processes, controls, data collection and IT systems that support

More information

Effects of Pronunciation Practice System Based on Personalized CG Animations of Mouth Movement Model

Effects of Pronunciation Practice System Based on Personalized CG Animations of Mouth Movement Model Effects of Pronunciation Practice System Based on Personalized CG Animations of Mouth Movement Model Kohei Arai 1 Graduate School of Science and Engineering Saga University Saga City, Japan Mariko Oda

More information

AP1 Waves. (A) frequency (B) wavelength (C) speed (D) intensity. Answer: (A) and (D) frequency and intensity.

AP1 Waves. (A) frequency (B) wavelength (C) speed (D) intensity. Answer: (A) and (D) frequency and intensity. 1. A fire truck is moving at a fairly high speed, with its siren emitting sound at a specific pitch. As the fire truck recedes from you which of the following characteristics of the sound wave from the

More information

A CHINESE SPEECH DATA WAREHOUSE

A CHINESE SPEECH DATA WAREHOUSE A CHINESE SPEECH DATA WAREHOUSE LUK Wing-Pong, Robert and CHENG Chung-Keng Department of Computing, Hong Kong Polytechnic University Tel: 2766 5143, FAX: 2774 0842, E-mail: {csrluk,cskcheng}@comp.polyu.edu.hk

More information

Prosodic focus marking in Bai

Prosodic focus marking in Bai Prosodic focus marking in Bai Zenghui Liu 1, Aoju Chen 1,2 & Hans Van de Velde 1 Utrecht University 1, Max Planck Institute for Psycholinguistics 2 l.z.h.liu@uu.nl, aoju.chen@uu.nl, h.vandevelde@uu.nl

More information

Aircraft cabin noise synthesis for noise subjective analysis

Aircraft cabin noise synthesis for noise subjective analysis Aircraft cabin noise synthesis for noise subjective analysis Bruno Arantes Caldeira da Silva Instituto Tecnológico de Aeronáutica São José dos Campos - SP brunoacs@gmail.com Cristiane Aparecida Martins

More information

The Role of Listening in Language Acquisition; the Challenges & Strategies in Teaching Listening

The Role of Listening in Language Acquisition; the Challenges & Strategies in Teaching Listening International Journal of Education and Information Studies. ISSN 2277-3169 Volume 4, Number 1 (2014), pp. 59-63 Research India Publications http://www.ripublication.com The Role of Listening in Language

More information

Noise. CIH Review PDC March 2012

Noise. CIH Review PDC March 2012 Noise CIH Review PDC March 2012 Learning Objectives Understand the concept of the decibel, decibel determination, decibel addition, and weighting Know the characteristics of frequency that are relevant

More information

QoS Mapping of VoIP Communication using Self-Organizing Neural Network

QoS Mapping of VoIP Communication using Self-Organizing Neural Network QoS Mapping of VoIP Communication using Self-Organizing Neural Network Masao MASUGI NTT Network Service System Laboratories, NTT Corporation -9- Midori-cho, Musashino-shi, Tokyo 80-88, Japan E-mail: masugi.masao@lab.ntt.co.jp

More information

Alignment of the National Standards for Learning Languages with the Common Core State Standards

Alignment of the National Standards for Learning Languages with the Common Core State Standards Alignment of the National with the Common Core State Standards Performance Expectations The Common Core State Standards for English Language Arts (ELA) and Literacy in History/Social Studies, Science,

More information

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING

A TOOL FOR TEACHING LINEAR PREDICTIVE CODING A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering

More information

INCREASE YOUR PRODUCTIVITY WITH CELF 4 SOFTWARE! SAMPLE REPORTS. To order, call 1-800-211-8378, or visit our Web site at www.pearsonassess.

INCREASE YOUR PRODUCTIVITY WITH CELF 4 SOFTWARE! SAMPLE REPORTS. To order, call 1-800-211-8378, or visit our Web site at www.pearsonassess. INCREASE YOUR PRODUCTIVITY WITH CELF 4 SOFTWARE! Report Assistant SAMPLE REPORTS To order, call 1-800-211-8378, or visit our Web site at www.pearsonassess.com In Canada, call 1-800-387-7278 In United Kingdom,

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

Aspects of North Swedish intonational phonology. Bruce, Gösta

Aspects of North Swedish intonational phonology. Bruce, Gösta Aspects of North Swedish intonational phonology. Bruce, Gösta Published in: Proceedings from Fonetik 3 ; Phonum 9 Published: 3-01-01 Link to publication Citation for published version (APA): Bruce, G.

More information

SPEECH OR LANGUAGE IMPAIRMENT EARLY CHILDHOOD SPECIAL EDUCATION

SPEECH OR LANGUAGE IMPAIRMENT EARLY CHILDHOOD SPECIAL EDUCATION I. DEFINITION Speech or language impairment means a communication disorder, such as stuttering, impaired articulation, a language impairment (comprehension and/or expression), or a voice impairment, that

More information

Characterizing Digital Cameras with the Photon Transfer Curve

Characterizing Digital Cameras with the Photon Transfer Curve Characterizing Digital Cameras with the Photon Transfer Curve By: David Gardner Summit Imaging (All rights reserved) Introduction Purchasing a camera for high performance imaging applications is frequently

More information

Basic Acoustics and Acoustic Filters

Basic Acoustics and Acoustic Filters Basic CHAPTER Acoustics and Acoustic Filters 1 3 Basic Acoustics and Acoustic Filters 1.1 The sensation of sound Several types of events in the world produce the sensation of sound. Examples include doors

More information

Doppler Effect Plug-in in Music Production and Engineering

Doppler Effect Plug-in in Music Production and Engineering , pp.287-292 http://dx.doi.org/10.14257/ijmue.2014.9.8.26 Doppler Effect Plug-in in Music Production and Engineering Yoemun Yun Department of Applied Music, Chungwoon University San 29, Namjang-ri, Hongseong,

More information

Developmental Verbal Dyspraxia Nuffield Approach

Developmental Verbal Dyspraxia Nuffield Approach Developmental Verbal Dyspraxia Nuffield Approach Pam Williams, Consultant Speech & Language Therapist Nuffield Hearing & Speech Centre RNTNE Hospital, London, Uk Outline of session Speech & language difficulties

More information

Dynamic sound source for simulating the Lombard effect in room acoustic modeling software

Dynamic sound source for simulating the Lombard effect in room acoustic modeling software Dynamic sound source for simulating the Lombard effect in room acoustic modeling software Jens Holger Rindel a) Claus Lynge Christensen b) Odeon A/S, Scion-DTU, Diplomvej 381, DK-2800 Kgs. Lynby, Denmark

More information

NATURAL SOUNDING TEXT-TO-SPEECH SYNTHESIS BASED ON SYLLABLE-LIKE UNITS SAMUEL THOMAS MASTER OF SCIENCE

NATURAL SOUNDING TEXT-TO-SPEECH SYNTHESIS BASED ON SYLLABLE-LIKE UNITS SAMUEL THOMAS MASTER OF SCIENCE NATURAL SOUNDING TEXT-TO-SPEECH SYNTHESIS BASED ON SYLLABLE-LIKE UNITS A THESIS submitted by SAMUEL THOMAS for the award of the degree of MASTER OF SCIENCE (by Research) DEPARTMENT OF COMPUTER SCIENCE

More information

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC 1. INTRODUCTION The CBS Records CD-1 Test Disc is a highly accurate signal source specifically designed for those interested in making

More information

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System Oana NICOLAE Faculty of Mathematics and Computer Science, Department of Computer Science, University of Craiova, Romania oananicolae1981@yahoo.com

More information

The Effects of Ultrasonic Sound Generated by Ultrasonic Cleaning Systems on Human Hearing and Physiology

The Effects of Ultrasonic Sound Generated by Ultrasonic Cleaning Systems on Human Hearing and Physiology The Effects of Ultrasonic Sound Generated by Ultrasonic Cleaning Systems on Human Hearing and Physiology Questions about the effects of ultrasonic energy on hearing and other human physiology arise from

More information

The ROI. of Speech Tuning

The ROI. of Speech Tuning The ROI of Speech Tuning Executive Summary: Speech tuning is a process of improving speech applications after they have been deployed by reviewing how users interact with the system and testing changes.

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

Bachelors of Science Program in Communication Disorders and Sciences:

Bachelors of Science Program in Communication Disorders and Sciences: Bachelors of Science Program in Communication Disorders and Sciences: Mission: The SIUC CDS program is committed to multiple complimentary missions. We provide support for, and align with, the university,

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

French Language and Culture. Curriculum Framework 2011 2012

French Language and Culture. Curriculum Framework 2011 2012 AP French Language and Culture Curriculum Framework 2011 2012 Contents (click on a topic to jump to that page) Introduction... 3 Structure of the Curriculum Framework...4 Learning Objectives and Achievement

More information

have more skill and perform more complex

have more skill and perform more complex Speech Recognition Smartphone UI Speech Recognition Technology and Applications for Improving Terminal Functionality and Service Usability User interfaces that utilize voice input on compact devices such

More information

Teenage and adult speech in school context: building and processing a corpus of European Portuguese

Teenage and adult speech in school context: building and processing a corpus of European Portuguese Teenage and adult speech in school context: building and processing a corpus of European Portuguese Ana Isabel Mata 1, Helena Moniz 1,2, Fernando Batista 2,3, Julia Hirschberg 4 1 FLUL/CLUL Universidade

More information

MICROPHONE SPECIFICATIONS EXPLAINED

MICROPHONE SPECIFICATIONS EXPLAINED Application Note AN-1112 MICROPHONE SPECIFICATIONS EXPLAINED INTRODUCTION A MEMS microphone IC is unique among InvenSense, Inc., products in that its input is an acoustic pressure wave. For this reason,

More information

PUMPED Nd:YAG LASER. Last Revision: August 21, 2007

PUMPED Nd:YAG LASER. Last Revision: August 21, 2007 PUMPED Nd:YAG LASER Last Revision: August 21, 2007 QUESTION TO BE INVESTIGATED: How can an efficient atomic transition laser be constructed and characterized? INTRODUCTION: This lab exercise will allow

More information

AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH

AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH AUTOMATIC DETECTION OF CONTRASTIVE ELEMENTS IN SPONTANEOUS SPEECH Ani Nenkova University of Pennsylvania nenkova@seas.upenn.edu Dan Jurafsky Stanford University jurafsky@stanford.edu ABSTRACT In natural

More information

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast

Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Automatic Speech Recognition and Hybrid Machine Translation for High-Quality Closed-Captioning and Subtitling for Video Broadcast Hassan Sawaf Science Applications International Corporation (SAIC) 7990

More information

General Thoughts on Generator Set - Acoustic Solutions & Noise Control

General Thoughts on Generator Set - Acoustic Solutions & Noise Control t: 023 81 290160 www.excelpowerltd.co.uk f: 023 81 290260 info@excelpowerltd.co.uk General Thoughts on Generator Set - Acoustic Solutions & Noise Control Written By Steve Delaney A.M.I.O.A. of TAS Ltd

More information

The LENA TM Language Environment Analysis System:

The LENA TM Language Environment Analysis System: FOUNDATION The LENA TM Language Environment Analysis System: The Interpreted Time Segments (ITS) File Dongxin Xu, Umit Yapanel, Sharmi Gray, & Charles T. Baer LENA Foundation, Boulder, CO LTR-04-2 September

More information

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19 Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*

More information

Enhancing Technology College Students English Listening Comprehension by Listening Journals

Enhancing Technology College Students English Listening Comprehension by Listening Journals Enhancing Technology College Students English Listening Comprehension by Listening Journals Jung-chuan Chen* Department of Applied Foreign Languages, Nanya Institute of Technology Chung-Li, Taiwan, 32034

More information