Telecom RealSpeak / Host Software Development Kit



Similar documents
RealSpeak Telecom Software Development Kit

How To Write A Spainese Text File In English (Spain) On A Microsoft Powerbook (Spanish)

Exemplar for Internal Achievement Standard. Spanish Level 1

demonstrates competence in

AP SPANISH LANGUAGE 2013 PRESENTATIONAL WRITING SCORING GUIDELINES

A mysterious meeting. (Un encuentro misterioso) Spanish. List of characters. (Personajes) Khalid, the birthday boy (Khalid, el cumpleañero)

Telecom RealSpeak / Host Software Development Kit

The New Forest Small School

SOLARWINDS, INC. ipmonitor 8.0 MANAGER END USER LICENSE AGREEMENT REDISTRIBUTION NOT PERMITTED

Microsoft Dynamics GP. SmartList Builder User s Guide With Excel Report Builder

Contents. Introduction Chapter 1 Articles Chapter 2 Nouns Chapter 3 Adjectives Chapter 4 Prepositions and Conjunctions...

RealSpeak Telecom Software Development Kit

90 HOURS PROGRAMME LEVEL A1

AP SPANISH LANGUAGE 2011 PRESENTATIONAL WRITING SCORING GUIDELINES

Copyright TeachMe.com 242ea 1

Information Technology Fundamentals

Teacher: Course Name: Spanish I Year. World Language Department Saugus High School Saugus Public Schools

Nuance Mobile Developer Program. HTTP Services for Nuance Mobile Developer Program Clients

Considerations for developing VoiceXML in Canadian French

Lesson B: Ordering food and drinks

Control of a variety of structures and idioms; occasional errors may occur, but

Text-To-Speech Technologies for Mobile Telephony Services

Prepare to speak Spanish Out There

SCORE DESCRIPTION TASK COMPLETION / TOPIC DEVELOPMENT LANGUAGE USE 5 Demonstrates

How To Speak Spain

CITRIX SYSTEMS, INC. SOFTWARE LICENSE AGREEMENT

DTD Tutorial. About the tutorial. Tutorial

Microsoft Dynamics GP. Bank Reconciliation

Microsoft Small Business Financials. Small Business Center Integration

Resources to Help Assess & Plan for Language Access

Knowledge. Subject Knowledge Audit - Spanish Meta-linguistic challenges full some none

THE FOUR AMBASSADORS ASSOCIATION, INC.

Sir John Cass Red Coat School Programme of Study Key Stage 4 Subject: Spanish

Microsoft Dynamics GP. Working with Crystal Reports

to Voice. Tips for to Broadcast

Microsoft Dynamics GP. Project Accounting Billing Guide

Canon USA, Inc. WEBVIEW LIVESCOPE SOFTWARE DEVELOPMENT KIT DEVELOPER LICENSE AGREEMENT

Easy Manage Helpdesk Guide version 5.4

Develop Software that Speaks and Listens

Spanish I: Youtube End of Year Review

WORKFLOW INTEGRATOR INSTALLATION GUIDE

HSS Specific Terms HSS SOFTWARE LICENSE AGREEMENT

S P A N I S H G R A M M A R T I P S

OFFICE OF COMMON INTEREST COMMUNITY OMBUDSMAN CIC#: DEPARTMENT OF JUSTICE

Website Hosting Agreement

Jet Data Manager 2012 User Guide

Plan for: Preliminary Lessons Spanish Señora Franco

SPANISH 3HY. Course Description. Course Goals and Learning Outcomes. Required Materials

8.4 Comparisons and superlatives

Spanish Survival Phrases

Table of Contents. Teacher Created Materials #13108 (i4695) Families 3

RockWare Click-Wrap Software License Agreement ( License )

Microsoft Dynamics GP. Project Accounting Cost Management Guide

Oracle Communications Network Charging and Control. Session Initiation Protocol (SIP) Protocol Implementation Conformance Statement Release 5.0.

Points of Interference in Learning English as a Second Language

FLoader User's Manual

CORE TECHNOLOGIES CONSULTING, LLC SOFTWARE UNLIMITED ENTERPRISE LICENSE AGREEMENT

APS ELEMENTARY SCHOOL PLANNING SURVEY

Removing Language Barriers: Reaching Your Spanish Speaking Audience

Scanner Wedge for Windows Software User Guide

Microsoft Dynamics GP. Payroll Connect

Intel Device View. User Guide

ALM Works End-User License Agreement for Structure Plugin

MDM Zinc 3.0 End User License Agreement (EULA)

Shortcut to Informal Spanish Conversations Level 2 Lesson 1

S.2.2 CHARACTER SETS AND SERVICE STRING ADVICE: THE UNA SEGMENT

Xcode User Default Reference. (Legacy)

PARITY SOFTWARE S SAGE ERP X3 CASHBOOK USER MANUAL

ALPHA TEST LICENSE AGREEMENT

Copy Tool For Dynamics CRM 2013

XML: extensible Markup Language. Anabel Fraga

Memorial Health Care System Catholic Health Initiatives Financial Assistance Application Form

Symbols in subject lines. An in-depth look at symbols

Mobile Banking and Mobile Deposit Terms & Conditions

In-Situ Inc. Warranty / Terms & Conditions / Software License. This warranty policy applies to items that shipped prior to January 1, 2012.

Microsoft Dynamics GP. Bill of Materials

How To Speak Spain

Your summer goal: To practice what you have been learning in Spanish and learn more about the Spanish language and Spanish-speaking cultures.

C-DAC Medical Informatics Software Development Kit End User License Agreement

BALANCE DUE 10/25/2007 $ STATEMENT DATE BALANCE DUE $ PLEASE DETACH AND RETURN TOP PORTION WITH YOUR PAYMENT

END USER LICENSE AGREEMENT

Policy Based Encryption Z. Administrator Guide

Copyright TeachMe.com 4ea67 1

Spanish (TR 9:30 10:50) Course Calendar Spring 2015

Sophos for Microsoft SharePoint Help. Product version: 2.0

Software License Agreement

Request For Quote. Reference Manual. Integrates with Microsoft Dynamics GP v10

SPANISH MOOD SELECTION: Probablemente Subjunctive, Posiblemente Indicative

Field Properties Quick Reference

CURRICULUM MAPPING. Content/Essential Questions for all Units

RealSpeak Telecom Software Development Kit

Bachmann PrintBoy SDK HTML Edition API Reference for CodeWarrior C/C++ Developers version 7.0

SUBCHAPTER A. AUTOMOBILE INSURANCE DIVISION 3. MISCELLANEOUS INTERPRETATIONS 28 TAC 5.204

Common definitions and specifications for OMA REST interfaces

Rethinking Schools Limited Institutional Site License

Transcription:

Telecom RealSpeak / Host Software Development Kit User s Guide for Mexican Spanish V4.0

EV AL U AT I O N AG R E E M E N T READ THE FOLLOWING CAREFULLY BEFORE USING THE SOFTWARE. PRIOR TO RECEIVING THE SOFTWARE, YOU HAVE SIGNED EITHER A LICENSE AGREEMENT OR A NON-DISCLOSURE AGREEMENT WITH SCANSOFT, CONCERNING THE REALSPEAK SOLO DEVELOPMENT SYSTEM OR SOME OF ITS COMPONENTS. THEREFORE, REFERENCE IS MADE TO THESE AND AS SUCH THE CONTRACTUAL TERMS AND CONDITIONS SHALL APPLY TO THE SOFTWARE. IF YOU HAVE NOT SIGNED SUCH AN AGREEMENT AND IF YOU USE THE SOFTWARE, SCANSOFT WILL ASSUME THAT YOU AGREED TO BE BOUND BY THE EVALUATION AGREEMENT SPECIFIED HEREUNDER. IF YOU DO NOT ACCEPT THE TERMS OF THIS EVALUATION AGREEMENT, YOU MUST RETURN THE PACKAGE UNUSED TO SCANSOFT WITHIN SEVEN (7) DAYS AFTER RECEIPT. 1. Grant of Rights In consideration of a possible commercial relationship, ScanSoft hereby grants to you, the LICENSEE, who accepts, a non-exclusive right to internally evaluate and test the software program ( the Software ). 2. Ownership of Software ScanSoft retains title, interests and ownership of the Software recorded on the original disk(s) and all subsequent copies of the Software and Documentation, regardless of the form or media in or on which the original and other copies may exist. ScanSoft reserves all rights not expressly granted to LICENSEE. 3. Copy Restrictions This Software and the accompanying documentation are copyrighted. Unauthorized copying of the Software, including Software that has been merged or included with other software, or of the documentation is expressly forbidden. LICENSEE may be held legally responsible for any intellectual property infringement that is caused or encouraged by his failure to abide by the terms of this agreement. LICENSEE is allowed to make two (2) copies of the Software solely for backup purposes, provided that the copyright notice is included on the backup copy. 4. Use Restrictions LICENSEE agrees not to use the Software for any other purpose than internally evaluating the Software. LICENSEE may physically transfer the Software from one computer to another, provided that the Software is used on only one computer at a time. LICENSEE may not modify, adapt, translate, reverse engineer, decompile, disassemble or create derivative works based on the Software.LICENSEE may not modify, adapt, translate or create derivative works based on the documentation provided by ScanSoft. The Software may not be transferred to anyone without the prior written consent of ScanSoft. In no event may LICENSEE transfer, assign, lease, sell or otherwise dispose of the Software and Documentation on a temporary or permanent basis except as expressly provided herein. 5. Warranty THE SOFTWARE IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. ScanSoft shall have no liability to LICENSEE or any third party for any claim, loss or damage of any kind, including but not limited to lost profits, punitive, incidental, consequential or special damages, arising out of or in connection with the use or performance of the Software and accompanying documentation. 6. Termination This agreement is effective until terminated. ScanSoft reserves the right to terminate this agreement automatically if any provision of this agreement is violated. LICENSEE may terminate this agreement by returning the Software and the accompanying documentation to ScanSoft, along with a written warranty stating that all copies have been returned.

Copyright (C) ScanSoft, Inc Telecom RealSpeak / Host Software Development Kit User s Guide for Mexican Spanish V4.0 - November 2004 No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information retrieval system, without the written permission of ScanSoft. Trademarks MS-DOS, WINDOWS, MICROSOFT VISUAL C++, BORLAND C++ and Sound Blaster are registered trademarks of their respective owners. ScanSoft is a registered trademark. All rights reserved.

Document History Date Type of change SoftwareVersion August 2004 Reformatting of document V4.0 Update for version 4.0 November 2004 Update for version 4.0 V4.0

Table of Contents MEXICAN SPANISH TEXT-TO-SPEECH SYSTEM... 8 Introduction...8 Preparing a text for Text-To-Speech...8 Using Control Sequences...8 Quick Reference of the Control Sequences...9 Speech Rates in Words per Minute for Mexican Spanish Voices...13 Entering phonetic input...14 How to proceed...14 Lexical stress and sentence accents in phonetic input...15 The Mexican Spanish L&H+ and UNIPA Phonetic Alphabets...17 Using a User Dictionary...19 Notes on the Mexican Spanish Text-To-Speech System...20 Cardinal Numbers...20 Decimal Numbers...20 Ordinal Numbers...21 Roman Numbers...21 Bank Account Numbers...22 Telephone Numbers...22 Dates...22 Time Indications...23 Currencies...23 Abbreviations...24 Acronyms...24 E-MAIL PREPROCESSOR... 26 Introduction...26 E-Mail Header Processing...27 Header Field Extraction...27 Header Field Reading...29 From Field...29 Date Field...30 Subject Field...31 E-Mail body processing...31 Message Extraction...31 Text Normalization...33 Language Specific Text Normalization...35 Language Specific Text Normalization...35 English words...35 Customizing the E-Mail Preprocessor...36 Customizing the E-Mail Header...36 From Field...36 Subject Field...39 Customizing the E-Mail Body...40 SSML PREPROCESSOR... 42

Introduction...42 4SML extensions...42 Overview of the supported 4SML tags for Mexican Spanish...43 XML encoding types for Mexican Spanish...43 Elements...44 4SML Tags List for Mexican Spanish...45 The Volume Scale...54 CUSTOM G2P DICTIONARIES... 57 Introduction...57 APPENDIX... 59 Appendix: Mexican Spanish voice names...59

Telecom RealSpeak / Host SDK Chapter I Mexican Spanish Text-To-Speech System User s Guidefor Mexican Spanish V4.0

Chapter I Mexican Spanish Text-To-Speech system Mexican Spanish Text-To-Speech System Introduction This section provides operational instructions for the Text-To-Speech system for Mexican Spanish. It reviews the functionality of the system, and describes how the user can customize the pronunciation of input texts. This part also describes issues that are particular to the Mexican Spanish Text-To-Speech system. It introduces the Mexican Spanish phonetic alphabet and discusses some language-specific features of the Mexican Spanish Text-To-Speech system. Preparing a text for Text-To-Speech Using Control Sequences In general, there are three ways to intervene in the pronunciation of text: By using control sequences By entering phonetic input By using a user dictionary The following tables provide a reference for all supported control sequences in Mexican Spanish. Remark: <ESC> represents the escape character \x1b (decimal 27) that generates the ASCII character 27 (Hex 1B). Chapter I/8

Chapter I Mexican Spanish Text-To-Speech system Quick Reference of the Control Sequences Sequence Description Range Default Delimiter <ESC> \vol=x\ <ESC> \rate=x\ <ESC> \rate_wpm= xxx\ <ESC>Mx <ESC>Wx Volume (x : 0.. 100) 0 = silence 10 = low 100 = high 80 No <ESC>\vol=10\ Puedo hablar con una voz baja, <ESC>\vol=90\ pero también con una voz alta. Speech Rate (x : 1.. 100) 10 = slow 100 = fast 50 No Puedo hablar <ESC>\rate=70\ con una velocidad más rápida <ESC>\rate=20\ o una velocidad más lenta. Words per minute (xxx: 1..1000) Voice-specific (see subsequent table) Voicespecific Puedo hablar <ESC>\rate_wpm=350\ con una velocidad más rápida <ESC>\rate_wpm=110\ o una velocidad más lenta. Read Mode 0 = character-bycharacter 1 = word-by-word 2 = sentence-bysentence 3 = line-by-line <ESC>M0 Demostrador (The word "Demostrador" will be spelled.) <ESC>M1 Esto es el Demostrador español. (This sentence will be read word by word.) <ESC>M2 Esto es el Demostrador español. (This sentence will be read as one sentence.) Wait Period 0 = no wait period 1 = 200 milliseconds wait period 9 = 1800 milliseconds No 2 Yes 2 Yes <ESC>M2 <ESC>W2 Esta frase está seguida de una pausa corta. <ESC>W9 Esta frase está seguida de una pausa larga. Se ha fijado en la diferencia? Chapter I/9

Chapter I Mexican Spanish Text-To-Speech system <ESC> \Pause=xxx \ Long Pause 1.. 2^32-1 msec no Puede insertar una pausa entre los mensajes especificando la duración <ESC>\Pause=1500\ de la pausa. <ESC>" Put sentence accent <ESC>"Cristina viene mañana. Cristina viene <ESC>"mañana. (i.e. no Carmen) (i.e. no hoy) no <ESC>C <ESC>E Note: Manually inserted sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested sentence accent, and thus not realize it. Continuation Yo como manzanas. Y peras también. Yo como manzanas. <ESC>C Y peras también. In the first of the examples above, the text-to-speech system will detect an end-of-sentence after manzanas and hence pause before pronouncing the second part of the input. In order to make the system pronounce the entire input as one sentence, a continuation sequence should be inserted. End-of-Message yes Esta es la primera parte de la frase <ESC>E y ésta es la segunda no <ESC>/+ <ESC>%x In the above example, the sequence <ESC>E forces the system to read the entire input as two separate sentences. Phonetic Input (L&H+ no phonetic alphabet) <ESC>/+la_'t&Si.ka<ESC>/+ Preprocessing Mode text = standard text yes mode email = e-mail mode <ESC>%text Tiene usted un mensaje enviado por Luís Morales. <ESC>%email From: "Luís Morales" <lmorales@arrakis.es> Chapter I/10

Mexican Spanish Text-To-Speech system Chapter I Sequence Description Range Default Delimiter <ESC> \tn=s\ <ESC> \tn=currency_ dollar\ Guide text normalization Interpretation of a currency expression s=string: address, spell, normal <ESC>\tn=currency_dollar\$34 is pronounced treinta y cuatro dólares <ESC> \tn=currency_ peso\ Interpretation of a currency expression <ESC>\tn=currency_peso\$34 is pronounced treinta y cuatro pesos <ESC>F Reset to Default Yes <ESC>\vol=90\ Es el volumen más alto. <ESC>F Es el volumen por defecto. <ESC>\rate=10\ Es la velocidad más baja. <ESC>F Es la velocidad por defecto. Chapter I/11

Chapter I Mexican Spanish Text-To-Speech system Sequence Description Range Default Break <ESC>@c Declare the part-ofspeech c = character No (not supported in Spanish) <ESC>\domain=s\ Enable the extension s = string: the Yes (only if a custom g2p has been loaded) name of the extension <ESC>\domain\ Disable the last Yes <ESC>\voice=s\ <ESC>\mrk=n\ <ESC>\p\ extension Set the voice (if there is more than 1 voice is available) Insert a bookmark Hola, soy <ESC>\mrk=2000\ Paulina Insert a paragraph boundary Miguel Hidalgo <ESC>\p\ Calle Lerdo 51 <ESC>\p\ Puebla s = string: the name of the voice n = 0..4294967295 Yes No - Yes <ESC>\audio= s \ Insert an audio file s = string: the filename URI No Chapter I/12

Mexican Spanish Text-To-Speech system Chapter I Speech Rates in Words per Minute for Mexican Spanish Voices Words per minute Voice Range Default Paulina Min = 80 Max = 370 Javier Min = 80 Max = 370 114 119 Chapter I/13

Mexican Spanish Text-To-Speech system Chapter I Entering phonetic input How to proceed To switch from orthographic to phonetic mode, insert <ESC>/+ to use the L&H+ phonetic alphabet. The phonetic input mode remains active until the command is explicitly reset by entering <ESC>/+ again. The phonetic input string is composed of symbols of the L&H+ phonetic alphabet (see phonetic table below). Examples are given in the phonetic table below. In addition to the phonetic symbols, it is advised to use the following characters in the phonetic input string: L&H + Symbol ' (ASCII 39, Hex 27) " (ASCII 34, Hex 22) Meaning primary word stress sentence accent Special characters As in: <ESC>/+ tr6a.mi.te <ESC>/+ (noun 'trámite') vs. <ESC>/+ tr6a. mi.te <ESC>/+ (verb form from 'tramitar') <ESC>/+?aj_ Dos_a. sen.tos_en_la_ fr6a.s e <ESC>/+ (Hay dos acentos en la frase.). syllable boundary <ESC>/+ 'si.la.ba <ESC>/+ (sílaba) # silence (pause) <ESC>/+ men. sa.xe#_no.lo. a.gas <ESC>/+ (Mensaje: no lo hagas.). syllable boundary <ESC>/+ 'si.la.ba <ESC>/+ (sílaba) # silence (pause) <ESC>/+ men. sa.xe#_no.lo. a.gas <ESC>/+ (Mensaje: no lo hagas.) Chapter I/14

Mexican Spanish Text-To-Speech system Chapter I Note that the use of punctuation marks remains useful within phonetic input to assure a correct intonation. Each punctuation mark needs to be preceded by an asterisk. <ESC>/+?a. Jer6*, kwan.do.ter6.mi. na.r6on*, se. fwe.r6on.a. ka.sa*. <ESC>/+ L&H+ Symbol _ Punctuation Marks Meaning Word delimiter *. End of declarative *, Comma *! End of exclamation *? End of question *; Semicolon *: Colon Lexical stress and sentence accents in phonetic input In phonetic input strings, lexical stress and sentence accents can be manualy indicated by the user, by using a single quote ( ) or double quote ( ) respectively. Note that manually inserted lexical stress or sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested sentence accent, and thus not realize it. The Text-To-Speech system will automatically convert all lexical stress marks into sentence accents in case no manually added sentence accents are found in the phonetic input string. Example: <ESC>/+mja. mi.go_se_ Ja.ma_ xwan*.<esc>/+. is the same as <ESC>/+mja. mi.go_se_ Ja.ma_ xwan*.<esc>/+ (Mi amigo se llama Juan.) Chapter I/15

Chapter I Mexican Spanish Text-To-Speech system If phonetic input contains at least one manually added sentence accent, no additional sentence accents are assigned by the textto-speech system. Therefore, only those words marked with " will get a sentence accent. As a consequence, a message containing only one manual sentence accent will have an almost flat intonation on all the other words. Example: <ESC>/+mja. mi.go_se_ Ja.ma_ xwan*.<esc>/+ (Only one sentence accent will be realized.) Phonetic input can also be combined with orthographic input. If no sentence accents are found in the input text (indicated by <ESC>" in orthographic input, or by " in phonetic input), the Text-To-Speech system will automatically assign sentence accents. In the orthographic part of the input, the Text-To-Speech system will realize these sentence accents on the basis of part-of-speech and syntactic information. In the phonetic part of the input, all lexical stress marks (if any) will be converted into sentence accents. If there are no lexical stress marks, no sentence accent will be realized for the phonetic part of the input (see point 1 above). If the user has manually specified one or more sentence accents, no additional sentence accents will be realized (see point 2 above). Chapter I/16

Mexican Spanish Text-To-Speech system Chapter I Si hace buen tiempo mañana, nos iremos a <ESC>/+man. xa.tan<esc>/+. (No sentence accents are found; The Text-To-Speech system will automatically assign sentence accents.) Si hace buen tiempo mañana, nos iremos a <ESC>/+man. xa.tan<esc>/+. (A sentence accent is specified in the phonetic part of the input text. No additional sentence accents will be realized.) Si hace buen tiempo <ESC> mañana, nos iremos a <ESC>/+man. xa.tan<esc>/+. (A sentence accent is specified in the orthographic part of the input text. No additional sentence accents will be realized. Hence, the lexical stress which is specified in the phonetic part will NOT be converted into a sentence accent.) Si hace buen tiempo <ESC> mañana, nos iremos a <ESC>/+man. xa.tan<esc>/+. (Two sentence accents are specified; no additional sentence accents will be realized.) The Mexican Spanish L&H+ and UNIPA Phonetic Alphabets L&H+ Symbol L&H+ Transcription Vowels UNIPA Symbol UNIPA Transcription As in: i 'bi.bo i 'bi.bo vivo e 'pe.so e 'pe.so peso a 'ka.sa a 'ka.sa casa o 'bo.bo o 'bo.bo bobo u ku.'n~a.do u ku.'n~a.do cuñado Chapter I/17

Mexican Spanish Text-To-Speech system Chapter I L&H+ Symbol L&H+ Transcription Consonants UNIPA Symbol UNIPA Transcription As in: p 'pe.pa p 'pe.pa pepa b 'kam.bjo b 'kam.bjo cambio t 'ti.a t 'ti.a tía d 'de.do d 'de.do dedo k 'ki.lo k 'ki.lo kilo g an.'gar6 g an.'gar6 hangar m 'a.mo m 'a.mo amo n 'ne.ne n 'ne.ne nene n~ 'ni.n o n~ 'ni.n o niño nk sink.ko nk sink.ko cinco r 'ka.ro r 'ka.ro carro r6 'ka.r6o r6 'ka.r6o caro B 'bo.bo B 'bo.bo bobo f 'fin f 'fin fin D 'de.do D 'de.do dedo Chapter I/18

Mexican Spanish Text-To-Speech system Chapter I Consonants s 'se.so s 'se.so seso z dez.de z dez.de desde J 'ma.jo J 'ma.jo mayo x 'a.xo x 'a.xo ajo G 'a.gwa G 'a.gwa agua j 'sje.go j 'Tje.Go ciego w 'kwen.ta w 'kwen.ta cuenta l 'la.ta l 'la.ta lata t&s 't&si.ko t+s 't+si.ko chico d&z 'd&zo d+z 'd+zo yo S wa.sin.ton S wa.sin.ton Washington NOTE Note that the L&H+ alphabet is not SSML compliant. For SSML, use the UNIPA alphabet. Using a User Dictionary For information on how to create and use user dictionaries, please refer to Appendix C in the Telecom RealSpeak/Host Programmer s Guide. Chapter I/19

Chapter I Mexican Spanish Text-To-Speech system Notes on the Mexican Spanish Text-To-Speech System The Mexican Spanish Text-To-Speech system has been designed to allow a correct pronunciation of any input written according to the rules of Mexican Spanish orthography. The following cases, however, require special attention. Cardinal Numbers When no periods are used to separate groups of digits, cardinal numbers up to 8 digits are pronounced as full numbers. Digit strings of 9 or more digits are read one by one. 12345678 123456789 When periods are used to separate groups of digits, cardinal numbers up to 15 digits are pronounced correctly as full numbers. 12.345.678 120.450.500.000 Decimal Numbers When no periods are used to separate groups of digits, decimal numbers up to 8 digits before and 8 digits after the comma are pronounced correctly as full numbers. 12345678,12345678 123.456.789,123 When periods are used to separate groups of digits, decimal numbers up to 15 digits before and 8 digits after the comma are pronounced as full numbers. Chapter I/20

Chapter I Mexican Spanish Text-To-Speech system 12.345.678,30 120.450.500.000,562347 Ordinal Numbers Cardinal numbers from 1 to 10 followed by the ordinal distinctives and ª will be pronounced as ordinal numbers. The numbers 1 and 3 may also be folowed by er, 1 by ra or ro. 1ro 2o 4ª 3er Roman Numbers Roman numbers in isolation are read as cardinal numbers. CXXII : ciento veintidós Roman numbers from 1 to 7 following or preceding a word beginning with an uppercase letter are read as ordinal numbers. Otherwise, they are read as cardinals. Carlos I Isabel II Enrique VIII Alfonso XII I Congreso Carlos primero Isabel segunda Enrique octavo Alfonso doce primer congreso Chapter I/21

Chapter I Mexican Spanish Text-To-Speech system Bank Account Numbers To have bank account numbers correctly pronounced, use hyphens between groups of digits. Groups of digits up to 8 digits will be pronounced as full numbers. Larger groups will be read digit by digit. 390-556846570-89 26-847364859-10 Telephone Numbers In order to ensure a correct pronunciation of telephone numbers, it is recommended to use spaces or hyphens to separate groups of digits. (91) 234 23 23 (93) 232-323 Dates The following formats are supported: Day/Month/Year Day-Month-Year Day can have one or two digits between 1 and 31, Month can have one or two digits between 1 and 12, and Year can have two or four digits without periods. 12/12/93 12-12-93 12/12/1993 12-12-1993 01-1-00 1-1-03 01/01/2000 Chapter I/22

Chapter I Mexican Spanish Text-To-Speech system Time Indications The following formats are supported (12 hours and 24 hours clock): (1 or 2 digits):(2 digits) (1 or 2 digits).(2 digits) The digits may be followed by usual abbreviations ("h.", "H."). 14:30 14.30 h. 14:30 H. Currencies The Mexican Spanish Text-To-Speech system supports the currency symbols $ (dólar), (libra), ƒ (florín) and (euro). 1 $ un dólar 100 $ cien dólares 1 una libra 100 cien libras 1 ƒ un florín 100 ƒ cien florines 1 un euro 100 cien euros Chapter I/23

Chapter I Mexican Spanish Text-To-Speech system Abbreviations The Mexican Spanish Text-To-Speech system contains a dictionary to handle the most frequent abbreviations, such as: etc. etcétera D. don dcha. derecha Dr. doctor ptas. pesetas ms minuto, minutos s. siglo Acronyms Forms such as EPTA, UNICEF, NIF, IPC (or N.I.F., I.P.C., etc.) are pronounced using a combination of rules and dictionaries. PRI IPC NIF is pronounced as a single word is spelled is expanded (as Número de Identificación Fiscal) Chapter I/24

Telecom RealSpeak / Host SDK Chapter II E-Mail Preprocessor User s Guidefor Mexican Spanish V4.0

Chapter II E-Mail Preprocessor E-Mail Preprocessor Introduction The ScanSoft e-mail preprocessor (EMPP) has been developed to analyze a specific type of text: e-mail messages. E-mail messages differ from any average type of text in both structure and contents. An e-mail message consists of two clearly distinguished parts: the header and the body. A substantial part of the header contains routing and administrative information, which is irrelevant to the user. Both the header and the body contain all kinds of e-mail specific text features, e.g. e-mail addresses, emoticons such as smileys, etc. Furthermore, informal writing is often combined with a lack of grammatical conventions. Spelling rules are frequently violated, punctuation is often omitted, etc. Although the standard ScanSoft Text-To-Speech system can handle special text items (abbreviations, numbers, dates, etc.), it is not capable of correctly handling all e-mail specific text features. These text features are therefore dealt with by the e-mail preprocessor. The EMPP transforms e-mail specific information into a format that complies with the rules of the standard ScanSoft Text-To-Speech system. The EMPP is a plug-in preprocessing module of the ScanSoft Text-To-Speech system. It replaces the preprocessor of the standard Text-To-Speech system. In the following sections you will find a description of the functioning of the ScanSoft e-mail preprocessor as well as an overview of its features. The e-mail preprocessor has two main tasks: processing of the e-mail header and processing of the body of the e-mail message. The input to the EMPP consists of one or more e-mail messages. In order to process the e-mail header, the EMPP extracts relevant header fields and then provides an intelligent header field reading. Chapter II/26

Chapter II E-Mail Preprocessor During the processing of the e-mail body, the text is divided into smaller text units, called text-to-speech messages, which are synthesized by the Text-To-Speech system. Text normalization is applied to e-mail specific text features such as e-mail addresses, proper names, emoticons, URLs (Universal Resource Locators), etc. For the text normalization of an e-mail message, the ScanSoft EMPP applies linguistic rules and performs dictionary look-up, in order to yield an adequate phonetic transcription. The EMPP also supports the ScanSoft user dictionary mechanism, which allows the user to customize the output of the e-mail processing. E-Mail Header Processing Header Field Extraction An e-mail message consists of two clearly distinguished parts: the header and the body. The EMPP detects the header and extracts the relevant header fields. Information that is of no interest to the user (such as routing information) is not retained. The EMPP extracts the following header fields: From Field Date Field Subject Field Contains the sender s name and/or addres Contains the date and time of sending Optionally contains the subject of the e-mail The extraction of the header fields is based on the detection of specific keywords in the e-mail header. The supported keywords for the extraction of the header fields are listed below: From Field Date Field Subject Field: From: Author: Sender: De: Von: Date: Enviado: Gesendet: Subject: Subj: Asunto: Betreff: Chapter II/27

Chapter II E-Mail Preprocessor The following is an example of header field extraction. The original header holds information that is irrelevant to the user. After extraction of date, sender and subject, the processed header merely mentions the Date field, the From field and the Subject field: Original header: From lmarina@mundivia.mx Mon Aug 04 11:37:09 1997 Path: xenon.inbe.net!inbe.net!krypton.inbe.net!inbe.net!news.nl.innet.net!i Nnl.net!hunter.premier.net From: lmarina@mundivia.mx Newsgroups: soc.culture.spain Subject: Re: El voseo en Extremadura Date: Thu, 31 Jul 1997 07:56:16 0400 Organization: Netcom Lines: 30 Message-ID: <33E07D60.7162@ix.netcom.com> References: <33DAA4B8.272040C@unmc.edu> <5re6qk$bk6@drn.zippo.com> <33DBDAB5.3F18@ix.netcom.com> <5rgtv4$psg@drn.zippo.com> <33DCA349.5FC0@ix.netcom.com> <5rkga8$210_004@cfn.ist.utl.pt> Reply-To: cadmart@ix.netcom.com NNTP-Posting-Host: mia-fl3-07.ix.netcom.com Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-NETCOM-Date: Thu Jul 31 6:54:48 AM CDT 1997 X-Mailer: Mozilla 3.0C-NC320 (Win95; I) Xref: xenon.inbe.net soc.culture.spain:102110 Status: N Extracted header fields: From: lmarina@mundivia.mx Subject: Re: El voseo en Extremadura Date: Thu, 31 Jul 1997 07:56:16-0400 Chapter II/28

Chapter II E-Mail Preprocessor Header Field Reading After the header fields have been extracted, they are processed by the EMPP. The header field keywords (see above) are replaced by an introductory message. The remainder of the header fields is processed by the EMPP in order to allow the Text-To-Speech system to intelligently read the fields. From Field The From field keyword is replaced by the introductory message Mensaje de:. From: Pablo Aguirre is pronounced: Mensaje de: Pablo Aguirre The remainder of the From field is further processed by the EMPP. The EMPP supports From fields that either consist of a) a proper name b) a proper name and an address c) an address a) - b) In case the From field contains a proper name, this name and only this name is sent to the Text-To-Speech system. This means that if both a name and an address are found in the From field, the address will not be read by the Text-To-Speech system. From: Mauricio de Pablo From: "Paco Manzano" <manzano@alpha.ee.upenn.edu> From: Marta Arribas at IepXchgPO are pronounced: Mensaje de: Mauricio de Pablo Mensaje de: Paco Manzano Mensaje de: Marta Arribas Chapter II/29

Chapter II E-Mail Preprocessor c) In case the From field contains only an address, the EMPP extracts the name out of the address and reads the domain name literally. From: alex@scansoft.com From: unizar.es!pablo From: t1443@bpe.mx From: rci.es!pozas at Internet are pronounced: Mensaje de: Alex arroba scansoft punto com Mensaje de: Pablo arroba unizar punto e s Mensaje de: t 1 4 4 3 arroba b p e punto m x Mensaje de: pozas arroba r c i punto e s Date Field The Date field keyword is replaced by the introductory message Fecha:. The Date field contains the date and time of sending. The EMPP supports multiple date and time formats, which are transformed to a uniform format that complies with the rules for date and time indications of the ScanSoft Text-To-Speech system. The EMPP only pronounces the date. The EMPP supports dates in the following formats: Date: 01/18/96 13:45:46 EDT Date: Thu, 7 Dec 1995 11:45 AM Date: 13 Mar 1996 19:55:27 GMT are pronounced: Fecha: 18 de Enero del 96 Fecha: Jueves, 7 de Diciembre de 1995 Fecha: 13 de Marzo de 1996 Chapter II/30

Chapter II E-Mail Preprocessor Subject Field The Subject field keyword is replaced by the introductory message Asunto:. The Subject field can contain all kinds of data, but may also be empty. The EMPP searches for keywords that are typical of the subject field (e.g. RE, FYI, FW). Subject: RE: Regalo para Elena Subject: FYI: documento en Word 6 Subject: FWD: virus! are pronounced: Asunto: Respuesta a: Regalo para Elena Asunto: Para su información Asunto: (reenvío) virus! E-Mail body processing Message Extraction The e-mail preprocessor splits the body of the e-mail message into text-to-speech messages. This is done on the basis of a number of criteria, such as punctuation, capitalization, layout, intelligent abbreviation handling, etc. Chapter II/31

Chapter II E-Mail Preprocessor The following examples illustrate some criteria for splitting the e-mail text into text-to-speech messages: Using sentence final punctuation and capital letters Si tienes alguna duda sobre lo que hemos hablado, ponte en contacto conmigo. Mi dirección de e-mail es: pedrof@scansoft.com. Gracias. Using layout Estructura del tema: 1) Introducción 2) Desarrollo de los componentes 3) Sumario Using intelligent abbreviation handling Para dudas o preguntas, consúltese el manual del Dr. López Ibor Chapter II/32

Chapter II E-Mail Preprocessor Text Normalization An e-mail message typically contains e-mail specific text features, such as e-mail addresses, URLs, file names, emoticons, etc. The EMPP transforms these e-mail specific features into a format that complies with the rules of the standard text normalization of the ScanSoft Text-To-Speech system. The following are examples of e-mail specific text normalization: Support for multiple e-mail address formats Gianluca Leoni/SSFT/IEP/BE plknapp@ucunix.san.uc.edu Sara Ghelli at IepXchgPO excalib!flagg!ucsd.edu!rambo Support for URLs (Universal Resource Locators) http://www.scansoft.com http://wpnet.com/timl/mrc/mrc.html gopher://tecno.com/11 Support for file names ldb001.tse sysinfo.exe lipedu.xls Processing of emoticons :-* is pronounced huy :( is pronounced bubu Processing of overuse of punctuation Atención!!!!!!!: VIRUS DE INTERNET!!!!!!!! Tengo un problema! #%&#@$<! Quién me puede ayudar? becomes: ATENCIÓN!: VIRUS DE INTERNET! Tengo un problema! Quién me puede ayudar? Chapter II/33

Chapter II E-Mail Preprocessor Processing of inserted mail Marisa> Oye Emilio, cuánto hace que no llamas Marisa> a tu hermano para preguntarle como Marisa> está? Marisa> Si no te lo recuerdo yo, no lo haces. Emilio> Es verdad. Es que he tenido bastante Emilio> trabajo últimamente. Emilio> Ahora tiene una página en Internet. Emilio> Su dirección es http://www.net/tmp becomes: Marisa: Oye Emilio, Cuánto hace que no llamas a tu hermano para preguntarle como está? Si no te lo recuerdo yo no lo haces.. Emilio: Es verdad. Es que he tenido bastante trabajo últimamente. Ahora tiene una página en Internet. Su dirección es http://www.net/tmp Processing of Question/Answer (FAQ) P. Estimado Sr. González. Tengo en mi poder unos artículos sobre arquitectura de posguerra que quisiera que examinase. Podría enviarselos?. laut@unizar.tm.mx R. Querido amigo. Como usted probablemente sabe, estoy preparando un coloquio sobre este tema. No sólo me encantará leer sus artículos, sino que le invito muy cordialmente, si usted esta interesado, a tomar parte en este coloquio. becomes: Pregunta: Estimado Sr. González. Tengo en mi poder unos artículos sobre arquitectura de posguerra que quisiera que examinase. Podría enviarselos?. laut@unizar.tm.mx Respuesta: Querido amigo. Como usted probablemente sabe, estoy preparando un coloquio sobre este tema. No sólo me encantará leer sus artículos, sino que le invito muy cordialmente, si usted esta interesado, a tomar parte en este coloquio. Chapter II/34

Chapter II E-Mail Preprocessor Language Specific Text Normalization The E-mail Preprocessor for Mexican Spanish is able to convert accented characters. In e-mail messages, accents and n-tilde can be transcribed in several ways: camio'n camion espanol espan~ol espanyol Accents or tildes may also be dropped: camion espanol The E-mail Preprocessor for Mexican Spanish correctly converts all possible ways of writing accented words. English words Since e-mail is an international medium, Mexican Spanish e-mail messages will inevitably contain many English words that might refer to Internet, electronic mail or soft- and hardware. The typical e-mail jargon is handled by the exceptions dictionary of the e-mail preprocessor. This dictionary is a lexicon for e-mail terminology and provides the TTS system with an adequate Mexican Spanish transcription for a number of English words. Rlogin techweb uunet filemanager projtech techtalk abstract access shareware /+ '?er6.'lo.gin /+ 'tex.'web u.u. /+ 'net /+ 'fajl.'ma.na.jer6 /+ 'pr6oj.'tek /+ 'tex.'tok /+ '?abs.tr6akt /+ '?ak.ses /+ 'ser6.wer6 Chapter II/35

Chapter II E-Mail Preprocessor Customizing the E-Mail Preprocessor The e-mail preprocessor supports the standard ScanSoft Text- To-Speech SDK user dictionary mechanism, which allows the user to customize the output of the e-mail preprocessor. The user dictionary is consulted both during the header processing and the body processing. For more information on how to build and use user dictionaries, see Appendix C: User Dictionaries of theprogrammer s Guide. Customizing the E-Mail Header The user dictionary is consulted during the header processing while reading the From field and the Subject field. From Field The From field either consists of a) a proper name b) a proper name and an address c) an address a) In case the From field contains a proper name only, the name is passed on to the user dictionary. If the lookup is successful, the proper name is substituted by the replacement string. If not, the name is further processed by the header reading module. Chapter II/36

Chapter II E-Mail Preprocessor If the user dictionary contains the following line: John /+ 'd&zon the following From field: From: John Ferrari becomes: Mensaje de: <esc>/+ d&zon <esc>/+ Ferrari b) In case the From field contains a proper name and an address, the EMPP first passes the address to the user dictionary. If the lookup is successful, both the proper name and the address are substituted by the replacement string. If not, the EMPP passes the proper name to the user dictionary. If this lookup is successful, the name and the address are substituted by the replacement string. If not, the name is further processed by the header reading module. The address will not be read by the Text-To-Speech system. If the user dictionary contains the following lines: psmith@scansoft.com Andrés Schrijver Pablo, mi tío de Indonesia un colega de mi hermana /+?es.'xr6ej.ber6 the following From fields: From: psmith@scansoft.com (P. Smith) Author: andres@elis.rug.ac.be (andrés) From: "Alejandro De Schrijver" <DeSchrijver@ALPHA.ee.upenn.edu> Chapter II/37

Chapter II E-Mail Preprocessor become: Mensaje de: Pablo, mi tío de Indonesia Mensaje de: un colega de mi hermana Mensaje de: Alejandro De <esc>/+?es.'xr6ej.ber6 <esc>/ c) In case the From field contains an address only, the complete address is looked up in the user dictionary. If the lookup is successful, a proper name is added to the From field. If not, only the domain part is sent to the user dictionary. The EMPP first calls the dictionary for the complete domain part. If the lookup is successful, the complete domain part is substituted by the replacement string. Otherwise, the EMPP cuts off the left most sublevel domain and repeats the lookup and matching procedure for the remainder of the domain part. If the lookup is successful, the remainder of the domain part is substituted by the replacement string. This procedure is called repeatedly until the top level domain is encountered. If none of the lookups is successful, the address is further processed by the header reading module. If the user dictionary contains the following lines: alberto.luna@arrakis.mx telemir.mx mi mejor amigo Veracruz (México) the following From fields: Sender: alberto.luna@arrakis.mx From: fernando.fuentes@telemir.mx become: Mensaje de: mi mejor amigo Mensaje de: fernando heras en Veracruz (México) Chapter II/38

Chapter II E-Mail Preprocessor NOTE To allow a correct processing of the From field, the replacement string in the user dictionary should not contain an address. Subject Field Each word in the Subject field is sent to the user dictionary. If the lookup is successful, the replacement string is sent directly to the Text-To-Speech system. If not, the Subject field is further processed by the header reading module. If the user dictionary contains the following lines: EMI IDT /+ '?e.'e.me.'i I D T the following Subject fields: Subject: Jueves: Conferencia con el EMI Subject: sobre las prestaciones de IDT are pronounced: Asunto: Jueves: Conferencia con el <esc>/+'?e.'e.me.'i <esc>/+ Asunto: sobre las prestaciones de I D T Chapter II/39

Chapter II E-Mail Preprocessor Customizing the E-Mail Body When a user dictionary has been loaded, the EMPP will call the dictionary for every word of the e-mail body. If the word is found in the user dictionary, it is substituted by the replacement string. If not, the body is further processed by the e-mail body processing module. If the user dictionary contains the following line: msg mensaje the word "msg" in the following sentence: Tu msg me ha llegado hoy. Tengo que hacerte algunos comentarios. is replaced by the corresponding string found in the user dictionary: Tu mensaje me ha llegado hoy. Tengo que hacerte algunos comentarios Chapter II/40

Telecom RealSpeak / Host SDK Chapter III SSML Preprocessor User s Guide for Mexican Spanish V4.0

Chapter III SSML Preprocessor SSML Preprocessor Introduction 4SML extensions SSML (Speech Synthesizer Markup Language ) is part of a set of markup specifications for voice browsers. SSML was designed to provide a rich, XML-based markup language for assisting the generation of synthetic speech in web and other applications. The essential role of the markup language is to provide a standard way to control aspects of speech such as pronunciation, volume, pitch, rate. The Telecom RealSpeak/Host SDK provides a preprocessor that supports a subset of the Speech Synthesizer Markup Language (SSML) specification. That subset is caled ScanSoft SSML (4SML). More information on SSML can be obtained from the Speech Synthesis Markup Language specification W3C Working Draft 5 April 2002 available at the W3C web site. Go to the SSML home page at: http://www.w3.org/tr/speech-synthesis/ and look for the link to the April 5 Working Draft. Note that the 4SML subset is based on this version of the specification. As mentioned before, 4SML is a subset of SSML. However, 4SML can also support its own set of elements/attributes to enhance the performance of ScanSoft s TTS engines. These are the so-called 4SML extensions. They are typicaly expresed by the prefix sft- before the name of the element/attribute. At this moment the supported extensions are: ssft-dtype attribute of speak, paragraph and sentence with values EMail or Text. With this atribute, the user can toggle the TTS output behavior between normal text mode or e-mail specific mode. ssft-domaintype attribute of speak, paragraph and sentence (value e.g. propernames"). With this atribute Scansoft vendor specific domain types can be deployed. Chapter III/42

Chapter III SSML Preprocessor To enable SSML support, set the mark up type via TtsSetParam: TtsSetParam(hTtsInst, TTS_MARKUP_TYPE_PARAM, MARKUP_4SML); By default, the mark up type is set to none. TtsSetParam(hTtsInst, TTS_MARKUP_TYPE_PARAM, MARKUP_NONE); The section below describes SSML support included in the Telecom RealSpeak host V4.0 Mexican Spanish language version. Overview of the supported 4SML tags for Mexican Spanish XML encoding types for Mexican Spanish The encoding is specified in the XML text declaration ("<?xml?>") by the encoding declaration which is of the form encoding="<encodingname>". E.g. <?xml version="1.0" encoding="utf-8"?> Telecom RealSpeak Host V4.0 Mexican Spanish supports: Windows-1252 and its subsets ISO-8859-1 to ISO- 8859-9 The Unicode encoding UTF-8, UTF-16 and UCS-4 (Note that the alias "ISO-10646-UCS-4" is not supported) Chapter III/43

Chapter III SSML Preprocessor Elements paragraph or p: support for atributes sft-domaintype and sft-dtype (see 4SML Extensions ) sentence or s: support for atributes sft-domaintype and sft-dtype (see 4SML Extensions ) say-as, with support for the following types: spell-out phoneme, with support for the unipa alphabet sub voice: support for atributes xml:lang, name, gender and variant. emphasis break prosody: support for rate and volume mark: only support for empty elements where the name attribute value is a positive number. Chapter III/44

Chapter III SSML Preprocessor 4SML Tags List for Mexican Spanish For reasons of compatibility with the standard Mexican Spanish system, the parallel text control sequence (<esc> sequence) is listed where applicable. As such, a similar TTS behavior can be created or Combined - with non-ssml driven text input. <speak> xml:lang 4SML Tags Comment Corresponding control sequence ssftdtype= <doctype> ssft-domaintype="xxx" High-level and document structure tags es-mx for Mexican Spanish. Attribute of speak, paragraph, sentence and voice. The value specifies the language in the xml:lang attribute (XML v1) format (<lang>-<country>). Attribute of speak, paragraph and sentence. values are EMail and Text (these strings are case-sensitive). Not supported. To use this tag, a specific custom G2p needs to be installed. See the next section of this manual for further explanation. Attribute of speak, paragraph and sentence. value e.g. "propernames or Finance". <esc>%x Chapter III/45

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence <paragraph> <p> <sentence> <s> High-level and document structure tags <paragraph> </paragraph> Same as <paragraph> <sentence> </sentence> Same as <sentence> <esc>e <esc>e <esc>e <esc>e 4SML Tags Comment Corresponding control sequence Text normalization tags <say-as interpretas= number format= cardinal > <say-as interpretas= number format= digits > <say-as interpretas= number format= decimal > <say-as interpretas= number > <say-as interpretas= number format= ordinal > <say-as interpretas= number format= telephone > <esc>\tn=number_cardinal\ <esc>\tn=number_digits\ <esc>\tn=number_decimal\ <esc>\tn=number\ <esc>\tn=number_ordinal\ <esc>\tn=number_telephone\ Chapter III/46

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence Text normalization tags <say-as interpret-as= number format= telephone detail= punctuation > <say-as interpretas= ordinal> <say-as interpretas= acronym > <say-as interpretas= acronym detail= strict > <say-as interpretas= measure > <say-as interpret-as= letters > <say-as interpret-as= letters detail= strict > <say-as interpret-as= words > <say-as interpret-as= date > <say-as interpret-as= date format= mdy > <say-as interpret-as= date format = dmy > <say-as interpret-as= date format= ymd > <say-as interpret-as= date format= ym > <say-as interpret-as date format= my > <say-as interpret-as= date format= dm > <esc>\tn=number_telephone_punctuation <esc>\tn=ordinal\ <esc>\tn=acronym\ <esc>\tn=acronym_strict\ <esc>\tn=measure\ <esc>\tn=letters\ <esc>\tn=letters_strict\ <esc>\tn=words\ <esc>\tn=date\ <esc>\tn=date_mdy\ <esc>\tn=date_dmy\ <esc>\tn=date_ymd\ <esc>\tn=date_ym\ <esc>\tn=date_my\ <esc>\tn=date_dm\ Chapter III/47

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence Text normalization tags <say-as interpret-as= date format= md > <say-as interpret-as= date format= y > <say-as interpret-as= date format= m > <say-as interpret-as= date format= d > <say-as interpret-as= time > <say-as interpret-as= time format= h > <say-as interpret-as= time format= hm > <say-as interpret as= time format= hms > <say-as interpret-as= duration format= hms > <say-as interpret-as= duration format= hm > <say-as interpret-as= duration format= ms > <say-as interpret-as= duration format= h > <say-as interpret-as= duration format= m > <say-as interpret-as= duration format= s > <say-as interpretas= duration > <say-as interpretas= curency > <esc>\tn=date_md\ <esc>\tn=date_y\ <esc>\tn=date_m\ <esc>\tn=date_d\ <esc>\tn=time\ <esc>\tn=time_h\ <esc>\tn=time_hm\ <esc>\tn=time_hms\ <esc>\tn=duration_hms\ <esc>\tn=duration_hm\ <esc>\tn=duration_ms\ <esc>\tn=duration_h\ <esc>\tn=duration_m\ <esc>\tn=duration_s\ <esc>\tn=duration\ <esc>\tn=currency\ Chapter III/48

Chapter III SSML Preprocessor <say-as interpretas= telephone > 4SML Tags Comment Corresponding control sequence <say-as interpretas= telephone detail= punctuation > <say-as interpretas= address > Text normalization tags <say-as interpret-as= spell > <say-as interpret-as= name > <say-as interpret-as= net format= email > <say-as interpret-as= net format= uri > <esc>\tn=telephone\ <esc>\tn=telephone_punctuation\ <esc>\tn=address\ <esc>\tn=spell\ <esc>\tn=name\ <esc>\tn=net_email\ <esc>\tn=net_uri\ <say-as interpret-as= net > <esc>\tn=net\ Chapter III/49

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence <phoneme alphabet= ipa > <phoneme alphabet= unipa > Pronunciation tags Not supported <sub alias= xxxxx > See section the Mexican Spanish L&H+ and UNIPA phonetic alphabets for an overview of the alphabet. <sub alias= xxx >YYY</sub> - xxx 4SML Tags Comment Corresponding control sequence Prosody and style tags <voice name= xxx > Not supported <voice gender= xxx > Not supported <voice variant= x > Not supported <voice age= xx > Not supported <voice xml:lang="xxx"> <emphasis level= default > <emphasis level= strong > <emphasis level= moderate > <emphasis level= reduced > Es-MX supported (See xml:lang attribute described earlier) Same as level= "moderate". Chapter III/50

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence <emphasis level= none > <break size= default > Prosody and style tags (but no effect) Language and voice dependent Same effect as <break size= medium > <break size= none > (but no effect) <break size= smal > <break size= medium > Language and voice dependent <break size= large > <break time= xxxms > (milliseconds) <break time= xs > (seconds) <prosody volume= xx > <prosody volume= +xx% > <prosody volume= - xx% > <prosody volume= +xx > <prosody volume= - xx > Language and voice dependent Language and voice dependent See section The Volume Scale for more details. See section The Volume Scale for more details. <esc>\tn=pause=xxx\ <esc>\tn=pause=0\ <esc>\tn=pause=xxx\ <esc>\tn=pause=xxx\ <esc>\tn=pause=xxx\ <esc>\tn=pause=yyy\ (yyy is pause duration in milliseconds) <esc>\tn=vol=xx\ with xx in [0-100]. <esc>\tn=vol=xx\ with xx in [0-100]. <esc>\tn=vol=x\ Chapter III/51

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence Prosody and style tags <prosody volume= default > <prosody volume= silent > <prosody volume= soft > <prosody volume= medium > <prosody volume= loud > <prosody pitch= default > <prosody pitch= high > <prosody pitch= medium > <prosody pitch= low > <prosody pitch= xxx% > <prosody pitch= xxx > <prosody contour= xxxx > <prosody range= xxx > <prosody range= xxx% > <prosody range= default > <prosody range= low > Not supported Not supported Not supported Not supported Not supported Not supported Not supported Not supported Not supported Not supported Not supported <esc>\tn=vol=80\ <esc>\tn=vol=0\ <esc>\tn=vol=40\ <esc>\tn=vol=70\ <esc>\tn=vol=95\ Chapter III/52

Chapter III SSML Preprocessor 4SML Tags Comment Corresponding control sequence <prosody range= medium > <prosody range= high > <prosody rate= xxx > <prosody rate= +xx% > <prosody rate= - xx% > <prosody rate= +xx > <prosody rate= - xx > <prosody rate= default > <prosody rate= slow > <prosody rate= medium > <prosody rate= fast > <prosody duration= xxx > Prosody and style tags Not supported Not supported <esc>\tn=rate_wpm=xxx\ (rate in words per minute) <esc>\tn=rate=yy\ with yy in [1-100] or <esc>\tn=rate_wpm=yy\ <esc>\tn=rate=yy\ with yy in [1-100] or <esc>\tn=rate_wpm=yy\ Not supported <audio> <audio> <esc>\tn=rate=50\ <esc>\tn=rate=20\ <esc>\tn=rate=50\ <esc>\tn=rate=70\ 4SML Tags Comment Corresponding control sequence Other tags <mark name="xxx"> Not Chapter III/53

Chapter III SSML Preprocessor The Volume Scale The volume can be specified as a floating-point number in the range 0.0 to 100.00 using <prosody volume="xx">. The two other representations are: a descriptive term like medium and relative changes (e.g. <volume ="+15%">). The SSML spec of W3C does not fully specify how values have to be interpreted: it merely says that higher values are louder and that the value zero is equivalent to specifying silent. We use a logarithmic scale (like the db scale). 4SML value Amplification factor Loudness in db 100 2.00 +6dB 90 1.41 +3dB 80 1.00 +0dB 70 0.71-3dB 60 0.50-6dB 50 0.35-9dB 40 0.25-12dB 30 0.18-15dB 20 0.125-18dB 10 0.088-21dB 0 0.00 - db The formula for the db value is: x (in db) = 20 * log10(amplification) The formula for converting the 4SML value y to the db value x: x (db) = (y - 80) * 0.3 db And the formula to derive the 4SML value from the db value x: y (4SML) = (x / 0.3) + 80 The following table specifies the conversion table for the descriptive volume levels. This specification is subject to change. Chapter III/54

Chapter III SSML Preprocessor 4SML descriptive volume level 4SML numerical volume value Amplification factor Loudness in db loud 95 1.68 +4.5dB default 80 1.00 +0dB medium 70 0.71-3dB soft 40 0.20-12dB silent 0 0.00 - db Chapter III/55

Telecom RealSpeak / Host SDK Chapter IV Custom G2P Dictionaries User s Guide for Mexican Spanish V4.0

Chapter IV Custom G2P Dictionaries Custom G2P Dictionaries Introduction ScanSoft's RealSpeak system now offers support for custom G2P dictionaries. A custom G2P dictionary module is an add-on module specifically designed to improve the quality of pronunciation for specific kinds of words. The Mexican Spanish system is currently not designed to support the use of a custom G2p dictionary module. Chapter IV/57

Telecom RealSpeak / Host SDK Chapter V Appendix User s Guide for Mexican Spanish V4.0

Appendix Chapter V Appendix Appendix: Mexican Spanish voice names The Telecom RealSpeak /Host Text-To-Speech system now supports selecting the voice and language via a string as well as a define (please see the definition for the function TtsInitializeEx() in the Programmers Guide and also the Backwards Compatibility Guide for details). The name strings for the currently supported Mexican Spanish voices are listed in the table below. Mexican Spanish Voice Name Strings Voice Paulina Javier Name String Paulina Javier The string to use to set the language to Mexican Spanish is Mexican Spanish. Chapter V/59