RealSpeak Telecom Software Development Kit User s Guide for French V4.0
EV AL U AT I O N AG R E E M E N T READ THE FOLLOWING CAREFULLY BEFORE USING THE SOFTWARE. PRIOR TO RECEIVING THE SOFTWARE, YOU HAVE SIGNED EITHER A LICENSE AGREEMENT OR A NON-DISCLOSURE AGREEMENT WITH SCANSOFT, CONCERNING THE REALSPEAK TELECOM DEVELOPMENT SYSTEM OR SOME OF ITS COMPONENTS. THEREFORE, REFERENCE IS MADE TO THESE AND AS SUCH THE CONTRACTUAL TERMS AND CONDITIONS SHALL APPLY TO THE SOFTWARE. IF YOU HAVE NOT SIGNED SUCH AN AGREEMENT AND IF YOU USE THE SOFTWARE, SCANSOFT WILL ASSUME THAT YOU AGREED TO BE BOUND BY THE EVALUATION AGREEMENT SPECIFIED HEREUNDER. IF YOU DO NOT ACCEPT THE TERMS OF THIS EVALUATION AGREEMENT, YOU MUST RETURN THE PACKAGE UNUSED TO SCANSOFT WITHIN SEVEN (7) DAYS AFTER RECEIPT. 1. Grant of Rights In consideration of a possible commercial relationship, ScanSoft hereby grants to you, the LICENSEE, who accepts, a non-exclusive right to internaly evaluate and test the software program ( the Software ). 2. Ownership of Software ScanSoft retains title, interests and ownership of the Software recorded on the original disk(s) and all subsequent copies of the Software and Documentation, regardless of the form or media in or on which the original and other copies may exist. ScanSoft reserves all rights not expressly granted to LICENSEE. 3. Copy Restrictions This Software and the accompanying documentation are copyrighted. Unauthorized copying of the Software, including Software that has been merged or included with other software, or of the documentation is expressly forbidden. LICENSEE may be held legally responsible for any intellectual property infringement that is caused or encouraged by his failure to abide by the terms of this agreement. LICENSEE is allowed to make two (2) copies of the Software solely for backup purposes, provided that the copyright notice is included on the backup copy. 4. Use Restrictions LICENSEE agrees not to use the Software for any other purpose than internally evaluating the Software. LICENSEE may physically transfer the Software from one computer to another, provided that the Software is used on only one computer at a time. LICENSEE may not modify, adapt, translate, reverse engineer, decompile, disassemble or create derivative works based on the Software.LICENSEE may not modify, adapt, translate or create derivative works based on the documentation provided by ScanSoft. The Software may not be transferred to anyone without the prior written consent of ScanSoft. In no event may LICENSEE transfer, assign, lease, sell or otherwise dispose of the Software and Documentation on a temporary or permanent basis except as expressly provided herein. 5. Warranty THE SOFTWARE IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. ScanSoft shall have no liability to LICENSEE or any third party for any claim, loss or damage of any kind, including but not limited to lost profits, punitive, incidental, consequential or special damages, arising out of or in connection with the use or performance of the Software and accompanying documentation. 6. Termination This agreement is effective until terminated. ScanSoft reserves the right to terminate this agreement automatically if any provision of this agreement is violated. LICENSEE may terminate this agreement by returning the Software and the accompanying documentation to ScanSoft, along with a written warranty stating that all copies have been returned.
Copyright (C) ScanSoft, Inc RealSpeak Telecom Software Development Kit User s Guide for French V4.0 October 2004 - December 2005 No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information retrieval system, without the written permission of ScanSoft. Trademarks MS-DOS, WINDOWS, MICROSOFT VISUAL C++, BORLAND C++ and Sound Blaster are registered trademarks of their respective owners. ScanSoft is a registered trademark. All rights reserved.
Document History Date Type of change SoftwareVersion October 2004 Document creation V4.0 December 2005 Update SSML Preprocesor chapter and Using Control Sequences section of Chapter I V4.0
Table of Contents FRENCH TEXT-TO-SPEECH SYSTEM... 8 Introduction...8 Preparing a text for Text-To-Speech...8 Native Character Set...8 Using Control Sequences...8 Quick Reference of the RealSpeak native Control Sequences for French...10 Entering phonetic input...14 How to proceed...14 Lexical stress and sentence accents in phonetic input...15 The French L&H+ and UNIPA Phonetic Alphabets...17 Using a User Dictionary...19 Using the Microsoft SAPI5 Lexicon...19 User Lexicons...20 Application Lexicons...20 The French SAPI5 Phoneme List...21 Notes on the French Text-To-Speech System...23 Cardinal Numbers...23 Numbers...24 Ordinal Numbers...24 Roman Numbers...24 Fractions...24 Telephone Numbers...25 Bank Account Numbers & Visa Numbers...26 Dates...26 Time Indications...27 Currencies...27 Alphanumeric Strings...28 Mathematical formulas...29 Abbreviations...30 Acronyms and Initialisms...30 E-MAIL PREPROCESSOR... 32 Introduction...32 E-Mail Header Processing...33 Header Field Extraction...33 Header Field Reading...35 From Field...35 Date Field...36 Subject Field...37 E-Mail body processing...38 Message Extraction...38 Text Normalization...39 Language specific accents...41 English words...41 Customizing the E-Mail Preprocessor...42 Customizing the E-Mail Header...42 From Field...42
Subject Field...44 Customizing the E-Mail Body...45 SSML PREPROCESSOR... 47 Introduction...47 French specific SSML markup...47 XML encoding types for French...47 4SML Specifics for French...49 CUSTOM G2P DICTIONARIES... 54 Introduction...54 APPENDICES... 56 Appendix A: French voice and language strings...56
RealSpeak Telecom SDK Chapter I French Text-To-Speech System User s Guide for French V4.0
Chapter I French Text-to-Speech system French Text-To-Speech System Introduction This section provides operational instructions for the RealSpeak Telecom Text-To-Speech system for French. It reviews the functionality of the system, and describes the way in which the user can customize the pronunciation of input texts. This part also describes issues that are particular to the French Text-To-Speech system. It introduces the French phonetic alphabet and it discusses some language-specific features of the French Text-To-Speech system. Preparing a text for Text-To-Speech Native Character Set Using Control Sequences In general, there are four ways to intervene in the pronunciation of text: By using control sequences By entering phonetic input By using a user dictionary or a user ruleset By using one of the supported API s These mechanisms are described in the Programmer s Guide. In this part, however, the specifications for French are fully described. The native character set of the French TTS system is Windows-1252; it has the printable characters in the ASCII range 1-127 as a subset. Note that TTS input encoded in another supported character set is converted to the native character set for that language before it is processed internally. Consequently, input must be representable in the native character set even if it is encoded in another character set supported by the API. For a description of the various supported markup languages (independent from the language), refer to the Programmer's Guide. User sguide for French Chapter I/8
Chapter I French Text-to-Speech system Remark: <ESC> represents the escape character \x1b (decimal 27) that generates the ASCII character 27 (Hex 1B). Below, you find a quick reference table for the RealSpeak native control sequences supported for French. The language-specific support for the SSML markup language is described in the SSML Preprocesor chapter. User sguide for French Chapter I/9
Chapter I French Text-to-Speech system Quick Reference of the RealSpeak native Control Sequences for French Sequence Description Range Default Delimiter <ESC> \vol=x\ <ESC> \rate=x\ <ESC> \rate_wpm= xxx\ <ESC>Mx Volume (x : 0.. 100) 0 = silence 10 = low 100 = high 80 No Modifier le volume ne pose pas de problèmes. <ESC>\vol=90\ Le plus haut volume <ESC>\vol=10\ s oppose au volume le plus bas. Speech Rate (x : 1.. 100) 10 = slow 100 = fast 50 No Toujours selon votre choix, la vitesse d'élocution <ESC>\rate=10\ sera plus lente <ESC>\rate=90\ ou plus rapide. Words per minute (xxx: 1..1000) Voice-specific (see subsequent table) Voicespecific Toujours selon votre choix, la vitesse d'élocution <ESC>\rate_wpm=110\ sera plus lente <ESC>\rate_wpm=350\ ou plus rapide. Read mode; some read modes are not supported in e- mail mode x = 0..3: 0 = character-bycharacter 1 = word-by-word (not supported in e- mail mode) 2 = sentence-bysentence 3 = line-by-line (not supported in e-mail mode) <ESC>M0 Test (The word "Test" will be spelled.) <ESC>M1 Ceci est un test. (This sentence will be read word by word.) <ESC>M2 Ceci est un test. (This sentence will be read as one sentence.) <ESC>Wx Wait Period 0 = no wait period 1 = 200 milliseconds wait period 9 = 1800 milliseconds User sguide for French Chapter I/10 No 2 Yes 2 Yes
French Text-to-Speech system Chapter I Sequence Description Range Default Delimiter <ESC>M2 <ESC>W2 Cet énoncé est suivi d'une pause courte... <ESC>W9 Et maintenant d'une pause longue. Est-ce que vous entendez la différence? <ESC> \Pause=xxx \ <ESC>" Long Pause 1..65535 msec No On peut déterminer la longueur <ESC>\pause=5000\ des pauses. Sentence Accent Mets le livre <ESC>"sur la table. Mets le <ESC>"livre sur la table. No Note: Manually inserted sentence accents may have no effect in Realspeak. The RealSpeak synthesis module may indeed have reasons to override the requested sentence accent, and thus not realize it. <ESC>C Continuation No Jean pose une question à Marie. Mais Marie ne lui répond pas. Jean pose une question à Marie. <ESC>C Mais Marie ne lui répond pas. In the first of the above examples, the text-to-speech system will detect an end-of-sentence after Marie and will read the input as two separate sentences. In the second example, a continuation sequence is inserted in order to make the system pronounce the entire input as one sentence. <ESC>E End-of-Message Yes Vous entendez d abord la première ligne <ESC>E et puis la deuxième. In the above example, the sequence <ESC>E forces the system to pronounce the two halves of the input separately. <ESC>/+ <ESC>%x Phonetic Input (L&H+ phonetic alphabet) <ESC>/+'sE+R<ESC>/+ Preprocessing Mode text = standard text mode email = e-mail mode No Yes User sguide for French Chapter I/11
Chapter I French Text-to-Speech system Sequence Description Range Default Delimiter <ESC>%text L expéditeur du présent mesage est Robert Guinot. <ESC>%email From: Robert Guinot <rguinot@rbs.fr> <ESC> \tn=x\ Guide text normalization; limited support in e-mail mode address=address mode (not supported in e-mail mode) normal=standard mode spell=spell mode Normal No <ESC>F <ESC>@c The text normalization types corresponding with the SSML <say-as> types are also supported in standard text mode (not in e-mail mode), see the SSML Preprocesor chapter for more details. <ESC>\tn=address\ MM. Dufour et Ganty, 45 psg. Bottin, 75005 Paris <ESC>\tn=normal\ <ESC>\tn=address\ Rés. Les boutons verts, 14, r. st.- Jean, 54000 Nancy <ESC>\tn=normal\ <ESC>\tn=address\ 125-127 pl. Voltaire<ESC>\tn=normal\ <ESC>\tn=address\ RN 25, Lille <ESC>\tn=normal\ Reset to Default yes <ESC>\vol=10\ Maintenant le volume est très bas, <ESC>F et remis à la valeur normale. <ESC>\rate=10\ Ici, la vitesse est réduite au niveau minimal, <ESC>F pour ensuite retrouver son rythme normal. Declare the part-of-speech With c a character with possible values: No N = noun J = adjective A = adverb V = verb R = past participle User sguide for French Chapter I/12
Chapter I French Text-to-Speech system Sequence Description Range Default Delimiter <ESC>\domain Enable the s = string: the name Yes =s\ extension (only if a custom g2p has been loaded) of the extension <ESC>\domain\ Disable the last Yes <ESC>\voice=s\ <ESC>\mrk=n\ extension Set the voice (if there is more than 1 voice is available) Insert a bookmark s = string: the name of the voice n = 0.. 2147483647 Yes No <ESC>\p\ Insert a paragraph boundary - Yes <ESC>\aud io="s"\ Insert an audio file; not supported in e- mail mode s = string: the URI of a document with an appropriate MIME type Yes User sguide for French Chapter I/13
French Text-to-Speech system Chapter I Speech Rates in Words per Minute for French Voices Words per minute Voice Range Default Virginie Min = 83 Max = 416 166 Entering phonetic input How to proceed To switch from orthographic to phonetic mode, insert <ESC>/+ to use the L&H+ phonetic alphabet. The phonetic input mode remains active until the command is explicitly reset by entering <ESC>/+ again. The phonetic input string is composed of symbols of the L&H+ phonetic alphabet (see phonetic table below). Examples are given in the phonetic table below. In addition to the phonetic symbols, it is advised to use the following characters in the phonetic input string: L&H + Symbol ' (ASCII 39, Hex 27) " (ASCII 34, Hex 22) Meaning primary word stress sentence accent Special characters As in: <ESC>/+ pre.zi.'da%~ <ESC>/+ (noun 'président') vs. <ESC>/+ pre.'zi.d$ <ESC>/+ (verb form 'président') <ESC>/+ set_'fra.z$_ko%~.tje%~_"de+ _zak."sa%~*.<esc>/+ (Cette phrase contient deux accents.). syllable boundary <ESC>/+ si.'la.b$ <ESC>/+ (syllabe)<esc>/+ 'si.l$.b$l <ESC>/+ (syllable) # silence (pause) <ESC>/+ Ze_"di_#_mE_"nO%~ <ESC>/+ (J ai dit: Mais non. ) User sguide for French Chapter I/14
French Text-to-Speech system Chapter I Note that the use of punctuation marks remains useful within phonetic input to assure a correct intonation. Each punctuation mark needs to be preceded by an asterisk. <ESC>/+ bje%~"syr*,_ze.te_"la*.<esc>/+ (Bien sûr, j étais là.) <ESC>/+ ty_e_"for*.<esc>/+ (Tu es fort.) Punctuation Marks L&H+ Symbol Meaning _ Word delimiter *. End of declarative *, Comma *! End of exclamation *? End of question *; Semicolon *: Colon Lexical stress and sentence accents in phonetic input In phonetic input strings, lexical stress and sentence accents can be manualy indicated by the user, by using a single quote ( ) or double quote ( ) respectively. Note that manually inserted lexical stress or sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested stress/accent. The Text-To-Speech system will automatically convert all lexical stress marks into sentence accents in case no manually added sentence accents are found in the phonetic input string. Example: <ESC>/+sO%~_'pER_sa.'pEl_gi.'jom*. <ESC>/+ is the same as: User sguide for French Chapter I/15
French Text-to-Speech system Chapter I <ESC>/+sO%~_"pER_sa."pEl_gi."jom*.<ESC>/+ (Son père s appele Guilaume.) If phonetic input contains at least one manually added sentence accent, no additional sentence accents are assigned by the text-to-speech system. Therefore, only those words marked with " will get a sentence accent. As a consequence, a message containing only one manual sentence accent will have an almost flat intonation on all the other words. Example: <ESC>/+sO%~_'pER_sa.'pEl_gi."jom*.<ESC>/+ (Only one sentence accent will be realized.) Phonetic input can also be combined with orthographic input. If no sentence accents are found in the input text (indicated by <ESC>" in orthographic input, or by " in phonetic input), the Text-To-Speech system will automatically assign sentence accents. In the orthographic part of the input, the Text-To-Speech system will realize these sentence accents on the basis of part-of-speech and syntactic information. In the phonetic part of the input, all lexical stress marks (if any) will be converted into sentence accents. If there are no lexical stress marks, no sentence accent will be realized for the phonetic part of the input (see point 1 above). If the user has manually specified one or more sentence accents, no additional sentence accents will be realized (see point 2 above). S il pleut demain, nous partirons pour <ESC>/+pa.'Ri<ESC>/+. (No sentence accents are found; the Text-To-Speech system will automatically assign sentence accents.) S il pleut demain, nous partirons pour <ESC>/+pa. Ri<ESC>/+. (A sentence accent is specified in the phonetic part of the input text. No additional sentence accents will be realized.) Si elles s'en <ESC>"vont, ils <ESC>/+'pARt<ESC>/+ pour Paris. User sguide for French Chapter I/16
French Text-to-Speech system Chapter I (A sentence accent is specified in the orthographic part of the input text. No additional sentence accents will be realized.) Si elles s'en <ESC>"vont, ils <ESC>/+ part<esc>/+ pour Paris. (Two sentence accents were specified; no additional sentence accents will be realized.) The French L&H+ and UNIPA Phonetic Alphabets L&H+ Symbol L&H+ Transcription Vowels UNIPA Symbol UNIPA Transcription As in: i mi.'nyt i mi.'nyt minute e e.'te e e.'te été E 'tre E 'tre très a 'ba a 'ba bas A 'pat A 'pat pâte O 'mort O 'mort morte o 'bo o 'bo beau u 'nu u 'nu nous y 'fy y 'fy fût e+ 'de+ e= 'de= deux E+ 'se+r E= 'se=r soeur $ 'l$ $ 'l$ le E%~ 'be%~ E%~ 'be%~ bain A%~ 'bla%~ A%~ 'bla%~ blanc O%~ 'bo%~ O%~ 'bo%~ bon E+%~ 'E+%~ E=%~ 'E=%~ un User sguide for French Chapter I/17
French Text-to-Speech system Chapter I L&H+ Symbol L&H+ Transcription Consonants UNIPA Symbol UNIPA Transcription As in: p 'pa p 'pa pas b 'ba b 'ba bas t 'ta t 'ta tas d 'do d 'do do k 'ki k 'ki qui g 'gom g 'gom gomme? A%~.ti.?aR.'pO %~? A%~.ti.?aR.'pO %~ antiharpon f 'fe%~ f 'fe%~ faim v 'vol v 'vol vol s 'sak s 'sak sac z ze.'ro z ze.'ro zéro S 'SaR.m$ S 'SaR.m$ charme Z ZaR.'dE%~ Z ZaR.'dE%~ jardin m 'mo m 'mo mot n 'nu n 'nu nous n~ a.'n~o n~ a.'n~o agneau nk smo.'kinkg nk smo.'kinkg smoking l 'la l 'la la R 'RO%~ R 'RO%~ rond j bri.'je j bri.'je briller w 'wi w 'wi oui h\ 'lh\i h\ 'lh\i lui NOTE Note that the L&H+ alphabet is not SSML compliant. For SSML, use the UNIPA alphabet. User sguide for French Chapter I/18
Chapter I French Text-to-Speech system Using a User Dictionary For information on how to create and use user dictionaries, please refer tothe User Configuration chapterin the RealSpeak Telecom Programmer s Guide. Using the Microsoft SAPI5 Lexicon Microsoft SAPI5 provides lexicons so that users and applications can specify pronunciation and part-of-speech information for particular words. As such, all SAPI compliant Text-To-Speech engines should use these lexicons to guarantee uniformity of pronunciation and part of speech information. There are two types of lexicons in SAPI: user lexicons and application lexicons. User sguide for French Chapter I/19
Chapter I French Text-to-Speech system User Lexicons Each user who logs in to a computer will have a User Lexicon. Initially, this lexicon is empty; words can be added either programmatically, or by using an engine's add/remove words UI component (for example, the sample application Dictation Pad provides an Add/Remove Words dialog). Application Lexicons Applications can create and ship their own lexicons of specialized words. These lexicons are fixed and cannot be edited. Detailed information on how to use the MS SAPI5 lexicons can be found in the manual Microsoft Speech SDK V5.1, chapter ISpLexicon Interface. User sguide for French Chapter I/20
French Text-to-Speech system Chapter I The French SAPI5 Phoneme List To add entries to the lexicon, the user should use a set of language specific phonemes. The language specific phoneme list for French is given below. SAPI5 Symbol SAPI Phone ID SAPI5 Symbols Example SAPI5 Transcription A 11 patte P A T AA 10 pâte P AA T AX 13 justement ZH UY S T AX M A ~ EH 16 seize S EH Z EU 30 deux D EU EY 17 ses S EY IY 22 si S IY OE 29 neuf N OE F OH 12 comme K OH M OW 31 gros G R OW UY 21 du D UY UW 37 doux D UW P 32 pont P OW ~ B 14 bon B OW ~ M 25 mont M OW ~ F 18 femme F A M V 38 vent V A ~ T 36 temps T A ~ D 15 dans D A ~ N 26 nom N OW ~ S 34 sans S A ~ Z 41 zone Z OW N L 24 long L OW ~ SH 35 champ SH A ~ ZH 42 gens ZH A ~ User sguide for French Chapter I/21
Chapter I French Text-to-Speech system SAPI5 Symbol SAPI Phone ID SAPI5 Symbols Example SAPI5 Transcription NJ 28 oignon OW NJ OW ~ NG 27 camping K A M P IY NG Y 40 ion, pierre Y OW ~, P Y EH R W 39 coin K W EY ~ K 23 quand K A ~ G 19 grand, gant G R A ~, G A ~ R 33 rond R OW ~ HY 20 juin ZH HY EY ~ A ~ 11 9 vent V A ~ EY ~ 17 9 vin V EY ~ OE ~ 29 9 brun B R OE ~ OW ~ 31 9 bon B OW ~ SAPI5 Symbols SAPI5 Symbol Meaning As in: SAPI Phone ID - (hyphen)! (exclamation mark) syllable boundary zh a rx - 1 d e~ 1 sentence terminator 1 l ax & 1 v e~ & 1 eh & 1 t rx eh & 1 b o~! & word boundary 1 l ax & 1 v e~ & 1 eh & 1 t rx eh & 1 b o~, (comma). (period)? (question mark) _ (underscore) sentence terminator sentence terminator sentence terminator silence 1 l ax & 1 v e~, 1 eh & 1 b o~. 1 l ax & 1 v e~ & 1 eh & 1 t rx eh & 1 b o~. 1 l ax & 1 v e~ & 1 eh & 1 t rx eh & 1 b o~? 1 l ax & 1 v e~ & _ 1 eh & 1 t rx eh & 3 b o~ 1 primary stress zh a rx - 1 d e~ 8 ~ nasalization 9 2 3 4 5 6 7 User sguide for French Chapter I/22
Chapter I French Text-to-Speech system Notes on the French Text-To-Speech System The French Text-To-Speech system has been designed to allow a correct pronunciation of any input written according to the rules of French orthography. The following cases, however, require special attention. Cardinal Numbers Cardinal numbers up to 15 digits are pronounced as full numbers. Commas may be used to separate groups of digits. Digit strings consisting of more than 15 digits are pronounced digit by digit. A number starting with a zero is automatically spelled. 20598500 610.456.789 235 566 887 123 256.789.411.789.215 NOTE Numerals that are normally pronounced as full numbers, can also be read digit by digit by using the control sequence <ESC>\spell=on\ in front of the numeral to set the spell mode. User sguide for French Chapter I/23
Chapter I French Text-to-Speech system Numbers Decimal numbers may consist of up to 15 digits before or after the decimal point. Commas may be used to separate groups of digits in the digit string before the decimal point. The decimals following the comma are pronounced as full numbers. 55,35 55,255 Ordinal Numbers Cardinal numbers followed by the correct ordinal suffix will be pronounced as ordinals: 1er 2e 10e la 61e fois Roman Numbers The French Text-To-Speech system supports the use of Roman numbers up to 39, when consisting of combinations of X, V and I. The Roman numbers may either be used separately or in combination with proper names. Roman numbers up to 30, followed by e, E, e. or E. are read as ordinal numbers. Louis XIV XIX Ve Ie Fractions Digit strings consisting of maximally 15 digits, followed by a slash, followed by a maximum of 15 digits and an ordinal suffix, are pronounced as fractions. User sguide for French Chapter I/24
French Text-to-Speech system Chapter I Digit strings 1,2,3,4 followed by a slash, followed by 1,2,3,4 are pronounced as fractions. 1/2 un demi 2/3 Deux tiers 125/425e cent vingt-cinq quatre cent vingt-cinquièmes Telephone Numbers In order to ensure a correct pronunciation of telephone numbers, it is recommended to use slashes or parentheses to separate the area code from the remainder of the telephone number. Also, use periods, hyphens or a space to separate groups of digits. Telephone numbers written in this format will be pronounced in groups of two digits, with a pause at the place of the period, hyphen or space. 04.42.27.86.53 83-50-33-33 83 54 21 73 03/88.41.73.00 (05)42 21 89 53 33 03-22-70-59-99 33.(0)3.22.71.49.90 If telephone numbers are preceded by an abbreviation, the number doesn t need to be separated by spaces, periods or hyphens: Tél. (03)22718919 tél. 22713990 The following formats for Belgian telephone numbers are supported: Tél. (03)8200222 Tél. (057) 82.05.22 Tél. 02/460.33.97 04 1234567 02-652.89.75 Telephone numbers written in this way will be pronounced in groups of two or three digits. User sguide for French Chapter I/25
Chapter I French Text-to-Speech system The following formats for Canadian telephone numbers are supported: (514) 895-7868 418.644.5950 1 123.123.1234 789 0456 Telephone numbers written in this way will be spelled. Bank Account Numbers & Visa Numbers In order to have bank account numbers correctly pronounced, use hyphens between groups of digits. The number will be pronounced in groups of two or three digits. 810-1254887-87 To have a bank account number pronounced digit by digit, switch to spell mode (<ESC>\spell=on\). For a correct pronunciation of visa numbers, use hyphens between each group of digits. Each group of 4 digits is spelled. 1234-5678-9112-3456 Dates The French Text-To-Speech system reads dates written as structured groups of digits in the following numeric formats: with slashes: Day(1 or 2 digits)/month(1 or 2 digits)/year(2 or 4 digits) with hyphens: Day(1 or 2 digits)-month(1 or 2 digits)-year(2 or 4 digits) with periods: Day(1 or 2 digits).month(1 or 2 digits).year(2 or 4 digits) User sguide for French Chapter I/26
Chapter I French Text-to-Speech system 11/12/98 11-12-1998 2.6.92 11/05/00 You can also use the written format: le 1er déc. 97 le 31 janv. 96 Time Indications Time indications will be correctly pronounced when written in one of the following formats: 22h30 vingt-deux heures trente 22 H 30 22.30 h 22.30H 22:30h 22:30 H 22h30 vingt-deux heures trente minutes 22 H 30 12h00 midi 00h00 minuit 2 h deux heures 14 H quatorze heures 22 vingt-deux minutes 6240 six mille deux cent quarante minutes. Currencies The French Text-To-Speech system correctly handles the currency symbols FF, FRS, FR, FB, $,, and, provided that they follow the numeral. 15 FF 14 FB 50 $ User sguide for French Chapter I/27
Chapter I French Text-to-Speech system 16 16 EUR EUR 16 Currencies up to 15 digits (with or without periods) will be correctly pronounced. 20.579.500 FF Decimal digits in combination with currency indications are also supported. Decimal currency amounts up to 15 digits before and 15 digits after the comma will be correctly pronounced. 1.999,50 FB 1.999,50 $ Alphanumeric Strings Also the most common currency abbreviations from around the world are supported. These abbreviations can follow or precede the amount and are expanded. 3 USD: trois dollars américains Other currencies are written in full words and have to follow the numeral. 1500 lires Alphanumeric strings consist of a combination of letters and digits. The alphabetic part of the alphanumeric string will always be spelled; the numeric part will be read as a full number. AB1956 is pronounced as A B 1956 125DC20 is pronounced as 125 D C 20 User sguide for French Chapter I/28
Chapter I French Text-to-Speech system Mathematical formulas The French Text-To-Speech system supports a range of mathematical formulas. Numbers, cardinals or decimals can be negative (i.e. preceeded by the minus symbol). operators: + plus - moins *xx fois / divisé par % modulo ^ puissance (but: ^2: au carré, ^3: au cube) separators: ( [ parenthèse, crochet ouvert ) ] parenthèse, crochet fermé 4 + 2 = 6 quatre plus deux égale six 4-2*5=-6 quatre moins deux fois cinq égale moins six (4 + 2) + 2 = 8 parenthèse ouverte quatre plus deux parenthèse fermée plus deux égale huit [4-2]+3-5=0 crochet ouvert quatre moins deux crochet fermé plus trois moins cinq égale zéro 15/(10-5)+1=4 quinze divisé par parenthèse ouverte dix moins cinq parenthèse fermée plus un égale quatre (4 + 2) + (4-2) = 8 parenthèse ouverte quatre plus deux parenthèse fermée plus parenthèse ouverte quatre moins deux parenthèse fermée égale huit [4+2][4-2]=12 crochet ouvert quatre plus deux crochet fermé fois crochet ouvert quatre moins deux crochet fermé égale douze 5*(2+3)-2+[1/1]-20=4 cinq fois parenthèse ouverte deux plus trois parenthèse fermée moins deux plus crochet ouvert un divisé par un crochet fermé moins vingt égale quatre User sguide for French Chapter I/29
Chapter I French Text-to-Speech system Abbreviations The French RealSpeak system contains a dictionary with the most common abbreviations, such as: e.a. svp. Mme O.K. ex. entre autres s il vous plaît Madame oké exemple Words consisting only of consonants are spelled: e.g. PDG, HLM. Some abbreviations are ambiguous, however, and are pronounced depending on the context in which they appear. For example, the abbreviation "MM" is pronounced "millimètres" when preceded by a digit, but "messieurs" in other cases. 3 MM 3 millimètres MM Deprez et Dupont Messieurs Deprez et Dupont Acronyms and Initialisms The French Text-To-Speech system contains a standard dictionary with acronyms and initialisms such as: RAM, NASA, AMEX, ETC. Acronyms are abbreviations formed by combining the first letter(s) of a group of words. They are pronounced as words. NATO, UNESCO Initialisms are abbreviations formed by combining the first letter of each part of a group of words. Initialisms are spelled. API, FBI User sguide for French Chapter I/30
RealSpeak Telecom SDK Chapter II E-Mail Preprocessor User s Guidefor French V4.0
Chapter II E-Mail Preprocessor E-Mail Preprocessor Introduction The ScanSoft e-mail preprocessor (EMPP) has been developed to analyze a specific type of text: e-mail messages. E-mail messages differ from any average type of text in both structure and contents. An e-mail message consists of two clearly distinguished parts: the header and the body. A substantial part of the header contains routing and administrative information, which is irrelevant to the user. Both the header and the body contain all kinds of e-mail specific text features, e.g. e-mail addresses, emoticons such as smileys, etc. Furthermore, informal writing is often combined with a lack of grammatical conventions. Spelling rules are frequently violated, punctuation is often omitted, etc. Although the standard ScanSoft Text-To-Speech system can handle special text items (abbreviations, numbers, dates, etc.), it is not capable of correctly handling all e-mail specific text features. These text features are therefore dealt with by the e-mail preprocessor. The EMPP transforms e-mail specific information into a format that complies with the rules of the standard ScanSoft Text-To-Speech system. The EMPP is a plug-in preprocessing module of the ScanSoft Text-To-Speech system. It replaces the preprocessor of the standard Text-To-Speech system. In the following sections you will find a description of the functioning of the ScanSoft e-mail preprocessor as well as an overview of its features. The e-mail preprocessor has two main tasks: processing of the e-mail header and processing of the body of the e-mail message. The input to the EMPP consists of one or more e-mail messages. In order to process the e-mail header, the EMPP extracts relevant header fields and then provides an intelligent header field reading. User sguide for French Chapter II/32
Chapter II E-Mail Preprocessor During the processing of the e-mail body, the text is divided into smaller text units, called text-to-speech messages, which are synthesized by the Text-To-Speech system. Text normalization is applied to e-mail specific text features such as e-mail addresses, proper names, emoticons, URLs (Universal Resource Locators), etc. For the text normalization of an e-mail message, the ScanSoft EMPP applies linguistic rules and performs dictionary look-up, in order to yield an adequate phonetic transcription. The EMPP also supports the ScanSoft user dictionary mechanism, which allows the user to customize the output of the e-mail processing. E-Mail Header Processing Header Field Extraction An e-mail message consists of two clearly distinguished parts: the header and the body. The EMPP detects the header and extracts the relevant header fields. Information that is of no interest to the user (such as routing information) is not retained. The EMPP extracts the following header fields: From Field Date Field Subject Field Contains the sender s name and/or address Contains the date and time of sending Optionally contains the subject of the e-mail The extraction of the header fields is based on the detection of specific keywords in the e-mail header. The supported keywords for the extraction of the header fields are listed below: From Field Date Field Subject Field: From: Author: Sender: De: Von: Date: Enviado: Gesendet: Subject: Subj: Asunto: Betreff: User sguide for French Chapter II/33
E-Mail Preprocessor Chapter II The following is an example of header field extraction. The original header holds information that is irrelevant to the user. After extraction of date, sender and subject, the processed header merely mentions the Date field, the From field and the Subject field: Original header (English version): From jbruneau@ushdv.fr Tue Oct 22 15:52:02 1996 Path: chaos.kulnet.kuleuven.ac.be!belgium.eu.net!eu.net!www.nntp.p rimenet.com!nntp.primenet.com!nntp.uio.no!newsfeed.easynet.co.uk!eas ynet-uk!news.easynet.fr!easynet-fr!rain.fr!francenet.fr!usenet From: Jean-Marc Bruneau <jbruneau@ushdv.fr> Newsgroups: fr.comp.lang.java Subject: Un jeu en JAVA sur cd-rom Date: Thu, 17 Oct 1996 14:36:38 +0200 Organization: Acapella Lines: 5 Message-ID: <32662856.6039@ushdv.fr> Reply-To: aca-prs@dialup.francenet.fr NNTP-Posting-Host: pppa233.francenet.fr Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Mailer: Mozilla 2.01I [fr] (Macintosh; I; PPC Extracted header fields: From: Jean-Marc Bruneau <jbruneau@ushdv.fr> Subject: Un jeu en JAVA sur cd-rom Date: Thu, 17 Oct 1996 14:36:38 +0200 User sguide for French Chapter II/34
Chapter II E-Mail Preprocessor Header Field Reading After the header fields have been extracted, they are processed by the EMPP. The header field keywords (see above) are replaced by an introductory message. The remainder of the header fields is processed by the EMPP in order to allow the Text-To-Speech system to intelligently read the fields. From Field The From field keyword is replaced by the introductory message Message envoyé par:. Author: Sandrine Rosier is pronounced: Message envoyé par: Sandrine Rosier The remainder of the From field is further processed by the EMPP. The EMPP supports From fields that either consist of a) a proper name b) a proper name and an address c) an address a) - b) In case the From field contains a proper name, this name and only this name is sent to the Text-To-Speech system. This means that if both a name and an address are found in the From field, the address will not be read by the Text-To-Speech system. From: Patrick Parigeaud From: "Robert Griffon" <rgriffon@rbs.ca> Author: Bernard Garnier at IepXchgPO Author: Hélène Dubois/LHS/IEP/CA From: Francois.Simonnet@lhs.ca (Francois Simonnet) User sguide for French Chapter II/35
Chapter II E-Mail Preprocessor are pronounced: Message envoyé par: Patrick Parigeaud Message envoyé par: Robert Griffon Message envoyé par: Bernard Garnier Message envoyé par: Hélène Dubois Message envoyé par: François Simonnet c) In case the From field contains only an address, the EMPP extracts the name out of the address and expands the domain that is contained in the address. In other words, the e-mail address is not read literally. From: Stephane.Chabert@ac-toulouse.fr Author: videotron.ca!labelle Author: c1689@t7j.com From: cdnsport.ca!philippe_lacour at Internet are pronounced: Message envoyé par: Stéphane Chabert at L'Académie de Toulouse Message envoyé par: labelle at Vidéotron Canada Message envoyé par: c 16 89 at Télé 7 Jours Message envoyé par: Philippe Lacour at Sport Canadien Date Field The Date field keyword is replaced by the introductory message Date:. The Date field contains the date and time of sending. The EMPP supports multiple date and time formats, which are transformed into a uniform format that complies with the rules for date and time indications of the ScanSoft Text-To-Speech system. The EMPP only pronounces the date. The EMPP supports dates in the following formats: User sguide for French Chapter II/36
Chapter II E-Mail Preprocessor Date: Thu, 7 Dec 1995 13:45:46 EDT Date: 13 Mar 2003 11:45 AM are pronounced: Date: jeudi, 7 décembre 1995 Date: 13 mars 2003 Subject Field The Subject field keyword is replaced by the introductory message Sujet. The Subject field can contain all kinds of data, but may also be empty. The EMPP searches for keywords that are typical for the subject field (e.g. RE, FYI, FW). Subject: Re: Perdu des disquettes! Subject: FYI: service rapide Subj: reunion annulee (fwd) are pronounced: Sujet: réponse: Perdu des disquettes! Sujet: à but informatif: service rapide Sujet: réunion annulée (message redirigé) User sguide for French Chapter II/37
Chapter II E-Mail Preprocessor E-Mail body processing Message Extraction The e-mail preprocessor splits the body of the e-mail message into text-to-speech messages. This is done on the basis of a number of criteria, such as punctuation, capitalization, layout, intelligent abbreviation handling, etc. The following examples illustrate some criteria for splitting the e-mail text into text-to-speech messages: Using sentence final punctuation and capital letters Veuillez renvoyer cette information le plus vite possible. Mon adresse est nathalief@rft.com. Merci bien! Using layout Cette semaine: 1) Gagnez un voyage a Moscou 2) La nouvelle creme anti-rides 3) Les starlettes de Cannes. Using intelligent abbreviation handling Je vous prie de contacter prof F. Dumas. User sguide for French Chapter II/38
Chapter II E-Mail Preprocessor Text Normalization An e-mail message typically contains e-mail specific text features, such as e-mail addresses, URLs, file names, emoticons, etc. The EMPP transforms these e-mail specific features into a format that complies with the rules of the standard text normalization of the ScanSoft Text-To-Speech system. The following are examples of e-mail specific text normalization: Support for multiple e-mail address formats Lu Chong/LHS/IEP/BE dumont@immo.paris.fr Sally Smith at IepXchgPO Support for URLs (Universal Resource Locators) http://offre.qc.ca http://abg.grenet.fr/abg/jobs.html gopher://gopher.upenn.edu/11/lists Support for file names ldb001.tse sysinfo.exe lipedu.xls Processing of emoticons :-x bisou :-O aïe Processing of overuse of punctuation Fais attention!!!!!!!!: INTERNET VIRUS!!!!!!!! Jamais de compliments! #%&#@$ becomes: Fais attention!!!. INTERNET VIRUS!!! Jamais de compliments.? User sguide for French Chapter II/39
E-Mail Preprocessor Chapter II Normalization of lay-out lines (e.g. part of an e-mail signature); not active when in spell mode. These sequences of identical characters are not pronounced: o 10 or more identical digits o a word consisting of 5 or more identical US-ASCII encoded letters of the modern Latin alphabet o a sequence of 3 or more identical US-ASCII characters that are no letters, no digits, no sentencefinal punctuation marks (.?!) and no white spaces; e.g. '&', '#', '%', '*', '-' oooooooooooooooooooooooooooooo --------------------------------------------------- will be removed. Normalization of lay-out lines (e.g. part of an e-mail signature); not active when in spell mode. These sequences of identical characters are not pronounced: o 10 or more identical digits o a word consisting of 5 or more identical US-ASCII encoded letters of the modern Latin alphabet o a sequence of 3 or more identical US-ASCII characters that are no letters, no digits, no sentencefinal punctuation marks (.?!) and no white spaces; e.g. '&', '#', '%', '*', '-' oooooooooooooooooooooooooooooo --------------------------------------------------- will be removed. Processing of Question/Answer (FAQ) Q. I have an e-mail address change. How can I ensure that I will continue to receive my EcoLink e-mail newsletter? A. Easy. Just send an e-mail to EcoLink@peach.ease.msoft.com. Be sure to include both your old address and your new address. becomes: Question: I have an e-mail address change. How can I ensure that I will continue to receive my EcoLink e-mail newsletter? Answer: Easy. Just send an e-mail to EcoLink@peach.ease.msoft.com. Be sure to include both your old address and your new address. Processing of inserted mail Roland> Si vous jouez avec des émissions Roland> diffusées, vous perdez une Roland> partie de l'image, ou vous la User sguide for French Chapter II/40
Chapter II E-Mail Preprocessor Roland> déformez en écrasant l'image. Roland> Aucune solution n'existera jamais à ce Roland> problème. Cecile> Pourtant ce n'était pas mon expérience. becomes: Roland: Si vous jouez avec des émissions diffusées, vous perdez une partie de l'image, ou vous la déformez en écrasant l'image. Aucune solution n'existera jamais à ce problème. Cécile: Pourtant ce n'était pas mon expérience. Language specific accents English words The ScanSoft E-mail Preprocessor for French has a special function for the detection of the specific French characters é, è, ê and cédille (ç). The following text without accents (é, è and ê) and cédille: Il n'avait pas remarque que ce probleme se presentait un peu partout. Ca montre qu'il avait la tete ailleurs la semaine passee. becomes: Il n'avait pas remarqué que ce problème se présentait un peu partout. ça montre qu'il avait la tête ailleurs la semaine passée. Since e-mail is an international medium, French e-mail messages will inevitably contain many English words, that might refer to Internet, electronic mail or soft- and hardware. The typical e-mail jargon is handled by the exceptions dictionary of the e-mail preprocessor. This dictionary is a lexicon for e-mail terminology and provides the Text- To-Speech system with an adequate French transcription or translation for a number of English words. Provider /+pro.vaj. de+r hypertext/+?i.per. tekst CD-ROM /+ se.de. ROM User sguide for French Chapter II/41
Chapter II E-Mail Preprocessor Customizing the E-Mail Preprocessor The e-mail preprocessor supports the standard ScanSoft Text- To-Speech SDK user dictionary mechanism, which allows the user to customize the output of the e-mail preprocessor. The user dictionary is consulted both during the header processing and the body processing. For more information on how to build and use user dictionaries, see the User Configuration chapter of theprogrammer s Guide. Customizing the E-Mail Header The user dictionary is consulted during the header processing while reading the From field and the Subject field. From Field The From field either consists of a) a proper name b) a proper name and an address c) an address a) In case the From field contains a proper name only, the name is passed on to the user dictionary. If the lookup is successful, the proper name is substituted by the replacement string. If not, the name is further processed by the header reading module. User sguide for French Chapter II/42
Chapter II E-Mail Preprocessor If the user dictionary contains the following line: John /+ dzon the following From field: From: John Leblanc becomes: Mesage envoyé par: */+ dzon*/+ Leblanc b) In case the From field contains a proper name and an address, the EMPP first passes the address to the user dictionary. If the lookup is successful, both the proper name and the address are substituted by the replacement string. If not, the EMPP passes the proper name to the user dictionary. If this lookup is successful, the name and the address are substituted by the replacement string. If not, the name is further processed by the header reading module. The address will not be read by the Text-To-Speech system. If the user dictionary contains the following lines: pdupont@scansoft.com Berthelots Mortier Pierre, mon ami sportif le chef /+ mor.'tir the following From fields: From: pdupont@lhs.be (P. Dupont) From: berthelot@imag.fr (Berthelot) From: "Alex Mortier" <Mortier@ALPHA.ee.upenn.edu> become: Message envoyé par: Pierre, mon ami sportif Message envoyé par: le chef Mesage envoyé par: Alex */+ mor. tir*/+ User sguide for French Chapter II/43
Chapter II E-Mail Preprocessor c) In case the From field contains only an address, the complete address is looked up in the user dictionary. If the lookup is successful, a proper name is added to the From field. If not, only the domain part is sent to the user dictionary. The EMPP first calls the dictionary for the complete domain part. If the lookup is successful, the complete domain part is substituted by the replacement string. Otherwise, the EMPP cuts the leftmost sublevel domain and repeats the lookup and matching procedure for the remainder of the domain part. If the lookup is successful, the remainder of the domain part is substituted by the replacement string. This procedure is called repeatedly until the top level domain is encountered. If none of the lookups is successful, the address is further processed by the header reading module. If the e-mail user dictionary contains the following lines: jtrenet@duripex.com duripex.com petit Jacques Duripex the following From field: From: jtrenet@duripex.com From: dlecompte@duripex.com becomes: Message envoyé par: petit Jacques Message envoyé par: d lecompte at Duripex NOTE To allow a correct processing of the From field, the replacement string in the e-mail user dictionary should not contain an address or a domain. Subject Field Each word in the Subject field is sent to the user dictionary. If the lookup is successful, the replacement string is sent directly to the Text-To-Speech system. If not, the Subject field is further processed by the header reading module. User sguide for French Chapter II/44
Chapter II E-Mail Preprocessor If the user dictionary contains the following lines: DITA Massachusetts d.i.t.a. /+ ma.sa.tsu.'sets the following Subject fields: Subject: rapport du projet DITA Subject: fait pas trop mauvais dans le Massachusetts are pronounced: Sujet: rapport du projet d.i.t.a. Sujet: fait pas trop mauvais dans le */+ ma.sa.tsu.'sets */+ Customizing the E-Mail Body When a user dictionary has been loaded, the EMPP will call the dictionary for every word of the e-mail body. If the word is found in the user dictionary, it is substituted by the replacement string. If not, the body is further processed by the e-mail body processing module. If the user dictionary contains the following line: strcpy /+ strink.ko.'pi the word " strcpy " in the following sentence: Ajoute un strcpy à ton programme, ca pourrait aider. is replaced by the corresponding string found in the e-mail user dictionary: Ajoute un */+ strink.ko.'pi */+ à ton programme, ça pourrait aider. User sguide for French Chapter II/45
RealSpeak Telecom SDK Chapter III SSML Preprocessor User s Guide for French V4.0
Chapter III SSML Preprocessor SSML Preprocessor Introduction SSML (Speech Synthesizer Markup Language) is part of a set of markup specifications by the W3C for voice browsers. General information regarding the RealSpeak SSML processor can be found in the SSML Support chapter of the Programmer s Guide. The RealSpeak Telecom SDK provides a built-in preprocessor that supports a large portion of the SSML 1.0 September 2004 Recommendation (REC). Moreover RealSpeak extends SSML with a number of Scansoft specific elements/attributes. The set supported by Scansoft is caled ScanSoft SSML (4SML). The section below describes language-specific SSML support included in the RealSpeak Telecom V4.0 French language version. French specific SSML markup XML encoding types for French The encoding is specified in the XML text declaration ("<?xml?>") by the encoding declaration which is of the form encoding="<encodingname>". E.g. <?xml version="1.0" encoding="utf-8"?> RealSpeak Telecom V4.0 French supports: Windows-1252 and ISO-8859-1 (ISO Latin1) The Unicodeencoding UTF-8, UTF-16 and UCS-4 (Note that the alias "ISO-10646-UCS-4" is not supported) Any coding character set supported by the ICU component as long as the input text only contains characters that can be transcoded to the native coded character set, being Windows-1252. For more information about the character sets supported by ICU, take a look at the ICU website http://www-306.ibm.com/software/globalization/icu and http://www.iana.org/assignments/character-sets. User sguide for French Chapter III/47
Chapter III SSML Preprocessor NOTE Encoding names are parsed case-insensitive; hyphens and underscores are ignored User sguide for French Chapter III/48
Chapter III SSML Preprocessor 4SML Specifics for French For reasons of compatibility with the standard French system, the parallel text control sequence (<esc> sequence) is listed where applicable. As such, a similar TTS behavior can be created or combined with non-ssml driven text input. 4SML Tags Comment Corresponding control sequence High-level and document structure tags xml:lang fr-fr for French. Attribute of speak, paragraph, sentence and voice. Text normalization tags <say-as interpretas= xxx > <say-as interpretas= number format= cardinal > <say-as interpretas= number format= digits > <say-as interpretas= number format= decimal > <say-as interpretas= number > <say-as interpretas= number format= ordinal > <say-as interpretas= number format= telephone > <say-as interpretas= number format= telephone detail= punctuation > ; limited support in e-mail mode. In e-mail mode the only supported interpretas value is spel. <esc>\tn=number_cardinal\ <esc>\ tn=number_digits\ <esc>\ tn=number_decimal\ <esc>\ tn=number\ <esc>\ tn=number_ordinal\ <esc>\ tn=number_telephone\ <esc>\ tn=number_telephone_punctuation\ User sguide for French Chapter III/49
Chapter III SSML Preprocessor <say-as interpretas= ordinal> <say-as interpretas= acronym > <say-as interpretas= acronym detail= strict > <say-as interpretas= measure > <say-as interpretas= letters > <say-as interpretas= letters detail= strict > <say-as interpretas= words > <say-as interpretas= date > <say-as interpretas= date format= mdy > <say-as interpretas= date format = dmy > <say-as interpretas= date format= ymd > <say-as interpretas= date format= ym > <say-as interpret-as date format= my > <say-as interpretas= date format= dm > <say-as interpretas= date format= md > <say-as interpretas= date format= y > <say-as interpretas= date format= m > <say-as interpretas= date format= d > <esc>\ tn=ordinal\ <esc>\ tn=acronym\ <esc>\ tn=acronym_strict\ <esc>\ tn=measure\ <esc>\ tn=letters\ <esc>\ tn=letters_strict\ <esc>\ tn=words\ <esc>\ tn=date\ <esc>\ tn=date_mdy\ <esc>\ tn=date_dmy\ <esc>\ tn=date_ymd\ <esc>\ tn=date_ym\ <esc>\ tn=date_my\ <esc>\ tn=date_dm\ <esc>\ tn=date_md\ <esc>\ tn=date_y\ <esc>\ tn=date_m\ <esc>\ tn=date_d\ User sguide for French Chapter III/50
Chapter III SSML Preprocessor <say-as interpretas= time > <say-as interpretas= time format= h > <say-as interpretas= time format= hm > <say-as interpret as= time format= hms > <say-as interpretas= duration format= hms > <say-as interpretas= duration format= hm > <say-as interpretas= duration format= ms > <say-as interpretas= duration format= h > <say-as interpretas= duration format= m > <say-as interpretas= duration format= s > <say-as interpretas= duration > <say-as interpretas= curency > <say-as interpretas= telephone > <say-as interpretas= telephone detail= punctuation > <say-as interpretas= address > <say-as interpretas= spell > <say-as interpretas= name > <esc>\ tn=time\ <esc>\ tn=time_h\ <esc>\tn=time_hm\ <esc>\ tn=time_hms\ <esc>\ tn=duration_hms\ <esc>\ tn=duration_hm\ <esc>\ tn=duration_ms\ <esc>\ tn=duration_h\ <esc>\ tn=duration_m\ <esc>\ tn=duration_s\ <esc>\ tn=duration\ <esc>\ tn=currency\ <esc>\ tn=telephone\ <esc>\ tn=telephone_punctuation\ <esc>\ tn=address\ <esc>\ tn=spell\ <esc>\ tn=name\ User sguide for French Chapter III/51
Chapter III SSML Preprocessor <say-as interpretas= net format= email > <say-as interpretas= net format= uri > <say-as interpretas= net > <phoneme alphabet= unipa > Pronunciation tags See section the French L&H+ and UNIPA phonetic alphabets for an overview of the alphabet. <esc>\ tn=net_email\ <esc>\ tn=net_uri\ <esc>\ tn=net\ User sguide for French Chapter III/52
RealSpeak Telecom SDK Chapter IV Custom G2P Dictionaries User s Guide for French V4.0
Chapter IV Custom G2P Dictionaries Custom G2P Dictionaries Introduction ScanSoft's RealSpeak system now offers support for custom G2P dictionaries. A custom G2P dictionary module is an add-on module specifically designed to improve the quality of pronunciation for specific kinds of words. The French system is currently not designed to support the use of a custom G2p dictionary module. User sguide for French Chapter IV/54
RealSpeak Telecom SDK Appendices User s Guide for French V4.0
Appendix A French voice and language strings Appendices Appendix A: French voice and language strings The RealSpeak Telecom Text-To-Speech system now supports selecting the voice and language via a string as well as a define (please see the definition for the function TtsInitialize(Ex)() in the Programmers Guide and also the Backwards Compatibility Guide for details). The name strings for the currently supported French voices are listed in the table below. French Voice Name Strings Voice Virginie Name String Virginie The string to use to set the language to French is French. User sguide for French Appendix A/56