VoiceXML Programmer s Guide

Transcription

1 VoiceXML Programmer s Guide VOICEXML PROGRAMMER S GUIDE 1

3 Table of Contents Preface Audience Conventions How to Use This Guide References Getting Started VoiceXML Environment Tags and Elements Simple Example Documents Applications Dialogs Properties Grammars Events Links Universal Commands and Grammars Procedural Logic User Interaction Flow of Execution Explicit Transition Recognition-Triggered Transition Subdialogs Collecting Input and Playing Prompts Forms Form Items Form-Item Variables Execution of a Form User Interaction VOICEXML PROGRAMMER S GUIDE 3

4 3. Event Handling Predefined Events Default Event Handlers Application-Defined Event Handlers Events in Subdialogs Throwing Events Application-Defined Events Fetching and Caching Resources How Fetching and Caching Work Requests and Responses Using Multiple Caches Fundamentals of Controlling Caching What You Can Control from VoiceXML Prefetching Resources Prefetch Cache Details Fetch Hints Restrictions on Prefetching Handling Fetching Delays Timeouts Background Audio Queued Prompts when Fetching Controlling the Use of Cached Resources Maximum Age Maximum Stale Time Caching Mimicking Response Headers Submitting Complex JavaScript Objects Using Multiple-Recognition Multiple Recognition Features N-Best Recognition Multiple Interpretations Combining the Features Working with Multiple Recognition Using N-Best Recognition Enabling N-Best Recognition Checking for Multiple Utterances VOICEXML PROGRAMMER S GUIDE

5 Selecting an Utterance Example Using Multiple Interpretations Enabling Multiple Interpretations Checking for Multiple Interpretations Selecting an Interpretation Example Using Both Features Together Enabling Both Features Checking for Multiple Results Selecting a Result Simple Example Generating a Subdialog Controlling Outbound Calls Call Status Limitations on Outbound Calls Interactions Without an Outbound Call Interactions During a Transfer Without a Transfer Grammar With Transfer Grammars Interactions During a Dialed Call Putting a Dialed Call on Hold Listening to a Dialed Call With Listen Grammars Without a Listen Grammar Interrupting a Dialed Call VoIP and Outbound Calls Go-Back Facility Retracting User Responses Go-Back Stack Stack Entries Go-Back Destinations Menus Mixed-Initiative Forms Input Items Enabling the Go-Back Facility VOICEXML PROGRAMMER S GUIDE 5

6 Activating the Universal Grammar Setting the Minimum Stack Size Controlling Go-Back Behavior Suppressing Retraction Customizing Go-Back Using the Go-Back Facility Selecting the Minimum the Stack Size Using Blocks Using Subdialogs Using Variables TTS and Recorded Voice Selection Specifying TTS Voices Supported TTS Voices Property syntax Property Property Example Errors Voice tag Syntax Voice tag description Specifying Recorded Voices Supported Recorded Voices Example Lists of Fallback Voices Rationale Algorithm for Selecting the Voice Selecting Voice Example Best Practices for Voice Selection Overriding Recorded Voices Dynamic SSML Introduction Using Dynamic SSML Examples and Notes Errors SSML Document Extensions to the SSML spec VOICEXML PROGRAMMER S GUIDE

7 10. SOAP Client Facility Locating and Identifying SOAP Services Calling SOAP Methods Type conversion SOAP Headers SOAP Methods are JavaScript objects Error Handling For WSDL-Based Services Error Handling for Non-WSDL-Based Services Tags Tag Summary Tag Index Tag s <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:enroll> <bevocal:foreach> <bevocal:hold> <bevocal:listen> <bevocal:register> <bevocal:verify> <bevocal:whisper> <block> <break> <catch> <choice> <clear> <data> <disconnect> <div> <dtmf> <else> <elseif> <emp> <emphasis> VOICEXML PROGRAMMER S GUIDE 7

8 <enumerate> <error> <example> <exit> <field> <filled> <foreach> <form> <goto> <grammar> <help> <if> <initial> <item> <lexicon> <link> <log> <mark> <menu> <meta> <metadata> <noinput> <nomatch> <object> <one-of> <option> <p> <paragraph> <param> <phoneme> <prompt> <property> <pros> <prosody> <record> <reprompt> <rethrow> VOICEXML PROGRAMMER S GUIDE

9 <return> <rule> <ruleref> <s> <say-as> <sayas> <script> <send> <sentence> <speak> <sub> <subdialog> <submit> <tag> <throw> <token> <transfer> <value> <var> <voice> <vxml> Properties Property Summary Property Index Property s audiofetchhint audiomaxage audiomaxstale bargein bargeintype bevocal.audio.capture bevocal.audio.outputvolume bevocal.dtmf.flushbuffer bevocal.fetchaudio.allfetches bevocal.fetchaudio.extend bevocal.fetchaudio.flushqueue bevocal.fetchaudio.sounds VOICEXML PROGRAMMER S GUIDE 9

10 bevocal.finaltimeout bevocal.goback bevocal.grammar.interpretationtype bevocal.grammar.phoneticpruning bevocal.grammar.weightfactor bevocal.grammar.wordtransitionpenalty bevocal.hotwordmax bevocal.hotwordmin bevocal.incrementerroronnsp bevocal.locale bevocal.logging bevocal.maxdialogerrors bevocal.maxerrors bevocal.maxinterpretations bevocal.mingoback bevocal.securelogging.enabled bevocal.securelogging.key bevocal.security.key bevocal.sounds.listening bevocal.sounds.maskrecognitionlatency bevocal.sounds.recognition bevocal.transfer.terminatetones bevocal.utterance.prefix bevocal.voice.name bevocal.vxml.maxrecognitionlatency caching completetimeout confidencelevel datafetchhint datamaxage datamaxstale documentfetchhint documentmaxage documentmaxstale fetchaudio fetchaudiodelay fetchaudiominimum VOICEXML PROGRAMMER S GUIDE

11 fetchtimeout grammarfetchhint grammarmaxage grammarmaxstale incompletetimeout inputmodes interdigittimeout maxnbest maxspeechtimeout recordutterance recordutterancetype scriptfetchhint scriptmaxage scriptmaxstale ssmlfetchhint ssmlmaxage ssmlmaxstale sensitivity speedvsaccuracy termchar termtimeout timeout universals Variables Variable Summary Variable Index Variable s _event _message application.lastaudio$ application.lastresult$ session.bevocal.timeincall session.bevocal.version session.iidigits session.telephone.ani session.telephone.dnis VOICEXML PROGRAMMER S GUIDE 11

12 14. JavaScript Functions and Objects JavaScript Constants bevocal.outboundrequestid bevocal.sessionid _addheader bevocal.cookies.addclientcookie bevocal.cookies.deleteclientcookie bevocal.cookies.getclientcookie bevocal.cookies.getclientcookies bevocal.enroll.removeenrolledphrase bevocal.getproperty bevocal.getversion bevocal.log bevocal.soap.servicefromwsdl bevocal.soap.servicefromendpoint bevocal.soap.locateservice bevocal.soap.soapexception bevocal.soap.soapfault bevocal.soap.faultdetails VOICEXML PROGRAMMER S GUIDE

13 Preface VoiceXML is a markup language for writing telephone-based speech applications. This document describes BeVocal VoiceXML, which is compliant with the W3C VoiceXML Version 2.0 Specification. Audience This document is for software developers using the BeVocal Café development environment. It assumes you are familiar with the basic concepts of HTML. Conventions Italic font is used for: Introducing terms that will be used throughout the document Emphasis Bold font is used for headings. Fixed width font is used for: Code examples Tags and attributes Values or text that must be typed as shown Filenames and pathnames Italic fixed width font is used for: Variables Prototypes or templates; what you actually type will be similar in format, but not the exact same characters as shown How to Use This Guide Part I of this guide explains how to use VoiceXML features. A new application developer typically reads these chapters completely and in order. Chapter 1, Getting Started introduces VoiceXML and its major features. Chapter 2, Forms describes VoiceXML forms. Chapter 3, Event Handling describes events that can be thrown during the execution of a VoiceXML application and how events are handled. Chapter 4, Fetching and Caching Resources explains how an application can control the way VoiceXML documents and other resources are fetched and cached.

14 PREFACE Chapter 5, Using Multiple-Recognition describes how the BeVocal VoiceXML interpreter can provide multiple recognition results. Part II of this guide explains how to use Extended VoiceXML features. A new application developer typically reads those chapters which are relevant for his application. Chapter 6, Controlling Outbound Calls describes the BeVocal VoiceXML call-control features, an extension to VoiceXML. Chapter 7, Go-Back Facility describes the BeVocal VoiceXML go-back facility, an experimental extension to VoiceXML. Chapter 8, TTS and Recorded Voice Selection describes the BeVocal VoiceXML TTS and Recorded Voice Selection facility, an experimental extension to VoiceXML Chapter 9, Dynamic SSML describes the BeVocal VoiceXML Dynamic SSML facility, an experimental extension to VoiceXML Chapter 10, SOAP Client Facility describes the BeVocal VoiceXML SOAP Client facility, an experimental extension to VoiceXML Part III of this guide provides reference descriptions of the various components of the VoiceXML language. Application developers typically do not read these chapters from start to finish, but instead use them to look up information about the various tags, properties, and so on. Chapter 11, Tags describes the tags that make up VoiceXML. Chapter 12, Properties describes the properties that can be set to control the behavior of a VoiceXML application. Chapter 13, Variables describes predefined variables that are available in VoiceXML applications. Chapter 14, JavaScript Functions and Objects describes predefined JavaScript functions that are available in VoiceXML applications. References For additional or related information, you can refer to: VoiceXML Version 2.0 Specification. VoiceXML Forum. ( VoiceXML Tag Summary. BeVocal. ( Grammar Reference. BeVocal. ( JavaScript Quick Reference. BeVocal. ( 2 VOICEXML PROGRAMMER S GUIDE

15 PART 1 Using VoiceXML This part explains how to use VoiceXML features: Chapter 1, Getting Started Chapter 2, Forms Chapter 3, Event Handling Chapter 4, Fetching and Caching Resources Chapter 5, Using Multiple-Recognition

16 4 VOICEXML PROGRAMMER S GUIDE

17 1 Getting Started VoiceXML is a markup language derived from XML for writing telephone-based speech applications. Users call applications by telephone. They listen to spoken instructions and questions instead of viewing a screen display; they provide input using the spoken word and the touchtone keypad instead of entering information with a keyboard or mouse. This chapter describes: VoiceXML User Interaction Flow of Execution Collecting Input and Playing Prompts VoiceXML Just as a web browser renders HTML documents visually, a VoiceXML interpreter renders VoiceXML documents audibly. You can think of the VoiceXML interpreter as a telephone-based voice browser. As with HTML documents, VoiceXML documents have web URIs and can be located on any web server. Yet a standard web browser runs locally on your machine, whereas the VoiceXML interpreter is run remotely at the VoiceXML hosting site, for example. And you use your telephone to access the VoiceXML interpreter. Environment In order to support a telephone interface, the VoiceXML interpreter runs within an execution environment that includes a telephony component, a text-to-speech (TTS) speech-synthesis component, and a speech-recognition component. The VoiceXML interpreter transparently interacts with these infrastructure components as needed. For example: Text strings in output elements are rendered using TTS. Connection issues (picking up the incoming call, detecting a hang-up, transferring a call) are handled by the telephony component. Listening to spoken input from the user and identifying its meaning is handled by the speech-recognition component. Tags and Elements VoiceXML uses markup tags and plain text. A tag is a keyword enclosed by the angle bracket characters (< and >). A tag may have attributes inside the angle brackets. Each attribute consists of a name and a value, separated by an equal sign (=) and the value must be enclosed in quotes.

18 GETTING STARTED Tags occur in pairs; corresponding to the start tag <keyword> is the end tag </keyword>. Between the start and end tag, other tags and text may appear. Everything from the start tag to the end tag, is called an element. For example, the following three lines constitute a prompt element: <prompt> What is your telephone number? </prompt> If there are no other tags or text between the start and end tag, a syntactic shorthand is permitted. You can precede the closing angle bracket ( > ) of the start tag with a slash ( / ) and omit the end tag. For example, instead of writing a value element as: <value expr="result"></value> you can use the shorthand notation: <value expr="result"/> Because the syntax specifies the end of each element, the VoiceXML interpreter can check that the entire document has been received. If one element contains another, the containing element is called the parent element of the contained element. The contained element is called a child element of its containing element. The parent element may also be called a container. Although both HTML and VoiceXML use markup tags, the two languages use tags differently. Whereas the markup tags in HTML describe how to render the data, the markup tags in XML (and consequently in VoiceXML) describe the data itself. This allows an XML interpreter or browser to display the data in whatever way is appropriate. BeVocal VoiceXML generally complies with the VoiceXML 2.0 Specification. It also includes several handy extensions that you can use if you choose. VoiceXML Tag Summary lists any differences between BeVocal VoiceXML and the standard. Tip: VoiceXML conforms to XML standards; the formats for VoiceXML tags are more strictly defined than are the formats in HTML. If you are used to HTML and not XML, remember that all container elements require end tags and all attribute values must be in quotes. Simple Example In VoiceXML, the <form> element is analogous to an HTML form that contains items for the user to enter. In VoiceXML forms, each logical piece of information to be collected from the user is identified with a <field> tag. The form in the following example collects one piece of information from the user. Once this information is obtained, execution proceeds to the field s <filled> element. Other tags used in the example include the following: The <script> tag specifies a block of client-side JavaScript code. The <var> tag declares a variable to be used within the form. The <prompt> tag produces audio output for the user. The <assign> tag assigns a value to a variable. The <value> tag evaluates an expression and produces spoken output of the result. This example requests a number from the caller, computes the factorial of that number, and repeats the answer to the caller. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" 6 VOICEXML PROGRAMMER S GUIDE

19 VoiceXML " <vxml version="2.0" xmlns="  <script> <![CDATA[ function factorial(n) {return (n <= 1)? 1 : n * factorial(n-1);} ]]> </script>  <form id="computefactorial">  <var name="result"/>  <field name="num" type="number">  <prompt>please say a number </prompt>   <filled>  <assign name="result" expr="factorial(num)" />  <prompt>the factorial of <value expr="num"/> is <value expr="result"/> </prompt>  </filled> </field> </form> </vxml> VoiceXML contains no explicit instructions about how to present the prompt, please say a number or how to present the results. In theory, these could be presented textually on a different kind of browser. In practice, the example document is run as a telephone application and results in conversations such as the following. Application: Please say a number. User: 4 Application: The factorial of 4 is 12. VOICEXML PROGRAMMER S GUIDE 7

20 GETTING STARTED Documents An executable VoiceXML file is called a document. The VoiceXML interpreter loads a document file to execute it. Every VoiceXML document must start with header information that conforms to the XML standard: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//W3C/DTD VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" These headers describe the language in which the document is written: The first tag indicates that the document is an XML document. This tag is required. Always use this tag exactly as specified; it must be the very first characters in the document. To be a legal XML document, the first 4 characters of any XML file (including a VoiceXML document) must be: <?xm No characters, not even whitespace characters such as space or newline, can come before these 4 characters in a VoiceXML document. The second tag identifies the Document Type Definition (DTD), which is used to validate that the contents represent well-formed VoiceXML. This tag is optional. A DTD describes the format of the data that might appear in an XML document. That is, the DTD defines the valid tags by specifying what attributes each tag can have and what child tags or other content each tag can contain. If your document contains only standard VoiceXML elements, you can use the DTD shown above. If you use any of the BeVocal VoiceXML extensions to VoiceXML, you ll need to use the correct DTD. In this case, you replace the DOCTYPE element with the following: <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " You should include a DOCTYPE declaration during development, as it allows better error checking by the interpreter. You may remove it during deployment for performance. The third tag identifies the version of VoiceXML used in this document and the designated namespace for VoiceXML. This tag is required. For VoiceXML 2.0, this tag should always include these 2 attributes. It can also include optional attributes described in the section on the <vxml> tag. Apart from headers and possibly comments, all the content in a VoiceXML document is contained within a <vxml> element, that is, between the <vxml> start tag and the </vxml> end tag. Applications A VoiceXML application consists of one or more documents. Any multidocument application has a single application root document. Each document in an application identifies the application root document with the application attribute of the <vxml> tag: <vxml version="2.0" xmlns=" application="myapproot.vxml" > Whenever the interpreter executes a document, it loads that document. If the document specifies an application root document, that document is also loaded. 8 VOICEXML PROGRAMMER S GUIDE

21 VoiceXML You can use an application root document for global items or interactions that you want to be active throughout the application. For example, suppose the application root document myapproot.vxml declares a variable named company that has an initial value of BeVocal: <vxml version="2.0" xmlns=" <var name="company" expr="bevocal">... This variable has application scope. That is, any document in the application can use the variable. Dialogs Within a document, a user interacts with dialogs, in which the application produces auditory output, typically asking for information. The user provides input by speaking or pressing keys on the telephone. User speech must be recognized and its meaning interpreted. The telephone key input is interpreted as a sequence of tones in the Dual Tone Multifrequency (DTMF) signalling system. VoiceXML has two kinds of dialogs: forms and menus. A form interacts with the user to fill in a number of fields. Every field has an associated variable, called its input-item variable, or just input variable. Initially, the variable has a value of undefined. It is filled in when the speech-recognition engine recognizes a valid response in a user utterance. Note: In VoiceXML 1.0, an input-item variable was known as a field-item variable. A menu presents the user with a number of choices; it transitions to a different dialog based on the user s selection. Forms The VoiceXML <form> tag defines a form and the <field> tag defines a field in a form. You specify the name of the input variable with the name attribute of the <field> tag. You can use the input variable s name in expressions to refer to the stored value. In the example in Simple Example on page 6, the input variable is named num: <field name="num" type="number"> When the user says the number, the number is stored in the num variable. Then the interpreter proceeds to execute the field s <filled> element. Here, the num variable in the <assign> element is evaluated before being passed as the parameter to the factorial function. <assign name="result" expr="factorial(num)" /> Menus The <menu> tag defines a menu; each choice consists of a <choice> element. The next attribute of a <choice> element specifies the destination dialog to which the interpreter should transition when the user selects that choice. If a <form> or <menu> element is to be the destination of a transition, the id attribute for the destination dialog should specify a unique identifier. For example, the following menu consists of three choices. <menu> <prompt> Please choose one of <enumerate/> </prompt> <choice next="#movieform"> local movies </choice> <choice next="localbroadcast.vxml#radioform"> VOICEXML PROGRAMMER S GUIDE 9

22 GETTING STARTED local radio stations </choice> <choice next=" national TV listings </choice> </menu> The prompt in this menu includes an <enumerate> tag. This tag lets you set up a template for an automatically generated description of the choices. By default, the <enumerate> template simply lists all the choices. In the above example, the prompt is Please choose one of local movies, local radio stations, national TV listings. The destination dialog specified by the next attribute can be in the current document or in a different document: If the user says local movies, the interpreter transitions to the dialog named MovieForm in the same document. If the user says local radio stations, the interpreter transitions to the dialog named RadioForm in the document localbroadcast.vxml. If the user says national TV listings, the interpreter transitions to the first dialog in the document tv.vxml in the national TV web site. Properties You can set properties to customize the behavior of the interpreter. The <property> tag specifies the property to set and the value for that property. Various properties control how the interpreter behaves when prompting the user for input, recognizing speech or DTMF input, and fetching documents and other resources. For additional information, see Chapter 12, Properties. Grammars The speech-recognition engine uses grammars to interpret user input. See the Grammar Reference for details on creating and using grammars. Here, we only cover a portion of the relevant information. Each field in a form can have a grammar that specifies the valid user responses for that field. An entire form can have a grammar that specifies how to fill multiple input variables from a single user utterance. Each choice in a menu has a grammar that specifies the user input that can select the choice. A VoiceXML application can use built-in grammars and application-defined grammars. 10 VOICEXML PROGRAMMER S GUIDE

23 VoiceXML Built-in Grammars The following basic grammars are built into all standard VoiceXML interpreters: Grammar Type boolean currency date digits number phone time Recognizes a positive or negative response. Recognizes an amounts of money, in dollars. Recognizes a calendar date. Recognizes a sequence of digits. Recognizes a number. Recognizes a telephone number adhering to the North American Dialing Plan (with no extension). Recognizes a clock time. BeVocal VoiceXML contains additional built-in grammars as an extension to standard VoiceXML: Grammer Type airport airline equity citystate stockindex street streetaddress Recognizes an airport name or code, such as DFW or Dallas-Fort Worth. Recognizeds an airline name or code, such as AA or American Airlines. Recognizes a company symbol or full name, such as IBM or Cisco Systems. Recognizes US city and state names, for example, Sunnyvale, California. Recognizes the names of the major US stock indexes, such as Nasdaq. Recognizes a street name (with or without street number). Recognizes a street name and a street number. You can reference a built-in grammar in either of two ways: You can use a standard built-in grammar as the type attribute of a <field> element. The example in Simple Example on page 6 uses the built-in number grammar: <field name="num" type="number"> This means that the speech-recognition engine tries to interpret what the user says as a number. You can use any built-in grammar (standard or BeVocal VoiceXML extension) in a <grammar> element by specifying the src attribute with a URI of the form: builtin:grammar/typename For example: <grammar src="builtin:grammar/boolean"/> Application-Defined Grammars Although the built-in grammars can be useful, you typically need to define your own grammars. An application-defined grammar can be specified in the following forms: Augmented BNF (ABNF) form of the W3C Speech Recognition Grammar Format XML form of the W3C Speech Recognition Grammar Format Nuance Grammar Specification Language (GSL) Java Speech Grammar Format (JSGF) VOICEXML PROGRAMMER S GUIDE 11

24 GETTING STARTED A simple grammar can be defined in the document. An inline grammar is defined within the <grammar> element itself. For example, the following inline ABNF grammar matches the words add and subtract. <field name="operator"> <grammar> #ABNF 1.0; root $op; $op = add subtract; </grammar>... With this grammar, if the user says add, the input variable operator is set to add. More complex grammars can be written externally. An external grammar is defined in a file separate from the VoiceXML document file and is referenced by the src attribute of the <grammar> element. For example, the following field uses a grammar rule named Colors in an external XML grammar defined in the file partgrammar.grxml. <field name="part"> <grammar src=" The named rule (Colors in the preceding example) is the one the interpreter will use to start recognition. The specified file may include other grammar rules, which may be used as subrules of the this rule. The grammar for a menu choice can be specified explicitly with a <grammar> child of the <choice> element. Alternatively, a grammar can be generated automatically from the choice text. If the accept attribute of the <menu> tag is set to approximate, the user can say a subset of the words in the choice text to select that choice. Adding this attribute to the preceding example allows the user to say TV listings or just TV to select the third choice: <menu accept="approximate">... <choice...> national TV listings </choice> </menu> Note that the words must be spoken in the correct order; listings, TV would not be recognized. If you want some choices to be matched exactly and others to allow a subset of the words, you can specify the accept attribute on individual <choice> elements. Active Grammars The speech-recognition engine uses active grammars to interpret user input. A field grammar is active whenever the interpreter is executing that field. A menu-choice grammar is active whenever the interpreter is executing the containing menu. A form grammar is active whenever the interpreter is executing the containing form. A form grammar or the collection of choice grammars in a menu can optionally be made active at higher scopes: A grammar with document scope is active whenever the interpreter is executing any dialog in the document. A grammar with application scope is active whenever the interpreter is executing any document in the application. 12 VOICEXML PROGRAMMER S GUIDE

25 VoiceXML If the interpreter is executing one dialog and the user s input matches an active grammar for a different dialog, control transfers to the latter dialog. If the grammar is in application scope, control might transfer to a dialog in a different document. Note that within a field, you can temporarily turn off grammars from higher scopes by setting the field s modal attribute to true. Events The VoiceXML interpreter can throw a number of predefined events based on errors, telephone disconnects, or user input. For example: A no-input event is thrown if the user does not respond to a question. A no-match event is thrown when the user does not respond intelligibly that is, when the user s utterance does not match any active grammar. A help event is thrown when the user requests help. An error event is thrown when any kind of error occurs. An application can define additional events and can use a <throw> element to throw an event of a specified kind. An application can catch an event and take the appropriate response in an event handler. A <catch> element is a general-purpose event handler; its event attribute specifies the kinds of event that it handles. Additional event-handling tags are syntactic shorthand: <noinput>, <nomatch>, <help>, and <error>. Each of these shorthand tags catches one type of event, indicated by its name. For example, a <nomatch> element catches no-match events. When an event is thrown, the associated event handler, if it exists, is invoked. If the handler did not cause the application to terminate, execution resumes in the element that was being executed when the event was thrown. For more information, see Chapter 3, Event Handling. Links A link specifies a grammar that is independent of any particular dialog. A <link> element defines a link. Each <link> element contains a <grammar> element. A link s grammar is active in the scope of the element that contains the link. For example, if the link is in a form, its grammar is active when the interpreter is executing that form. If a link is under a <vxml> element, its grammar has document scope; if the link is in the application root document, its grammar has application scope. Links in a <vxml> element can implement global behaviors. A link can specify one of two possible actions to take if the speech-recognition engine detects a match its grammar: The link can cause a transition to a different location; in that case, its next attribute specifies the destination of the transition. Links, like menu choices, can cause transitions to other dialogs or documents. The link can throw an event; in that case, its expr attribute specifies the event to throw. After the event is handled execution resumes with the element that was being executed when the link grammar was matched. VOICEXML PROGRAMMER S GUIDE 13

26 GETTING STARTED For example, the following link is defined at document level; its grammar is active whenever the interpreter is executing any dialog in the document. If the user says operator, the link transfers control to a different document. <vxml version="2.0" xmlns=" <link next="operator_xfer.vxml"> <grammar type="application/x-nuance-gsl"> operator </grammar> </link>... Universal Commands and Grammars A universal command is always available the user can give the command at any point in an interaction. A universal grammar specifies user utterances that can be recognized as a universal command. Predefined Universal Grammars The following predefined universal grammars are available to all applications: Grammar help exit cancel goback The user asked for help. The user asked to exit. The user asked to cancel the prompt that is playing. The user wants to retract the last response and go back to an earlier part of the interaction. If one of these predefined universal grammars is activated and a user utterance matches the grammar, an event of the same name is thrown. For example, a help event is thrown when the user says help. Application-Defined Universal Grammars An application creates its own universal command by defining and enabling a new universal grammar and implementing its response to the command. To define a universal grammar, set the universal attribute in the <grammar> tag that defines the grammar for the command. The attribute value is a name that uniquely identifies the grammar among all universal grammars in the application. In the following example, the new universal grammar is named joke; the user utterance Tell me a joke will be a universal command when this universal grammar is activated. <grammar universal="joke" type="application/x-nuance-gsl"> (tell me a joke) </grammar> Activating Universal Grammars An application can activate any of the universal grammars to enable the corresponding universal commands. When a universal grammar is activated, a user utterance that matches the grammar is treated as a universal command. All universal grammars are deactivated by default. The application can activate some or all universal grammars by setting the universals property. This property specifies which of the universal grammars should be active; all other universal grammars are deactivated. 14 VOICEXML PROGRAMMER S GUIDE

27 VoiceXML Set the universals property to all to activate all universal grammars (both predefined and application-defined):  <property name="universals" value="all" /> Set the universals property to a space-separated list of grammars to activate those universal and deactivate others:  <property name="universals" value="help goback joke" /> Set the universals property to none to deactivate all previously activated universal grammars in the current scope.  <property name="universals" value="none" /> Note: (VoiceXML 1.0 only) If the <vxml> tag s version attribute is 1.0, all universal grammars are activated by default. Responding to Application-Specific Universal Grammars A <link> element containing a universal grammar implements the application s response to the corresponding universal command. Your application can respond to the command in whatever manner is appropriate. Typically, the response is to throw an event or to transition to a different form. If you throw an application-specific event, you must provide an event handler to take the appropriate action. For example:  <link event="joke">  <grammar universal="joke" type="application/x-nuance-gsl"> (tell me a joke) </grammar> </link>  <catch event="joke"> <subdialog name="joker" src="telljoke.vxml"/> </catch> Procedural Logic You can use procedural logic, called executable content, within a few basic elements: <block>, <filled>, and event handlers. Within executable content, you can declare and assign values to variables, use simple conditional logic, perform iteration (a BeVocal VoiceXML extension), output speech or audio to the user, or run a JavaScript script. Variables Variables are declared by the <var> tag. Declarations can appear in a document, a form, or executable content. The <var> tag can optionally specify the variable s initial value; if it doesn t, the variable is initialized to undefined. VOICEXML PROGRAMMER S GUIDE 15

28 GETTING STARTED A variable has the scope of the element that contains its declaration: A variable has document scope if it is declared in a <vxml> element, or in a <block> or event handler that is a child of the <vxml> element. If the document is the application root document, then the variable has application scope. You can refer to a variable x with document scope either as x or document.x (for clarity or to resolve ambiguity). If the variable is in the application root document, then you can refer to it in other documents as application.x. A variable has dialog scope if it is declared in a <form> element, or in a <block> or <filled> element that is a child of a <form> element, or an event handler that is a child of a <form> or <menu> element. You can refer to a variable x with dialog scope either as x or dialog.x. A variable has an anonymous scope, local to a field, if it is declared in an event handler or <filled> element that is a child of a <field> element. If a <var> element specifies a variable that is already in scope, it does not declare a new variable with the same name, but simply assigns a value to the existing variable. If the <var> element has an expr attribute, the variable is assigned the specified value; otherwise, the variable is assigned the value undefined. You can set a variable s value with the <assign> tag. VoiceXML variables are in all respects equivalent to JavaScript variables they are part of the same namespace. For additional information, see Scripts on page 17. Conditional Logic You can use an <if> element to execute a block of code if a condition is satisfied. Within that element, you can use a sequence of <elseif> elements to execute alternative blocks of code if all previous conditions failed and the condition of the <elseif> element is satisfied. You can use an <else> element to execute and alternative block of code if all previous conditions failed. The conditions in <if> and <elseif> elements are expressed as Boolean-valued JavaScript expressions. Tip: If your JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. Iteration You can use the BeVocal VoiceXML extension <bevocal:foreach> to execute the contained elements once for each element of a specified array. Audio Output A <prompt> or <reprompt> element generates speech output; an <audio> element plays a prerecorded audio clip. The <value> tag evaluates an expression and produces spoken output of the result. Prompts can appear in executable contents as well as in elements for collecting user input. Anywhere a <prompt> is valid, text is interpreted as a prompt even if the enclosing <prompt> and </prompt> tags are omitted. An input item and the <initial> item of a mixed-initiative form has a prompt counter that lets you play different prompts if the user revisits the item several times. For example, you may want to play shorter descriptions after the first or second time the user is prompted for the same information. The prompt counters are reset on each form invocation. 16 VOICEXML PROGRAMMER S GUIDE

29 User Interaction Scripts A <script> element executes a JavaScript script, which is run in the scope of the parent element. A <script> element can also define functions that can be called by JavaScript expressions in the same scope. VoiceXML variables are equivalent to JavaScript variables and are part of the same namespace. VoiceXML variables can be used in a script just as variables defined in a <script> element can be used in VoiceXML. Declaring a variable using a <var> element is equivalent to using a var statement in a <script> element. If your JavaScript expression contains any of the characters <, >, or &, that character must be escaped. Inside a <script> element, you can do so in one of 2 ways. You can replace the individual characters with the corresponding escape sequence <, >, or &. This may result in code that is difficult to read. Alternatively, you can place the entire script inside a CDATA section. For example, either of the following is correct: <script> function factorial(n) { return (n <= 1)? 1 : n * factorial(n-1); } </script> or <script> <![CDATA[ function factorial(n) { return (n <= 1)? 1 : n * factorial(n-1); } ]]> </script> You might argue that the second is a little easier to read. User Interaction VoiceXML supports both application-directed and mixed-initiative interactions with a user. In an application-directed (or simply directed) interaction, the application prompts for the information it needs and the user supplies the requested information by answering the prompts. The application controls the interaction; the user cannot volunteer information. To be more accurate, the application does not understand volunteered information: If the application is executing a form, the only active grammar is the one for the current field of the form. The only valid user input is one that provides a value for the current field s variable. If the application is executing a menu, the only active grammars are the grammars of the menu s choices. The only valid user input is one that selects a choice for the current menu. In a mixed-initiative interaction, the user and the application both participate in determining what the application does next. A single utterance from the user may provide input for multiple input variables in a form. In response to a prompt in one dialog, the user may provide input that matches a grammar defined in a different form. When this happens, the interpreter transitions to that dialog and fills its input variables from the user input. Similarly, the user may provide input that selects a choice from a different menu or that matches a link grammar, causing a transition to the destination specified by that choice or link. If an application does not use links or grammars with document or application scope, it may still include mixed-initiative forms. A mixed-initiative form includes a form grammar. It can include an <initial> element to control the initial interaction in the form. This element can request user input or perform other non-interactive initialization tasks. In response to a prompt from the <initial> element, the user could VOICEXML PROGRAMMER S GUIDE 17

30 GETTING STARTED provide input that fills in multiple input variables. If the form prompts for individual fields, any user input that matches the form grammar is valid even if that input does not fill in the field for which the user was just prompted. Note: Fewer speech-recognition errors occur in directed interactions than in mixed-initiative interactions. Flow of Execution Execution within a VoiceXML document flows in document order until a dialog (form or menu) is entered. Execution flows from the current dialog to a different dialog or document, based on either: An explicit transition statement in the current dialog. Speech recognition in the current dialog that causes a transition to a different dialog. In addition, execution can temporarily leave the current dialog to execute a subdialog, returning to the current dialog when execution of the subdialog is complete. If the current dialog completes execution without transitioning to a different location, the application exits. In addition, you can use an <exit> element to end the application explicitly. Explicit Transition You can set up explicit transitions to other dialogs or documents in your application using <goto> or <submit> tags. These transition elements can be placed inside <block> or <filled> elements or event handlers. The <goto> element lets you transition to another input item in the current form, to another dialog in the current document, or to another document. When you make the transition to the new location, the local variables from the old form or document are lost. This happens even if you transition to the same form you were in before. However, the values of local variables are not affected when you use <goto> to transition between items within a form. The <submit> tag lets you pass values to another document using an HTTP GET or POST request. Since you use a URI to specify the next document, it need not be a VoiceXML document; for example, it could be a CGI script document. Recognition-Triggered Transition User input to a dialog may cause a transition to a different location: If the speech-recognition engine matches the grammar of a menu s <choice> element that has a next or expr attribute, the interpreter transitions to the destination specified by that attribute. If the speech-recognition engine matches the grammar of a <link> element that has a next or expr attribute, the interpreter transitions to the destination specified by that attribute. If the speech-recognition engine matches a grammar with document or application scope that is defined in a different dialog, the interpreter transitions to that dialog. Subdialogs A subdialog is a reusable VoiceXML dialog that you can pass data to and get return values from: The current dialog passes control to a subdialog with a <subdialog> element. It can pass data to the subdialog with <param> elements inside the <subdialog> element. A subdialog returns control to the calling dialog with the <return> element. It can pass values back using the namelist attribute of the <return> element. 18 VOICEXML PROGRAMMER S GUIDE

31 Collecting Input and Playing Prompts Collecting Input and Playing Prompts At any moment, the VoiceXML interpreter is either waiting for input in an input item, such as a field, or transitioning between input items in response to some input. In this sense, input can be a spoken user utterance, a series of DTMF key presses, or an input-related event such as invalid input. What happens in the waiting and transitioning states is rather intertwined. While waiting for input (also referred to as being in a recognition state), the interpreter is listening for and attempting to match spoken utterances or DTMF key presses against the currently active grammars. When the interpreter listens for speech input, it constantly compares the incoming audio stream to all active grammars, looking for a match. At some point after the user stops talking, the interpreter decides whether the input is valid. The timing for this is controlled by several properties; the properties are different for spoken grammars and for DTMF grammars. For details on how these properties interact, see Chapter 12, Properties. While transitioning between input items, the interpreter completely ignores spoken utterances. If the property bevocal.dtmf.flushbuffer is set to false, then it does listen for DTMF key presses. It queues (or buffers) any key presses for the next recognition state and it keeps track of timing information for the key presses. The interpreter also queues asynchronously generated events that are not related directly to execution of the transition (such as the user hanging up). During this transitioning state, prompts and audio are queued to be played and a program s executable content is run. Prompts get played either at the start of the next waiting state or sometimes when the interpreter goes off to fetch a resource, such as another document. For details on fetching resources, see Chapter 4, Fetching and Caching Resources. At the beginning of a waiting state, there may be DTMF key presses queued during the previous transitioning state. By default, those key presses are not available for the waiting state to use for recognition. If you do want to use those keys, set the bevocal.dtmf.flushbuffer property to false. VOICEXML PROGRAMMER S GUIDE 19

32 GETTING STARTED 20 VOICEXML PROGRAMMER S GUIDE

33 2 Forms The main elements of a document (within the <vxml> element) are forms. VoiceXML forms are analogous to web forms; you use them to collect (voice) input from the user. This chapter describes: Form Items Form-Item Variables Execution of a Form User Interaction Form Items So far, the only form item we ve discussed is the <field> element. However, forms can contain either input items or control items: Input items are elements for collecting user input or results. An input item is any one of the following: A field, defined with the <field> tag, asks the user for a piece of information. A record item, defined with the <record> tag, records what the user says (perhaps for a voic message); A subdialog, defined with the <subdialog> tag, invokes a reusable dialog. A transfer item, defined with the <transfer> tag, transfers the user to another telephone number. Control items are tags that can contain procedural items for audio output or computation. A control item is either of the following: A block, defined with the <block> tag, is a container for procedural elements. An initial item, defined with the <initial> tag, controls the initial interaction of a mixed-initiative form. Form-Item Variables Each form item has an associated form-item variable. When a form is entered, all form-item variables are initially undefined. When a form item is visited, its variable is set to the result of interpreting that form item. For example, visiting a <block> element sets its form-item variable true. The form-item variable for an input item is also called an input-item variable (or simply input variable); after an input item is visited, its input-item variable is set to the value collected from the user. For more details on setting form-item and input-item variables, see the Grammar Reference.

34 FORMS Execution of a Form Within a form, the flow of execution is governed by the Form Interpretation Algorithm (FIA), a looping algorithm. On each iteration, the FIA selects the form item to visit next. A form item s guard conditions determine whether it can be selected on a given iteration: The value of the form-item variable must be undefined. The value of any cond expressions contained in the form item must evaluate to true. Both guard conditions must be met in order for a form item to be selected. The FIA examines the form items in document order, selecting the first one whose guard conditions are met. If the guard conditions for all form items fail, the form (and the application) exits. By default, every form-item variable has an initial value of undefined so every form item that does not specify a cond expression is eligible for selection. After the form item is visited, its variable is set to a value, which prevents the same form item from being selected again on the next iteration. You can explicitly control the execution of any form item if you give its variable a name and an initial value other than undefined. Doing so prevents the form item from being eligible for selection until you explicitly use the <clear> tag to reset its variable. Typically, input-item variables are given names but control-item variables are not. User Interaction User interaction with a form can be directed or mixed initiative. A directed form has no form grammar, only grammars for its individual fields. A directed form gives the user explicit directions about what to say and when. For example, a directed form might result in the following dialog: Application: User: Application: User: Would you like to buy, sell, or receive a stock quote? Get a quote. What stock or stocks would you like a quote for? Intel. A form that includes its own grammar is a mixed-initiative form. The form grammar allows several input variables to be filled in as a result of a single user utterance. A mixed-initiative form allows the user to speak more naturally. For example, a mixed-initiative form might result in the following dialog. Application: User: One disadvantage of mixed-initiative forms is that form grammars are more complicated and can result in more recognition errors. The grammar for a field sets a value for the field s variable. For example, the grammar in the following field, specified in ABNF, assigns the value june to the variable month if the user says June. <field name="month"> <grammar> #ABNF 1.0; Stock assistant here. How can I help you? I d like to get a quote for Intel. 22 VOICEXML PROGRAMMER S GUIDE

35 User Interaction root $mo; $month = june july august; </grammar> <field> The grammar for a form must specify both the input variable to be set by a grammar rule and the value for that variable. For example, the ABNF grammar in the following file, foo.gram, sets values for two variables, quantity and fruit: #ABNF 1.0; root $main; $main = [$amount] $fruit $amount [$fruit] $amount $fruit ; $amount = one { quantity=1 } two { quantity=2 } three { quantity=3 } ; $fruit = (apple apples) { fruit=apples } (orange oranges) { fruit=oranges } ; This grammar is used by the following mixed-initiative form: <form id="foo"> <grammar src="foo.gram#main"/> <initial> <prompt>how many apples or oranges do you want?</prompt> </initial>   <field name="fruit"> <grammar src="foo.gram#fruit"/> <prompt>do you want apples or oranges?</prompt> </field> <field name="quantity"> <grammar src="foo.gram#amount"/> <prompt>how many <value expr="fruit"/> do you want?</prompt> </field> <filled> <prompt> Ok, you want <value expr="quantity"/> <value expr="fruit"/> </prompt> </filled> </form> VOICEXML PROGRAMMER S GUIDE 23

36 FORMS 24 VOICEXML PROGRAMMER S GUIDE

37 3 Event Handling The VoiceXML interpreter can throw a number of predefined events based on errors, telephone disconnects or user requests. You can also throw events you define that are specific to your application. When an event is thrown, the associated event handler, if it exists, is invoked. Then execution resumes in the element that was being executed when the event was thrown. This chapter describes: Predefined Events Default Event Handlers Application-Defined Event Handlers Events in Subdialogs Throwing Events Application-Defined Events Predefined Events The following standard events are predefined: Event exit The user asked to exit. help The user asked for help. noinput The user did not provide timely input. nomatch The user did not provide meaningful input. cancel The user asked to cancel the prompt that is being played. connection.disconnect.hangup The user hung up. New in VoiceXML 2.0. connection.disconnect.transfer The user s call was transferred. New in VoiceXML 2.0.

38 EVENT HANDLING The following additional events are defined as BeVocal VoiceXML extensions: Event goback connection.far_end.busy connection.far_end.disconnect connection.far_end.disconnect.timeout connection.far_end.noanswer User wants to retract the last response and go back to an earlier part of the interaction. See Chapter 7, Go-Back Facility. The number for an outbound telephone call was busy. An outbound telephone was disconnected because the called third party hung up. Outbound telephone calls are described in Chapter 6, Controlling Outbound Calls. An outbound telephone exceeded its maximum allowed duration. An outbound telephone call was not answered within the time allowed for making the connection. The following standard errors are predefined: Event error.badfetch error.noauthorization error.semantic error.connection.baddestination error.connection.noauthorization error.connection.noresource error.noresource error.unsupported.format error.unsupported.element An error occurred while the interpreter was fetching a document or resource. The user is not authorized to perform the requested action. A runtime error occurred in the VoiceXML code. The destination URI for an outbound telephone call was invalid. An attempt was made to place an unauthorized outbound telephone call, for example, one that exceeds the maximum allowed duration. An audio input or output resource is unavailable. An audio input or output resource is unavailable. The requested resource format is not supported. The requested element is not supported (for example, error.unsupported.subdialog). The following additional errors are defined as BeVocal VoiceXML extensions: Event error.internal error.bevocal.maxdialogerrors_exceeded error.bevocal.maxerrors_exceeded A serious internal error occurred in the interpreter. The maximum number of speech errors was exceeded in a particular execution of a particular form. The maximum number of speech errors was exceeded during the call. Note: In a VoiceXML 2.0 document (when the value of the version attribute of the vxml tag is 2.0), the telephone.disconnect.* and error.telephone.* events have been changed to connection.disconnect.* and error.connection.*. See above. 26 VOICEXML PROGRAMMER S GUIDE

39 Default Event Handlers Backward Compatibility with VoiceXML 1.0. The following predefined events are still supported in VoiceXML 1.0: Event telephone.disconnect.hangup telephone.disconnect.transfer The user hung up. The user s call was transferred. The following predefined errors are still supported in VoiceXML 1.0: Event error.telephone.baddestination error.telephone.noauthorization error.telephone.noresource The destination URI for an outbound telephone call was invalid. An attempt was made to place an unauthorized outbound telephone call, for example, one that exceeds the maximum allowed duration. A telephone resource is unavailable, for example because the application tried to make an outbound telephone call while another outbound call was active. Default Event Handlers The BeVocal interpreter provides the following default event handlers for the predefined events and errors: Event Handler exit help Exit the interpreter. Play a default audio help message and reprompt. The help message says: No help available right now. noinput Play a default audio message and reprompt. The message says: I m sorry, I didn t hear you. nomatch Play a default audio message and reprompt. The says: I m sorry, I didn t understand you. cancel Stop playing audio. error Exit the interpreter. connection.disconnect.hangup Exit the interpreter. New in VoiceXML 2.0. goback Undo whatever actions resulted from the last response, then prompt the user for a new response. All others Play a default audio error message and exit the interpreter. Backward Compatibility with VoiceXML 1.0: The following predefined events handler is still supported for VoiceXML 1.0. VOICEXML PROGRAMMER S GUIDE 27

40 EVENT HANDLING Event Handler telephone.disconnect.hangup Exit the interpreter. Application-Defined Event Handlers Although the system provides default handlers for the predefined events, you can override these handlers by providing your own event handlers in any element that can throw an event. The <catch>, <error>, <help>, <noinput>, and <nomatch> elements are event handlers. An element in which an event may be thrown also inherits event handlers defined in its ancestor elements. For example, an event thrown within a field element may be caught by a handler in that element, or in its form, or in its document, or in its application. This inheritance of event handlers allows you to provide consistency in event handling by defining handlers at a higher level. The method by which event handlers are inherited from ancestor elements is called as if by copy semantics in the VoiceXML 2.0 specification. It helps to think of the appropriate event-handler literally being copied into the scope of where the event was thrown. Variable references are resolved relative to the scope of the element where the event was thrown. And URL references are resolved relative to the document from which the event was thrown. For example, if you have a <catch> handler in an application root document, which is in a different directory from the main document which threw the event, URLs in the handler will be resolved to the directory of the main document. The change to URL resolution to the originating document is considered 2.0 behavior and applies only when the <vxml> tag s version attribute is set to "2.0" or greater. Form items contain event counters that let you respond differently if the same event is thrown multiple times. For example, you may want to provide more details each time the user asks for help. The event counters are reset on each form invocation. When an event occurs, its counter is used to select applicable event handlers: 1. All handlers in the scope in which the event occurred and its containing scopes are considered. 2. A handler for the event is eligible if its count attribute is less than or equal to the event s counter. 3. Those eligible handlers with the highest count are selected as applicable (more than one handler may have the same highest count). 4. The applicable handlers are ordered by scope, with the innermost event handlers first; within a given scope, the applicable handlers are examined in the order in which the occur in the VoiceXML document. The first applicable handler in this ordering is selected to handle the event. You can set up event handlers that catch all events with a given prefix (for example, error.unsupported). Note, however, that the interpreter selects a handler based on count, scope, and document order only. A more specific handler does not take precedence. For example, if an error.unsupported.format event is thrown and the first applicable handler is for all events beginning with the prefix error.unsupported, that handler will be invoked even if the next applicable handler is for the specific event error.unsupported.format. Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined. 28 VOICEXML PROGRAMMER S GUIDE

41 Events in Subdialogs Tips: Always set up default <help>, <nomatch>, and <noinput> messages of your own, at top level scope. For example: <vxml version="2.0" xmlns=" <help> I'm sorry. There's no help available here. <reprompt/> </help> <noinput> I'm sorry. I didn't hear anything. <reprompt/> </noinput> <nomatch> I didn't get that. <reprompt/> </nomatch>... If you want to execute both an event handler in an inner scope and a handler for the same event in an outer scope, the inner handler can use a <rethrow> element to rethrow the event. Events in Subdialogs A subdialog must catch any event that is thrown while the subdialog is being executed. If no handler for the event is found in the subdialog s execution context, a fatal error occurs, causing the interpreter to exit. (VoiceXML 1.0 only) In VoiceXML 1.0, when a subdialog throws an event, the result depends on whether the subdialog is modal. Subdialogs are modal by default; a subdialog can be made non-modal by setting the modal attribute to false. If an event is thrown within a modal subdialog and no handler for the event is found in the subdialog s execution context, a fatal error occurs, causing the interpreter to exit. If an event is thrown within a non-modal subdialog and no handler for the event is found in the subdialog s execution context, the interpreter causes the subdialog s context to return and rethrows the event in the calling context, restarting the search for the event handler in that context. Note: In VoiceXML 2.0, all subdialogs are modal. Throwing Events An application can throw events as follows: A <throw> element throws an event; it can occur within executable content, that is, in a block or <filled> element, or an event handler. A <link> element can specify an event to be thrown when the link s grammar is matched. A <choice> element in a menu can specify an event to be thrown when the choice s grammar is matched. A <return> element in a subdialog can specify an event to be thrown after control returns to the calling dialog. VOICEXML PROGRAMMER S GUIDE 29

42 EVENT HANDLING Application-Defined Events An application can define additional events implicitly. If a tag that throws an event specifies an event other than one of the predefined events, it implicitly defines the specified event. For example, the following tag implicitly defines an event named myevent and throws that event. <throw event="myevent"/> An application can use a <catch> element to catch and handle an application-defined-event. For example: <catch event="myevent">... </catch> 30 VOICEXML PROGRAMMER S GUIDE

43 4 Fetching and Caching Resources You can think of the VoiceXML interpreter as a telephone-based web browser. As with HTML documents, VoiceXML documents have Web URIs and can be located on any Web server. In addition to VoiceXML documents, a VoiceXML application can use several different types of files, including recorded audio data, speech and DTMF grammars, and XML data (this last is a BeVocal extension). Some data may be obtained from streaming sources, servers responding to form requests, CGI script output, and so on. We follow the World Wide Web Consortium s convention of using the term resources to refer collectively to files, streams, and other data sources. All of these resources can be accessed with standard Web URIs and can be located on any Web server. One significant difference between an HTML application intended for a standard Web browser and a VoiceXML application intended for your telephone is that a standard Web browser runs locally on your machine, whereas the VoiceXML interpreter does not run on your telephone but runs remotely, for example at the VoiceXML hosting site. On every call to a VoiceXML application, all of the resources needed by that application may need to be retrieved (or fetched) from a location other than where the VoiceXML interpreter runs. Those resources may then be stored locally at the hosting site (that is, cached) by the VoiceXML interpreter for later use by the same or a different application on another call. This chapter discusses concepts related to retrieval and caching of your application s resources: How Fetching and Caching Work What You Can Control from VoiceXML Prefetching Resources Handling Fetching Delays Controlling the Use of Cached Resources Submitting Complex JavaScript Objects This chapter describes the mechanics of this process. For information on how take advantage of these features when designing your application for best performance, see the VoiceXML Performance Guide. How Fetching and Caching Work The BeVocal VoiceXML interpreter follows the VoiceXML 2.0 and Hypertext Transfer Protocol (HTTP) 1.1 standards for fetching and caching resources. The HTTP standard in particular provides a lot of flexibility in how fetching and caching can be implemented. This discussion does not describe all of the details, not even all of the details that apply to VoiceXML. It merely gives an overview of the basic model and the most common ways to use this model in VoiceXML. Note: Fetches made using secure HTTPS are not cached in the proxy server. Requests and Responses Basically, any application that follows the HTTP 1.1 standard for fetching and caching sends a request to a server for a particular resource and then gets a response back from that server. The request consists of a request type (typically a GET request for VoiceXML requests), a URI that specifies what resource is wanted, and various headers that specify details about what will be acceptable in the returned resource.

44 FETCHING AND CACHING RESOURCES The response consists of a response code containing information about the type of response, various headers with details about what can be done with the resource, and possibly a body containing the actual resource. As anyone who has waited for the download of a graphic-intensive Web page knows, fetching resources over the Web can be a time-consuming process. Consequently, it s a very good idea to avoid as much as possible going out over the Web to get new copies of things that have not changed since the last time you got them. This is where caching comes in; the fundamental idea of caching is to avoid sending a body in the response unless absolutely necessary. Because of the importance of caching to reduce the time involved in fetching resources, many of the headers (both request and response headers) contain information about how old the resource is, how old the resource can be and still be fresh or unexpired, and what restrictions, if any, there are on caching the resource. For standard HTML applications, you can directly set request and response headers. For VoiceXML applications, on the other hand, you still directly set response headers, but you use VoiceXML attributes and properties to set request headers. The VoiceXML interpreter translates the VoiceXML attributes and properties into the appropriate HTTP request headers. Using Multiple Caches Your standard Web browser typically sets aside some amount of disk space on your machine for caching Web pages. This means that as you explore back and forth amongst the multiple pages of a Web site, the browser does not have to download every page every time you visit it. This corresponds to a single phone call to a VoiceXML application. In theory, the VoiceXML interpreter could cache resources for a single phone call; however, single phone calls don t tend to be long enough for this to be worth the overhead. What is most assuredly worth the overhead is... If you open multiple Web browsers on your machine at the same time, you can visit the same or different Web pages in each Web browser. These multiple browser instances can share the same cache for Web pages. That is, pages that are cached by one instance are available for use by the other instances. Correspondingly, at a VoiceXML hosting site, a single VoiceXML Media Gateway may service multiple phone calls at the same time. All of those phone calls share the same VoiceXML interpreter cache. So, resources downloaded for an application on one phone call may be available to the next phone call to that or to a different application on the same Media Gateway. Many of us use the Web through our company s connection to it. Many companies are set up with proxy servers that sit between the company s machines and the outside world of the World Wide Web. A proxy server may store Web pages and then have those Web pages available to any machine that goes through that server out to the Web. Thus, even if you haven t visited a site from your machine, if someone else in your company has visited that site, its pages may be stored in the proxy server cache and be available to you from that cache instead of going out over the Web. While not as fast as if it were stored on your machine, the proxy cache is frequently still faster than downloading the page from the Web. A BeVocal hosting site typically contains multiple VoiceXML Media Gateways, to facilitate handling more calls simultaneously. Just as your company uses a proxy cache for Web pages, a hosting site uses a proxy cache for VoiceXML resources. So, if anybody has used the Horoscopes application, its pages may be available in the persistent site-wide proxy cache. When a new call comes in for that application, that call may be able to use the pages from this cache, rather than going out and downloading the information again. The following diagram shows how requests flow from your application through the various levels of caching and finally out to other servers on the Web. 32 VOICEXML PROGRAMMER S GUIDE

45 How Fetching and Caching Work World Wide Web Your Server Other Servers request / response request / response BeVocal Hosting Site (many machines) Site-wide Proxy Cache VoiceXML Media Gateway (1 machine) VoiceXML Interpreter Cache request / response request / response VoiceXML Media Gateway (1 machine) VoiceXML Interpreter Cache request / response request / response Call to VXML App 2 Call to VXML App 1 Call to VXML App 2 Call to VXML App 3 When a user interacts with your application, she typically calls a phone number that is associated with a particular BeVocal hosting site (such as the one that hosts BeVocal Café applications). That site starts a local copy of the VoiceXML interpreter for your application and runs your application on one of its Media Gateways, retrieving resources as needed from wherever they live. As has been said earlier, the resources for your application may live on multiple servers throughout the Web. Even if your resources are hosted at the hosting site, however, they must be fetched by the interpreter when the user calls. Note: There is a naming ambiguity you should be aware of. Every phone call to a BeVocal hosting site gets its own local copy of the VoiceXML interpreter; it does not share that interpreter with other phone calls. However, the VoiceXML Interpreter Cache is a cache for an entire VoiceXML Media Gateway, not for an individual phone call into that Media Gateway. An individual instance of a VoiceXML interpreter (that is, a single phone call) does not have a completely separate cache. At any one time, a lot of different users may be using a lot of different applications all at the same BeVocal hosting site. Each call will use a different set of resources; sometimes these resources overlap and sometimes they do not. For example, if 10 users simultaneously call the same Horoscopes application, they all need access to common documents and audio files for that application. If at the same time another 5 people call a Sports application, that second group will need a similar set of resources for the Sports application, but these resources probably won t overlap much with the resources needed by the group calling the Horoscopes application. At its simplest, if all resources are fresh forever, the sequence would be as follows: The first time a call accesses a particular resource, the interpreter generates the request, looks in the local VoiceXML interpreter cache, then in the site-wide proxy cache, and finally goes out and retrieves the resource from the appropriate server on the Web. VOICEXML PROGRAMMER S GUIDE 33

46 FETCHING AND CACHING RESOURCES When a successful response comes back from that server, the resource is first stored in the site-wide proxy cache, then passed to the local interpreter cache where it is also stored, before finally being used by the running application. Later in the same call, if the same resource is requested again, the interpreter gets it directly from the local interpreter cache, without having to download it again from the proxy cache, let alone from the original server. When that call ends, the copy remains both in the local VoiceXML interpreter cache for the VoiceXML Media Gateway and in the site-wide proxy cache. If another call comes in a few minutes later to the same Media Gateway and asks for the same resource, that call can use the copy in the local cache. If another call comes in to a different Media Gateway and asks for the same resource, that call does not have the resource in its local cache, but can use the resource copy in the proxy cache. Sounds relatively simple, doesn t it? Things get complicated for a variety of reasons: Sometimes the server providing the resource wants to control for how long or even whether the information is cached. For example, for a resource that has security implications, the server may instruct the BeVocal platform to not cache the resource. Most resources are time sensitive, for some time period. For example, if the resource is the current stock price for some company, that resource may only be valid at the time requested. Or, if the resource is an audio file containing a daily horoscope, it may be valid for a single 24-hour period. On the other hand, if the resource is an audio file corresponding to the application s main menu, it may be theoretically valid for weeks or even months. Conversely, sometimes the VoiceXML application may want to control when to use cached information. For example, you may not have control over the server that stores your audio files and so cannot say when those audio files are no longer useful. In that case, you d want the VoiceXML application to provide this information. Typically, you do not need to be concerned with the differences between the VoiceXML interpreter s local cache and the proxy cache for the entire site. The facilities available for controlling caching within a VoiceXML application do not distinguish which cache you are controlling; they talk about whether a resource can be cached at all and for how long it can be cached. Those settings apply to all relevant caches within the BeVocal platform and to any relevant caches out on the Web, for example, at the application s Web server. Also note that this discussion only addresses the caches that are present on the VoiceXML side. The remote server may have its own caches and proxies for storing resources before sending them to your application. For simplicity, most of the rest of this document just talks about the cache, without distinguishing between the local interpreter cache and the proxy cache. Fundamentals of Controlling Caching As we ve said, you control caching with the HTTP response and request headers; that is, either on the responding server or from the requesting application. You can specify control information in both of these places. Typically, however, unless you re a caching expert, for a single type of resource (VoiceXML document, grammar file, audio file, SSML file, or XML file), you should decide whether you want to control that type of resource from the server or from the application. Things can get complicated if you try to control a single resource from both ends at once. The primary concepts for controlling caching are the freshness of a resource (whether or not it has expired) and the time interval during which you can use a resource, based on its freshness or on when it was originally fetched. 34 VOICEXML PROGRAMMER S GUIDE

47 How Fetching and Caching Work Application Control If you control a resource completely from your VoiceXML application, the normal HTTP fetch sequence for a GET request is basically as follows: 1. The first time the resource is requested, the request goes to the origin server; that is, it goes to the server on which the resource resides. 2. The origin server returns a response. 3. The date and time of receipt (the fetch date) are recorded by the requestor, the resource is stored in the cache, and the resource is returned to the requesting application. The response always includes the Date header, indicating when the response was generated. It might include a Last-modified header, indicating when the resource was last changed on the server. The response may include an Etag header, which is a way of uniquely identifying the actual content of the resource. 4. Right now, we re assuming that the server has not specified any expiration information, so the resource expires immediately. 5. By default, the very next time the resource is requested, the interpreter must make a new request for the resource. 6. The new request can include one or both of the If-Modified-Since header, set to the fetch time of the cached resource, and the If-None-Match header, set to the cached Etag. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource. Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since. 7. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache. 8. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache. 9. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time. That sequence of events also applies to a POST request, but only if the response includes an Expires or Cache-Control header. When controlling caching from your application, you change step 5 above, by specifying how stale the resource can be. That is, you can use the Cache-Control: max-stale request header to indicate a number of seconds after the expiration of a resource during which your application can use the expired resource. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Stale Time on page 44.) Here it helps to remember the difference between the local VoiceXML interpreter cache and the site-wide proxy cache. If a particular call needs a resource and that resource is already in its local cache, fetched within max-stale seconds, the interpreter doesn t need to send a request anywhere. However, if the resource is either not in the local cache at all or the local copy is too old, a request is sent to the proxy cache, to see if it contains a copy fetched within max-stale seconds. If the proxy cache does not contain an appropriately dated resource, the proxy cache sends a request to the server for a new copy. VOICEXML PROGRAMMER S GUIDE 35

48 FETCHING AND CACHING RESOURCES Server Control If you control a resource primarily from the server, the sequence is basically as follows: 1. The first time the resource is requested, the request goes to the origin server (as a GET request). 2. The origin server returns a response. 3. Unless the response header includes a Cache-control: no-cache or Cache-control: no-store header, the response headers are cached for later use by the requestor. The fetch date is recorded, the resource stored, and the resource is returned to the requesting application. Note: In HTTP 1.1, the relevent header is Cache-control, not pragma. HTTP 1.0 used pragma, but this directive is no longer understood by many servers. The response still includes the Date header and might include Last-modified or Etag headers. With server control, the response typically includes either an Expires header or a Cache-control: max-age header. If it contains both, the max-age header takes precedence over the Expires header. Expires indicates an exact date and time at which the resource expires. Cache-control: max-age indicates a number of seconds after the Date at which the resource expires. For example, these 2 sets of headers are equivalent: Date: 12 December :34:00 GMT Expires: 12 December :36:00 GMT or Date: 12 December :34:00 GMT Cache-Control: max-age= The next time the resource is requested, if the resource was cached, the interpreter uses the appropriate combination of Expires, max-age, and Date headers to determine whether or not the resource has expired. 5. If the resource has not expired, the interpreter returns the cached copy. In our example, if the next request is at 12 December :34:30 GMT, the cached copy is returned. 6. If the resource has expired, the interpreter makes a new request for the resource. In our example, if the next request is at 12 December :36:30 GMT, a new request is sent to the origin server. 7. If available, the new request includes both the If-Modified-Since header, set to the Last-Modified header of the original request, and the If-None-Match header, set to the Etag of the original response. If both are sent, the If-None-Match header takes precedence; if neither is present, the request must be for a completely new copy of the resource. Etag and If-None-Match are both HTTP 1.1 features. HTTP 1.1 servers give them precedence over the If-Modified-Since header. However, HTTP 1.0 servers use If-Modified-Since. 8. The server uses these headers to determine whether a new copy of the resource needs to be sent in the body of the response or whether it should simply indicate that the requesting application can use the copy in its cache. 9. If the response includes a new copy of the resource, the new copy and its headers replace the old copy in the cache. 10. Even if the response does not include a new copy of the resource, the cached header information can change. For example, the server can, and often does, send a new expiration date for the resource. Even if the server does not update the expiration date, the VoiceXML cache updates the default expiration time to the new fetch time. 36 VOICEXML PROGRAMMER S GUIDE

49 How Fetching and Caching Work You may have no control over the expiration times sent by your server. As an example, you may even know that some resources are modified more frequently than the server indicates with its expiration times. In this situation, you have some options on how to change the caching behavior. You might choose not to request a new resource based on when the resource in your cache expires, but rather on how long ago you fetched the resource in your cache. You use the Cache-Control: max-age request header for this purpose. (Remember that you use VoiceXML attributes and properties to set this header information; see Maximum Age on page 44.) Note: In a response header, max-age indicates when the resource will expire, regardless of when it was fetched. In a request header, it indicates the opposite; max-age indicates how long ago the resource could have been fetched, regardless of when it will expire. You can specify both max-age and max-stale headers in the same request. If you do so, the max-age header takes precedence over the max-stale header. For VoiceXML document and grammar document resources, you have another alternative. You can use the http-equiv attribute of the <meta> tag to act as if you had set response headers for these resource types. When the VoiceXML interpreter parses the VoiceXML document or grammar, if it encounters a <meta> tag that specifies caching information, the interpreter must then go and modify the cache to include this information. Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags. Summary of HTTP Headers Remember that for your VoiceXML application, you do not create HTTP request headers manually. Rather you use the appropriate attributes or properties that are described later in this chapter. The VoiceXML interpreter translates these attributes and properties in a fairly obvious way to the HTTP request headers described here. The HTTP request headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are: Request Header User-Agent: BeVocal/ivers VoiceXML/vvers BVPlatform/pvers If-Modified-Since: date If-None-Match: tag Every request contains a User-Agent header of this form. ivers is the version of the BeVocal VoiceXML interpreter, vvers is the version of VoiceXML used in the requesting document, and pvers is the BeVocal platform version. For example: User-Agent: BeVocal/2.4 VoiceXML/2.0 BVPlatform/ If you need to detect the User-Agent for a request, you should probably ignore the last 2 digits in the platform version, as these are likely to change occasionally. If the modification date on the requested resource is after this time, send a new copy. tag is a unique identifier for a resource. If the resource that would be provided for the request has the same identifier as tag, don t send a new copy. VOICEXML PROGRAMMER S GUIDE 37

50 FETCHING AND CACHING RESOURCES Request Header Cache-Control: max-age=n Cache-Control: max-stale=n A number of seconds after which the resource must be fetched again from the origin server, regardless of whether or not it has expired. Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh. A number of seconds after the expiration time during which the application is still willing to use an expired resource. If you have control over the response sent by the server, you can directly set response headers (or even configure the server itself). The HTTP response headers most commonly relevant for caching with the BeVocal platform and VoiceXML interpreter are: Response Header Etag: tag Date: date Expires: date Cache-Control: max-age=n Cache-Control: s-maxage=n Cache-Control: no-cache Cache-Control: no-store Cache-Control: must-revalidate A unique identifier for this response. The details of how the server creates the identifier are server-specific; what is important is that the identifier indicates a particular version of the resource so the server can determine if it has been modified. The date and time the response was generated. This header is required for all HTTP 1.1 responses. The date and time on which this response expires. This is an HTTP 1.0 header, but is still widely used with HTTP 1.1. If Expires is set to 0, the interpreter does not cache the resource. The resource expires N seconds after generation. That is, to determine the expiration date, add N seconds to the date specified with the Date header. Note that max-age in a response header is quite different from max-age in a request header. In a response header, it lets the server specify a number of seconds during which the resource is still fresh. In a request header, it lets the application specify a number of seconds after which a resource must be refetched, even if the server says the resource is still fresh. For a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header. Do not store this resource in the cache. For VoiceXML applications, has the same effect as the no-cache directive. The cache must not use the resource after its expiration time, even if the request says that stale information is acceptable. 38 VOICEXML PROGRAMMER S GUIDE

51 What You Can Control from VoiceXML Response Header Cache-Control: private Cache-Control: public Cache-Control: proxy-revalidate All or part of the response is intended for a single user. The response must not be cached by a shared cache; it can be cached by a private (non-shared) cache. The response is cachable by any cache, even if it would normally be non-cachable. For VoiceXML applications, this is equivalent to the must-revalidate directive. What You Can Control from VoiceXML From your VoiceXML application, you can control various things about fetching and caching of resources. The various attributes and properties that provide this control are collectively referred to as the application s fetch policies. The fetch policies govern the following aspects of fetching: Prefetching Resources The VoiceXML interpreter can try to start fetching resources before they are actually needed (prefetch them), in an attempt to have them already available when actually required. Handling Fetching Delays No matter what else you do, there inevitably will be noticeable delays between when a resource is requested and when it is available. Controlling the Use of Cached Resources These are the policies that control request and response headers. See How Fetching and Caching Work on page 31 for how request headers affect caching. Some fetch policies are set by a single property for all types of resource. Other fetch policies can be set separately for different types of resource. For these policies, there is usually a corresponding set of properties, one for each of these resource types: VoiceXML documents Recorded audio data Grammar files JavaScript source files SSML files (Extension) XML data files (Extension) In addition, for all of the fetch policies, the appropriate VoiceXML tags support a corresponding attribute. For example, to optimize fetch operations, you can use the audiofetchhint, documentfetchhint, grammarfetchhint, scriptfetchhint, and ssmlfetchhint properties. In addition, the <audio>, <choice>, <data>, <dtmf>, <goto>, <grammar>, <link>, <script>, and <subdialog> tags all support the fetchhint attribute. All policies have default settings. An application can change any default setting with a <property> element that sets a property corresponding to the policy to be changed. Any tag that requests a fetch operation includes attributes that can be set to override the current policy settings during that one fetch operation: A property set in the <vxml> element of a single-document application or the application root document of a multidocument application sets the policy for fetching reosources from that document and the application, overriding the default setting. A property set in the <vxml> element of a non-root document of a multidocument application sets the policy for fetching reosources from that document, overriding the setting for the application. A property set in a <form> or <menu> element sets the policy for fetching reosources from that dialog, overriding the setting for the containing document. VOICEXML PROGRAMMER S GUIDE 39

52 FETCHING AND CACHING RESOURCES A property set in a form item sets the policy for fetching reosources from that form item, overriding the setting for the containing form. An attribute set in an element that fetches a resource sets the policy for that fetch, overriding any other setting of the policy. There are a couple of subtleties you need to be clear about: When you set fetch properties within the document foo.vxml, you are setting properties for resources that foo.vxml fetches; you are not setting properties that affect fetching foo.vxml itself. Fetch properties are always set by some VoiceXML document. For any given phone call to an application, the initial document of the application is not fetched from another VoiceXML document. Consequently, there is no way to set fetching policies for the initial document. The application root (for example, root.vxml) for document foo.vxml is set in the <vxml> tag of foo.vxml. The application root document is not directly fetched by another document and it inherits the fetch policies of the document of which it is the root. For example, if the document bar.vxml fetches foo.vxml, and bar.vxml sets maxage=120 for foo.vxml, then the VoiceXML interpreter also uses maxage=120 for root.vxml. The following sections describe the policies and their default settings, and also list the properties that can be used to set each policy. For a detailed description of the various properties, see Chapter 12, Properties. Prefetching Resources The interpreter can attempt to optimize dialog interpretation by prefetching files that might be needed. The interpreter prefetches resources used by a document by starting to fetch them as soon as a document is loaded, rather than waiting until execution of the VoiceXML tags that reference those resources. Prefetching can improve an application s performance by allowing it to fetch resources during free time while the user is speaking or listening to a dialog. The interpreter prefetches resources in the order in which they appear in the document. Consequently, those near the top of the document are retrieved first, unless there are delays at the server, heavy Internet traffic, and so on. The interpreter prefetches several resources at once; a delay retrieving one resource does not affect others. Note: Prefetching resources can generate many simultaneous requests on your server. Prefetch Cache Details The VoiceXML interpreter prefetches resources separately for each phone call and for each document executed within that phone call. While the interpreter executes a single VoiceXML document on a single phone call, it has a queue of the resources that it can prefetch for that document. During execution of that document on that call, the interpreter prefetches as many resources as it can and puts them in the prefetch cache. During the execution of a document, the VoiceXML interpreter always checks the prefetch cache for a resource before initiating a new fetch operation. If the resource is in the prefetch cache, the interpreter uses it, even if the resource expires between when it is prefetched and when it is needed. When execution leaves the document (either by the call ending or by transitioning to another document in the same call), the interpreter flushes the prefetch queue and cache and starts over for the next document. Note that if there are multiple simultaneous calls to the same application, they may be executing the same document at the same time. However, the resources for each phone call will be in separate prefetch 40 VOICEXML PROGRAMMER S GUIDE

53 Prefetching Resources caches. This means that in some cases, the interpreter will fetch a new copy of a resource for one phone call even though another phone call is using an unexpired copy. To illustrate all this, assume that an application has documents d1.vxml and d2.vxml, both of which refer to the same audio file, foo.wav. If there are 2 simultaneous calls to this application, then at the same time foo.wav might be stored in the prefetch cache of d1.vxml for the first call and the separate prefetch cache of d1.vxml for the second call. Or, it might be in the prefetch cache of d1.vxml for the first call and the prefetch cache for d2.vxml for the second call. foo.wav cannot, however, be in the prefetch caches for d1.vxml for the first call and for d2.vxml for the first call, because 2 different documents on the same phone call cannot be executing at the same time and so cannot have active prefetch caches at the same time. Fetch Hints Prefetching is controlled by instructions called hints, that are specified by the fetchhint attribute, and also by the typefetchhint properties where type is a placeholder for the type of resource to be fetched. For example, the audiofetchhint property controls prefetching of audio files. The fetch hint policy can be set to one of the following values: prefetch Fetch the resource when the page is loaded. safe Fetch the resource only when it is needed. Any tag that can fetch a resource has a fetchhint attribute that specifies how to fetch the resource. If this attribute is not set, the interpreter uses the current value of the appropriate typefetchhint property, where type is a placeholder for the type of resource to be fetched, as shown in the table below. Resource type Tags that support the fetchhint attribute Property Default Value (for property) Recorded audio data <audio> audiofetchhint prefetch VoiceXML documents <choice> <goto> <link> <subdialog> documentfetchhint safe SSML document <audio> ssmlfetchhint safe Grammar files <grammar> grammarfetchhint prefetch <dtmf> VoiceXML 1.0 only JavaScript source files <script> scriptfetchhint prefetch SSML files (Extension) <audio> ssmlfetchhint prefetch XML data files (Extension) <data> datafetchhint safe Restrictions on Prefetching Prefetching is disabled when the URI or other attributes of a tag are computed at runtime. In these cases, even if the fetching hints specify prefetch, the interpreter cannot fetch the resource until the tag is executed and the exact values of the attributes are determined. For example, programmers sometimes simplify their job by writing VoiceXML such as: <audio expr="audiouri('hello')"> where audiouri() is a JavaScript function that adds a prefix such as and an ending such as.wav to the parameter, resulting in a complete URI of This technique saves some typing and VOICEXML PROGRAMMER S GUIDE 41

54 FETCHING AND CACHING RESOURCES simplifies program maintenance. However, the interpreter cannot prefetch the audio file in this case, because the exact URI is not known until the tag is executed. Handling Fetching Delays Regardless of how well you set the various caching and prefetching policies, inevitably fetching resources from remote servers will sometimes generate delays. Various fetch policies control how the interpreter handles these delays. Timeouts By default, the interpreter waits up to one minute for a resource or document to be fetched. The application can control this behavior with the fetchtimeout attribute of a tag that fetches a resource. That attribute specifies how long the interpreter waits for a resource to arrive. If the resource does not arrive within the specified time, the interpreter throws an error.badfetch event. The value is a number representing the time in milliseconds. This attribute is available for all tags that fetch resources, specifically: <audio> <choice> <data> (Extension) <goto> <grammar> <link> <script> <subdialog> <submit> <dtmf> (VoiceXML 1.0 only) <send> (VoiceXML 1.0 only; Extension) When this attribute is not specified, the interpreter uses the current value of the fetchtimeout property, whose default value is 60 seconds. Background Audio By default, the user does not hear any audio output while the interpreter is fetching a resource of any kind. The application can change this behavior with the fetchaudio policy, which specifies the URI of a background audio file to be played while the interpreter fetches a VoiceXML document or XML data file. Background audio can be helpful if the fetch operation may cause a noticeable delay in processing, such as when an on-line purchase is being verified and processed by a transaction server. The audio file can contain music, a please wait message, and so on. Background audio is never played while the interpreter fetches grammar, audio, or script files. It is only played when fetching VoiceXML documents or XML data files. The fetchaudio policy is controlled with the fetchaudio property and the fetchaudio attribute of the following tags: All tags that fetch VoiceXML documents: <choice>, <goto>, <link>, <subdialog>, and <submit>. Extension. The <data> tag, which fetches XML data files; this attribute is relevant only if the bevocal.fetchaudio.allfetches property is true. Extension; VoiceXML 1.0 only. The <send> tag, which submits values to a Web server. If the fetchaudio attribute is not specified, the interpreter uses the current value of the fetchaudio property. This property does not have a default value; that is, by default no background audio is played. 42 VOICEXML PROGRAMMER S GUIDE

55 Controlling the Use of Cached Resources When a background audio file is specified for a fetch operation, the fetching of the background audio file itself is governed by the audiofetchhint, audiomaxage, audiomaxstale, and fetchtimeout properties that are in effect at the time of the fetch. (In VoiceXML 1.0 applications, the caching property is used in place of audiomaxage and audiomaxstale). Note: The interpreter plays the background audio file only once during a given fetch operation; it does not loop (repeat). Two properties govern the playing of the background audio clip: The interpreter does not start to play the audio file unless the time to fetch the resource exceeds a limit set by the fetchaudiodelay property. This can prevent the user from hearing very short audio clips when there are very slight delays in fetching resources. The default value of this property is 0. The value of the fetchaudiominimum property is the minimum time interval to play the fetchaudio source, once started, even if the fetch operation completes during play. The default value of this property is 0; with this default, the interpreter interrupts the audio playback as soon as the resource is fetched, and resumes normal processing. If set to a larger value, it prevents the user from hearing a short clip of background audio which is immediately cut off. Queued Prompts when Fetching By default in VoiceXML 2.0, queued prompts are not played in the background during the execution of a fetch. However, for those tags which fetch data and for which background audio can be played during the fetch (<choice>, <goto>, <link>, <subdialog>, <submit>, and <data>), if background audio will be played (as described next), then queued prompts are played during the fetch and before the background audio is played. If background audio will not be played, but the bevocal.fetchaudio.flushqueue property is set to true, then queued prompts will still be played during the fetch. Controlling the Use of Cached Resources After a resource expires, it remains in the cache although it is stale. If the same file is needed in the future, the interpreter takes one of the following actions: Uses the stale cached file as is. Revalidates the stale cached file with a Get-If-Modified request to the resource s server. If the server replies that the resource has not been modified, the interpreter uses the cached copy. Refetches the resource unconditionally. If the interpreter needs a resource and the cache contains a copy of that resource, there are 3 primary policies governing whether the interpreter uses the cached copy: The Maximum Age for the cached file The Maximum Stale Time for the cached file (VoiceXML 1.0 only) The Caching policy for the file These policies all affect the headers sent when a resource is requested. In addition, you can set caching information that would normally be in the response header for a VoiceXML document or a grammar document. (See Mimicking Response Headers on page 46.) See Prefetch Cache Details on page 40 for how the prefetch cache interacts with these policies. VOICEXML PROGRAMMER S GUIDE 43

56 FETCHING AND CACHING RESOURCES Maximum Age An application can specify that it will use a cached resource only if its time in the cache does not exceed a maximum age: If the cached copy is older than the maximum, it will be refetched with a Get-If-Modified header. If the cached copy is within the maximum age and has not expired, it will be used. If the cached copy is within the maximum age but has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file. See Maximum Stale Time on page 44. Any tag that can fetch a resource has a maxage attribute that specifies the maximum age in seconds of a cached resource. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxage property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxage property specifies the maximum age for audio files. In VoiceXML 1.0 applications, when no value is set for maxage, the caching attribute controls whether an unexpired cached file is used. Resource type Tags that support Property the maxage attribute Recorded audio data <audio> audiomaxage VoiceXML documents <choice> <goto> <link> <subdialog> <submit> documentmaxage SSML documents <audio> ssmlmaxage Grammar files <grammar> grammarmaxage JavaScript source files <script> scriptmaxage XML data files (Extension) <data> datamaxage SSML data (Extension) <audio> ssmlmaxage No default is set for these properties, which means that any unexpired cached file will be used. If you set a maximum-age property to a non-zero value, you ensure that: The interpreter uses an unexpired resource whose age is less than or equal to the maximum age without doing a Get-If-Modified request to verify that the cached file is up to date. The interpreter fetches a fresh copy of a resource whose age is more than the maximum age even if the cached file has not yet expired. For example, suppose you fetch a VoiceXML document file that expires in 60 seconds, and after 40 seconds you need the same file. If documentmaxage is set to 30, the application will refetch the document file; if documentmaxage is set to 60, it will use the cached file. You can set a maximum-age property to 0 to ensure that a fresh copy is fetched if the resource has been modified since it was last fetched. Maximum Stale Time An application can specify that an expired file that is not too stale can still be used. The maximum stale time for a file is the time by which its expiration time can be exceeded. Within this allowable stale time, an expired cached file will be used without being refetched. If an expired cached file is needed after its maximum stale time has been exceeded, the file will be refetched. 44 VOICEXML PROGRAMMER S GUIDE

57 Controlling the Use of Cached Resources Any tag that can fetch a resource has a maxstale attribute that specifies the maximum time in seconds during which a stale (expired) cached resource may be used. If this attribute is not set, the interpreter uses the current value of the appropriate typemaxstale property, where type is a placeholder for the type of resource to be fetched. For example, the audiomaxstale property specifies the maximum stale time for audio files. The following table specifies the appropriate typemaxstale property for each resource type. Resource type Tags that support the maxstale attribute Property Recorded audio data <audio> audiomaxstale 300s VoiceXML documents <choice> <goto> <link> <subdialog> <submit> documentmaxstale 0s SSML documents <audio> ssmlmaxstale 0s Grammar files <grammar> grammarmaxstale 0s JavaScript source files <script> scriptmaxstale 0s XML data files (Extension) <data> datamaxstale 0s SSML data (Extension) <audio> ssmlmaxstale 300s Default value (for property) The maximum stale time is relevant either when the expired file is within the maximum age or when no maximum age is set for the file. If the number of seconds since the cached file expired is less than or equal to the maximum stale time, the cached file is used. If the file has been expired for longer than the maximum stale time, the interpreter does a Get-If-Modified request to update the cached file, if necessary. Caching VoiceXML 1.0 only. In a VoiceXML 1.0 application, when the relevant maximum-age property is not set, the caching policy determines whether the interpreter uses an unexpired cached copy of a file: If the caching policy is fast (the default), the cached copy is used if it has not expired. If the caching policy is safe, the interpreter sends a Get-If-Modified request to the server, even if the resource is still in the cache. This ensures that the most recent copy of the resource is always used; however, it does introduce some extra delays, because the interpreter must contact the resource s server. The safe setting is intended mainly for use during development and debugging, when documents and other files may be updated frequently; it is equivalent to maxage= 0. In VoiceXML 1.0, any tag that can fetch a resource has a caching attribute that specifies the caching policy for the resource: <audio> <choice> <data> (Extension) <dtmf> <goto> <grammar> <link> <script> <subdialog> <submit> VOICEXML PROGRAMMER S GUIDE 45

58 FETCHING AND CACHING RESOURCES If this attribute is not set, the interpreter uses the current value of the caching property. Note that the default value for the property is fast. That is, fast is the normal condition for any tag that does not explicitly specify caching="safe". If the cached file has expired, the relevant maximum-stale-time policy determines whether the interpreter uses the expired cached file. Note: This attribute is used only in when all the following conditions are met: The version attribute of the <vxml> tag is 1.0. The maxage attribute does not have a value. The cache contains an unexpired copy of the resource. Mimicking Response Headers You may not have direct control over the response headers sent by your server. If you do not, then for VoiceXML documents and XML grammar files, you can use the <meta> tag with its http-equiv attribute to mimic the use of HTTP response headers. For example: <meta http-equiv="cache-control" content="max-age=10"/> <meta http-equiv="expires" content="02 Feb :59:59 GMT"/> When the VoiceXML interpreter parses a VoiceXML document or a grammar file in the XML format, it interprets these instances of <meta> tag as though the HTTP response had sent these response headers. The interpreter goes back to its cache and changes the associated information. Note: This method is not recommended because it can only affect the local VoiceXML cache. The overhead of having the intermediate site-wide proxy caches interpret every file would be prohibitive. The proxy caches do not interpret the contents of files, they only look at the headers. Consequently, proxy caches do not understand or implement the caching behavior specified in <meta> tags. Submitting Complex JavaScript Objects The following tags submit variables to a server: <subdialog> <submit> <data> (Extension) <send> (Extension; VoiceXML 1.0 only) The tag s namelist attribute specifies the variables to be submitted. If one of the specified variables is set to a complex JavaScript object, all component values in the object are submitted as separate variables. For example, the following <submit> tag submits the object foo: <script> var foo = new Object; foo[0] = 1; foo[1] = 7; foo[2] = "hello"; </script> <submit src="bar.jsp" namelist="foo"/> The interpreter submits three individual variables to the server with a URI of the form: bar.jsp?foo[0]=1&foo[1]=7&foo[2]=hello The URI includes the necessary encoding escapes around the bracket characters [ and ]. 46 VOICEXML PROGRAMMER S GUIDE

59 Submitting Complex JavaScript Objects An arbitrarily complex object can be submitted in this way. The individual values at each level of the structure are submitted individually. For example, following <submit> tag submits the object top: <script> var subobj = new Object; subobj.a = 2; subobj.b = 4; var superobj = new Object; top.size = 2; top.name = "Test"; top.part = subobj; </script> <submit src="bar.jsp" namelist="superobj"/> The interpreter submits four individual variables to the server with a URI of the form: bar.jsp?top.size=2&top.name=test&top.part.a=2&top.part.b=4 A server-side package that receives individual component variables in this form can put them back together into the appropriate objects. VOICEXML PROGRAMMER S GUIDE 47

60 FETCHING AND CACHING RESOURCES 48 VOICEXML PROGRAMMER S GUIDE

61 5 Using Multiple-Recognition This chapter describes how the speech-recognition engine used by the BeVocal VoiceXML interpreter can provide multiple recognition results. This capability consist of two features: N-best recognition and multiple interpretations of spoken input. This chapter describes: Multiple Recognition Features Using N-Best Recognition Using Multiple Interpretations Using Both Features Together Multiple Recognition Features A VoiceXML application typically obtains a single value as the result of each speech-recognition event. If the user s response was not clear, the result is the one utterance that the speech-recognition engine judges to be the most likely. This utterance may provide values for several slots in a grammar, but it is still a single recognized utterance. If the application uses an ambiguous grammar and the utterance matches more than one rule, slots would be filled according to an arbitrary one of those rules. Two features improve recognition, providing multiple recognition results: N-best recognition: Instead of returning the single most likely utterance, the speech-recognition engine can return a list of the most likely utterances. Multiple interpretations: If any given utterance matches multiple grammar rules, the speech-recognition engine returns those alternative interpretations of the utterance. Both these features are disabled by default. They can be enabled separately or jointly in applications that want to accept multiple recognition results. N-Best Recognition In some advanced voice applications, a single result may not be sufficient. For example, an airline reservation application might ask the user for destination and departure cities. If a speaker mumbles, the speech-recognition engine might not be able to distinguish between two possible utterances, Austin and Boston. Ideally, the application would obtain both these possible results so that it could prompt for clarification, Did you mean Austin, Texas; or Boston, Massachusetts? Using N-best recognition, the speech-recognition engine returns a list of different possible utterances whose confidence levels are high enough for consideration. See Using N-Best Recognition on page 51.

62 USING MULTIPLE-RECOGNITION Multiple Interpretations In some applications, a single recognized utterance may have multiple interpretations, indicating that the utterance is ambiguous. For example, an application might include a GSL grammar with two rules that match the utterance Portland. Cities [... (portland?maine) {<city Portland> <state ME>} (portland?oregon) {<city Portland> <state OR>}... ] If the user clearly says Portland, this utterance does not allow the speech-recognition engine to choose between the two possible interpretations. Ideally, the application would obtain both interpretations so it could prompt for more information, Do you mean Portland, Maine; or Portland, Oregon? Multiple interpretations lets an application access the different interpretations for a given recognized utterance. If multiple grammar rules match the recognized utterance, all resulting interpretations are returned. See Using Multiple Interpretations on page 55. Combining the Features The two multiple-recognition features can be used together. If both features are enabled, each possible utterance may have multiple interpretations. For example, an airline reservation application might both features for a field whose ambiguous GSL grammar includes two rules that match the utterance Austin. Cities [... (austin?texas) {<city Austin> <state TX>} (austin?california) {<city Austin> <state CA>} (boston?massachusettes) {<city Boston> <state MA>}... ] If the user mutters something that sounds like either Austin and Boston, the speech-recognition engine would find two possible results, Austin and Boston. The first of these results would have two possible interpretations: Austin, Texas and Austin, California. A sophisticated application could prompt the user, Did you mean Austin, Texas; Austin, California; or Boston, Massachusetts? Combining both features provides the maximum flexibility. See Using Both Features Together on page 59. Working with Multiple Recognition An application can selectively enable the two multiple-recognition features, specifying the maximum number of results to be returned. The following VoiceXML language features support recognition of multiple results: The property maxnbest controls whether N-best recognition is enabled. The property bevocal.maxinterpretations controls whether maximum interpretations is enabled. The read-only variable application.lastresult$ is set by the speech-recognition engine. It contains information about the result of the most recent speech-recognition event. If multiple recognition was enabled, this variable may contain more than one result. If only N-best recognition is enabled, the results represent different recognized utterances, each with a single interpretation. If only multiple interpretations is enabled, the results represent a single recognized utterance with a number of different interpretations. 50 VOICEXML PROGRAMMER S GUIDE

63 Using N-Best Recognition If both features are enabled, the results can represent a number of different recognized utterances, some or all of which can have multiple interpretations. The two properties maxnbest and bevocal.maxinterpretations control how many results are returned from speech recognition. If only N-best recognition is enabled, maxnbest is the maximum number of results to be returned. If only multiple interpretations is enabled, bevocal.maxinterpretations is the maximum number of results to be returned. When both features are enabled, the two properties are used together. You can set these properties either to limit the total number of results without distinguishing whether a particular result is a different utterance or a different interpretation of a given utterance. Alternatively, you can specify the maximum number of distinct utterances and the maximum number of distinct interpretations for any given utterance. Using N-Best Recognition N-best recognition can be invoked whenever the user s spoken input is matched against a grammar. By default, N-best recognition is disabled. If you want to use this feature, you must explicitly enable it. After recognition in which this feature is enabled, you check to see whether more than one result was recognized. If so, you can prompt the user to select among the possible results. Enabling N-Best Recognition You enable N-best recognition by setting the maxnbest property to a value greater than one. If only N-best recognition is enabled, the value is the maximum number of distinct utterances that the speech-recognition engine should return. If multiple interpretations is also is enabled, the interpretation of value for maxnbest depends on the value of the bevocal.maxinterpretations property. This section describes using N-best recognition alone. Using Both Features Together on page 59 describes how to combine N-best recognition with multiple interpretations. By default, when you set maxnbest to a number greater than one, you enable both N-best recognition and multiple interpretations. To disable multiple interpretations, set the bevocal.maxinterpretations property to 1. When N-best recognition is enabled, the speech-recognition engine may find multiple utterances. The most common use of N-best recognition is in recognizing input in <field> and <initial> elements. It also can be used in recognizing input that matches a <link> or <choice> grammar. N-best recognition slows down the recognition process; you should enable this feature only when you need it. For example, you might enable it for a particular field or form in which you anticipate that user inputs might sound similar to more than one expected response. You should set maxnbest to a fairly small number and your application should be able to handle the specified number of results. Checking for Multiple Utterances After speech recognition occurs while N-best recognition is enabled, you should check whether multiple likely utterances were found. If recognition occurs in a <field> element, you check the results in the <filled> element of that field. If recognition occurs in a <initial> element, you check the results in the <filled> element of the containing form. If recognition occurs in a <link> or <choice> element, you check the results in a <block> at the top of the dialog or document to which the <link> or <choice> element sent you. VOICEXML PROGRAMMER S GUIDE 51

64 USING MULTIPLE-RECOGNITION In the most common case, you check for multiple utterances to decide how to set input variables following speech recognition in a <field> or <initial> element. Whether or not N-best recognition is enabled, the most likely recognized utterance is used to set relevant input variables. If the most likely utterance matches more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules. To check whether more than one result was found, you examine the application.lastresult$ array, which may contain up to maxnbest elements; in most cases, fewer results are returned. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index. If you find that only one result was returned, you do not need to take special actions; you can use the results in the input variables just as if N-best recognition were disabled. Otherwise, you can ask the user to select among the various results. Selecting an Utterance Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible results. Each result corresponds to a different possible utterance; the utterances are ordered by speech-recognition engine confidence level. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city If speech recognition occurs in a <link> or <choice> element, you typically don t use the selected result to set input variables. Instead, you use it to decide which dialog or document to visit. Example This application allows a user to schedule a visit with one of the company s offices, identified by the city where the office is located. The grammar includes three cities whose names have somewhat similar sounds: Austin, Boston, and Houston. To allow for the situation in which the speech-recognition engine cannot distinguish among those names, the maxnbest property for the office field is set to 3. Note that N-best recognition is enabled only during interpretation of the office field. If the application receives more than one recognition result for the office field, it prompts the user to select a number corresponding to one of the possible utterances. It also lets the user start over (in case none of the possible utterances is correct). The application keeps track of the number of recognized utterances. If the user gives an inappropriate number when asked for clarification, the application prompts again. 52 VOICEXML PROGRAMMER S GUIDE

65 Using N-Best Recognition Sample Interactions In this interaction, the user s answer is clear. Application: Which office would you like to visit? User: Denver. Application: Scheduling a visit to the Denver office. In this interaction, the application cannot distinguish among possible responses. Application: User: Application: User: Application: Which office would you like to visit? (Garbled) estin. Please answer 1 if you said Austin; 2 if you said Boston; 3 if you said Houston; if you want to start over, answer 0. Two. Scheduling a visit to the Boston office. In this interaction, the user wants to give the city again instead of selecting one of the options. Application: User: Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. Zero. Which office would you like to visit? Houston. Scheduling a visit to the Houston office. In this interaction, the user enters an invalid selection when asked for clarification. Application: User: Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. Three. Unrecognized option. Please answer 1 if you said Boston; 2 if you said Austin; if you want to start over, answer 0. Two. Scheduling a visit to the Austin office. VOICEXML PROGRAMMER S GUIDE 53

66 USING MULTIPLE-RECOGNITION Application Code <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <var name="myresults"/>  <var name="choiceprompt"/>    <script> <![CDATA[ function listresults(allresults) { var promptmsg = "Please answer "; var promptindex = 1; for (var i = 0; i < allresults.length; i++) { promptmsg = promptmsg + promptindex + " if you said " + allresults[i].utterance + "; "; ++promptindex; } promptmsg = promptmsg + "If you want to start over, answer 0." return promptmsg; } function findresult(allresults, strindex) { return allresults[strindex - 1].utterance; } ]]> </script> <field name="office" > <property name="maxnbest" value="3"/> <property name="bevocal.maxinterpretations" value="1"/> <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( austin ) { <office austin> } ( boston ) { <office boston> } ( chicago ) { <office chicago> } ( denver ) { <office denver> } ( houston ) { <office houston> } ])]]> </grammar> <prompt>which office would you like to visit?</prompt> <filled> <if cond="application.lastresult$.length > 1">  <assign name="myresults" expr="application.lastresult$"/>  <assign name="choiceprompt" expr="listresults(myresults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choiceprompt"/> </prompt> 54 VOICEXML PROGRAMMER S GUIDE

67 Using Multiple Interpretations <filled> <if cond="choice == 0">  <clear/> <elseif cond="choice > myresults.length"/>  <prompt>unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="office" expr="findresult(myresults, choice)"/> </if> </filled> </field> <block> <prompt>scheduling a visit with the <value expr="office"/> office</prompt> </block> </form> </vxml> Using Multiple Interpretations If the grammar that is used for a particular field or form is ambiguous, you can enable multiple interpretations when the user s spoken input is matched against that grammar. By default, multiple interpretations is disabled. If you want to use this feature, you must explicitly enable it. After recognition in which this feature is enabled, you check to see whether more than one interpretation was found. If so, you can prompt the user to select among the possible interpretations. Enabling Multiple Interpretations You enable multiple interpretations by setting the bevocal.maxinterpretations property to a value other than one. If only multiple interpretations is enabled, the value is the maximum number of distinct interpretations that the speech-recognition engine should return. If N-best recognition is also is enabled, the value for bevocal.maxinterpretations is used in conjunction with the value of the maxnbest property to determine how many results to return. This chapter describes using multiple interpretations recognition alone. Using Both Features Together on page 59 describes how to combine N-best recognition with multiple interpretations. When multiple interpretations is enabled, if the user s utterance match more than one rule in an ambiguous grammar, all corresponding interpretations are included in the recognition results. The most common use of ambiguous grammars is in recognizing input in <field> and <initial> elements. You typically should avoid using an ambiguous <link> or <choice> grammar. The remainder of this document assumes that multiple interpretations are found for spoken input in a <field> or <initial> element. You should enable multiple interpretations only when you need it namely in a particular field or form in which you use an ambiguous grammar. Your application should be able to handle the specified number of interpretations. VOICEXML PROGRAMMER S GUIDE 55

68 USING MULTIPLE-RECOGNITION Checking for Multiple Interpretations You should check for multiple interpretations in the <filled> element of a field or mixed-initiative form that has an ambiguous grammar. If the user s response matched more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules. If multiple interpretations is enabled, you should check whether additional results were returned. The number of results returned from the speech-recognition engine does not necessary equal bevocal.maxinterpretations; in most cases, fewer results are returned. To check whether more than one result was found, you examine the application.lastresult$ array, which may contain up to bevocal.maxinterpretations elements; in most cases, fewer results are returned. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index. If you find that only one result was returned, you do not need to take special actions; you can use the values of the input variables just as if multiple interpretations were disabled. Otherwise, you can ask the user to select among the various interpretations. Selecting an Interpretation Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible interpretations. Each result corresponds to a different interpretations of the most likely utterance; the different interpretations are in an undefined order. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city Example This application prompts the user for an employee. The grammar allows the user to identify an employee by first name only, by first name and last name, by nickname, or by nickname and last name. The first name Robert is ambiguous: it could mean either Bob Smith or Rob Black. To allow for an ambiguous answer, the bevocal.maxinterpretations property for the employee field is set to 2. Multiple interpretations is enabled only during interpretation of the employee field. Sample Interactions In this interaction, the user s answer is unambiguous. Application: User: Application: Which employee do you want to call? Alice. Placing call to Alice Brown. 56 VOICEXML PROGRAMMER S GUIDE

69 Using Multiple Interpretations In this interaction, the user gives an ambiguous name. Application: Which employee do you want to call? User: Robert. Application: Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. User: One. Application: Placing call to Robert Smith. In this interaction, the user enters an invalid selection when asked for clarification. Application: User: Application: User: Application: User: Application: Which employee do you want to call? Robert. Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. Three. Unrecognized option. Please say 1 if you mean Robert Smith; 2 if you mean Robert Black. Two. Placing call to Robert Black. Application Code <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <var name="myresults"/>  <var name="ninterps"/>  <var name="choiceprompt"/>  <script> <![CDATA[ // Create a prompt for clarification, asking the user to choose // an interpretation of the most likely utterance function listinterps(allinterps) { var promptmsg = "Please say "; var promptindex = 0; for (var i = 0; i < allinterps.length; i++) { if (allinterps[i].utterance == allinterps.utterance) { promptmsg = promptmsg + ++promptindex + " if you mean " + allinterps[i].interpretation.employee + "; "; } } ninterps = promptindex; return promptmsg; } VOICEXML PROGRAMMER S GUIDE 57

70 USING MULTIPLE-RECOGNITION // Return the interpretation chosen by the user function findinterp(allinterps, strindex) { return allinterps[strindex - 1].interpretation.employee; } // Count the number of recognition results whose utterance matches // the most likely utterance function countinterps(allresults) { var i, c = 0; for (i = 0; i < allresults.length; i++) { if (allresults[i].utterance == allresults.utterance) { c++; } } return c; } ]]> </script> <field name="employee" > <property name="bevocal.maxinterpretations" value="2"/>  <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( alice?brown ) {<employee "alice brown"> } ( robert?smith ) {<employee "robert smith"> } ( bob?smith ) {<employee "robert smith"> } ( robert?black ) {<employee "robert black"> } ( rob?black ) {<employee "robert black"> } ( joe?jones ) {<employee "joseph jones"> } ( joseph?jones ) {<employee "joseph jones"> } ])]]> </grammar> <prompt>which employee do you want to call?</prompt> <filled> <var name="count" expr="countinterps(application.lastresult$)"/> <if cond="count > 1">  <assign name="myresults" expr="application.lastresult$"/>  <assign name="choiceprompt" expr="listinterps(myresults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choiceprompt"/> </prompt> <filled> <if cond="choice == 0 choice > ninterps"> 58 VOICEXML PROGRAMMER S GUIDE

71 Using Both Features Together  <prompt>unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="employee" expr="findinterp(myresults, choice)"/> </if> </filled> </field> <block> <prompt>placing call to <value expr="employee"/></prompt> </block> </form> </vxml> Using Both Features Together You can enable both multiple-recognition features for a particular field or form that has an ambiguous grammar and in which you anticipate that user inputs might sound similar to more than one expected response. Enabling Both Features You enable N-best recognition by setting the maxnbest property to a value greater than one. You enable multiple interpretations by setting the bevocal.maxinterpretations property to a value other than one. Speech-recognition results are returned in the application.lastresult$ array. Each element corresponds to one interpretation of one likely utterance. The same utterance may have different interpretations, and two or more different utterances may have a common interpretation. The values of the two properties are used together to limit the number of results that are returned by the speech-recognition engine. If bevocal.maxinterpretations is undefined or less than one, up to maxnbest results are returned. If bevocal.maxinterpretations is greater than one, up to maxnbest distinct utterances are returned, each of which can have up to bevocal.maxinterpretations distinct interpretations. The maximum number of results is the product of maxnbest and bevocal.maxinterpretations. For example, if maxnbest is 3 and bevocal.maxinterpretations is 0, a maximum of 3 results can be returned; if maxnbest is 3 and bevocal.maxinterpretations is 2, a maximum of 6 results can be returned. Checking for Multiple Results You should check for multiple results in the <filled> element of a field or mixed-initiative form for which both multiple-recognition features are enabled. As always, the most likely recognized utterance is used to set relevant input variables. If the most likely utterance matches more than one grammar rule, the relevant input variables are set according to an arbitrary one of those rules To check whether more than one result was found, you examine the application.lastresult$ array. The application.lastresult$ array contains at least one element, namely, application.lastresult$[0]. You can check application.lastresult$.length to see how many elements are in the array. For a given index i, application.lastresult$[i] is undefined if the array contains no object at that index. VOICEXML PROGRAMMER S GUIDE 59

72 USING MULTIPLE-RECOGNITION If you find that only one result was returned, you do not need to take special actions; you can use the results in the input variables just as if multiple recognition were disabled. Otherwise, you can ask the user to select among the various results. Selecting a Result Once you have determined that the application.lastresult$ array contains more than one result, your application can interact with the user to determine which result was intended. Each object in the array corresponds to one likely result; its utterance property is the recognized utterance and its interpretation property is the interpretation of that utterance. You can ask the user to select among the possible result. You should assume that the different recognition results may correspond to different possible utterances as well as different interpretations of some utterances. Elements for different possible utterances are ordered by speech-recognition engine confidence level; elements for the different interpretations of a given utterance are in an undefined order. After the user selects a result, you can set input variables accordingly. For example, if the user selects the third recognition result, the interpretation is in: application.lastresult$[2].interpretation The interpretation has a property for each slot that is filled in by the matching grammar rule. You can access these properties to get the values for input variables. Typically, a slot name is identical to the name of an input variable. For example, the value for the city field of the interpretation is in: application.lastresult$[2].interpretation.city Simple Example This application allows a user to schedule a visit with one of the company s offices, identified by the city where the office is located. The grammar includes three cities whose names have somewhat similar sounds: Austin, Boston, and Houston. The grammar allows the user to identify an office by city only or by city and state. The city Austin is ambiguous; the company has offices in both Austin, Texas and Austin, California. To allow for the situation in which the speech-recognition engine cannot distinguish among similar sounding names, and/or it recognizes an ambiguous answer the maxnbest property for the office field is set to 4. The bevocal.maxinterpretations is not set; it is undefined by default, so multiple interpretations is also enabled and the maximum number of results in 4. Multiple recognition is enabled only during interpretation of the office field. If the application receives more than one recognition result for the office field, it prompts the user to select a number corresponding to one of the possible results. Any result whose confidence level is within 0.3 of the highest confidence level is considered. If the grammar included rules in which different similar-sounding utterances could produce the same interpretation, the application could ensure that it only asks the user about unique interpretations. 60 VOICEXML PROGRAMMER S GUIDE

73 Using Both Features Together Sample Interactions In this interaction, the application cannot distinguish between two possible responses, one of which is ambiguous. Application: User: Application: User: Application: Which office would you like to visit? (Garbled) ahstin. Please say 1 if you mean Boston Massachusetts; 2 if you mean Austin Texas; 3 if you mean Austin California. If you want to start over, answer 0. Two. Scheduling a visit with the Austin Texas office. Application Code <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <var name="myresults"/>  <var name="ninterps"/>  <var name="choiceprompt"/>        <script> <![CDATA[ function listinterps(allinterps) { var promptmsg = "Please say "; var index = 0; var promptindex = 1; var maxconfidence = allinterps[0].confidence; while (allinterps[index]!= undefined && maxconfidence - allinterps[index].confidence < 0.3) { promptmsg = promptmsg + promptindex + " if you mean " + allinterps[index].interpretation.office + "; "; ++index; ++promptindex; } ninterps = index; promptmsg = promptmsg + "if you want to start over, say 0." return promptmsg; } function findinterp(allinterps, strindex) { return allinterps[strindex - 1].interpretation.office; } ]]> </script> <field name="office" > <property name="maxnbest" value="4"/> VOICEXML PROGRAMMER S GUIDE 61

74 USING MULTIPLE-RECOGNITION  <grammar type="application/x-nuance-gsl"> <![CDATA[([ ( austin?texas ) {<office "austin texas"> } ( austin?california ) {<office "austin california"> } ( boston?massachusetts ) {<office "boston massachusetts"> } ( chicago?illinois ) {<office "chicago illinois"> } ( denver?colorado ) {<office "denver colorado"> } ( houston?texas ) {<office "houston texas"> } ])]]> </grammar> <prompt>which office would you like to visit?</prompt> <filled> <if cond="application.lastresult$.length > 1 && application.lastresult$[0].confidence - application.lastresult$[1].confidence < 0.3">  <assign name="myresults" expr="application.lastresult$"/>  <assign name="choiceprompt" expr="listinterps(myresults)"/> <else/>  <assign name="choice" expr="0"/> </if> </filled> </field> <field name="choice" type="digits"> <prompt> <value expr="choiceprompt"/> </prompt> <filled> <if cond="choice == 0">  <clear/> <elseif cond="choice > ninterps"/>  <prompt>unrecognized option.</prompt> <clear namelist="choice"/> <else/> <assign name="office" expr="findinterp(myresults, choice)"/> </if> </filled> </field> <block> <prompt>scheduling a visit with the <value expr="office"/> office</prompt> </block> </form> </vxml> Generating a Subdialog The preceding examples asked the user to enter a number corresponding to the intended response. You can produce a more sophisticated interaction by generating a subdialog from the value of application.lastresult$ and using the subdialog to request disambiguation. This application consists of a mixed-initiative form that prompts the user for a city and state. As in the preceding example, the grammar is ambiguous and some possible city names sound similar. After the user 62 VOICEXML PROGRAMMER S GUIDE

75 Using Both Features Together has filled in the city and state, the application checks application.lastresult$ to see whether multiple recognition results were found and, if so, whether the confidence levels of the first two results are within 0.3. If so, the application calls a subdialog, which is generated from the value of application.lastresult$ by a perl script. The perl script receives the array of recognition results as a POST parameter named results. Sample Interaction In this interaction, the application cannot distinguish between two possible responses, both of which are ambiguous. Application: User: Application: User: Application: Please name a city and state. (Garbled) ahstin. I didn t quite get that. Please say that one when you hear the city you want. Boston, Massachusetts. (Pause) Boston, Maine. (Pause) Austin, Texas. Yes. You chose Austin Texas. Application Code <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <property name="maxnbest" value="5"/> <var name="results"/>   <grammar mode="voice" type="application/x-nuance-gsl"> <![CDATA[([ ( austin?california ) { <city austin> <state california> } ( austin?texas ) { <city austin> <state texas> } ( boston?maine ) { <city boston> <state maine> } ( boston?massachusetts ) { <city boston> <state massachusetts> } ( chicago?illinois ) { <city chicago> <state illinois> } ( denver?colorado ) { <city denver> <state colorado> } ( houston?texas ) { <city houston> <state texas> } ])]]> </grammar> <initial> <prompt>please name a city and state</prompt> </initial> <field name="city"> Choose a city <grammar type="application/x-nuance-gsl"> [ austin boston chicago denver houston] </grammar> </field> VOICEXML PROGRAMMER S GUIDE 63

76 USING MULTIPLE-RECOGNITION <field name="state"> Which state? <grammar type="application/x-nuance-gsl"> [ california colorado illinois maine massachussetts texas] </grammar> </field> <filled namelist="city state" mode="all">    <log>last result is <value expr="application.lastresult$"/> </log> <if cond="application.lastresult$.length > 1 && application.lastresult$[0].confidence - application.lastresult$[1].confidence < 0.3"> <assign name="results" expr="application.lastresult$"/> <goto nextitem="disambig"/> </if> </filled>      <subdialog name="disambig" cond="false" src="disambig.pl" namelist="results" method="post" > <filled>   <assign name="city" expr="disambig.city"/> <assign name="state" expr="disambig.state"/> </filled> </subdialog> <block> <prompt>you chose <value expr="city"/>, <value expr="state"/>.</prompt> </block> </form> </vxml> 64 VOICEXML PROGRAMMER S GUIDE

77 Using Both Features Together Perl Script #!/usr/local/bin/perl5 # This sample Perl-based CGI demonstrates a server-side technique # for disambiguating the multiple utterances or multiple interpretations # from speech recognition. This script assumes the recognition was # from a grammar that filled two slots: "city" and "state". # # The expected HTTP request parameters are: # results.length - The number of results from the recognition. # # For i from 0 to results.length - 1: # results[i].confidence - The confidence of this result, 0 to 1 # results[i].interpretation.city - The city recognized for this result # results[i].interpretation.state - The state recognized for this result use CGI; # Print out the XML and VoiceXML headers # print "\n"; print "<?xml version=\"1.0\"?>\n\n"; print "<!DOCTYPE vxml PUBLIC \"-//BeVocal Inc//VoiceXML 2.0//EN\"\n"; print "\" print "<vxml version=\"2.0\">\n"; # Print the beginning of the disambiguation dialog # print " <form>\n"; print " <block>\n"; print " I didn't quite get that.\n"; print " Please say 'that one' when you hear the city you want.\n"; print " </block>\n"; # Create two variables to store the disambiguation results # print " <var name=\"city\"/>\n"; print " <var name=\"state\"/>\n"; # Figure out how many results we have to deal with $length = CGI::param("results.length"); # Save the maximum confidence of any result (which we know is always the first one) # $maxconfidence = ( CGI::param("results[0].confidence") ); for ($i = 0; $i < $length; $i++) { my ($fname) = ( "f$i" ); # field name my ($city) = ( CGI::param("results[$i].interpretation.city") ); my ($state) = ( CGI::param("results[$i].interpretation.state") ); my ($level) = ( CGI::param("results[$i].confidence") ); # If the confidence of this result is close enough to the maximum VOICEXML PROGRAMMER S GUIDE 65

78 USING MULTIPLE-RECOGNITION # one, then use it in the disambiguation # if (($maxconfidence - $conf) < 0.3) { # Generate a field that prompts this city/state name # and then pauses briefly to let the user say "that one". # If the user remains silent, our custom <noinput> handler # simply goes to the next field. # print " <field name=\"$fname\">\n"; print " <grammar>[ yes ( that one ) ]</grammar>\n"; print " <prompt timeout=\"0.75s\">\n"; print " $city, $state\n"; print " </prompt>\n"; # If the user said "that one" (or "yes"), fill in the result variables. # The form-level <filled> block below will then return them. # print " <filled>\n"; print " <assign name=\"city\" expr=\"'$city'\"/>\n"; print " <assign name=\"state\" expr=\"'$state'\"/>\n"; print " </filled>\n"; # The user didn't say anything. Tell the interpreter to go to the # next field by setting this field's form item variable to true. # print " <noinput>\n"; print " <reprompt/>\n"; print " <assign name=\"$fname\" expr=\"true\"/>\n"; print " </noinput>\n"; print " </field>\n"; } # End if }; # End for # If we get here, it means the user didn't say "that one" on any of the fields. # Clear them all out and try again. # print " <block>\n"; print " <clear/>\n"; print " </block>\n"; # This form-level filled block returns the city and state variables # as soon as any of the fields are filled by the user saying "that one" # print " <filled mode=\"any\">\n"; print " <return namelist=\"city state\"/>\n"; print " </filled>\n"; print " </form>\n"; print "</vxml>\n"; print "\n"; exit 0; 66 VOICEXML PROGRAMMER S GUIDE

79 PART 2 VoiceXML Extended This part explains how to use BeVocal VoiceXML extensions: Chapter 6, Controlling Outbound Calls Chapter 7, Go-Back Facility Chapter 8, TTS and Recorded Voice Selection Chapter 9, Dynamic SSML Chapter 10, SOAP Client Facility

81 6 Controlling Outbound Calls A session begins when a call is either received or placed, and then associated with an application via the BeVocal VXML Interpreter. The application can then initiate an outbound call to a third party, allowing the user to talk to the called third party: The standard VoiceXML tag <transfer> transfers the user s call to the third party, either terminating execution of the application or temporarily suspending execution of the application (except for recognition of speech that matches the transfer grammar). The BeVocal VoiceXML extension <bevocal:dial> initiates an outbound call to a third party and continues executing the application. Additional extension tags can modify the outbound call in any of the following ways: The outbound call can be put on hold. The application can listen to the user, waiting for a recognized utterance. The application can put the user on hold and play audio output to the called third party. This chapter describes interactions among the user, the application, and the called third party during outbound calls: Call Status Limitations on Outbound Calls Interactions Without an Outbound Call Interactions During a Transfer Interactions During a Dialed Call Putting a Dialed Call on Hold Listening to a Dialed Call Interrupting a Dialed Call VoIP and Outbound Calls Note: Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the call-control features become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Call Status An call is in progress from the time a successful connection is made, to the time the call is terminated. A call is active from the time the call is connected until the Interpreter detects a hangup.

82 CONTROLLING OUTBOUND CALLS Limitations on Outbound Calls Only one outbound call can be in progress at a given time: During an outbound call placed by a <transfer> tag, execution of the application is suspended. Consequently, it is not possible to try to place a second outbound call at the same time. After an outbound call is placed by a <bevocal:dial> tag execution continues; that call must be terminated before another <bevocal:dial> tag or a <transfer> tag is executed. Note: Putting the outbound call on hold is not sufficient; it must be terminated before another outbound call can be placed successfully. Additional restrictions apply to outbound calls placed by a Café customer s VoiceXML application: The application may place a local or long-distance outbound call, but not an international call. The maximum duration of the outbound call is 60 seconds. Inbound calls using VoIP cannot currently make outbound calls. These restrictions do not apply to a hosting customer s VoiceXML applications. Interactions Without an Outbound Call When no outbound call is in progress, the user speaks and sends DTMF signals to the application; the application listens to the user and plays audio output to the user. User speech, DTMF VoiceXML Platform audio output VoiceXML Application listening, executing Inbound Call Interactions During a Transfer A <transfer> element places an outbound call to a third party. During the outbound call, the user and the called third party talk to each other; either the application is quiet or it is completely terminated. BeVocal VoiceXML currently supports three transfer methods: bridge, blind and supervised. During a bridging transfer, execution of the application is suspended. When the outbound call terminates, execution resumes. At that time, child elements of the <transfer> element (for example, <filled>) are executed, if appropriate. Then, the interpreter proceeds as usual, looking for the next form item to execute. 70 VOICEXML PROGRAMMER S GUIDE

83 Interactions During a Transfer The application may participate in the outbound call, depending on whether the <transfer> element includes child grammars. BeVocal VoiceXML also supports blind and supervised transfers. In a blind transfer, as soon as the session starts the outbound call, the VoiceXML session ends. Regardless of the success or failure of the outbound call, control does not return to the VoiceXML application. In a supervised transfer, once the outbound call is successfully connected, the VoiceXML session ends and control cannot return to the VoiceXML application; however, if the outbound call is not successfully connected (for example, due to no answer or line busy), then control returns to the VoiceXML application. Note: To use blind or supervised transfers, contact BeVocal Customer Support. Without a Transfer Grammar If the <transfer> element has no child grammar, the VoiceXML application simply waits for the outbound call to terminate. The application does not listen to the user; execution is suspended, so the application cannot play audio output to the user. Because the inbound communication channel is not used in this situation, the inbound call is inactive. Outbound Call User speech (and possibly DTMF) speech (and possibly DTMF) Called Third Party VoiceXML Platform VoiceXML Application waiting Inbound Call (inactive) VOICEXML PROGRAMMER S GUIDE 71

84 CONTROLLING OUTBOUND CALLS With Transfer Grammars If the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the outbound call is terminated. Outbound Call User speech (and possibly DTMF) speech (and possibly DTMF) Called Third Party VoiceXML Platform VoiceXML Application listening, waiting Inbound Call Interactions During a Dialed Call A <bevocal:dial> tag places an outbound call to a third party. After the call is placed, the user and the called third party talk to each other. Execution of the application continues as soon as the call is placed. 72 VOICEXML PROGRAMMER S GUIDE

85 Interactions During a Dialed Call The inbound communication channel remains open, and the application continues to listen to the user. If a user utterance matches an active grammar, the application responds to the recognized utterance as usual. Outbound Call User speech (and possibly DTMF) speech (and possibly DTMF) Called Third Party VoiceXML Platform audio output VoiceXML Application listening, executing Inbound Call The name attribute of the <bevocal:dial> tag is required; its value is the name of a variable. If the specified name does not match an existing variable name, a new variable is created. When the call is placed successfully, the specified variable is set to a JavaScript object referring to the call. That variable can be used in the call attribute of other tags to terminate the call or to modify interactions during the call. The outbound call continues until the called third party hangs up, the application executes a <bevocal:disconnect> tag, or the call exceeds its maximum allowed duration. If the inbound call terminates (either because the user hangs up or because the application executes a <disconnect> tag), the outbound call is also terminated and the session ends. The interactions during a dialed call can be modified in any of the following ways: If the destination phone number specifies an extension as post-dial digits, those digits are sent to the called third party as DTMF signals after the call is answered. If the onhold attribute of the <bevocal:dial> tag is true, the outbound call is put on hold as soon as the connection is made. In addition, an active call is put on hold when a <bevocal:hold> tag is executed. See Putting a Dialed Call on Hold on page 74. A <bevocal:listen> element allows the application to suspend execution temporarily during the outbound call. See Listening to a Dialed Call on page 75. A <bevocal:whisper> element puts the user on hold, allowing the application to play audio output to the called third party. See Interrupting a Dialed Call on page 76. VOICEXML PROGRAMMER S GUIDE 73

86 CONTROLLING OUTBOUND CALLS Putting a Dialed Call on Hold A dialed outbound call can be put on hold in either of two ways: A new call is put on hold as soon as the connection is made if the onhold attribute of the <bevocal:dial> tag is true; otherwise, the call remains active. An active outbound call is put on hold if the interpreter executes a <bevocal:hold> tag. While the outbound call is on hold, communications are like those when no outbound call is in progress. Namely, the user speaks and sends DTMF signals to the application; the application listens to the user and plays audio output to the user. For SIP calls, you can use the transferaudio attribute to play audio to the called third party. Otherwise, the called third party hears silence. Neither the user nor the application hears the third party. Outbound Call (inactive) User Called Third Party speech, DTMF VoiceXML Platform audio output VoiceXML Application listening, executing Inbound Call Note: You cannot place a new outbound call while a prior call is on hold. Placing a call on hold does not extend its maximum allowed duration. The maxtime attribute of <bevocal:dial> limits the total time that the call can be in progress (not the time the call can be active). An outbound call remains on hold until the interpreter executes a <bevocal:connect> tag to reconnect it; at that time, the dialed call becomes active once again. Alternatively, a call that is on hold can be terminated without becoming active. 74 VOICEXML PROGRAMMER S GUIDE

87 Listening to a Dialed Call Listening to a Dialed Call While a dialed call is active, a <bevocal:listen> tag can temporarily suspend execution of the application. The application may listen to the user while execution is suspended, depending on whether the <bevocal:listen> element includes child grammars. A <bevocal:listen> element is an input item; if a user utterance matches a child grammar, its input variable is set to the recognition result. After successful recognition, execution of the application resumes and the outbound call continues. At that time, child elements of the <bevocal:listen> element are executed, if appropriate. Then, the interpreter proceeds as usual, looking for the next form item to execute. If you want the outbound call to end when recognition occurs, you must terminate the call explicitly in a <filled> element. If the outbound call terminates with no recognition, an event is thrown and execution continues. In this case, the input variable is not set. Note: An outbound call must be active when the <bevocal:listen> element is executed. If you want to prevent the <bevocal:listen> element from being selected for execution after the outbound call terminates, you can set the input variable explicitly in an event handler. With Listen Grammars If a <bevocal:listen> element includes any child grammars, the application listens to the user, suspending execution until recognition occurs or the outbound call terminates. Outbound Call User speech (and possibly DTMF) speech (and possibly DTMF) Called Third Party VoiceXML Platform VoiceXML Application listening, waiting Inbound Call VOICEXML PROGRAMMER S GUIDE 75

88 CONTROLLING OUTBOUND CALLS Without a Listen Grammar If a <bevocal:listen> element has no child grammars, the VoiceXML application suspends execution until an event is thrown indicating that the outbound call has terminated. The application does not listen to the user; execution is suspended, so the application cannot play audio output to the user. Because the inbound communication channel is not used in this situation, the inbound call is inactive. Outbound Call User speech (and possibly DTMF) speech (and possibly DTMF) Called Third Party VoiceXML Platform VoiceXML Application waiting Inbound Call (inactive) Interrupting a Dialed Call While a dialed call is active, a <bevocal:whisper> tag can interrupt the call. The user is put on hold and the application is connected to the called third party. The application can then play audio output to the third party. While the call is interrupted, the third party can hear the application, but cannot hear the user. The application ignores all speech and DTMF signals, whether from the user or from the called third party. For 76 VOICEXML PROGRAMMER S GUIDE

89 VoIP and Outbound Calls SIP calls, you can use the transferaudio attribute to play audio to the user. Otherwise, the user hears silence. Outbound Call User Called Third Party VoiceXML Platform audio output VoiceXML Application executing Inbound Call (inactive) The user is reconnected to the third party when execution of the <bevocal:whisper> element terminates whether execution terminates successfully or because an event it thrown. VoIP and Outbound Calls The BeVocal Platform is capable of placing outbound calls to both PSTN and SIP destinations. However, calls to SIP destinations must originate on a VoIP gateway. Outbound calls are not currently supported in the BeVocal Café developement environment. For more information, see the BeVocal Voice Over IP (VoIP) Support Quick Reference. VOICEXML PROGRAMMER S GUIDE 77

90 CONTROLLING OUTBOUND CALLS 78 VOICEXML PROGRAMMER S GUIDE

91 7 Go-Back Facility The BeVocal VoiceXML go-back facility allows the user to retract the last response or to transition back to the last location in an application. This chapter describes: Retracting User Responses Go-Back Stack Go-Back Destinations Enabling the Go-Back Facility Controlling Go-Back Behavior Using the Go-Back Facility Note: The go-back facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. Retracting User Responses If the go-back facility is enabled, the user can retract the last response to a VoiceXML application by saying go back. After the interpreter removes the user s response, it prompts for the information again. For example, the following form asks for the user s home and work phone numbers: <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt> </field> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> Suppose a user inadvertently gives the work number when asked for the home number. The go-back facility would allow the user to correct this mistake. Application: What is your home phone number? User: Application: What is your work phone number?

92 GO-BACK FACILITY User: Go back. Application: What is your home phone number? User: Application: What is your work phone number? User: The go-back facility also allows users to change their minds after requesting one of several alternatives. For example, it would permit the following interaction. Application: User: Application: User: Application: User: Would you like News, Weather, or Traffic? Weather. What city? Go back. Would you like News, Weather, or Traffic? Traffic. Go-Back Stack When user says go back, the interpreter undoes whatever actions resulted from the last response, then it prompts the user for a new response. The user can retract a sequence of responses by saying go back repeatedly. Each request for user input is called a go-back destination. When the user provides the requested input, the interpreter saves information about the go-back destination as an entry on its go-back stack. If the user says go back, the interpreter uses the saved information for the most recent go-back destination on the stack to undo the actions that resulted from the user s response. It then goes back to that go-back destination, popping the corresponding entry off the stack. Stack Entries Each entry on the go-back stack saves information about one step the interpreter performed during the execution of the application. Go-Back Entries The entries corresponding to go-back destinations are called go-back entries; they correspond to the user-visible steps in the interaction. As the user retraces these steps, the interpreter goes back to the appropriate elements within the VoiceXML application, transparently moving between dialogs and documents as necessary. For example: After the user fills the last field in a form, the form may transition to a different form in a different document. If the user says go back to the first question on the new form, the interpreter returns to the first form in the original document. It clears the last field in that form, but restores the values of all other form-item variables in the form. A user s response may match a link grammar that transitions to a different form or that throws an event that causes a transition. Or a user s response may match a document-scoped grammar in a different form, causing a transition to that form. If the user says go back to the first question in the new 80 VOICEXML PROGRAMMER S GUIDE

93 Go-Back Destinations location, the interpreter returns to the form and field that was being visited at the time of the user s last response. Internal Entries In addition to the go-back entries, the go-back stack saves internal entries, which correspond to non-user-visible steps, such as transitions between forms. When the interpreter goes back to the most recent go-back destination, it also undoes each non-user-visible step that occurred after the last go-back destination and pops the corresponding internal entry off the stack. A <block> form item does not request user input and so is not a possible go-back destination. However, any block items that are executed between one input request and the next are saved as internal stack entries that can be undone when the interpreter goes back to the preceding input request. Go-Back Destinations A VoiceXML application can request user input in a menu, in the initial item of a mixed-initiative form, and in an input item. These elements, therefore, can be go-back destinations. Menus A <menu> element asks the user to select a choice. The <menu> element is the go-back destination for the user s response. If the user says go back after selecting a menu choice, the menu is executed again. Mixed-Initiative Forms The <initial> element of a mixed-initiative form asks the user for initial input to the form. This element is the go-back destination for the user s response. If the user says go back after providing initial input, the initial element is executed again. The user s answer to the initial prompt may provide values for several of the form s input variables. When the interpreter undoes an initial element, it clear s not only the initial form-item variable, but also any input variables that were set by the user s response. Input Items For the purposes of the go-back facility, input items can be classified as follows: The <field> and <record> items accept a single user input. These input items appear to the user as a single request for information. The <transfer> item may involve a long interaction between the user and a third party. It provides the application with a single piece of information, namely the result of the transfer. During a transfer, however, the user may provide various pieces of information to the third party and may later want to retract some or all of that information. The <subdialog> item may accept multiple user inputs and so may appear to the user as multiple requests for information. Single-Input Input Items A <field> item asks the user for the value of its input variable. A <record> item asks the user for input to be recorded. These input items are go-back destinations. If the interpreter goes back to one of these items, it clears the corresponding input variable and executes the item again. Going back to a <field> item allows the user to give a different answer; going back to a <record> item allows the user to provide different input to be recorded. VOICEXML PROGRAMMER S GUIDE 81

94 GO-BACK FACILITY Transfer Items A <transfer> item transfers the user to another destination, allowing the user to carry on a conversation with a third party. At the end of a bridging transfer, the interpreter resumes execution of the form containing the transfer item. The user might then say go back to the next request for input. In a blind or supervised transfer, the current session terminates when the transfer is made; the user has no opportunity to invoke the go-back facility at the end of the call. Note: Currently, the BeVocal interpreter supports bridging transfers only. A transfer item is a go-back destination. If the interpreter goes back to a transfer item, the transfer call is repeated. Any change in the information exchanged during the original transfer and during the repeated transfer is determined by the user s conversation with the third party and does not affect the VoiceXML application. For example, the original transfer might place a call in which the user orders a pizza. After that call, the user might say go back, and add a salad to the original order. Subdialogs A <subdialog> item invokes another dialog as a subdialog of the current one. Each request for input made by the subdialog is a go-back destination. The <subdialog> element itself is also a go-back destination. If the user says go back to a request for input inside the subdialog, the go-back behavior is the same as in any other form. Within the subdialog s execution context, the go-back stack is initially identical to the go-back stack in the calling dialog s execution context. As each new input is requested, another go-back destination is pushed onto the stack: If the user says go back to the first input request in the subdialog, the interpreter returns to the last go-back destination in the calling dialog. If the user says go back to a subsequent input request in the subdialog, the interpreter returns to the preceding go-back destination in the subdialog Once the subdialog returns to the calling dialog, however, the subdialog s execution context terminates. The go-back stack in the calling dialog s execution context does not contain any go-back destinations for the input requests made by the subdialog. A new go-back destination is added for the subdialog itself. If the subdialog requests a single user input, the go-back behavior is the same as for any other single-input input item. If a subdialog requests more than one user input, however, the go-back behavior may not be what the user expects. For example, suppose a document contains the following forms: <form id="main"> <field name="a">...</field> <subdialog name="b" src="#sub">... </subdialog> <field name="f">...</field> </form> <form id="sub"> <field name="c">...</field> <field name="d">...</field> <field name="e">...</field> <filled> <return namelist="a B C"/> </filled> </form> 82 VOICEXML PROGRAMMER S GUIDE

95 Enabling the Go-Back Facility A user who says go back when prompted for field F, might expect to provide a different answer for field E in the subdialog. However, the interpreter goes back to subdialog B. It executes the subdialog from the beginning, prompting again for fields C, D, and E. Note: In a future release, you may be able to specify whether go back will go back into the subdialog (to ask for field E in the preceding example) or to the beginning of the subdialog (as currently happens). Enabling the Go-Back Facility To enable the go-back facility, you must set two properties to: Activate the goback universal grammar Specify the minimum size of the go-back stack You use the <property> tag to set both properties. Activating the Universal Grammar The goback universal grammar recognizes the spoken go back request. Like all universal grammars, it is deactivated by default. See Universal Commands and Grammars on page 14. When this grammar is deactivated, the speech-recognition engine does not recognize the input go back. If the user says go back to the prompt for a field, a no-match event is thrown. You set the universals property to activate or deactivate the goback grammar, either in the entire application, or in particular documents, forms, or fields. The go-back facility is activated in the following document, but deactivated during the execution of the first form. <vxml version="2.0" xmlns="  <property name="universals" value="goback help exit"/>... <form>  <property name="universals" value="help exit"/>... </form>... </vxml> Setting the Minimum Stack Size The bevocal.mingoback property specifies the minimum size of the go-back stack. The interpreter keeps at least this many entries on the stack, except at the beginning of the call when fewer steps have been executed, and after the user has said go back so many consecutive times that the stack has been depleted. By default, this property is set to 0, which means that the go-back stack is always empty and the go-back facility is effectively disabled. If the user says go back, the go-back facility responds, I m sorry, I don t know where to go back to. If you want to allow the user to go back, you must set the bevocal.mingoback property to 1 or more. For example, the following application sets the minimum stack size to 20 entries. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" VOICEXML PROGRAMMER S GUIDE 83

96 GO-BACK FACILITY " <vxml version="2.0" xmlns="  <property name="universals" value="goback help"/>  <property name="bevocal.mingoback" value="20"/> <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt> </field> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> </vxml> Controlling Go-Back Behavior You can control the application s use of the go-back facility in the following ways: You can prevent the user from retracting certain inputs. You can customize the application s response to a go-back request from the user. You can deactivate the go-back facility in the entire application, in a particular document, in a particular dialog, or in a particular go-back destination. Suppressing Retraction You can prevent the user from retracting certain inputs by setting the bevocal.goback property. This property controls whether requests for user input are legal go-back destinations. By default, the property is set to true and each request for input is a legal go-back destination. When the user provides the requested input, the interpreter pushes a go-back entry for the request onto its go-back stack. If the bevocal.goback property is false, however, a request for input is a not a legal go-back destination. When the user provides the requested input, the interpreter pushes an internal entry for the request onto its go-back stack. The internal stack entry enables the interpreter to undo the information request if the user returns to an earlier go-back destination; however, it prevents the user from going back to the request itself. A user s response is called retractable if a corresponding go-back entry is added to the stack; if, instead, an internal entry is added to the stack, the response cannot be retracted. If you set the bevocal.goback property to false in a field, the user s input for the field is not retractable. The user cannot go back to that field, but may skip back to retract the preceding retractable input. If you set this property to false in a form, you prevent the user from retracting any input to that form. 84 VOICEXML PROGRAMMER S GUIDE

97 Controlling Go-Back Behavior When several fields are treated as a single conceptual unit, you may want to suppress retraction of all but the first field. For example, the go-back facility treats the city and state fields as a unit in the following form: <form> <field name="city"> <prompt>choose a city</prompt> <grammar>...</grammar> </field> <field name="state"> <property name="bevocal.goback" value="false"/> <prompt>what state?</prompt> <grammar>...</grammar> </field> <field name="first" type="boolean"> <prompt> Do you want to fly first class? </prompt> </field> </form> The user cannot retract an answer to the question about state, but can skip past it to retract the city, as illustrated in the following interaction. Application: User: Application: User: Application: User: Application: Choose a city. Albany What state? Georgia Do you want to fly first class? Go back. Choose a city. Customizing Go-Back When the speech-recognition engine matches the goback grammar, a goback event is thrown. The default handler undoes entries on the go-back stack until it reaches the most recent go-back entry, corresponding to the user s last retractable response. If the go-back stack is empty, the default handler plays an audio message that says I m sorry, I don t know where to go back to. If you want the application to take different actions, you can add your own event handler for go-back events. For example, an application might keep information about each user s default location. If the user requests a traffic report from the main menu, the traffic form might start to fetch the report for the user s default location without requesting the user s city. The application could use the go-back facility to allow the user to provide a different location. <form id="traffic"> <catch event="goback"> <clear/> </catch> <field name="city" expr="document.defaultcity"> <prompt>what city?</prompt> <grammar>...</grammar> </field> <block> VOICEXML PROGRAMMER S GUIDE 85

98 GO-BACK FACILITY <prompt> Retrieving traffic data for <value name="city"> Say Go Back to choose another city. </prompt>  </block> </form> An interaction with the application might proceed as follows. Application: User: Application: User: Application: User: Application: In this case, saying go back takes the user to a question that has never been asked before. If the application s go-back handler needs to take some actions and then proceed as normal to undo the user s response, it can perform the appropriate actions and then rethrow the event to the default handler: <catch event="goback">... <rethrow/> </catch> Would you like news, weather, or traffic? Traffic Retrieving traffic data for San Francisco. Say Go Back to choose another city. Go back. What city? San Jose. Retrieving traffic data for San Francisco. Say Go Back to choose another city. Using the Go-Back Facility This section contains guidelines for using the go-back facility Selecting the Minimum the Stack Size You need to ensure that go-back stack can grow large enough to enable a user to retrace as many steps as you think are likely; see Setting the Minimum Stack Size on page 83, The size of the stack limits the number of consecutive times the user can say go back. Remember that the stack must be large enough to accommodate internal entries as well as go-back entries. When you set the stack size, you should allow for a few internal entries for each go-back entry. Using Blocks You can safely put blocks between go-back destinations in a form. For example, in the following form, if the user goes back to the home field, the interpreter undoes the subsequent block, clearing its item variable and allowing the block to be visited again after the user provides an new answer for the home field: <form> <field name="home" type="phone"> <prompt> What is your home phone number? </prompt> 86 VOICEXML PROGRAMMER S GUIDE

99 Using the Go-Back Facility </field> <block> Your home number is <value expr="home"/> </block> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> The interaction with the user might proceed as follows. Application: What is your home phone number? User: Application: Your home number is What is your work phone number? User: Go back. Application: What is your home phone number? User: Application: Your home number is What is your work phone number? User: A <block> in a form is saved as an internal stack entry only if it occurs after the first go-back destination in the form. If the form s first item is a block containing a welcoming prompt, no internal stack entry is saved for the block, so it will not be revisited if the go-back facility returns to the first input request in the form. Tip: In a mixed-initiative form, put any welcoming prompt in the <initial> element, not in a separate <block> element. The internal stack entry for a block is undone and redone only if the interpreter returns to a go-back destination before the block. As a consequence, a block that is used to prompt for information in the subsequent field is not redone if the interpreter goes back to the field. In the following form, if the user says go back when asked for a work phone number, the request for the home phone number would not be replayed. <form> <block>what is your home phone number?</block> <field name="home" type="phone"></field> <block>your home number is <value expr="home"/></block> <field name="work" type="phone"> <prompt> What is your work phone number? </prompt> </field> </form> VOICEXML PROGRAMMER S GUIDE 87

100 GO-BACK FACILITY Tip: Be sure to put the prompt for a field value inside the <field> element and not in a separate <block> element. Using Subdialogs To avoid any confusion that can occur if the user says go back after returning from a subdialog, try to limit your use of subdialogs to requests for confirmation or disambiguation. In addition, you should prevent the subdialog itself from being a legal go-back destination by setting the bevocal.goback property to false inside the <subdialog> element. If the user says go back after the subdialog returns, the interpreter will go back to the question preceding the subdialog presumably the question whose answer required confirmation or clarification. Using Variables When the interpreter returns to a particular go-back destination in a form, it clears form-item variables for every block, initial item, and input item that needs to be undone. However, it does not change the values of any other variables declared in dialog, document, or application scope. If the interpreter undoes a transition, going back to a different form, it does not restore the variables declared in the form to the values they had when that transition left the form. If you use the go-back facility, you should avoid saving state information in variables that cannot be reset by a go-back operation. In limited circumstances, you may be able to reset variables in an error handler for go-back events. In general, however, the event handler will not have enough context to know what variables need to be reset because the event is thrown at the location where the user says go back, not at the go-back destination. 88 VOICEXML PROGRAMMER S GUIDE

101 8 TTS and Recorded Voice Selection The TTS and Recorded Voice Selection facility of BeVocal VoiceXML allows the voice developer to specify TTS and recorded voices within their VoiceXML programs. This chapter describes: Specifying TTS Voices Specifying Recorded Voices Lists of Fallback Voices Overriding Recorded Voices Note: The TTS and Recorded Voice Selection facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. Specifying TTS Voices Text To Speech (TTS) output can occur in many different places in a VoiceXML application. Some examples where TTS can be played are within <block>, <prompt>, <audio>, and <say-as> tags, and there are others. Supported TTS Voices The following are the currently supported TTS voices and their characteristics: Voice name jennifer julie katarina laurie maria mark reed Characteristics Female, American English. Female, Vocalizer French Canadian Female, RealSpeak German Female, Vocalizer English Female, Vocalizer Spanish Male, American English. Male, Vocalizer American English Note: If not specified, the default TTS voice is jennifer. TTS voices can be specified in two ways, through a VoiceXML property with name bevocal.voice.name, and the <voice> tag using the name attribute. The same syntax is used both. Property syntax <property name="bevocal.voice.name" value="tts_voice_name"/>

102 TTS AND RECORDED VOICE SELECTION Property BeVocal VoiceXML has introduced a property, bevocal.voice.name which allows you to define a voice name. A TTS voice defines the characteristics of the TTS played by a TTS engine. A TTS voice may correspond to a single TTS engine with certain parameters set. Because the BeVocal interpreter can support multiple TTS engines, two TTS voices may correspond to different TTS engines. As with all properties the bevocal.voice.name property is taken from the innermost property scope which applies to the TTS in question. When no property is specified the default TTS voice is used. Property Example The following statement would specify the mark TTS voice for the prompts within the field, including the prompts "Please say a phone number" and "You said ": <field name="myphone"> <property name=bevocal.voice.name" value="mark"/> <grammar src="builtin:grammar/phone"/> <prompt>please say a phone number</prompt> <filled> You said <prompt><say-as type="telephone"> <value expr="myphone"/></prompt> </say-as> </filled> </field> See the <say-as> tag for more details. When specifying TTS voices, it is important to know that some <say-as> types do extra processing on the TTS content of the <say-as> tag, before giving it to the TTS engine, or set parameters on the TTS engine to output the result more naturally. In the case of the telephone type used above, inserting appropriate pauses in 10-digit US numbers helps the TTS output sound more natural. It is usually a good idea to use <say-as> when outputting a type which is supported by <say-as>. In a very special case you can parse the result in ECMAScript for example and put in the appropriate SSML or other TTS markup yourself. Errors An error.noresource event is thrown when an invalid TTS voice is specified. Voice tag Syntax The TTS voice can alternatively be specified using the <voice> tag s name attribute: <voice name="mark"> Say Hello using Mark s voice! </voice> Voice tag description The voice tag syntax allows you to specify TTS Voices at a finer -level of granularity, in SSML, wherever <voice> tags are allowed. With just two TTS voices, one female and one male, the use-cases for switching voices within SSML are few. In general it is probably not a good practice to do so. However, as an illustration, maybe you are reading out a plot from a gripping novel: <voice name="jennifer"> Would you like an apple or a banana? </voice> <voice name="mark"> Didn t you say you had oranges? 90 VOICEXML PROGRAMMER S GUIDE

103 Specifying Recorded Voices </voice> A more typical use-case is using a more natural sounding voice for most TTS, but using a more intelligible TTS voice to read-out some . The BeVocal interpreter can support multiple TTS engines and would surface these as TTS voices. Specifying Recorded Voices A recorded voice is defined as a set of interpreter prompts recorded by a human voice talent. A recorded voice usually has a TTS voice fallback. All recorded voices have TTS voices as fallbacks. A recorded voice is output when using the <say-as> tag and the attribute bevocal:mode="recorded". Supported Recorded Voices Voice name Characteristics Types TTS fallback bv_ann_en_us Female, American English. all jennifer bv_adam_en_us Male, American English equity mark bv_ben_en_us Male, American English citystate mark bv_cecelia_es_us Female, American Spanish all (except time) maria You specify the recorded voice name in the exact same way as the TTS voice name, using the bevocal.voice.name property. The currently available set of recorded voices are itemized in the table. Note: Currently the recorded voice can not be set using the SSML <voice> tag. Example The following example uses the bv_adam_en_us recorded voice within a field which prompts for a stock name and outputs the result using a recorded prompt: <field name="mystock"> <property name= bevocal.voice.name" value="bv_adam_en_us"/> <grammar src="builtin:grammar/equity"/> <prompt>please say a stock name</prompt> <filled> You said <prompt><say-as type="equity" bevocal:mode="recorded"> <value expr="mystock"/></prompt> </say-as> </filled> </field> In the above example the equity will be read out in the bv_adam_en_us recorded voice unless it is not available; in that case, it will fallback to TTS in the mark voice. This could happen for example if that equity has not been recorded in this voice yet. Note that if the recorded voice is not available for a particular <say-as> type, then the TTS fallback is used. As you see in the table only the equity type is supported for bv_adam_en_us, so if any other type is specified the output would fallback to the TTS mark voice. Important: Again, see the <say-as> tag for more details. When specifying a specific <say-as> types and recorded voices, the exact format of the content of the <say-as> tag is important. VOICEXML PROGRAMMER S GUIDE 91

104 TTS AND RECORDED VOICE SELECTION Lists of Fallback Voices Providing lists of voices enables extensibility and flexibility as the BeVocal interpreter supports more recorded and TTS voices. Also it provides a mechanism for BeVocal partners to create applications to override recorded voices for any type. Rationale There are some cases in which a recorded output may not be available for a recorded voice: Recorded voices may be available for only a subset of the possible <say-as> types. For example the bv_adam_en_us voice is currently only available for the equity type. Within a given <say-as> type there may be some values of content which are not available in a specific recorded voice. For example, there may be some street names which are not available when specifying the type street and the bv_ann_en_us recorded voice. Also note that the interpreter may in the future provide support for: additional types for existing recorded voices. more coverage within a type for existing recorded voices. new recorded voices supporting a subset of available types. new recorded voices with no TTS voice. For these reasons and more, the BeVocal interpreter supports a space-delimited list of recorded and TTS voices within the bevocal.voice.name property. Remember that a voice may be: recorded voice with a TTS fallback voice defined. recorded voice only TTS voice only. Algorithm for Selecting the Voice. For any space delimited list of voices, the following algorithm is used. 1. Recorded: use the first (leftmost) recorded voice in which the content is available. 2. TTS: Use the first (leftmost) TTS voice specified, even if it is a fallback of a recorded voice, irrespective of the recorded voice actually used in 1. Selecting Voice Example For example, let s assume there is a newly available voice called bv_joe_en_us which supports all types, but the bv_adam_en_us has more up-to-date equity coverage. The following property declaration: <property name=bevocal.voice.name" value="bv_joe_en_us bv_adam_en_us"/> and the subsequent <say-as> statement: <say-as type="equity" bevocal:mode="recorded"> <value expr="mystock"/> </say-as> would output the equity value in the variable mystock in bv_joe_en_us voice if possible but would fallback to bv_adam_en_us voice, if mystock was not available in the first. This assumes that you prefer Joe for consistency with the other types in your application but want to fallback to Adam to ensure a recorded male voice for equity. 92 VOICEXML PROGRAMMER S GUIDE

105 Overriding Recorded Voices What about TTS fallback in this case? If the bv_joe_en_us voice has a TTS fallback voice it will be used, since it is the first. This is true even if bv_adam_en_us was selected for the recorded voice. What if you didn t like the TTS fallback for the Joe voice, and you wanted to override it? Then you could use a statement like the following: <property name=bevocal.voice.name" value="bv_superman_tts bv_joe_en_us bv_adam_en_us"/> Or alternatively the if the Joe voice didn t have a TTS fallback but you didn t want Adam s TTS fallback used, you could use the same statement above. Best Practices for Voice Selection In most cases you should observe the following principles in your voice selection. Specify one recorded voice, and one TTS voice per application for consistency in the interface. If you specify multiple voices they should have similar characteristics. For BeVocal VoiceXML recorded voices: All recorded voices will have TTS fallbacks. Similar voices should have the same TTS fallback. Overriding Recorded Voices BeVocal provides a mechanism to allow special BeVocal partners, carrier and enterprise customers to override interpreter voices for certain types or all types. This can be combined with BeVocal Services for recording Voice talent. If you are interested in these services and capabilities, send to [email protected]. VOICEXML PROGRAMMER S GUIDE 93

106 TTS AND RECORDED VOICE SELECTION 94 VOICEXML PROGRAMMER S GUIDE

107 9 Dynamic SSML The BeVocal VoiceXML Dynamic SSML facility enables the loading of Speech Synthesis Markup Language (SSML) documents from URIs. This allows VoiceXML developers to generate SSML documents based on parameter values and thus generate complex prompts based on a variety of input criteria. This chapter describes: Introduction Using Dynamic SSML Examples and Notes Errors SSML Document Extensions to the SSML spec Note: The Dynamic SSML facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current implementation of BeVocal VoiceXML contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. For the latest W3C working draft of the Speech Synthesis Markup Language spec, see Introduction Purpose and Scope This document describes the support in the BeVocal VoiceXML interpreter for dynamic SSML. The <audio> tag has been extended with extended attributes to reference the URIs of SSML documents. The SSML document can be generated based on parameters of the URI used to reference it. Goals The goals of this design are: To provide an extension to the <audio> element so that the developers can have a flexible mechanism for dynamically generating simple as well as complex, layered prompts. To provide support for interpreting and executing an SSML document (compliant with the SSML spec) from within a VoiceXML context.

108 DYNAMIC SSML Using Dynamic SSML In order to have dynamic SSML generated and executed from within a VoiceXML document, the <audio> tag has been extended with two new attributes. bevocal:ssml - the URI which generates the dynamic SSML. bevocal:ssmlexpr - A JavaScript expression which resolves to the value of bevocal:ssml. Only one of bevocal:ssml or bevocal:ssmlexpr should be used. The resource that resides at the URI should be an SSML document, compliant with the SSML spec. The interpreter downloads the resource, identified by the URI above, parses it as an SSML document and then places the contents of the SSML document s <speak> tag inline to replace the <audio> tag which references the SSML document. If there was a problem fetching the resource from the above URI or if there was a parse error in the downloaded SSML document, then an error.badfetch is thrown. The alternate text for <audio> is ignored in this case. It is as if the <audio> tag is replaced by the SSML. This behavior is semantically different than the normal <audio> behavior where if alternate text is specified and an error.badfetch forces the alternate text to be played. Note: if the <audio> element inside an SSML document contains either the bevocal:ssml or bevocal:ssmlexpr, then an error.badfetch is thrown at the SSML document parse time. Caching The SSML document is cached according to the current caching properties that were relevant for the <audio> element, which resulted in the SSML download. Examples and Notes <audio> with bevocal:ssml attribute The following vxml source shows how to use the bevocal:ssml attribute with <audio>: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <block> <prompt> <audio bevocal:ssml=" /> </prompt> </block> </form> </vxml> The contents of foo.ssml are: <?xml version="1.0" encoding="iso "?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" " 96 VOICEXML PROGRAMMER S GUIDE

109 Examples and Notes <speak version="1.0" xmlns=" > <voice name="mark"> How are You? Please say one of <audio src=" >add</audio> or <audio src="missing.wav">multiply</audio> or <audio src=" av">divide</audio> </voice> </speak> When the above vxml code is executed then the foo.ssml is downloaded and parsed. The resulting SSML elements are then interpreted and added to the prompt queue in place of the containing <audio> tag as if the original <audio> element in the vxml source code was something like... <prompt> <voice name="mark"> How are You? Please say one of <audio src=" >add</audio> or <audio src="missing.wav">multiply</audio> or <audio src=" av">divide</audio> </voice> </prompt>... VOICEXML PROGRAMMER S GUIDE 97

110 DYNAMIC SSML Errors Syntax or Fetch errors downloading from the SSML URI: If there are errors downloading from the URI specified by the bevocal:ssml attribute of <audio>, then an error.badfetch is thrown <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <catch event="error.badfetch"> SSML download failed </catch> <form> <block> <prompt> <audio bevocal:ssml=" /> </prompt> </block> </form> </vxml> thrown which can be caught with a <catch> handler and appropriate action can be taken. No recursive use of bevocal:ssml or bevocal:ssmlexpr The <audio> elements in an SSML document cannot contain bevocal:ssml or bevocal:ssmlexpr attributes. The following SSML document would result in an error.badfetch at the document parse time. <?xml version="1.0" encoding="iso "?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" " <speak version="1.0" xmlns=" > <audio bevocal:ssml="error.ssml"/> </speak> 98 VOICEXML PROGRAMMER S GUIDE

111 SSML Document SSML Document The SSML document that results from the execution of the <audio> tag with the bevocal:ssml or bevocal:ssmlexpr should be an SSML document compliant with the SSML spec. The root element should be <speak>. A sample SSML document would look like <?xml version="1.0" encoding="iso "?> <!DOCTYPE speak PUBLIC "-//W3C//DTD SYNTHESIS 1.0//EN" " <speak version="1.0" xmlns=" > <voice name="mark"> How are you? <prosody rate="slow"> I am fine </prosody> </voice> </speak> Developers can refer to the SSML DTD via the PUBLIC id "-//W3C//DTD SYNTHESIS 1.0//EN" as shown in the above SSML document. The location of the SSML document serves as the Base URI for resolving any relative URI references inside the SSML document. Also the xml:base attribute on the <speak> element can modify the Base URI, as in VoiceXML 2.0. Extensions to the SSML spec BeVocal VoiceXML adds extended attributes to two SSML elements, in addition to what is specified in the SSML spec. The two elements and their new attributes are: Add vxml 2.0 caching attributes to <audio> The <audio> elements inside the downloaded SSML document can specify all the vxml 2.0 caching attributes, like fetchhint, maxage, maxstale, and so on, in order to control how the interpreter can cache the audio resource. Note that specifying vxml 1.0 caching attributes like caching would result in a parse error as only vxml 2.0 style caching attributes are supported. Add bevocal:mode to <say-as> The <say-as> element can specify the BeVocal VoiceXML extended attribute bevocal:mode to specify if the <say-as> should render the output in a TTS voice (default) or in a recorded voice. VOICEXML PROGRAMMER S GUIDE 99

112 DYNAMIC SSML 100 VOICEXML PROGRAMMER S GUIDE

113 10 SOAP Client Facility This chapter describes the mechanism for calling SOAP services from the BeVocal VoiceXML interpreter. It describes the binding between the JavaScript interpreter in VoiceXML and the SOAP services. It also documents the method by which the interpreter locates SOAP services. The interpreter's SOAP API takes advantage of the dynamic nature of JavaScript. There is no need to create separate proxy classes for each SOAP service you wish to call, because the interpreter can perform the conversion between JavaScript and SOAP types and methods on the fly. This makes SOAP much easier to deal with than in most other languages. This chapter has the following sections: Locating and Identifying SOAP Services Calling SOAP Methods Note: The SOAP Client facility currently supports SOAP services in which messages are RPC-oriented. The facility does not currently support SOAP services in which messages are document-oriented; this will be supported in a future release. For a description of how these message styles differ, refer to Note: The SOAP Client facility is an experimental extension to VoiceXML; its implementation and behavior are subject to change. The current BeVocal VoiceXML implementation contains the feature before it has been standardized so that developers may provide feedback. If this capability becomes a standard part of a future version of VoiceXML, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. Locating and Identifying SOAP Services There is a small API for creating JavaScript objects that act as proxies for SOAP services. It is exposed through the JavaScript object bevocal.soap, which resides in the session scope and has 3 methods. For details on these methods, see Chapter 14, JavaScript Functions and Objects. The first two methods are most commonly used with SOAP services accessible from the general internet or intranet where the BeVocal platform is installed. The last method is used only for BeVocal SOAP services that are exposed via a service registry and service ID. Briefly, the methods are as follows: bevocal.soap.servicefromwsdl Given the URL for a Web Service Definition Language (WSDL) file, create an object to act as a proxy for one of the services described in the file. (This is the preferred way to create a service proxy object, because it provides better type checking.) bevocal.soap.servicefromendpoint Given the name and endpoint URL of a SOAP service, create an object to act as a proxy for the service. (This method should be used only when the WSDL for a service is not available.) bevocal.soap.locateservice Given the service identifier and possibly a version, use the BeVocal SOAP service registry to locate either the standard BeVocal SOAP service or a particular version of that service. Note: Currently, no BeVocal SOAP services are exposed in the BeVocal SOAP service registry.

114 SOAP CLIENT FACILITY There are a number of errors that can happen while locating SOAP services. If an error occurs while creating a SOAP proxy object, an exception of type bevocal.soap.soapexception is thrown. For example, assume want to use a temperature service available from At that site, you discover that the WSDL for this service is at This WSDL looks like: <?xml version="1.0"?> <definitions name="temperatureservice" targetnamespace=" xmlns:tns=" xmlns:xsd=" xmlns:soap=" xmlns=" <message name="gettemprequest"> <part name="zipcode" type="xsd:string"/> </message> <message name="gettempresponse"> <part name="return" type="xsd:float"/> </message> <porttype name="temperatureporttype"> <operation name="gettemp"> <input message="tns:gettemprequest"/> <output message="tns:gettempresponse"/> </operation> </porttype> <binding name="temperaturebinding" type="tns:temperatureporttype"> <soap:binding style="rpc" transport=" <operation name="gettemp"> <soap:operation soapaction=""/> <input> <soap:body use="encoded" namespace="urn:xmethods-temperature" encodingstyle=" </input> <output> <soap:body use="encoded" namespace="urn:xmethods-temperature" encodingstyle=" </output> </operation> </binding> <service name="temperatureservice"> <documentation>returns current temperature in a given U.S. zipcode </documentation> <port name="temperatureport" binding="tns:temperaturebinding"> <soap:address location=" </port> </service> </definitions> 102 VOICEXML PROGRAMMER S GUIDE

115 Calling SOAP Methods The bold lines in the defintion provide the rest of the information you need to create a proxy object for this service in your VoiceXML application: <script> <![CDATA[ var service = bevocal.soap.servicefromwsdl( // WSDL URL " // name attribute of port "TemperaturePort", // targetnamespace attribute of definitions " // name attribute of service "TemperatureService", // location attribute of soap:address child of port element " // ); ]]> </script> Calling SOAP Methods Once you have a JavaScript proxy object for a SOAP service, calling the service's methods is easy. For example, if you have a service that performs temperature conversions, the code might look like this: <script> var converter = bevocal.soap.locateservice("temp_converter", 1.0); var freezing = converter.fahrenheittocelsius(32.0); </script> The proxy object automatically converts the JavaScript fahrenheittocelsius function into a call to the SOAP method with the same name. It converts the sole parameter, 32.0, to the SOAP encoding and passes it as a float. If the converter proxy had access to the WSDL for the service when it was created, it also knows the proper name for the argument and encodes that in the SOAP message as well. Type conversion When calling a SOAP method, there are currently some limitations regarding streaming out the arguments. A fixed mapping of Java classes to XML element types is used during serialization. Currently, a method's WSDL definition is not used when streaming out the method's arguments; the WSDL is used only when pre-flighting the method call. This means that the interpreter cannot always stream out complex types with the correct XML type. Most SOAP servers can deal with this, but a few cannot and will return errors. The biggest problem is with arrays of complex types. In a SOAP array, an attribute of the Array element must contain the type of the contained object. For example: VOICEXML PROGRAMMER S GUIDE 103

116 SOAP CLIENT FACILITY <SOAP-ENC:Array SOAP-ENC:arrayType="xyz:Order[1]"> <Order> <Product>Apple</Product> <Price>1.56</Price> </Order> </SOAP-ENC:Array> In the current implementation, the interpreter is unable to determine the type contained within the array, so it cannot decide what to use as the value of the SOAP-ENC:arrayType attribute. As a default, the interpreter uses xsd:anytype, which causes an error on some servers. The type mappings currently performed are as follows: JavaScript SOAP String xsd:string This mapping automatically works in both directions. If a string is an argument to a SOAP call, the interpreter serializes it as an xsd:string, and if an xsd:string is in a response the interpreter converts it into a String in the JavaScript result object. Number xsd:int xsd:negativeinteger xsd:float Depending on their values, JavaScript Numbers or numeric literals in arguments to SOAP methods are converted into one of these three numeric types from the Schema specification. Any of these three types is converted to Number when received in a SOAP response. Object with properties <multiref> xsd is the XML Schema namespace. The interpreter uses the namespace specified in the WSDL. For services without WSDL information, on output, it uses on input, it accepts any of the following: When the interpreter sees a complex JavaScript object, that is, an object with properties, it serializes it in the SOAP message as a reference to another object. For example, an <item href="..."/> becomes a <multiref>. Each property of the JavaScript object is encoded as a separate element under the <multiref> element. When deserializing SOAP replies, the interpreter does this in reverse. A compound object in a SOAP message is converted to a JavaScript object. Each sub-element in the SOAP value is added as a property of the JavaScript object. The local portion of the SOAP element's name (without the namespace qualifier) is used as the JavaScript property name. Array SOAP-ENC:Array JavaScript arrays are serialized as compound objects. The root element is of type SOAP-ENC:Array and SOAP-ENC:arrayType="xsd:anyType[...]". Each element in the compound object must be the same type. On input, a SOAP-ENC:Array is converted into a JavaScript Array. 104 VOICEXML PROGRAMMER S GUIDE

117 Calling SOAP Methods SOAP Headers Occasionally, you may need to add information to the header sent with your SOAP requests. For example, if you use any of the BeVocal platform services in your VoiceXML application, you must add security information to the header. To add headers, each JavaScript proxy object supports the _addheader method. This method allows you to set an additional SOAP header for all SOAP requests. Once you use the _addheader method on a proxy object, all future requests sent using that object have the added header. If later in the same application you need to send a request to the same service that does not include that header, you must create a different proxy object to service the request. SOAP Methods are JavaScript objects In JavaScript, a method of an object is really just an object property that points to a function. This is true for the interpreter's SOAP proxy objects as well, allowing you to use them as "functors". For example: <script> var converter = bevocal.soap.lookupservice("temp_converter", 1.0); var conversion = converter.fahrenheittocelsius;... // Someone else told me which conversion to use var temperature = conversion(123); </script> Error Handling For WSDL-Based Services When attempting to call a function on a SOAP proxy which has WSDL information available, the interpreter can detect many errors before making the actual SOAP call. This leads to earlier and better error detection and the ability to distinguish different types of errors more easily. Call to Missing Method Remember that in this JavaScript binding, SOAP functions map to JavaScript objects. Consider this statement: result = service.method(arg1, arg2); It first retrieves the method property of the object service, then calls it as a function with the arguments arg1 and arg2, assigning the result to result. When you refer to a nonexistent method of a service proxy that is backed by WSDL, such as: var service = bevocal.soap.servicefromwsdl(wsdlurl,...); var result = service.missingmethod(1, 2); the service proxy will return the JavaScript constant undefined when asked for its missingmethod property, and will print a warning to the log. The JavaScript interpreter will then try to treat the undefined value as a function, immediately causing a runtime error with the message. The log will look like this: JavaScript warning on line 29 of 'soap.vxml: Cannot find operation: missingmethod - none defined ERROR error.semantic: JavaScript error on line 29: missingmethod is not a function. The implementation returns undefined rather than throwing an exception immediately. However, this is an advantage in certain cases, because it allows you to write code like the following: // Get the highest 1.x version of "myservice" var service = bevocal.soap.lookupservice("myservice", "1"); if (service.method2!= undefined) { // Call the new and improved method VOICEXML PROGRAMMER S GUIDE 105

118 SOAP CLIENT FACILITY service.method2(arg1, arg2); } else { // Call the old method service.method1(arg1); } If the interpreter immediately threw a runtime error when you tried to reference a missing method, you could not make this check for an undefined method. Missing or Extra Parameters If client code attempts to call a SOAP method using the wrong number of parameters, the interpreter's SOAP client will throw a bevocal.soap.soapexception exception with the cause property set to CALL_ERROR. The message property will contain an error message explaining that there was an argument mismatch. For example: try { myservice.callmethod("too", "many", "parameters"); } catch (error if error.type == "bevocal.soap.soapexception") { if (error.cause == bevocal.soap.soapexception.call_error) { // Probably passed the wrong number of arguments } } Invalid Parameter Types The current implementation does not validate parameter types against a method's WSDL definition. It simply serializes each parameter using the XML type that seems to be the best match. If this parameter does not match the type that was specified for that parameter in the WSDL, the server for the request will signal an error and return it as described next. It is also possible that the type-conversion engine will report an error if it is unable to serialize a complex JavaScript type. This is described in Type conversion on page 103. Server-side Errors Errors reported by a server are returned in a SOAP message with a <Fault> element. When the interpreter detects a fault message, it throws a bevocal.soap.soapfault. Error Handling for Non-WSDL-Based Services When the interpreter does not have the WSDL available for a service proxy, it can't do much pre-flighting of calls. This means that there are only several types of errors that can occur: Type conversion error marshalling or unmarshalling an object (see Type conversion on page 103). Network communication error All other errors, including invalid method names, bad parameter lists, and so on, are reported by the SOAP server or service and thrown as a bevocal.soap.soapfault, as described above. 106 VOICEXML PROGRAMMER S GUIDE

119 PART 3 VoiceXML Reference This part contain reference descriptions of the components of VoiceXML: Chapter 11, Tags Chapter 12, Properties Chapter 13, Variables Chapter 14, JavaScript Functions and Objects

121 11 Tags This chapter provides detailed information about each VoiceXML tag. See: Tag Summary on page 110 for an overview of the tags, grouped by function Tag Index on page 112 for an alphabetical list of all tags Each tag description includes the following information: Syntax Usage See Also Examples Summary of how the tag is used. of attributes or other details. Table of parent and children tags. Parent tags can contain this tag and children tags can be used within this tag. Links to related information. Short examples you can run as simple, standalone applications. In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not implemented Extension Experimental Extension Deprecated VoiceXML 1.0 only Functionality not currently available. Added functionality. Added functionality that may be included in a later specification for VoiceXML. If the extension is standardized, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. Non-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature. Tag is part of the VoiceXML 1.0 standard, but has been removed from VoiceXML 2.0. Tags and attributes that were added to the specification in VoiceXML 2.0 are marked New in VoiceXML 2.0. VoiceXML Tag Summary lists any differences between the BeVocal VoiceXML implementation and the VoiceXML 2.0 standard.

122 TAGS Tag Summary The following table classifies VoiceXML tags according to their purpose. Purpose Defining the Application Dialogs Input Items of a Form Control Items of a Form Menus Fields Subdialogs Controlling Dialog Transitions Handling Events Tags <vxml> <meta> <metadata> <form> <menu> <field> <bevocal:enroll> Extension <bevocal:listen> Extension <record> <bevocal:register> Extension <subdialog> <transfer> <bevocal:verify> Extension <initial> <block> <menu> <choice> <grammar> <enumerate> <field> <enumerate> <grammar> <option> <filled> <help> <noinput> <nomatch> <subdialog> <param> <return> <goto> <submit> <link> <choice> <catch> <error> <help> <noinput> <nomatch> 110 VOICEXML PROGRAMMER S GUIDE

123 Tag Summary Purpose Controlling Synthesized Speech Specifying Grammars Speaker Verification (Extension) Controlling the Interpreter Containers for Executable Content Declaring and Setting Variables (executable content) Procedural Logic (executable content) Scripting logic (executable content) Tags <break> <voice> New in VoiceXML 2.0 <emphasis> New in VoiceXML 2.0 <prosody> New in VoiceXML 2.0 <phoneme> New in VoiceXML 2.0 <say-as> New in VoiceXML 2.0 <sub> New in VoiceXML 2.0 <mark> New in VoiceXML 2.0 <s> New in VoiceXML 2.0 <sentence> New in VoiceXML 2.0 <p> New in VoiceXML 2.0 <paragraph> New in VoiceXML 2.0 <grammar> <rule> New in VoiceXML 2.0 <ruleref> New in VoiceXML 2.0 <token> New in VoiceXML 2.0 <one-of> New in VoiceXML 2.0 <item> New in VoiceXML 2.0 <tag> New in VoiceXML 2.0 <example> New in VoiceXML 2.0 <lexicon> New in VoiceXML 2.0 <div> VoiceXML 1.0 only <dtmf> VoiceXML 1.0 only <emp> VoiceXML 1.0 only <pros> VoiceXML 1.0 only <sayas> VoiceXML 1.0 only <bevocal:register> Extension <bevocal:verify> Extension <property> <block> <filled> <if> <bevocal:foreach> Extension <catch> <error> <help> <noinput> <nomatch> <object> <var> <assign> <clear> <if> <else> <elseif> <bevocal:foreach> Extension <script> VOICEXML PROGRAMMER S GUIDE 111

124 TAGS Purpose Producing Audio Output (executable content) Throwing Events (executable content) Termination the Session (executable content) Placing and Controlling Outbound Calls (executable content) Sending and Fetching Data (executable content) Tags <audio> <prompt> <reprompt> <value> <enumerate> <throw> <rethrow> Extension <disconnect> <exit> <bevocal:dial> Extension <bevocal:hold> Extension <bevocal:connect> Extension <bevocal:whisper> Extension <bevocal:disconnect> Extension <submit> <send> VoiceXML 1.0 only, Extension <data> Extension Debugging <log> New in VoiceXML 2.0 Tag Index The following table lists the available tags, including tags for: VoiceXML elements Elements in the XML form of the W3C Speech Recognition Grammar Format, which are used to define grammars in that format Elements in the W3C Speech Synthesis Markup Language VoiceXML 1.0 only. Elements in the Java Speech Markup Language Tag <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:enroll> <bevocal:foreach> <bevocal:hold> <bevocal:listen> <bevocal:whisper> Assigns a value to a variable. Plays an audio clip to the user. Extension. Reconnects the user with an outbound call that was placed on hold. Extension. Initiates an outbound call, allowing the user to talk to a third party at another destination. Extension. Disconnects an outbound call or the inbound call. Extension: Create and modify enrolled grammars. Extension. Iterates over the elements of an array. Extension. Places an outbound call on hold. Extension. Allows the application to suspend execution and listen to the user during an outbound call. Extension. Interrupts an outbound call, allowing the application to play audio output to the called third party while the user is on hold. 112 VOICEXML PROGRAMMER S GUIDE

125 Tag Index Tag <block> <break> <catch> <choice> <clear> <data> <disconnect> <div> <dtmf> <else> <elseif> <emp> <emphasis> <enumerate> <error> <example> <exit> <field> <filled> <form> <goto> <grammar> <help> <if> <initial> <item> <lexicon> <link> <log> <mark> Contains (non-interactive) executable code. Speech Synthesis Markup Language element that inserts a pause in audio output. Catches an event. Defines a menu item. Clears one or more form-item variables. Experimental Extension. Fetches arbitrary XML data from an HTTP server, or submits values to a server. Disconnects a telephone session. VoiceXML 1.0 only. Java Speech Markup Language element that classifies a region of text as a particular type. VoiceXML 1.0 only. Specifies a touch-tone key grammar. Marks the beginning of an else clause within an <if> element. Marks the beginning of an else-if clause within an <if> element. VoiceXML 1.0 only. Java Speech Markup Language element that changes the emphasis of speech output. New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the emphasis of speech output. Generates audio output that enumerates the options in a field or the choices in a menu. Catches an error event. New in VoiceXML 2.0. XML grammar element with an example phrase that matches the containing grammar rule. Exits a session. Declares an input field in a form. Contains actions to be executed when fields are filled. Presents information and collects data. Goes to another location in the same or different document. Specifies a speech-recognition grammar. Catches a help event. Executes actions conditionally. Declares initial logic upon entry into a (mixed-initiative) form. New in VoiceXML 2.0. XML grammar input element that indicates optional or repeated user input. New in VoiceXML 2.0. XML grammar input element that indicates a source of pronunciation information. Specifies a transition common to all dialogs in the link s scope. New in VoiceXML 2.0. Writes debugging information to a BeVocal Café call log, which you can view on the Café web site. New in VoiceXML 2.0. Speech Synthesis Markup Language element that places a marker into the output stream for asynchronous notification. VOICEXML PROGRAMMER S GUIDE 113

126 TAGS Tag <menu> <meta> <metadata> <noinput> <nomatch> <object> <one-of> <option> <p> <paragraph> <param> <phoneme> <prompt> <property> <pros> <prosody> <record> <bevocal:register> <reprompt> <rethrow> <return> <rule> <ruleref> <s> <say-as> <sayas> <script> <send> Allows user to choose among alternative destinations. Defines a meta-data item as a name/value pair. Currently this tag has no effect. Catches a no-input event. Catches a no-match event. Always throws an unsupported object exception. New in VoiceXML 2.0. XML grammar input element that indicates alternative user inputs. Specifies an option in a <field>. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. Specifies a parameter in a <subdialog> element. New in VoiceXML 2.0. Speech Synthesis Markup Language element that provides a phonetic pronunciation for the contained text. Queues TTS and audio output to the user. Controls settings specific to the BeVocal VoiceXML implementation platform. VoiceXML 1.0 only. Java Speech Markup Language element that changes the prosody of speech output. New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the prosody of speech output. Records an audio sample. Extension. Register a voice print that can be used to verify caller identity. Plays a field prompt when a field is re-visited after an event. Extension. Causes the event currently being handled to be rethrown. Returns from a subdialog. New in VoiceXML 2.0. XML grammar element that defines a grammar rule. New in VoiceXML 2.0. XML grammar input element that references another rule. New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. New in VoiceXML 2.0. Speech Synthesis Markup Language element that modifies how the enclosed word or phrase is spoken. VoiceXML 1.0 only. Java Speech Markup Language element that modifies how a word or phrase is spoken. Specifies a block of client-side scripting logic in JavaScript. VoiceXML 1.0 only; Experimental Extension. Submits values to a web server without transitioning to a new VoiceXML document. 114 VOICEXML PROGRAMMER S GUIDE

127 Tag s Tag <sentence> <speak> <sub> <subdialog> <submit> <tag> <throw> <token> <transfer> <value> <var> <bevocal:verify> <voice> <vxml> New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. New in VoiceXML 2.0. The <speak> element is the root element of a standalone SSML document which contains all other SSML elements. New in VoiceXML 2.0. Attribute provides substitute text to be spoken instead of the contained text. Invokes another dialog as a subdialog of the current one. Submits values to a document server. New in VoiceXML 2.0. XML grammar element that specifies how to interpret the user input. Throws an event. New in VoiceXML 2.0. XML grammar input element that specifies words to be spoken by the user. Transfers the user s call to a third party at another destination. Inserts the value of a expression into audio output. Declares a variable. Extension. Verify that the speaker s voice matches a stored voice print. New in VoiceXML 2.0. Speech Synthesis Markup Language element that requests a change in speaking voice. Contains the VoiceXML code of a document. Tag s The remainder of this chapter contains tag descriptions is alphabetical order. VOICEXML PROGRAMMER S GUIDE 115

128 TAGS <assign> Syntax Assigns a value to a variable. <assign name="string" expr="js_expression" /> Attribute name expr Name of variable. This variable must have already been declared. JavaScript expression that evaluates to the value assigned to this variable. Tips: In JavaScript, + means both string concatenation and add. By default, values are treated as strings. If you want to add two numbers represented by string variables a and b in an <assign>, use: <assign name="x" expr="number(a) + Number(b)"/> Multiply, divide, and subtract are not ambiguous in this way. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None Exception Exception error.badfetch If the name or expr attribute is missing. See Also VoiceXML 2.0 Specification: <assign> JavaScript Quick Reference Related tag: <var> 116 VOICEXML PROGRAMMER S GUIDE

129 <assign> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <var name="a"/> <var name="b"/> <var name="result"/> <block> <assign name="a" expr=" Pine "/> <assign name="b" expr=" Apple "/> <assign name="result" expr="a + b"/> </block> <block> <prompt> This is the test for the assign tag. If you put <value expr="a"/> and <value expr="b"/> together, it would make <value expr="result"/> </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 117

130 TAGS <audio> Syntax Plays an audio clip to the user. <audio src="uri" expr="js_expression" fetchhint="prefetch" "safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" > bevocal:ssml="uri" > bevocal:ssmlexpr="js_expression" > Optional Content </audio> The audio clip is played in its entirety unless interrupted. The Audio Library contains stored audio files with commonly used spoken prompts and other sounds that you can use in your applications. If the bargein property is true, a user utterance can interrupt playing of the audio clip. If the bargeintype property is recognition, only an utterance that matches an active grammar can interrupt the audio clip. In the latter case, the bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, of the interrupting utterance. Attribute src expr fetchhint The URI of the audio file. Optional (as alternative to expr, bevocal:ssml, and bevocal:ssmlexpr; you must specify one of these 4 attributes). If not specified or invalid (that is, the interpreter was unable to perform the fetch from the specified URI), any content of the <audio> element will be played instead. The content can include text or valid child tags. New in VoiceXML 2.0. JavaScript expression that evaluates to either a string or an array or strings, or can be the recorded audio from the input variable of a <record> item. If it evaluates to a string, the string is interpreted as a URI and the audio file at the location is fetched and played. Optional (as alternative to expr, bevocal:ssml, and bevocal:ssmlexpr; you must specify one of these 4 attributes). If the expression evaluates to JavaScript undefined, then the element including its alternate content is ignored. For compatibility with previous releases the content is also ignored if the expression evaluates to null. Extension. If it is an array, each element is treated as an audio file URI, each of which is fetched and played, in turn. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Note: The interpreter can prefetch an audio file specified by the src attribute, but not by the expr attribute. 118 VOICEXML PROGRAMMER S GUIDE

131 <audio> Attribute fetchtimeout maxage maxstale bevocal:ssml bevocal:ssmlexpr Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. Extension. A URI which refers to an SSML document. This SSML document should be compliant to the W3C SSML spec but may have certain extensions. See Chapter 9, Dynamic SSML for details. Optional (as alternative to src, expr, and bevocal:ssmlexpr; you must specify one of these 4 attributes). Extension. A JavaScript expression which resolves to the URI expected by the bevocal:ssml attribute. Optional (as alternative to src, expr, and bevocal:ssml; you must specify one of these 4 attributes). (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. For playing prompts, the BeVocal interpreter supports popular formats including Wave (.wav), Sun audio (.au) and MP3. Because we use JMF technology, you can refer to the following reference for a complete list of audio formats supported: Note: If the specified audio file is an unsupported type, any alternative audio content of the <audio> element (text, prompts, and so on) is played instead. Tips: In production applications, you should consider using the Wave format for your audio files. The reasons for this recommendation are discussed in the Frequently Asked Questions, specifically in How can I improve the sound quality in my audio files? and Echo cancellation is not working, and my prompts barge in on themselves!. You can use <audio> within a <prompt>. If you do, it will inherit the attributes of the <prompt> element, such as bargein. If you name your audio files consistently, you can use the expr attribute to simplify the way you construct audio file names in your VoiceXML. For example: <prompt> <audio expr="'resources/prompts/hello' + sign +'.wav'"/> </prompt> <prompt> <audio expr="'resources/prompts/' + sign +'.wav'"/> </prompt> If you use the expr attribute in place of the src attribute, the interpreter cannot prefetch the audio file, which may cause a minor performance degradation. Once a given file is in the interpreter s cache, however, this difference is typically not noticeable. VOICEXML PROGRAMMER S GUIDE 119

132 TAGS Usage Using JavaScript functions makes it easy to update the location of your audio files. For an expanded example, see Factorial on the VoiceXML Samples page of the BeVocal Café; as a simple example: function female1(a) { return("audio/female1/en_us/" + a); } function common(b) { return(female1("common/" + b + ".wav")); } function number(b) { return(female1("number/" + b + ".wav")); } When calling JavaScript functions, use apostrophes in place of double quotes: <audio expr="common('bevocal_chimes')"/> If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Examples Parents <audio> <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <prompt> <prosody> <record> <s> <sentence> <subdialog> <transfer> <voice> VoiceXML 2.0 Specification: <audio> Example 1 using src: Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " 120 VOICEXML PROGRAMMER S GUIDE

133 <audio> <vxml version="2.0" xmlns=" <form id="foo"> <block> <audio> Welcome to BeVocal Cafe, the number One place to build and deploy your voice applications. </audio> <audio maxage="0" src="bevocal_cafe.wav"/> BeVocal Cafe. </block> </form> </vxml> Example 2 using expr: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns="  <script> <![CDATA[ var base = " function one() { return base + "6000-e.wav"; } function many() { var result = new Array(4); result[0] = base + "6000-b.wav"; result[1] = base + "300.wav"; result[2] = base + "37_and.wav"; result[3] = base + "1-32.wav"; return result; } ]]> </script> <form> <block> <prompt> Playing result of one <audio expr="one()">one</audio> <break/> Playing result of many <audio expr="many()">many</audio> </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 121

134 TAGS <bevocal:connect> Syntax Extension. Reconnects the user with an outbound call that was placed on hold. <bevocal:connect call="js_expression" /> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" A <bevocal:connect> tag reconnects an outbound call that was put on hold by a <bevocal:hold> tag or by the onhold attribute of the <bevocal:dial> tag that initiated the call. This tag does nothing if the specified call is not on hold. After the call is reconnected, the user and called third party talk to each other. The application continues to listen to the user; if a user utterance matches an active grammar, the application responds to the recognized utterance as usual. The application ignores all speech and DTMF signals from the called third party. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be reconnected. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 122 VOICEXML PROGRAMMER S GUIDE

135 <bevocal:connect> Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None The following events may be thrown by the execution of a <bevocal:connect> tag: Event connection.far_end.disconnect error.semantic The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call. See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:disconnect>, <bevocal:hold> VOICEXML PROGRAMMER S GUIDE 123

136 TAGS <bevocal:dial> Syntax Extension. Initiates an outbound call, allowing the user to talk to a third party at another destination. <bevocal:dial name="string" dest="uri" destexpr="js_expression" silent="true" "false" ani="digit_string" aniexpr="js_expression" connecttimeout="time_interval" maxtime="time_interval" bevocal:maxtimeexpr="js_expression" bevocal:type="blind" "bridge" "supervised" type="blind" "bridge" "consultation" onhold="true" "false" transferaudio="uri" /> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" The <bevocal:dial> tag may initiate a call in one of three ways: Transfer Method Bridge transfer Blind transfer Supervised or consultation transfer The current session with the interpreter resumes after the call with the third party completes. The current session terminates as soon as it starts the transfer, regardless of the success of the transfer. The current session terminates as soon as the transfer successfully connects the outbound call. If the transfer is unsuccessful, control returns to the application. Note: To use blind or supervised transfers, contact BeVocal Customer Support. Only one outbound call can be in progress at a given time. The call placed by one <transfer> or <bevocal:dial> tag must be terminated before another <transfer> or <bevocal:dial> tag can be executed. Café customers can use a <transfer> tag to place local and long-distance calls only; international calls are not allowed. (Hosting customers are allowed to make international calls.) During execution of the <transfer> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. During a bridge transfer, if the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the transfer is terminated and the input variable is set to the status near_end_disconnect. The bevocal.hotwordmin and bevocal.hotwordmax 124 VOICEXML PROGRAMMER S GUIDE

137 <bevocal:dial> properties specify the minimum and maximum time duration, respectively, for an utterance that can terminate the call. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute name dest Name of a variable to be set to a JavaScript object referring to the call. Optional. If no variable exists with the specified name, a new variable is defined in the current scope; for example, in the anonymous scope of the containing <block>. URI of the destination (for example, phone, IP telephony address). Optional (as alternative to destexpr). You can specify the URI using any of the following formats: phone:// phone:// tel: tel: ;postd=1234 This format allows you to specify an extension as post-dial digits (postd). After the call is answered, the specified digits (1234 in this example) are sent to the called third party as DTMF. sip: @fwd.pulver.com SIP URIs are specified as: sip:<destination number>@<domain value>:<port> where the default port is SIP URIs are valid only for VoIP calls. destexpr silent Note: A leading 1 on the phone number is optional and will be ignored. Note: You must specify one of dest or destexpr, but not both. JavaScript expression that evaluates to the URI of the destination. Optional (as alternative to dest). Note: You must specify one of dest or destexpr, but not both. Specifies whether the call-progress sounds are muted. Optional (default is false). true The call-progress sounds and all other sounds from the outbound call are muted until the post-dial digits have finished playing. false The user hears the call-progress sounds and a small snippet of the call after it is connected and before the post-dial digits are played. VOICEXML PROGRAMMER S GUIDE 125

138 TAGS Attribute ani aniexpr connecttimeout maxtime bevocal:maxtimeexpr A sequence of digits to be passed as the ANI or Caller ID for the outbound call. Optional (default is the value of the session.telephone.ani variable). The ANI may contain up to 32 digits in either of the following forms: A formatted phone number, with or without area code, for example, (201) a string of ASCII digits, for example, ANI spoofing is disabled by default on Café for outbound calls via call control tags. A warning is thrown to VXML log. It is enabled by default on hosting servers. Contact BeVocal to enable ANI spoofing on Café. Note: You may specify either this attribute, or aniexpr, but not both. JavaScript expression that evaluates to the sequence of digits to be passed as the ANI or Caller ID for the outbound call. Optional (default is the session.telephone.ani variable). Note: You may specify either this attribute, or ani, but not both. Time to wait for the outbound call to connect before throwing a connection.far_end.noanswer event. Optional (default is 20 seconds). Express time interval as an unsigned number followed by either s for time in seconds or ms for time in milliseconds (the default). How long the call is allowed to be in progress. Optional. The default for Café customers is 60 seconds. This is an absolute maximum. If the time in this attribute is longer than 60 seconds, the interpreter uses 60 seconds as the limit. The default for hosting customers is 0, signifying no limit. Express time interval as an unsigned number followed by either s for time in seconds or ms for time in milliseconds (the default). If the call exceeds the maximum duration, the outbound call is disconnected and a connection.far_end.disconnect.timeout event is thrown. Extension. A JavaScript expression which evaluates to the maxtime value. Optional. Again, the default for Café customers is 60 seconds. This is an absolute maximum. If the time in this attribute is longer than 60 seconds, the interpreter uses 60 seconds as the limit. The default for hosting customers is 0, signifying no limit. 126 VOICEXML PROGRAMMER S GUIDE

139 <bevocal:dial> Attribute bevocal:type type onhold transferaudio Extension. Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is supervised, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or supervised transfers, please contact BeVocal Customer Support. Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is consultation, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or consultation transfers, please contact BeVocal Customer Support. Specifies whether the outbound call is put on hold immediately after it is answered and after any post-dial digits have been sent. Optional (default is false). true Put the called third party on hold until reconnected by a <bevocal:connect> tag. false Leave the call active with the user connected to the called third party. Setting this attribute to true is equivalent to executing a <bevocal:hold> tag immediately after executing the <bevocal:dial> tag. Audio (specified by the URI) to play while the call connection attempt is in progress. Note: This attribute is valid only for VoIP calls. VOICEXML PROGRAMMER S GUIDE 127

140 TAGS The following events may be thrown by the execution of a <bevocal:dial> tag: Event connection.far_end.busy connection.far_end.noanswer connection.far_end.network_busy connection.far_end.network_disconnect error.connection.baddestination error.telephone.baddestination error.connection.noauthorization error.telephone.noauthorization error.connection.noresource error.telephone.noresource telephone.disconnect.hangup connection.disconnect.hangup The call was refused by the endpoint (the number was busy). There was no answer within the time allowed for making the connection. The call was refused by an intermediate network. The call completed and was terminated by the network. The destination URI specified by dest or destexpr is not valid. New in VoiceXML 2.0. The destination URI specified by dest or destexpr is not valid. VoiceXML 1.0 only. A Café customer attempted to make an international call. (The outbound call is not placed.) New in VoiceXML 2.0. A Café customer attempted to make an international call. (The outbound call is not placed.) VoiceXML 1.0 only. Another outbound call is already in progress. (A new outbound call is not placed.) New in VoiceXML 2.0. Another outbound call is already in progress. (A new outbound call is not placed.) VoiceXML 1.0 only. The user hung up while the connection was being made; for example, before the outbound call was answered. VoiceXML 1.0 only. The user hung up while the connection was being made; for example, before the outbound call was answered. New in VoiceXML 2.0. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Chapter 6, Controlling Outbound Calls Children None 128 VOICEXML PROGRAMMER S GUIDE

141 <bevocal:dial> Examples Related tags: <bevocal:connect>, <bevocal:disconnect>, <bevocal:hold>, <bevocal:listen>, <bevocal:whisper>, <transfer> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <var name="mycall"/>  <block> Placing an outbound call... <bevocal:dial name="mycall" dest="tel: "/> <bevocal:whisper call="mycall">  Please hold for a call from the President </bevocal:whisper> </block>  <bevocal:listen name="hotword"> <grammar type="application/x-nuance-gsl"> <![CDATA[ [ (good bye) { return ("exit") } (one) { return ("1") } (two) { return ("2") } (three) { return ("3") } ] ]]> </grammar> <filled> <if cond="hotword == 'exit'"> Good bye <bevocal:disconnect call="mycall"/> <else/>  <bevocal:whisper call="mycall"> <audio expr="'dtmf_' + hotword + '_100.wav'"> <value expr="hotword"/></audio> </bevocal:whisper> <clear namelist="hotword"/> </if> </filled> <catch event="connection.far_end.disconnect.timeout"> I'm sorry. The outbound call has exceeded the maximum allowed time. <exit/> VOICEXML PROGRAMMER S GUIDE 129

142 TAGS </catch> </bevocal:listen> <catch event="connection.far_end.disconnect">  </catch> <catch event="connection.disconnect.hangup">  <bevocal:disconnect call="mycall"/> </catch> </form> </vxml> 130 VOICEXML PROGRAMMER S GUIDE

143 <bevocal:disconnect> <bevocal:disconnect> Syntax Extension. Disconnects an outbound call or the inbound call. <bevocal:disconnect call="js_expression" /> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" This extended version of <disconnect> can disconnect either an outbound call initiated by a <bevocal:dial> tag, or the user s inbound call that started the current session: If a call attribute is included, it specifies the outbound call to be disconnected. The inbound call remains connected, allowing the user and the application to continue the session. If no call attribute is included, this tag is equivalent to <disconnect>; it performs the following steps: Disconnects the inbound call. If an outbound call is active or on hold, disconnects that call also. Throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). If an event handler catches the event, it can perform one last <submit> to notify the server that the call has ended. Because the call is no longer connected, any VoiceXML document returned from the server is ignored. The interpreter exits following execution of any event handler (or immediately if no handler catches the hang-up event). If the call is no longer active, <bevocal:disconnect> does nothing. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be disconnected. Optional; if omitted, the inbound call is disconnected. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 131

144 TAGS Usage See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None The execution of a <bevocal:disconnect> tag throws an error.semantic event if the call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call. Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:connect>, <disconnect> 132 VOICEXML PROGRAMMER S GUIDE

145 <bevocal:enroll> <bevocal:enroll> Syntax Extension. Create and modify enrolled grammars. <bevocal:enroll name="string" expr="js_expression" cond="js_expression" grammarname="string" speakeridexpr="js_expression" phraseidexpr="js_expression" minconsistencies="integer" maxtries="integer" type="mime_type" /> VOICEXML PROGRAMMER S GUIDE 133

146 TAGS To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" Attribute name expr cond grammarname speakerid speakeridexpr phraseidexpr minconsistencies maxtries Name of the input variable that will hold the enrollment result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form's scope. JavaScript expression that assigns the initial value of the input variable for this enrollment. Optional (default is undefined). If you set the input variable to a value other than undefined, you'll need to clear it before the enrollment can execute. JavaScript boolean expression that also must evaluate to true for the enrollment to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the field can execute. The name of the enrollment grammar to which the new phrase will be added. If the grammar does not exist, it will be created. Deprecated; use speakeridexpr instead. JavaScript expression that evaluates to the ID of the current speaker. Because enrolled grammars are speaker-dependent, recognition against the grammar only works with the same speaker (and same value for speakeridexpr). A unique ID for the phrase being enrolled in the grammar. When the grammar is matched during speech recognition, the phrase ID for the enrolled phrase that was recognized is returned. For example, in an address book application, each name should be enrolled with a different phrase ID. The associated telephone numbers should be stored in a database that maps from phrase ID to telephone number. When a name from the grammar is recognized, the phrase ID can be used to look up the telephone number. The minimum number of consistent utterances that must be provided by the user for the enrollment to succeed. minconsistencies must be less than or equal to maxtries or a parse-time error.badfetch will be thrown. The default and minimum value for minconsistencies is 2. The maximum number of enrollment attempts the interpreter will make. If it has not collected minconsistencies consistent utterances in maxtries attempts, an error.enrollment.max_tries event will be thrown. The developer should catch the event and may decide to clear the enroll item and force the FIA to visit it again. maxtries must be greater than or equal to minconsistencies or a parse-time error.badfetch will be thrown. The default value for maxtries is VOICEXML PROGRAMMER S GUIDE

147 <bevocal:enroll> Attribute type keyexpr The MIME type for the recorded utterance. Allowed types are: audio/wav (RIFF format) audio/basic (m-law encoded) The default type is audio/wav. Deprecated. Use the bevocal.security.key property instead. Properties of the Input Variable Once enrollment succeeds, the input variable is filled with the audio from one of the user's consistent utterances. Applications can send the audio to their back-end server using <submit> or <data>. Properties of the Shadow Variable Corresponding to the <bevocal:enroll> input variable name is a shadow variable name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property clash clashedphraseids enrollaudio numconsistentutterances The number of clashes that occurred during enrollment The existing phrase IDs that clashed with the enroll utterance. The audio data from one of the consistent utterances. This is the same audio data that is stored in the input variable. The number of consistent utterances collected for this enrollment For a field whose name is name, you access the property propname of the shadow variable with the syntax: name$.propname For example, you access the clash property for a field named en1 field as: en1$.clash Usage Model The <bevocal:enroll> tag must collect several utterances until it has a consistent statistical model of the phrase the user is trying to enroll. The first time a <bevocal:enroll> item is visited, it will collect one utterance and then return control to the FIA. Since the minimum value of minconsistencies is 2, the FIA's next iteration will select the same <bevocal:enroll> item, which will then collect a second utterance. This behavior will continue until a consistent enrollment is achieved, maxtries is reached, or an error occurs. When an enroll utterance clashes with an existing phrase in the enrolled grammar, an error.enrollment.clash event is thrown. You can use the shadow variables, name$.clash and name$.clashedphraseids to get information about the number of clashes and which phrase IDs the enroll utterance clashed with. Security Considerations The bevocal.security.key property controls access to enrollment grammars. In this situation, a key can be thought of as a namespace that qualifies the grammarname attribute of <bevocal:enroll>. Applications using one security key are totally prevented from accessing enrollment grammars created by an application using a second key, because their grammars live in separate namespaces. When you VOICEXML PROGRAMMER S GUIDE 135

148 TAGS develop applications for one of BeVocal's commercial hosting services such as Enterprise Hosting, you will need a security key in order to use enrollment. When you develop on Café, you can use enrollment without a key; however there are limitations. First, there will be an implied key derived from your Café account number. This means that even if you use the same enrollment grammar name from two different Café accounts, you will not be able to access the same enrolled phrases. Attempting to enroll more than ten phrases will cause an error.noauthorization event to be thrown. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Exceptions Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <link> <noinput> <nomatch> <option> <prompt> <property> <value> Exception error.enrollment.retries error.enrollment.clash error.noauthorization noinput The system collected maxtries utterances and still could not achieve a consistent model of the phrase being enrolled. The utterances that were collected clash with those for another phrase already enrolled in this model. Maximum number of phrases enrolled into the grammar. For a Café developer, the enrolled grammar is limited to 10 phrases. No speech was heard before the time specified by the timeout attribute on the last queued prompt (or by the timeout property) elapsed. See Also See Chapter 9, Voice Enrollment Grammars in the Grammar Reference, especially for the description of how to refer to an enrolled grammar. The JavaScript function bevocal.enroll.removeenrolledphrase 136 VOICEXML PROGRAMMER S GUIDE

149 <bevocal:enroll> Examples A form that adds items to an enrolled grammar. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="enroll_names"> <block> Welcome to the address book demo. Let s add some names to the address book. </block>   <catch event="error.enrollment.clash"> Oops! There was a clash for the enrollment sample. <exit/> </catch>   <catch event="error.enrollment.max_tries"> Maximum tries reached. Please try again. <exit/> </catch> <catch event="error.noauthorization"> Maximum phrases enrolled. <exit/> </catch> <catch event="noinput"> <prompt> In the noinput handler </prompt> <reprompt/> </catch>        <bevocal:enroll name="en1" minconsistencies="2" maxtries="4" grammarname="addressbook" speakeridexpr=" speaker10 " phraseidexpr=" tom " type="audio/wav"> <prompt count="1"> Say a name </prompt> <prompt count="2"> Say the name again. </prompt> <prompt count="3"> Please say the name again. </prompt> <filled> VOICEXML PROGRAMMER S GUIDE 137

150 TAGS The enrolled phrase is <value expr="en1"/> </filled> </bevocal:enroll> <bevocal:enroll name="en2" minconsistencies="2" maxtries="4" grammarname="addressbook" speakeridexpr=" speaker10 " phraseidexpr=" jackson " type="audio/wav"> <prompt count="1"> Say a name </prompt> <prompt count="2"> Say the name again. </prompt> <prompt count="3"> Please say the name again. </prompt> <filled> The enrolled phrase is <value expr="en2$.enrollaudio"/> </filled> </bevocal:enroll> </form> </vxml> 138 VOICEXML PROGRAMMER S GUIDE

151 <bevocal:enroll> And a form that uses that enrolled grammar. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns:bevocal=" <link next="addressbook.vxml"> <grammar> #ABNF 1.0; root $abook; $abook = address book; </grammar> </link> <form id="recognize_names"> <field name="f1"> Let s recognize the enrolled names. Say one of the enrolled names <grammar> <![CDATA[ #ABNF 1.0; root $call; $call= [call] $<enrolled:/addressbook?speaker=speaker10> [on at] [his her its else] [home work cell] [phone]; ]]> </grammar> <filled> Calling <value expr="f1"/> Thanks for using address book. Good Bye. </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 139

152 TAGS <bevocal:foreach> Syntax Extension. Iterates over the elements of an array. Note: The <bevocal:foreach> tag has been replaced by the <foreach> tag and may be deprecated in a future release. <bevocal:foreach item="string" array="string" > Executable Content </bevocal:foreach> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" The interpreter sets the iteration variable to each element of the array, in turn, and executes the contained elements once for each setting of the iteration variable. Attribute item array Name of the iteration variable, which may not be a JavaScript reserved keyword. The scope of the iteration variable is the <bevocal:foreach> element. The name of a variable whose value is an array. 140 VOICEXML PROGRAMMER S GUIDE

153 <bevocal:foreach> Usage See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> None Children <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <value> <var> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <block>  <var name="servicenames" expr="new Array( news, weather, traffic )"/> <prompt>here s a list of services </prompt>  <bevocal:foreach item="service" array="servicenames">  <prompt><value expr="service"/> <break size="small"/></prompt> </bevocal:foreach> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 141

154 TAGS <bevocal:hold> Syntax Extension. Places an outbound call on hold. <bevocal:hold call="js_expression" transferaudio="uri" /> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" A <bevocal:hold> tag affects an outbound call that was initiated by a <bevocal:dial> tag; it disconnects the voice path between the user and the called third party, putting the third party on hold. The call remains on hold until reconnected with a <bevocal:connect> tag. While the call is on hold: The user can hear and talk to the application. The application listens to the user and plays audio output for the user. The called third party hears silence. Neither the user nor the application can hear the called third party Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however,bevocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call transferaudio JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be put on hold. Audio (specified by the URI) to play to the third-party while on hold. Note: This attribute is valid only for VoIP calls. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 142 VOICEXML PROGRAMMER S GUIDE

155 <bevocal:hold> Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None The following events may be thrown by the execution of a <bevocal:hold> tag: Event connection.far_end.disconnect error.semantic The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an outbound call. See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:connect> VOICEXML PROGRAMMER S GUIDE 143

156 TAGS <bevocal:listen> Syntax Extension. Allows the application to suspend execution and listen to the user during an outbound call. <bevocal:listen name="string" expr="js_expression" cond="js_expression" > Optional Content </bevocal:listen> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" A <bevocal:listen> element lets the application listen to the user during an active outbound call that was initiated by a <bevocal:dial> tag. During execution of the <bevocal:listen> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. The application ignores all speech and DTMF signals from the called third party. If the <bevocal:listen> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the <bevocal:listen> tag s input variable is set to the recognition result. The bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, for an utterance that can match a <bevocal:listen> grammar. If a user utterance matches a child grammar, any child <filled> elements are executed in the order in which they occur. Recognition does not cause the call to terminate. If desired, a <filled> element can disconnect the call explicitly: Use the <disconnect> tag to end the inbound call; doing so disconnects any outbound call and terminates the session. Use the <bevocal:disconnect> tag to terminate the outbound call and continue with the session. Following execution of the <bevocal:listen> element, the interpreter proceeds as usual, searching for the next form item to execute. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML; however, the approval of call-control standards will take some time. BeVocal VoiceXML contains this extension to allow developers to take advantage of call-control features before standards are available, even though the eventual standards are likely to be nothing like the current extension. When the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement the standard. We will 144 VOICEXML PROGRAMMER S GUIDE

157 <bevocal:listen> then deprecate the current extension and provide developers with information on how to upgrade their applications to the new standard. Attribute name expr cond Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. JavaScript expression that assigns the initial value of the input variable for this field. Optional (default is undefined). If you set the input variable to a value other than undefined, you ll need to clear it before this element can execute. JavaScript boolean expression that also must evaluate to true for this element to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not this element can execute. The following events may be thrown by the execution of a <bevocal:listen> element: Event connection.disconnect.hangup The user hung up. New in VoiceXML 2.0. telephone.disconnect.hangup The user hung up. VoiceXML 1.0 only. connection.far_end.disconnect The called third party hung. connection.far_end.disconnect.timeout The outbound call was terminated due to timeout. error.semantic No outbound call is active. Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property confidence utterance inputmode duration The recognition confidence level (with 0.0 representing the lowest confidence and 1.0 representing the highest). A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. After the transfer is complete, the floating point duration of the call in milliseconds. For a field whose name is name, you access the property propname of the shadow variable with the syntax: name$.propname For example, you access the duration property for a field named services field as: services$.duration VOICEXML PROGRAMMER S GUIDE 145

158 TAGS Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value> Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:disconnect>, <bevocal:whisper> 146 VOICEXML PROGRAMMER S GUIDE

159 <bevocal:register> <bevocal:register> Syntax Extension. Register a voice print that can be used to verify caller identity. <bevocal:register name="string" type="boolean" "date" "digits" "currency" "number" "phone" "time" keyexpr="js_expression" mode="adapt" "delete" "skip"> Child Elements </bevocal:register> Input item that prompts for a value matching a particular grammar and then registers the user s response as a voice print for the specified key. Any utterances that match a universal grammar or a grammar in document or application scope are recognized as they would be in a <field> element. However, only those utterances that match the grammar of the <bevocal:register> element itself are used to create or adapt a voice print. Note: Unlike other input items, a <bevocal:register> element is not a possible go-back destination for the go-back facility. That is, the user cannot say go back to return to a <bevocal:register> element, retracting earlier input to that element. See the Chapter 7, Go-Back Facility for a description of the go-back facility, a BeVocal VoiceXML extension. Any DTMF input is rejected and results in a nomatch event. Attribute name type Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope. Specifies an internal grammar. Optional (as alternative to a child <grammar> element). boolean - Grammar for recognizing positive and negative responses. Returns true for yes and false for no. date - Grammar for recognizing dates. Returns string with format yyyymmdd;???? is used for an unknown year and?? is used for an unknown month or day. digits - Limited grammar for recognizing a sequence of digits. Returns a string of digits. currency - Grammar for recognizing amounts of money, in dollars (Not Implemented: International currency designation). Returns a string with format mm.nn. number - More general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phone - Grammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. time - Grammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or? for an ambiguous time (could be AM or PM). VOICEXML PROGRAMMER S GUIDE 147

160 TAGS Usage Attribute keyexpr mode Javascript expression whose value is the key identifying this user s voice print. Action to take if a voice print exists for the key specified by the keyexpr attribute. Optional (default is adapt). delete - Delete the existing voice print and start creating a new voice print. adapt - Refine the existing voice print with the user s response. skip - Skip execution of the <bevocal:register> tag. See Also Examples Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value> Related tag: <bevocal:verify> Speaker Verification <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="register_user"> <field name="account" type="digits"> <prompt>what is your account number.</prompt> <filled> You will now be registered as the only authorized caller for this account. </filled> </field>      <bevocal:register name="phone1" type="phone" keyexpr="account" mode="delete"> <prompt>please say your telephone number.</prompt> </bevocal:register> <bevocal:register name="phone2" type="phone" keyexpr="account"> <prompt>please repeat your telephone number.</prompt> <filled> <if cond="phone1!= phone2"> 148 VOICEXML PROGRAMMER S GUIDE

161 <bevocal:register> <prompt> You did not say the same phone number both times. Let s try again. </prompt> <clear namelist="phone1 phone2"/> </if> </filled> </bevocal:register> <bevocal:register name="numbers" type="digits" keyexpr="account"> <prompt> Please say one two three four five, one two three four five </prompt> </bevocal:register> <bevocal:register name="color" keyexpr="account"> <grammar> [red blue yellow green] </grammar> <prompt> Please say one of the the colors red, blue, yellow, or green. </prompt> </bevocal:register> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 149

162 TAGS <bevocal:verify> Syntax Extension. Verify that the speaker s voice matches a stored voice print. <bevocal:verify> name="string" type="boolean" "date" "digits" "currency" "number" "phone" "time" keyexpr="js_expression" Child Elements </bevocal:verify> Input item that prompts for a value matching a particular grammar and compares the user s voice against the stored voice print for the specified key. Throws an error.verify.keynotfound event if no voice print has been registered for that key. Any utterances that match a universal grammar or a grammar in document or application scope are recognized as they would be in a <field> element. However, only those utterances that match the grammar of the <bevocal:verify> element itself are used to verify the speaker s identity. Any DTMF input is rejected and results in a nomatch event. Note: Unlike other input items, a <bevocal:verify> element is not a possible go-back destination for the go-back facility. That is, the user cannot say go back to return to a <bevocal:verify> element, retracting earlier input to that element. See the Chapter 7, Go-Back Facility for a description of the go-back facility, a BeVocal VoiceXML extension. Attribute name type keyexpr Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope. Specifies an internal grammar. Optional (as alternative to a child <grammar> element). boolean - Grammar for recognizing positive and negative responses. Returns true for yes and false for no. date - Grammar for recognizing dates. Returns string with format yyyymmdd;???? is used for an unknown year and?? is used for an unknown month or day.; digits - Limited grammar for recognizing a sequence of digits. Returns a string of digits. currency - Grammar for recognizing amounts of money, in dollars (Not Implemented: International currency designation). Returns a string with format mm.nn. number - More general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phone - Grammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. time - Grammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or? for an ambiguous time (could be AM or PM). Javascript expression whose value is the key identifying this user s voice print. 150 VOICEXML PROGRAMMER S GUIDE

163 <bevocal:verify> Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable whose name is name$. The verifier returns the result of the verification in the decision property of this shadow variable. The possible results are: Value accepted rejected unsure The speaker s voice matches the voice print. The speaker s voice does not match the voice print. The verifier could not determine whether the speaker s voice matches the voice print. Usage For a <bevocal:verify> element whose name is name, you access the decision property with the syntax: name$.decision Your application can use the value of the decision property to decide how to proceed. For example, if the value is unsure, it could repeat the verification step. See Also Examples Parents <form> Children <audio> <catch> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value> Related tag: <bevocal:register> Speaker Verification <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="verify_user"> <field name="account" type="digits"> <prompt>what is your account number.</prompt> </field> <bevocal:verify name="check1" type="phone" keyexpr="account"> <prompt>please say your telephone number.</prompt> <filled> <if cond="check1$.decision== accepted "> <prompt> VOICEXML PROGRAMMER S GUIDE 151

164 TAGS You have been verified as the authorized caller for account number <value expr="account"/>. </prompt> <elseif cond="check1$.decision== rejected " /> <prompt> Sorry. You are not the authorized caller for account number <value expr="account"/>. </prompt> <exit/> <elseif cond="check1$.decision== unsure " /> <prompt>unable to verify that you are the authorized caller for account number <value expr="account"/>. Let s try again. </prompt> <clear namelist="check1"/> </if> </filled> </bevocal:verify> </form> </vxml> 152 VOICEXML PROGRAMMER S GUIDE

165 <bevocal:whisper> <bevocal:whisper> Syntax Extension. Interrupts an outbound call, allowing the application to play audio output to the called third party while the user is on hold. <bevocal:whisper call="js_expression" transferaudio="uri" > Optional Content </bevocal:whisper> To use this tag, the containing <vxml> tag must declare the XML namespace bevocal by including the following attribute: xmlns:bevocal=" A <bevocal:whisper> element lets the application interrupt an active outbound call that was initiated by a <bevocal:dial> tag. During execution of the <bevocal:whisper> element: The application plays audio output for the called third party. The application ignores all speech and DTMF signals, whether from the user or from the called third party. The called third party can hear the application, but cannot hear the user. The user hears silence. Note: This and other call-control tags constitute a BeVocal VoiceXML extension. Committees are currently working to standardize call-control features for VoiceXML, and their current approach is different from the BeVocal VoiceXML implementation. Because the approval of any call-control standards will be quite some time in coming, however, BeVocal VoiceXML contains this extension to allow developers to start taking advantage of call-control features. BeVocal will continue to monitor the development within these committees. If the features in the BeVocal VoiceXML extension become a part of VoiceXML or a separate call-control standard, BeVocal VoiceXML will implement that standard. We will then deprecate the current extension and provide developers with information on how to convert their applications to the new standard. Attribute call transferaudio JavaScript expression whose value is a JavaScript object that was initialized by <bevocal:dial>; specifies the call to be interrupted. Audio (specified by the URI) to play to the user while the called third-party hears the application. Note: This attribute is valid only for VoIP calls. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 153

166 TAGS Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> The following events may be thrown by the execution of a <bevocal:whisper> element: Event connection.far_end.disconnect error.semantic The specified outbound call has already been disconnected. The call attribute is not a valid JavaScript expression or the expression does not evaluate to an object representing an active outbound call. See Also Chapter 6, Controlling Outbound Calls Related tags: <bevocal:dial>, <bevocal:listen> 154 VOICEXML PROGRAMMER S GUIDE

167 <block> <block> Syntax Contains (non-interactive) executable code. <block name="string" expr="js_expression" cond="js_expression" > Executable Content </block> Control item container of executable code. As with all form items, a block s form-item variable must have a value of undefined before the block can execute. Just before the block is entered, the interpreter sets its form-item variables to true, so a block is typically executed only once per form invocation. Attribute name expr cond Name of form-item variable, which may not be a JavaScript reserved keyword. Optional (default is an unusable internal name). The form-item variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. Generally, you use this attribute only if you want to control block execution explicitly. JavaScript expression that assigns the initial value of the form-item variable. Optional (default is undefined). If you set the form-item variable to a value other than undefined, then you ll need to clear it before the block can execute. Note that you need to give the block a name if you want to clear it separately from other form-item variables. JavaScript boolean expression that must also evaluate to true for the block to execute. Optional (default is true). If not specified, the value of the form-item variable alone determines whether or not the block can execute. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 155

168 TAGS Usage See Also Examples Parents <form> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> VoiceXML 2.0 Specification: <block> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <block name="hello"> <prompt> Welcome to BeVocal Cafe. It is the best known place for Voice X M L Development. <audio src="bevocal_chimes.wav" /> </prompt> </block> </form> </vxml> 156 VOICEXML PROGRAMMER S GUIDE

169 <break> <break> Syntax Speech Synthesis Markup Language element that inserts a pause in audio output. <break time="time_interval" size="none" "small" "medium" "large" /> Attribute time size New in VoiceXML 2.0. Amount of time to pause, in milliseconds. Optional (default is 1 second). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). How long to pause, specified qualitatively. Optional (as alternative to msecs). none No pause. small Short pause (500 milliseconds). medium Longer pause (1 second). large Long pause (2 seconds). (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Usage Attribute msecs VoiceXML 1.0 only. Amount of time to pause, in milliseconds. Optional (default is 1 second). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Used in place of the VoiceXML 2.0 time attribute. Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children None. VOICEXML PROGRAMMER S GUIDE 157

170 TAGS See Also Examples VoiceXML 2.0 Specification: <break> Related tags: <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="demo-break"> <block> <prompt> Voice X M L allows the programming of silence <break size="medium"/> with the break tag. </prompt> </block> </form> </vxml> 158 VOICEXML PROGRAMMER S GUIDE

171 <catch> <catch> Syntax Catches an event. <catch event="event1..." count="integer" cond="js_expression" > Executable Content </catch> Container for event handling code. Like <block>, you can put non-interactive executable code (procedural logic) in a <catch> element to handle an event. Attribute event count cond Name of the event(s) to catch. Optional. New in VoiceXML 2.0. You can catch all events by omitting this attribute. Minimum number of times the event must have occurred during a form or menu invocation. Lets you handle different occurrences of the same event differently. Optional (default is 1). JavaScript boolean expression that must also evaluate to true for an event to be caught. Optional (default is true). Although you can define your own events, there is a set of predefined events. The VoiceXML interpreter provides a standard set of default event handlers for the predefined events. If multiple handlers for a given event are defined in, or inherited by, the element in which the event occurs, one handler is chosen based on count, scope, and document order. See Chapter 3, Event Handling. Tips: Because you can throw events from within a <catch>, be sure to avoid infinite loops. For example, the following handler would result in an infinite loop: <catch event="foobar"> <throw event="foobar"/> </catch> You can use a <submit> within a <catch> for a hang up event to notify the server that the call has ended (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). Because the call is no longer connected, any VoiceXML document returned from the server will be ignored and the interpreter will exit. Similarly, if you use a <goto> within a <catch> for this event, it will be ignored and the interpreter will exit. Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined. VOICEXML PROGRAMMER S GUIDE 159

172 TAGS Usage The method by which event handlers are inherited from ancestor elements is called as if by copy semantics in the VoiceXML 2.0 specification. It helps to think of appropriate event-handler literally being copied into the scope of where the event was thrown. So variable references are resolved relative to the scope of the element where the event was thrown, and URL references are resolved relative to the document from which the event was thrown. For example, if you have a <catch> handler in an application root document, which is in a different directory as the main document which threw the event, URLs in the handler will be resolved to the directory of the main document. The change to URL resolution to the originating document is considered 2.0 behavior and applies only when the <vxml> tag s version attribute is set to "2.0" or greater. In a VoiceXML 2.0 document, <catch> with an empty event attribute (<catch event="">) is no longer allowed. To catch all events use <catch> or <catch event=".">. If the version is set to 1.0 the empty event attribute can catch all events; however it is suggested you to use the new <catch> syntax since it works in both versions of the interpreter. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> VoiceXML 2.0 Specification: <catch> Related variables: _event, _message Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> Related tags: <error>, <help>, <noinput>, <nomatch>, <rethrow>, <throw> 160 VOICEXML PROGRAMMER S GUIDE

173 <catch> Examples See other catch and throw examples under <throw> and <rethrow>. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <property name="universals" value="help" /> <form id="stars"> <catch event="nomatch"> <prompt> Sorry, I did not hear a number more than ten. </prompt> <reprompt/> </catch> <block name="numbergame"> You can create games with numbers by using JavaScript </block> <field name="mynumber" type="number"> <prompt> Tell me a number and I can repeat it for you. You can say help for information. </prompt> <catch event="help"> <prompt> Please say a number more than 10 and less than infinity. </prompt> </catch> <filled> <if cond="mynumber > 10"> <prompt> The number you said is <value expr="mynumber"/> </prompt> <else/> <clear namelist="mynumber"/> <throw event="nomatch"/> </if> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 161

174 TAGS <choice> Syntax Defines a menu item. <choice accept="exact" "approximate" next="uri" event="event" expr="js_expression" dtmf="dtmf_sequence" eventexpr="js_expression" message="string" messageexpr="js_expression" fetchhint="prefetch" "safe" fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" > Choice Text </choice> Attribute accept next event expr dtmf eventexpr message messageexpr New in VoiceXML 2.0. Specifies whether the default grammar generated for this <choice> element requires all words or accepts a subset of the words; overrides the accept attribute of the parent <menu> element. Optional. exact Requires the user to say the exact phrase that appears in the <choice> element. approximate Allows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element specifies 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0 or unspecified. URI of the dialog or document to visit when this choice is selected. Optional (as an alternative to event or expr). An event to throw when this choice is selected. Optional (as an alternative to next or expr). JavaScript expression that evaluates to a URI of the dialog or document to visit when this choice is selected. Optional (as an alternative to next or event). DTMF sequence that can be used to select this choice. Optional (default is no DTMF value). New in VoiceXML 2.0. A JavaScript expression evaluating to the name of the event to be thrown. New in VoiceXML 2.0. A message string providing additional context about the event being thrown. The message is available as the value of a variable within the scope of the <catch> element. New in VoiceXML 2.0. A JavaScript expression evaluating to the message string. 162 VOICEXML PROGRAMMER S GUIDE

175 <choice> Attribute fetchhint fetchtimeout fetchaudio maxage maxstale Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. One and only one of the next, expr, or event attributes must be specified. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Exceptions See Also Examples Parents <menu> Children <grammar> error.badfetch - If the dtmf attribute of <menu> is true and any of the menu's <choice> tags have specified their own DTMF sequences to be something other than "*", "#" or "0". error.semantic - If not previously declared using JavaScript var within <script> or VoiceXML <var> or via being named a form item, like a <field name="foo"/>. VoiceXML 2.0 Specification: <choice> Related tag: <enumerate>, <menu> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" VOICEXML PROGRAMMER S GUIDE 163

176 TAGS <menu> <prompt> Welcome to BeVocal. <enumerate> For the <value expr="_prompt"/> service, say <value expr="_prompt"/> </enumerate> </prompt> <choice accept="approximate" next="movies.vxml">movie finder</choice> <choice next="horoscopes.vxml">horoscopes</choice> <choice accept="approximate" next="news.vxml">news desk</choice> </menu> </vxml> 164 VOICEXML PROGRAMMER S GUIDE

177 <clear> <clear> Syntax Clears one or more form-item variables. <clear namelist="variable1..." /> Usage Attribute namelist Space-separated list of variables to reset. Optional (default behavior clears all form-item variables in the current form). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including variables that have not been explicitly declared. When the interpreter resets a form item, it: Sets the form-item variable s value to undefined. Resets the form item s prompt and event counters to 0. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None. VoiceXML 2.0 Specification: <clear> Related tags: <field>, <record>, <subdialog>, <transfer>, <block>, <initial> VOICEXML PROGRAMMER S GUIDE 165

178 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="ssn" type="digits"> <prompt> Please say your S S N number. </prompt> </field> <field name="passcode" type="digits"> <prompt> Please say your passcode number </prompt> </field> <field name="choice" type="boolean"> <prompt> Your S S N is <say-as type="number:digits"> <value expr="ssn"/> </say-as> and your passcode is <say-as type="number:digits"> <value expr="passcode"/> </say-as>. Are the values right? </prompt> <filled> <if cond="choice"> <prompt> That is good. </prompt> <else/> <clear/> <prompt>let s do it again </prompt> </if> </filled> </field> </form> </vxml> 166 VOICEXML PROGRAMMER S GUIDE

179 <data> <data> Syntax Experimental Extension. Fetches arbitrary XML data from an HTTP server, or submits values to a server. <data src="uri" name="string" srcexpr="js_expression" expr="js_expression" method="get" "post" namelist="variable1..." enctype=mime_type fetchhint="prefetch" "safe" fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" /> The <data> tag fetches or submits data without transitioning to a new VoiceXML document. The XML data fetched by the <data> element is returned in a read-only JavaScript variable via an object model as specified in the W3C Document Object Model (DOM), described in Appendix E of the specification describes the JavaScript language binding for the DOM. Note: The current BeVocal VoiceXML implementation of <data> precedes finalization of the standard to give developers the opportunity to use the tag and provide feedback, which we can pass on to the W3C. If <data> is standardized, the BeVocal VoiceXML implementation will change as necessary to match the VoiceXML standard. If such changes occur, we will attempt to maintain backwards compatibility with the current implementation. Attribute src name srcexpr URI specifying the location of the XML data. Optional (as alternative to expr). Variable name, which must be a valid JavaScript identifier and may not be a reserved keyword in either JavaScript or Java. If the name attribute is omitted, the HTTP request is submitted, but the retrieved data is ignored. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. VOICEXML PROGRAMMER S GUIDE 167

180 TAGS Attribute expr method enctype namelist fetchhint fetchtimeout fetchaudio maxage maxstale Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). Note: The expr attribute has been replaced by the srcexpr attribute and may be deprecated in a future release. If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. The query request method, either get or post. Optional (default is get). MIME encoding used when submitting data with the post method. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data. Space-separated list of variables to submit to the server. Optional (default is to submit no variables). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of an audio clip to play while a resource is being fetched. See Background Audio on page 42. Optional. The specified audio clip is played only if the bevocal.dtmf.flushbuffer property is set to true. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. If a <data> element names a variable that is already in scope, it does not declare a new variable with the same name, but simply assigns a value to the existing variable the variable is assigned a reference to the DOM returned from the server. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. 168 VOICEXML PROGRAMMER S GUIDE

181 <data> Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <foreach> <form> <help> <if> <noinput> <nomatch> <vxml> None Children None. Example <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <var name="quote"/> <var name="ticker" expr="'f'"/> <form id="get_quote"> <block> <data name="quote" src="quote.xml"/> <assign name="document.quote" expr="quote.documentelement"/> <goto next="#play_quote"/> </block> </form> <form id="play_quote"> <script> <![CDATA[ // retrieve the value contained in the node t from the DOM exposed by d function GetData(d, t, nodata) { try { return d.getelementsbytagname(t).item(0).firstchild.data; } catch(e) { // the value could not be retrieved, so return this instead return nodata; } VOICEXML PROGRAMMER S GUIDE 169

182 TAGS } ]]> </script> <block>  <var name="change" expr="getdata(quote, 'change', 0)"/>  <prompt> The stock for <value expr="getdata(quote, 'name', 'unknown')"/> is </prompt>   <if cond="change == 0"> <prompt> unchanged at </prompt> <else/> <if cond="change > 0">  <prompt> up by </prompt> <else/>  <prompt> down by </prompt> </if>  <prompt> <value expr="math.abs(change)"/> to </prompt> </if>  <prompt> <value expr="getdata(quote, 'last', 0)"/> </prompt> </block> </form> </vxml> The data file quote.xml follows: <?xml version="1.0"?> <xml> <quote> <ticker>f</ticker> <name>ford Motor Company</name> <change>1.0</change> <last>30.00</last> </quote> </xml> 170 VOICEXML PROGRAMMER S GUIDE

183 <disconnect> <disconnect> Syntax Disconnects a telephone session. <disconnect> namelist="string" <disconnect/> Forces the execution environment to disconnect the user s inbound telephone call. If an outbound call is active or on hold, that call is also disconnected. This element throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). If an event handler catches the event, it can perform one last <submit> to notify the server that the call has ended. Because the call is no longer connected, any VoiceXML document returned from the server is ignored. The interpreter exits following execution of any event handler (or immediately if no handler catches the hang up event). Usage Attribute namelist Allows the application to return data to the execution context This attribute is currently ignored. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <foreach> <help> <if> <noinput> <nomatch> Children None. VoiceXML 2.0 Specification: <disconnect> Related tags: <bevocal:disconnect>, <transfer> VOICEXML PROGRAMMER S GUIDE 171

184 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <error> <prompt> That code is bad! <value expr="code"/></prompt> <disconnect/> </error> <field name="code" type="digits"> <prompt> Say your passcode now. </prompt> <filled> <if cond="code < 100"> <throw event="error"/> <else/> <prompt>good passcode!</prompt> </if> </filled> </field> <block> <prompt>that was the last form item.</prompt> <disconnect/> </block> </form> </vxml> 172 VOICEXML PROGRAMMER S GUIDE

185 <div> <div> Syntax Usage VoiceXML 1.0 only. Java Speech Markup Language element that classifies a region of text as a particular type. <div type="sentence" "paragraph" > Text </div> Identifies enclosed text as a particular type for interpretive purposes. The contained text is spoken normally; <div> has no effect. Note: In VoiceXML 2.0, this tag is replaced by the <paragraph> and <sentence> tags. Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> Children <audio> <break> <div> <emp> <enumerate> <pros> <sayas> <say-as> <value> See Also VoiceXML 1.0 Specification: <div> Related tags: <break>, <emp>, <pros>, <sayas> VOICEXML PROGRAMMER S GUIDE 173

186 TAGS <dtmf> Syntax VoiceXML 1.0 only. Specifies a touch-tone key grammar. <dtmf scope="document" "dialog" src="uri" expr="js_expression" type="mime_type" caching="safe" "fast" fetchhint="prefetch" "safe" fetchtimeout="time_interval" universal="string" > Optional Inline DTMF Grammar </dtmf> Defines a grammar for telephone key press sequences. Note: In VoiceXML 2.0, this tag is replaced by the <grammar> tag with the mode attribute set to dtmf. Attribute scope src expr Sets the scope of the DTMF grammar. document the grammar will be active throughout the current document. If the document is the application root document, then it will be active throughout the application (application scope). dialog the grammar is active throughout the current form. Note: A <dtmf> element can include a scope attribute only if its parent is a <form> element. Optional (default is dialog). The scope of any other <dtmf> element is determined by its parent: If the parent is an input item, the grammar has field scope. If the parent is a link, the scope is the element that contains the link. If the parent is a menu choice, the grammar scope is specified by the scope property of the containing <menu> element (or dialog scope by default). URI of the DTMF grammar specification, when it is contained in an external file. Optional (as an alternative to an inline DTMF grammar). Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src). 174 VOICEXML PROGRAMMER S GUIDE

187 <dtmf> Attribute type caching fetchhint fetchtimeout universal MIME type of the DTMF grammar. Optional. For external grammars, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are:.grxml,.xml XML Speech Grammar.gram ABNF Speech Grammar.gsl,.grammar Nuance GSL.ngo Nuance Grammar Object.jsgf Java Speech Grammar Format For internal grammars, if the grammar definition specifies the grammar type (either directly with a declaration or indirectly by containing XML elements), the interpreter uses that type. If the grammar definition doesn t indicate the type, the interpreter uses the value of the type attribute, if present. Otherwise, the interpreter assumes that the grammar is in GSL format. The currently supported types are: application/srgs+xml XML Speech Grammar application/grammar+xml XML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgs ABNF Speech Grammar application/grammar ABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gsl Nuance GSL application/x-gsl Nuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binary Nuance Grammar Object If you specify an unsupported type, an error is thrown. This value is used only if the web server returns an unsupported grammar type. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Extension. Makes this grammar a universal grammar with the specified name so that it can be activated and deactivated using the universals property. This attribute does not affect the scope of the grammar; it simply assigns it to a universal category. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 175

188 TAGS Usage See Also Parents <bevocal:listen> <field> <form> <link> <transfer> VoiceXML 1.0 Specification: <dtmf> Related tag: <grammar> Children None. 176 VOICEXML PROGRAMMER S GUIDE

189 <else> <else> Marks the beginning of an else clause within an <if> element. Syntax <else/> Empty tag that marks an else clause within an <if> element. Usage Parents <if> Children None. See Also VoiceXML 2.0 Specification: <else> Related tags: <if>, <elseif> VOICEXML PROGRAMMER S GUIDE 177

190 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <field name="color"> <grammar type="application/x-nuance-gsl"> [black white green blue purple yellow red] </grammar> <prompt> If you say your favorite color, I shall tell you its hexa decimal color code. What <emphasis>color? </emphasis> </prompt> <filled> <var name="color_code"/> <if cond="color == black "> <assign name="color_code" expr=" "/> <elseif cond="color == white "/> <assign name="color_code" expr=" FFFFFF "/> <elseif cond="color == green "/> <assign name="color_code" expr=" 00FF00 "/> <elseif cond="color == blue "/> <assign name="color_code" expr=" 0000FF "/> <elseif cond="color == purple "/> <assign name="color_code" expr=" 7D26CD "/> <elseif cond="color == yellow "/> <assign name="color_code" expr=" 8B8B00 "/> <elseif cond="color == red "/> <assign name="color_code" expr=" CD0000 "/> <else/> <assign name="color_code" expr="? "/> </if> <prompt> The code for <value expr="color"/> is <value expr="color_code"/> </prompt> <clear namelist="color color_code"/> </filled> </field> </form> </vxml> 178 VOICEXML PROGRAMMER S GUIDE

191 <elseif> <elseif> Syntax Marks the beginning of an else-if clause within an <if> element. <elseif cond="js_expression" /> Attribute cond JavaScript boolean expression that must evaluate to true for the clause to execute. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Parents Children <if> None. See Also VoiceXML 2.0 Specification: <elseif> Related tags: <if>, <else> VOICEXML PROGRAMMER S GUIDE 179

192 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <field name="color"> <grammar type="application/x-nuance-gsl"> [black white green blue purple yellow red] </grammar> <prompt> If you say your favorite color, I shall tell you its hexa decimal color code. What <emphasis>color? </emphasis> </prompt> <filled> <var name="color_code"/> <if cond="color == 'black'"> <assign name="color_code" expr="'000000'"/> <elseif cond="color == 'white'"/> <assign name="color_code" expr="'ffffff'"/> <elseif cond="color == 'green'"/> <assign name="color_code" expr="'00ff00'"/> <elseif cond="color == 'blue'"/> <assign name="color_code" expr="'0000ff'"/> <elseif cond="color == 'purple'"/> <assign name="color_code" expr="'7d26cd'"/> <elseif cond="color == 'yellow'"/> <assign name="color_code" expr="'8b8b00'"/> <elseif cond="color == 'red'"/> <assign name="color_code" expr="'cd0000'"/> <else/> <assign name="color_code" expr="'?'"/> </if> <prompt> The code for <value expr="color"/> is <value expr="color_code"/> </prompt> <clear namelist="color color_code"/> </filled> </field> </form> </vxml> 180 VOICEXML PROGRAMMER S GUIDE

193 <emp> <emp> Syntax VoiceXML 1.0 only. Java Speech Markup Language element that changes the emphasis of speech output. <emp level="strong" "moderate" "none" "reduced" > Text </emp> Any attribute is ignored; the contained text is spoken normally. Note: In VoiceXML 2.0, this tag is replaced by the <emphasis> tag. Usage Attribute level Level of emphasis to use when speaking the enclosed text. Optional (default is moderate). Possible values are: strong moderate none reduced See Also Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> Children <audio> <enumerate> <value> <break> <div> <emp> <pros> <sayas> <say-as> VoiceXML 1.0 Specification: <emp> Related tags: <break>, <div>, <pros>, <sayas> VOICEXML PROGRAMMER S GUIDE 181

194 TAGS <emphasis> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the emphasis of speech output. <emphasis level="strong" "moderate" "none" "reduced" > Text </emp> Any attribute is ignored; the contained text is spoken normally. Usage Attribute level Level of emphasis to use when speaking the enclosed text. Optional (default is moderate). Possible values are: strong moderate none reduced See Also Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice> VoiceXML 2.0 Specification: <emphasis> Related tags: <break>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> 182 VOICEXML PROGRAMMER S GUIDE

195 <enumerate> <enumerate> Syntax Generates audio output that enumerates the options in a field or the choices in a menu. <enumerate> Optional Template Content </enumerate> Automatically generates a description of acceptable input based on the template you provide. An <enumerate> element may be used in prompts and event handlers within <menu> elements and within <field> elements that contain <option> elements; an error.semantic event is though if it is used elsewhere. When this tag is in a <menu> element or a prompt or event handler that is executed while a menu is active, it enumerates all <choice> elements within the active menu. When this tag is in a <field> element or a prompt or event handler that is executed while a field is active, it enumerates all <option> elements within the active field. Two special variables are available for use in template content for auto-generated text. Variable _prompt _dtmf Meaning Prompt of current choice. DTMF sequence assigned to current choice. If this tag has no content, the generated text simply lists the prompts from the option or choice elements in the field or menu. VOICEXML PROGRAMMER S GUIDE 183

196 TAGS Usage See Also Parents <audio> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <p> <paragraph> <prompt> <prosody> <record> <s> <sentence> <subdialog> <transfer> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> VoiceXML 2.0 Specification: <enumerate> Related tags: <menu>, <field> 184 VOICEXML PROGRAMMER S GUIDE

197 <enumerate> Examples If you run this example, the menu s prompt will be: Welcome to BeVocal. For movie finder, say movie finder. For horoscopes, say horoscopes. For news desk, say news desk. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <menu> <prompt> Welcome to BeVocal. <enumerate> For the <value expr="_prompt"/> service, say <value expr="_prompt"/> </enumerate> </prompt> <choice accept="approximate" next="movies.vxml">movie finder</choice> <choice next="horoscopes.vxml">horoscopes</choice> <choice accept="approximate" next="news.vxml">news desk</choice> </menu> </vxml> VOICEXML PROGRAMMER S GUIDE 185

198 TAGS <error> Syntax Catches an error event. <error count="integer" cond="js_expression" > Executable Content </error> Shorthand for <catch event="error">. Catches error events of all kinds. Attribute count cond Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true). If multiple error handlers are defined in, or inherited by, the element in which the error occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tips: Within an event handler, the _event variable contains the name of the event currently being handled; the _message variable contains the message string that provides additional information about the event. If no message was supplied when the event was thrown, the _message variable is undefined. If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 186 VOICEXML PROGRAMMER S GUIDE

199 <error> Usage See Also Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> VoiceXML 2.0 Specification: <error> Related variables: _event, _message Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> Related tags: <catch>, <help>, <noinput>, <nomatch>, <rethrow>, <throw> VOICEXML PROGRAMMER S GUIDE 187

200 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <error> <prompt> That code is bad! <value expr="code"/></prompt> <disconnect/> </error> <field name="code" type="digits"> <prompt> Say your passcode now. </prompt> <filled> <if cond="code < 100"> <throw event="error"/> <else/> <prompt>good passcode!</prompt> </if> </filled> </field> <block> <prompt>that was the last form item.</prompt> <disconnect/> </block> </form> </vxml> 188 VOICEXML PROGRAMMER S GUIDE

201 <example> <example> Syntax Usage New in VoiceXML 2.0. XML grammar element with an example phrase that matches the containing grammar rule. <example> Example Input </example> This tag is used to define a grammar in the XML form of the W3C Speech Recognition Grammar Format. An <example> element encloses a sequence of tokens corresponding to user input that matches the containing rule. It is illustrative, for the benefit of a developer reading the grammar; the speech-recognition engine ignores the element. Parents <rule> Children None. See Also Examples Speech Recognition Grammar Specification: <example> Chapter 4, XML Speech Grammar Format in the Grammar Reference. <rule id="snapper" scope="public">  <example>red snapper</example> <example>mutton snapper</example> <ruleref uri="#snappertype"/> <token>snapper</token> </rule> VOICEXML PROGRAMMER S GUIDE 189

202 TAGS <exit> Exits a session. Syntax <exit expr Not implemented namelist Not implemented /> Unloads all documents and returns control to the interpreter s execution environment. Attribute expr namelist Not implemented. JavaScript expression that evaluates to the value to return to the execution environment. Optional (as alternative to namelist). Not implemented. Space-separated list of variable names to return to the interpreter execution environment. Optional (default is to return nothing). Usage The expr and namelist attributes are not meaningful because the BeVocal interpreter execution context does not accept return values from the execution of a VoiceXML document. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> VoiceXML 2.0 Specification: <exit> Children None. 190 VOICEXML PROGRAMMER S GUIDE

203 <exit> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <catch event="connection"> <log expr=" user said disconnect "/> </catch> <field name="choose"> <grammar type="application/x-nuance-gsl"> [ exit disconnect ] </grammar> <prompt>please say exit</prompt> </field> <block> <if cond="choose== exit "> <exit/> <prompt> you should NOT hear this prompt! </prompt> </if> <disconnect/> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 191

204 TAGS <field> Syntax Declares an input field in a form. <field name="string" expr="js_expression" cond="js_expression" type="boolean" "currency" "date" "digits" "number" "phone" "time" slot="string" modal="true" "false" bevocal:urlexpr="js_expression" > Child Elements </field> Input item that prompts user for a value that matches a particular grammar. Attribute name expr cond Name of the input variable that will hold the recognition result. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. JavaScript expression that assigns the initial value of the input variable for this field. Optional (default is undefined). If you set the input variable to a value other than undefined, you ll need to clear it before the field can execute. JavaScript boolean expression that also must evaluate to true for the field to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the field can execute. 192 VOICEXML PROGRAMMER S GUIDE

205 <field> Attribute type slot modal bevocal:urlexpr Specifies a built-in grammar. Optional (as alternative to <grammar> element). boolean Grammar for recognizing positive and negative responses. Returns true for yes and false for no. currency Grammar for recognizing amounts of money, in dollars (Not implemented: International currency designation). Returns a string with format mm.nn. date Grammar for recognizing dates. Returns string with format yyyymmdd;???? is used for an unknown year and?? is used for an unknown month or day. digits Limited grammar for recognizing a sequence of digits. Returns a string of digits. number More general grammar for recognizing numbers. Returns a string that could include digits, a decimal point, or positive or negative sign. phone Grammar for recognizing a telephone number adhering to the North American Dialing Plan (with no extension). Returns a sequence of digits. time Grammar for recognizing a time. Returns a string with format hhmmx where and x is one of: a for AM, p for AM, h for 24 hour notation, or? for an ambiguous time (could be AM or PM). If this field is part of a mixed-initiative dialog, the name of the grammar slot that will be used to assign a value to the input variable for this field. Optional (defaults to variable name). This attribute is ignored by a field-level grammar. See the Grammar Reference for more information on grammar slots. Boolean value that must be true to temporarily turn off higher level grammars. Optional (default is false). Lets you alter default behavior so that only this field s grammars are active while the field executes. JavaScript expression that evaluates to the URI of an audio file or a local audio variable. The local audio variable can be from a <record> tag or can be a shadow variable that contains audio. Optional. Lets you perform recognition against input from the specified audio file or from a local audio variable. VOICEXML PROGRAMMER S GUIDE 193

206 TAGS Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property confidence utterance inputmode audio recording recordingsize recordduration The recognition confidence level (with 0.0 representing the lowest confidence and 1.0 representing the highest). A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. If the bevocal.audio.capture or recordutterance property is set to true and the user s speech matched the field grammar, the audio property contains an audio capture of the user s speech. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. If the recordutterance property is set to true and the user s speech matched the field grammar, the recording property contains an audio capture of the user s speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected. For a field whose name is name, you access the property propname of the shadow variable with the syntax: name$.propname For example, you access the confidence property for the color field as: color$.confidence Built-in Types Parameters. Some built-in field types can be parameterized to affect the built-in type's behavior. The following table shows the types that can be parameterized and indicates which input parameters must be specified. Field Type Input Parameters boolean y The DTMF keypress for an affirmative response. n The DTMF keypress for a negative answer. digits minlength The minimum number of digits in a valid response. maxlength The maximum number of digits in a valid response. length The exact number of digits in a valid response. Note: If you do not specify length or maxlength, the built-in grammar accepts any number of digits. You should use length or maxlength whenever you can; doing so usually results in more accurate speech recognition. A parameter is specified within the type attribute with syntax of the form: typename? parameter = value 194 VOICEXML PROGRAMMER S GUIDE

207 <field> For example the following element specifies a field of type digits that must contain exactly five digits: <field name="mydigits" type="digits?length=5"> More than one parameter may be specified; the parameters are separated by semicolons. Note: The interpreter throws a error.badfetch event if it retrieves a VoiceXML file that contains a field type with inconsistent parameters, such as:  <field... type="digits?minlength=5;maxlength=1"> Built-in Types Output. In VoiceXML 2.0 the field s type attribute no longer defines an implicit <say-as> type to output the field s value. Instead it plays the value in normal TTS. You must now use the type attribute of <say-as> for a type-specific readout of the value. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Usage Exceptions See Also Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <link> <noinput> <nomatch> <option> <prompt> <property> <value> error.semantic - Thrown if no grammars are active. VoiceXML 2.0 Specification: <field> Grammar Reference Related tags: <filled>, <grammar> VOICEXML PROGRAMMER S GUIDE 195

208 TAGS Examples Example 1 no type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <field name="name"> <prompt> What is your favorite color? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml> Example 2 boolean type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="new_file" type="boolean"> <prompt> Do you want to play the game again? </prompt> <filled> <prompt> Your answer is <value expr="new_file"/> </prompt> </filled> </field> </form> </vxml> Example 3 date type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="today" type="date"> <prompt> What date is today? Please say or enter month day and year. </prompt> <filled> <prompt> Your answer is 196 VOICEXML PROGRAMMER S GUIDE

209 <field> <say-as type="date"><value expr="today"/></say-as> </prompt> </filled> </field> </form> </vxml> Example 4 digits type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="card_num" type="digits"> <prompt> Please say or enter the last four digits of your credit card. </prompt> <filled> <if cond="card_num.length!= 4"> <prompt> Sorry, I didn t hear exactly four digits. </prompt> <clear/> <reprompt/> <else/> <prompt> The number you entered is <say-as type="number:digits"> <value expr="card_num"/> </say-as> </prompt> </if> </filled> </field> </form> </vxml> Example 5 currency type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="ticket_cost" type="currency"> <prompt> Please say the cost of your ticket. </prompt> <filled> <prompt> The cost you entered is <say-as type="currency"><value expr="ticket_cost"/></say-as> </prompt> </filled> </field> </form> VOICEXML PROGRAMMER S GUIDE 197

210 TAGS </vxml> Example 6 number type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="num_count" type="number"> <prompt> Please say the number of computers you want to order. </prompt> <filled> <prompt>the number you entered is <value expr="num_count"/></prompt> </filled> </field> </form> </vxml> Example 7 phone type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="phone_num" type="phone"> <prompt> Please say or enter your phone number. </prompt> <filled> <prompt> The phone number you entered is <say-as type="telephone"> <value expr="phone_num"/></say-as> </prompt> </filled> </field> </form> </vxml> Example 8 time type: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <field name="meeting_time" type="time"> <prompt> Please say the meeting time. </prompt> <filled> <prompt> The meeting time is 198 VOICEXML PROGRAMMER S GUIDE

211 <field> <say-as type="time"> <value expr="meeting_time"/> </say-as> </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 199

212 TAGS <filled> Syntax Contains actions to be executed when fields are filled. <filled mode="any" "all" namelist="variable1..." > Child Elements </filled> A <filled> element can be either the child of an input item or the child of a form. When used as the child of an input item, the <filled> element has no attributes; its action is taken when the last user input fills the input variable of the containing element. When used as the child of a form, the <filled> element may have attributes that specify when its action is taken. Attribute mode namelist A value of any causes execution of this element when the last user input fills any one of the specified input variables. A value of all causes execution of this element when all specified input variables are filled; the last user input must have filled at least one of the fields. Optional (default value is all). Space-separated list of names of the input variables whose filling can trigger this element. Note that names of control items are not permitted in this list. Optional (default is all input variables in the form). It is an error to specify attributes in a <filled> element within an input item. 200 VOICEXML PROGRAMMER S GUIDE

213 <filled> Usage See Also Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <record> <subdialog> <transfer> VoiceXML 2.0 Specification: <filled> Related tag: <field> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <field name="name"> <prompt> What is your favorite color? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 201

214 TAGS <foreach> Syntax Iterates over the elements of an array. <foreach item="string" array="string" > Executable Content </bevocal:foreach> The interpreter sets the iteration variable to each element of the array, in turn, and executes the contained elements once for each setting of the iteration variable. Usage Attribute item array Name of the iteration variable, which may not be a JavaScript reserved keyword. The scope of the iteration variable is the <foreach> element. The name of a variable whose value is an array. Parents <block> <catch> <error> <filled> <foreach> <help> <if> <noinput> <nomatch> Children <assign> <audio> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <exit> <foreach> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <value> <var> 202 VOICEXML PROGRAMMER S GUIDE

215 <foreach> See Also Examples None <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form1"> <block>  <var name="servicenames" expr="new Array( news, weather, traffic )"/> <prompt>here s a list of services </prompt>  <foreach item="service" array="servicenames">  <prompt><value expr="service"/> <break size="small"/></prompt> </foreach> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 203

216 TAGS <form> Syntax Presents information and collects data. <form id="string" scope="document" "dialog" > Child Elements </form> A dialog for collecting user input. Note that when you reach the end of a form, execution does not proceed to the next form on the page. If you do not explicitly transition to another dialog using <goto> or <submit>, then the application terminates when the form completes. When execution leaves a form, all variables whose scope is that form go away. If you want to retain the value of a form-level variable after leaving the form, you must copy the value into a variable with document or application scope. Attribute id scope The name of the form. Sets the default scope of the form s grammars. Optional (default is dialog). document The form s grammars are active throughout the current document. If the document is the application root document, then they are active throughout the application (application scope). dialog By default, the form s grammars have dialog scope, which means they will be active only in this form. 204 VOICEXML PROGRAMMER S GUIDE

217 <form> Usage Exceptions See Also Examples Parents <vxml> Children <bevocal:listen> <bevocal:register> <bevocal:verify> <block> <catch> <data> <error> <field> <filled> <grammar> <help> <initial> <link> <noinput> <nomatch> <property> <record> <script> <subdialog> <transfer> <var> error.semantic - If no grammars are active when input is expected in a form VoiceXML 2.0 Specification: <form> Related tags: <field>, <block>, <record>, <transfer>, <subdialog>, <initial>, <grammar> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="welcome"> <block>you are in Form One. Welcome back home! </block> <field name="hello"> <grammar type="application/x-nuance-gsl"> [next dtmf-1 start] </grammar> <prompt> Say next or press one to go to Form 2 or start to start over again. </prompt> <filled> <if cond="hello== next hello== dtmf-1 "> <goto next="#comeagain"/> <else/> <goto next="#welcome"/> VOICEXML PROGRAMMER S GUIDE 205

218 TAGS </if> </filled> </field> </form> <form id="comeagain"> <block>you are now in Form 2.</block> <field name="goback" type="boolean"> <grammar type="application/x-nuance-gsl"> [back dtmf-2 continue] </grammar> <prompt> Say back or press two to go to Form 1 or say continue to enter this form again. </prompt> <filled> <if cond="goback== back goback== dtmf-2 "> <prompt>thanks for stopping by Form 2. Please come again.</prompt> <goto next="#welcome"/> <else/> <goto next="#comeagain"/> </if> </filled> </field> </form> </vxml> 206 VOICEXML PROGRAMMER S GUIDE

219 <goto> <goto> Syntax Goes to another location in the same or different document. <goto next="uri" expr="js_expression" expritem="js_expression" nextitem="uri" fetchhint="prefetch" "safe" fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" /> Transitions to another item in the same form, to another dialog in the same document, or to a different document. Except when the transition is to another item in the same form, the transition will cause all values stored in the dialog s variables to be lost. This is true even if you transition into the same dialog as you were in before. The VoiceXML interpreter clears its prompt queue when transitioning to another form item. You will not be able to bargein during prompts played in the execution of a <goto> tag. Attribute next expr expritem nextitem fetchhint fetchtimeout fetchaudio maxage maxstale The URI to go to next. Optional (as alternative to expr, expritem, nextitem). JavaScript expression that evaluates to the URI to go to next. Optional (as alternative to next, expritem, nextitem). JavaScript expression that evaluates to the name of the next item in the current form to visit next. Optional (as alternative to next, expr, nextitem). Name of the next item in the current form to visit next. Optional (as alternative to next, expr, expritem, nextitem). Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. One and only one of the attributes next, expr, nextitem, and expritem must be specified. VOICEXML PROGRAMMER S GUIDE 207

220 TAGS Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Examples Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> VoiceXML 2.0 Specification: <goto> Related tag: <submit> Children None. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <var name="something" expr="500"/> <form id="form1"> <block> <var name="something" expr="6000"/> <prompt> Something is <value expr="something"/> Going to the second dialog. </prompt> <goto next="#form2"/>  <prompt>goto failed</prompt> </block> </form> <form id="form2"> <block> <prompt> You are now in the second dialog. Something is now <value expr="something"/> Thank you. </prompt> </block> </form> </vxml> 208 VOICEXML PROGRAMMER S GUIDE

221 <grammar> <grammar> Syntax Specifies a speech-recognition grammar. <grammar src="uri" srcexpr="js_expression" expr="js_expression" mode="voice" "dtmf" root="string" tag-format="uri" version="version_number" xml:base="uri" xml:lang="lang" xmlns="uri" xmlns:xsi="uri" xsi:schemalocation="uri" scope="document" "dialog" type="mime_type" universal="string" fetchhint="prefetch" "safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" Only to point to an external or built-in grammar Only for an XML grammar allowed inline and in grammar file Only in an external XML grammar file Allowed for all grammars, except references to built-in grammars Allowed for external grammars weight="n" > Optional Inline Grammar </grammar> Not implemented Specifies a grammar to be used within some VoiceXML tag such as <field>, <form>, or <link>. When the speech-recognition engine detects a match with a grammar, it may cause a transition to another dialog (if the grammar is in a menu item or link) or assign a return value to assign to an input variable (if the grammar is in a form or input item). The <grammar> element serves two primary purposes: To define the grammar or point to a predefined grammar The <grammar> element can either contain the entire grammar definition directly, point to an external grammar file in one of several formats, or point to a built-in grammar. If the grammar definition is inline and is in the XML format, then the <grammar> element allows extra attributes to support the grammar definition; these attributes are ignored for an inline grammar in any other format and they are ignored for all external grammars. VOICEXML PROGRAMMER S GUIDE 209

222 TAGS To specify aspects of how to use the grammar in the particular containing VoiceXML element. For this purpose, the <grammar> element supports a set of attributes you can use with any application grammar. You can use these attributes for any grammar format and for both external and internal grammars. You cannot use these attributes for built-in grammars. Attribute src srcexpr expr mode root URI of the grammar specification, when it is contained in an external file. Optional (as an alternative to an inline grammar or expr). If you specify this attribute, the element cannot have content. You specify a grammar with a URI of one of the following forms: Grammar in any grammar file, specifying a rule to start recognition from: GrammarFileURI#RuleName Grammar in any grammar file; using the grammar s root rule: GrammarFileURI Grammar in a GSL compiled grammar file (the file identifies its root rule): compiled:grammar/key. Built-in grammar: builtin:grammar/type Built-in grammar with parameters: builtin:grammar/type?parameters An error.badfetch event is thrown if the specified file cannot be found or if an inline grammar is also specified. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. Extension. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). Note: The expr attribute has been replaced by the srcexpr attribute and may be deprecated in a future release. If you specify this attribute, the element cannot have content. You can specify either srcexpr or expr, but not both. If you specify both, a parse error is thrown. New in VoiceXML 2.0. Whether the grammar is for speech or DTMF input. Optional (default is voice). Valid values are: voice Spoken input dtmf DTMF input. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. New in VoiceXML 2.0. Specifies the name of the root grammar rule. Required for an inline XML grammar; optional for an external XML grammar. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. 210 VOICEXML PROGRAMMER S GUIDE

223 <grammar> Attribute tag-format version xml:base xml:lang xmlns xmlns:xsi xmlns:schema Location New in VoiceXML 2.0. Indicates the content type of enclosed <tag> elements. Optional. The only valid value is semantics/1.0. Currently, this attribute is ignored. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. New in VoiceXML 2.0. Version of the grammar format. Required for any XML grammar. The only allowed version is 1.0. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. New in VoiceXML 2.0. Base URI. Optional. Used to define a grammar in the XML grammar format. For an inline XML grammar, attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. New in VoiceXML 2.0. The language and optional country local identifier for the grammar. Optional (default is en-us) The accepted language identifiers are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian If an unsupported language is specified, an error.unsupported.language event is thrown. Used to define a grammar in the XML grammar format. For an inline XML grammar, the attribute occurs in the VoiceXML document. For an external XML grammar file, attribute occurs in the grammar file and is ignored if present in the VoiceXML document. For a grammar of any other format, this attribute is always ignored. New in VoiceXML 2.0. Indicates the grammar namespace. Required in an external XML grammar file. Must never occur in a VoiceXML document. The value must be New in VoiceXML 2.0. Indicates the location of the grammar schema. Optional in an external XML grammar file. Must never occur in a VoiceXML document. The only legal value is New in VoiceXML 2.0. Indicates the location of the grammar schema. Optional in an external XML grammar file. Must never occur in a VoiceXML document. The only legal value is " VOICEXML PROGRAMMER S GUIDE 211

224 TAGS Attribute scope type universal Sets the scope of the grammar. document the grammar will be active throughout the current document. If the document is the application root document, then it will be active throughout the application (application scope). dialog the grammar is active throughout the current form. Note: A <grammar> element can include a scope attribute only if its parent is a <form> element. Optional (default is dialog). The scope of any other <grammar> element is determined by its parent: If the parent is an input item, the grammar has field scope. If the parent is a link, the scope is the element that contains the link. If the parent is a menu choice, the grammar scope is specified by the scope property of the containing <menu> element (or dialog scope by default). Allowed on a <grammar> element in a VoiceXML document; not allowed in an external XML grammar file. MIME type of the grammar. Optional. The currently supported types are: application/srgs+xml XML Speech Grammar application/grammar+xml XML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgs ABNF Speech Grammar application/grammar ABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gsl Nuance GSL application/x-gsl Nuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binary Nuance Grammar Object If you specify an unsupported type, an error is thrown. For external grammars, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are:.grxml,.xml XML Speech Grammar.gram ABNF Speech Grammar.gsl,.grammar Nuance GSL.ngo Nuance Grammar Object.jsgf Java Speech Grammar Format For internal grammars, if the grammar definition specifies the grammar type (either directly with a declaration or indirectly by containing XML elements), the interpreter uses that type. If the grammar definition doesn t indicate the type, the interpreter uses the value of the type attribute, if present. Otherwise, the interpreter assumes that the grammar is in GSL format. Allowed on any <grammar> element in a VoiceXML document; not allowed in an external XML grammar file. Extension. Makes this grammar a universal grammar with the specified name so that it can be activated and deactivated using the universals property. This attribute does not affect the scope of the grammar; it simply assigns it to a universal category. Allowed on any <grammar> element in a VoiceXML document; not allowed in an external XML grammar file. 212 VOICEXML PROGRAMMER S GUIDE

225 <grammar> Attribute fetchhint fetchtimeout maxage maxstale weight Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Allowed only on a <grammar> element pointing to an external grammar. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. Allowed only on a <grammar> element pointing to an external grammar. New in VoiceXML 2.0. Not implemented. The weight of the grammar. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching Multiple grammars may be active at the same time. When the speech-recognition engine detects a match from a higher level, control is passed to that grammar s parent element. Grammar Formats. BeVocal VoiceXML supports the W3C specification, Speech Recognition Grammar Format, which defines an XML Grammar and an ABNF Grammar. The VoiceXML 2.0 specification defines the XML grammar as a must support, and the ABNF Grammar as a should support. It is expected that many VoiceXML developers will use ABNF because it is more readable and concise. BeVocal VoiceXML additionally supports the Java JSGF format (used as examples in the VoiceXML 1.0 draft) and the Nuance GSL Grammar format. See the following for details: VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Examples on page 219 has examples of every format. The Grammar Reference details syntax and usage of the various supported grammar formats. VOICEXML PROGRAMMER S GUIDE 213

226 TAGS Compiled Grammars. You may refer to a compiled GSL grammar using the src attribute of the <grammar> tag; for example: <grammar src="compiled:/key"/> However, if you want to combine a compiled grammar in a more complex way with another grammar, you can refer to a compiled grammar from within an ABNF grammar: <grammar> <![CDATA[ #ABNF 1.0 en-us; root $script; $script = prescription $<compiled:/key>; ]]> </grammar> Using the Grammar Compiler tool on the BeVocal Café, you can compile a grammar and receive an associated key. For information on compiling grammars, see Chapter 7, Grammar Compiler in Using the BeVocal Café Tools. Built-in Grammars. The VoiceXML 2.0 specification (Appendix P) includes a set of built-in grammars as a convenience to enable developers to get started writing more complex VoiceXML applications quickly. Some of the basic built-in grammars such as date and time are actually difficult to write and tune by hand, but are very useful in many applications. The basic built-in grammars are: Grammar Type boolean currency date digits number phone time Recognizes a positive or negative response. Recognizes an amount of money, in dollars. Recognizes a calendar date. Recognizes a sequence of digits. Recognizes a number. Recognizes a telephone number adhering to the North American Dialing Plan (with no extension). Recognizes a clock time. Note: All standard built-in grammars are supported in the Spanish language. Currently there are no extended built-in grammars for Spanish. Neither standard nor extended built-in grammars are currently supported in French Canadian. All extended built-in grammars are supported only in English. If you specify a language other than English and refer to an unsupported built-in grammar, a parse error error.unsupported.builtin is thrown. 214 VOICEXML PROGRAMMER S GUIDE

227 <grammar> In addition, BeVocal VoiceXML contains a set of extended built-in grammars, so VoiceXML developers can reference these quite complex grammars which have been tuned over the years by caller usage. The extended grammars are: Grammar Type airport airline datetime equity citystate stockindex street streetaddress Recognizes an airport name or code, such as DFW or Dallas-Fort Worth. Recognizes an airline name or code, such as AA or American Airlines. Recognizes a date and time. Please contact your BeVocal sales representative or [email protected] for further information on pricing and availability. Recognizes a company symbol, such as IBM or CSCO. Recognizes US city and state names, for example, Sunnyvale, California. Recognizes the names of the major US stock indexes, such as Nasdaq. Recognizes a street name (with or without street number). Recognizes a street name and street number. Note that built-in grammars can be used where other grammars are used, commonly either in fields or in forms. A built-in grammar can be specified using the syntax: <grammar src="builtin:grammar/type"> For example the phone type is specified using: <grammar src="builtin:grammar/phone"> Most built-in grammars, such as phone, fill a single slot whose name is the same as the grammar name. For example, the date grammar fills a slot named date. However, some of the more complex grammars fill in multiple slots, for example airline fills in code and name. Built-in Grammars Fields and Forms. If the built-in grammar is used as a field grammar, the input variable is set to the slot value. If the built-in grammar fills multiple slots: If the built-in grammar is used as a form grammar, any field whose name or slot attribute matches one of the grammar s slot names has its input variable set to the slot value. If the built-in grammar is used as a field grammar, then the slot names and values returned are converted into properties and values on the field s input variable. More technically, the input variable is set to a JavaScript object with properties that can be accessed once the value of the field has been filled. For more details, see Chapter 1, Using VoiceXML Grammars in the Grammar Reference. VOICEXML PROGRAMMER S GUIDE 215

228 TAGS Built-in Grammars Multiple Slots. The following table lists the slots for the extended built-in grammars which return multiple slots. In the case of using a built-in grammar in a <field> element, these are the properties of the field s item variable which are assigned the slot values. Field Type airport airline datetime equity stockindex street Slots (or Properties) code Code of the airport, such as DFW. For a list of known airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Café web site. spokencity The city that was spoken, such as dallas. This slot is only returned if a domestic (US) airport is being recognized. airportcity The airport city for the recognized airport, such as ft. worth. state The two-letter abbreviation for the state, such as tx. country The country where the airport is located, such as united states of america. code The airline s identifying code, such as AS. For a list of known airline codes, see Airport & Airline Codes. name Full name of the airline, such as Alaska Airlines. month The month for this date/time. day The day for this date/time time_hours The hour value of the time time_minutes The minute value of the time ampm Whether the date/time is AM or PM. timezone The timezone associated with this datet/ime duration_hours The number of hours to the specified time duration_minutes The number of minutes to the specified time dayofweek The day of thw eek associated with this date/time The following examples show utterances and the corresponding slots that are filled: "tomorrow " day = tomorrow "next monday" dayofweek = monday "six o clock central time" time_hours = 6 time_minutes = 0 timezone = ct "four twenty-two p m on november thirtieth" ampm = pm day = 30 month = november time_hours = 4 time_minutes = 22 "in one hour and fifteen minutes" duration_hours = 1 duration_minutes = 15 name The name of the company, such as Cisco Systems. symbol Stock symbol, such as CSCO. name The name of the index, such as Nasdaq Composite Index. symbol Symbol for the index, such as COMP. (none) 216 VOICEXML PROGRAMMER S GUIDE

229 <grammar> Field Type streetaddress citystate A property is accessed with an expression of the form: fieldname.propertyname Slots (or Properties) streetname The name of the street, such as Bordeaux Drive. streetnumber The number of the street, such as city The city, such as Sunnyvale. county The county in which the city is located, such as Santa Clara. state The two-letter abbreviation for the state, such as CA. datacity If the city is unlikely to appear in data feeds for traffic, weather, and so on (for example, because it is tiny or unincorporated), the name of an adjacent city that is more likely to work with data feeds. In all other cases, this property is identical to the city property. For example the city property of a citystate field is accessed as follows: <field name="mycity"> <grammar src="builtin:grammar/citystate"/> <filled> <prompt>the city is <value expr="mycity.city"/> </filled> </field> Built-in Grammars Parameters. Two standard built-in grammars, digits and boolean, can be parameterized. Specifically, you can set limits on the length of a digit string, and you can set DTMF key presses to mean yes or no. In addition, you must specify the city and state with the street and streetaddress built-in grammars; you can optionally specify the county to disambiguate a case in which the state contains two cities with the same name. The following table shows the built-in grammars that can be parameterized and indicates which parameters are required. Grammar Parameters boolean y The DTMF key press for an affirmative answer. n The DTMF key press for a negative answer. digits minlength The minimum number of digits in a valid utterance. maxlength The maximum number of digits in a valid utterance. length The exact number of digits in a valid utterance. Note: If you do not specify length or maxlength, the built-in grammar accepts an infinite number of digits. You should use length or maxlength whenever you can; doing so usually results in more accurate speech recognition. equity symbol When value is true, the grammar expects the input to be the stock symbol spelled-out, as in "eye" "bee" "emm" for IBM. street city The city in which the street is located (required). state The state in which the specified city is located (required). county The county in which the specified city is located. If you omit the county and the city name is ambiguous, an error.badfetch event is thrown with a list of multiple possible counties in the error message. Note: An error.badfetch event is thrown if a required parameter is not specified, if the city is not a recognized city of the specified state, or if there is no street grammar for the specified city. VOICEXML PROGRAMMER S GUIDE 217

230 TAGS Grammar Parameters streetaddress airport city The city in which the street is located (required). state The state in which the specified city is located (required). county The county in which the specified city is located. If you omit the county and the city name is ambiguous, an error.badfetch event is thrown with a list of multiple possible counties in the error message. Note: An error.badfetch event is thrown if a required parameter is not specified, if the city is not a recognized city of the specified state, or if there is no street grammar for the specified city. domestic If true, then the grammar recognizes only the major domestic (US) airports. If false, then it recognizes only the major international airports. If no parameter is specified, then this property defaults to true. You express parameter information using URI-style query syntax of the form: builtin:grammar/typename?parameter=value For example, the grammar matches a sequence of exactly five digits: <grammar src="builtin:grammar/digits?length=5"> You can specify more than one parameter, separated by semicolons. For example, the following grammar allows a user to press 7 for an affirmative answer and 9 for a negative answer: <grammar src="builtin:grammar/boolean?y=7;n=9"/> Note: The interpreter throws a error.badfetch event if it loads a VoiceXML file that contains a built-in grammar with an unrecognized parameter or with inconsistent parameters, such as:  <grammar src="builtin:grammar/digits?minlength=5;maxlength=1"> Built-in Extended Grammars Input. The following table shows example user inputs for each extended built-in grammar. Grammar Type airport airline equity stockindex street streetaddress citystate Example Inputs San Jose DFW American UA Cisco ORCL Nasdaq Bordeaux Drive 1380 Bordeaux Drive Sunnyvale, California Built-in Extended Grammars Output. In VoiceXML 2.0 the field s type attribute no longer defines an implicit <say-as> type to output the field s value. Instead it plays the value in normal TTS. You must now use the type attribute of <say-as> for a type-specific read-out of the value (in TTS). The bevocal:mode attribute of <say-as> can be used to define recorded output for some types. See the <say-as> tag for details. 218 VOICEXML PROGRAMMER S GUIDE

231 <grammar> Grammar Type Say-As Type Audio Output String Result airport airport Airport name Dallas Fort Worth International Airport (DFW) airline airline Airline name American Airlines (AA) equity equity Company name ORCL stockindex stockindex Index name Nasdaq Composite Index street street Street name Bordeaux Drive streetaddress address Street number and name 1380 Bordeaux Drive citystate citystate City State Sunnyvale, CA Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <choice> <field> <form> <link> <record> <transfer> Children <lexicon> <metadata> <rule> See Also Examples VoiceXML 2.0 Specification: <grammar> Grammar Reference <say-as> Different grammar formats. XML Grammar - DTMF Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar mode="dtmf" root="pin" version="1.0"> <rule id="digit"> VOICEXML PROGRAMMER S GUIDE 219

232 TAGS <one-of> <item> 0 </item> <item> 1 </item> <item> 2 </item> <item> 3 </item> <item> 4 </item> <item> 5 </item> <item> 6 </item> <item> 7 </item> <item> 8 </item> <item> 9 </item> </one-of> </rule> <rule id="pin" scope="public"> <item repeat="4"><ruleref uri="#digit"/></item> </rule> </grammar> <filled> You entered <value expr="pin"/> </filled> </field> </form> </vxml> XML Grammar - Voice Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar mode="voice" root="pin" version="1.0"> <rule id="digit"> <one-of> <item> 0 </item> <item> 1 </item> <item> 2 </item> <item> 3 </item> <item> 4 </item> <item> 5 </item> <item> 6 </item> <item> 7 </item> <item> 8 </item> <item> 9 </item> </one-of> </rule> <rule id="pin" scope="public"> <item repeat="4"><ruleref uri="#digit"/></item> </rule> 220 VOICEXML PROGRAMMER S GUIDE

233 <grammar> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 221

234 TAGS ABNF Grammar - DTMF Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar type="application/grammar"> <![CDATA[ #ABNF 1.0; mode dtmf; root $pin; $digit = ( ); $pin = $digit<4> ; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> ABNF Grammar - Voice Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar type="application/grammar"> <![CDATA[ #ABNF 1.0; root $pin; $digit = ( ); $pin = $digit<4> ; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> 222 VOICEXML PROGRAMMER S GUIDE

235 <grammar> GSL Grammar - DTMF Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar type="application/x-nuance-gsl"> FourDigits (Digit Digit Digit Digit) Digit [dtmf-1 dtmf-2 dtmf-3 dtmf-4 dtmf-5 dtmf-6 dtmf-7 dtmf-8 dtmf-9 dtmf-0] </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> GSL Grammar - Voice Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar type="application/x-nuance-gsl"> FourDigits (Digit Digit Digit Digit) Digit [ ] </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 223

236 TAGS JSGF Grammar - Voice Recognition <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="pin"> What is your pin? <grammar> <![CDATA[ #JSGF 1.0; grammar pin; <digit> = ( ); public <pin> = <digit> <digit> <digit> <digit>; ]]> </grammar> <filled> You said <value expr="pin"/> </filled> </field> </form> </vxml> 224 VOICEXML PROGRAMMER S GUIDE

237 <help> <help> Syntax Catches a help event. <help count="integer" cond="js_expression" > Executable Content </help> Attribute count cond Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). If multiple handlers for help events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. A help event is thrown whenever user input matches the predefined help universal grammar. When the <vxml> tag s version attribute is 2.0 or greater, all universal grammars are deactivated by default. You can activate the help grammars by setting the universals property. The following tag activates all universal grammars, including help: <property name="universals" value="all" /> The following tag activates the help grammar and deactivates the other predefined universal grammars: <property name="universals" value="help"/> When the <vxml> tag s version attribute is 1.0, all universal grammars are active by default. You can deactivate the help grammar by setting the universals property. The following tag deactivates all universal grammars, including help: <property name="universals" value="none" /> The following tag deactivates the help grammar and activates the other predefined universal grammars: <property name="universals" value="exit cancel goback"/> For additional information about universal grammars, see Universal Commands and Grammars on page 14. Tip: JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true). If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 225

238 TAGS Usage See Also Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> VoiceXML 2.0 Specification: <help> Related tags: <catch>, <error>, <noinput>, <nomatch>, <rethrow>, <throw> 226 VOICEXML PROGRAMMER S GUIDE

239 <help> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <property name="universals" value="help"/> <field name="sports"> <grammar type="application/x-nuance-gsl"> [ football basketball tennis skiing ] </grammar> <help> Please say one of football, basketball, tennis or skiing </help> <prompt> What is your favorite sport? </prompt> <filled> <prompt> Looks like <value expr="sports"/> is your favorite sports. </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 227

240 TAGS <if> Syntax Executes actions conditionally. <if cond="js_expression" > Executable Content </if> Attribute cond JavaScript expression that evaluates to a boolean value that must be true for the if clause to execute. Optional (default is true). Usage Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <else> <elseif> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> 228 VOICEXML PROGRAMMER S GUIDE

241 <if> See Also Tip: Examples If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VoiceXML 2.0 Specification: <if> Related tags: <else>, <elseif> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form"> <field name="hello"> <grammar type="application/x-nuance-gsl"> [one dtmf-1 goodbye] </grammar> <prompt>say one or press one to continue or say goodbye to exit.</prompt> <nomatch>sorry, I did not get it.<reprompt/></nomatch> <filled> <if cond="hello== one hello== dtmf-1 "> <prompt> Welcome to this part of the world. </prompt> <else/> <prompt> Sorry you could not have much fun. Goodbye </prompt> </if> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 229

242 TAGS <initial> Syntax Declares initial logic upon entry into a (mixed-initiative) form. <initial name="string" expr="js_expression" cond="js_expression" > Child Elements </initial> Control item that controls the initial interaction in a mixed initiative form. An <initial> element can request user input or perform other non-interactive initialization tasks at the beginning of a mixed initiative form. As with all form items, an <initial> s form-item variable must have a value of undefined in order to be visited. If any of the form s fields are filled as a result of user input, the interpreter sets all <initial> form-item variables to true before performing any <filled> actions. After that, it will request user input in a directed mode based on the prompts associated with the fields that are still unfilled. Attribute name expr cond Name of form-item variable, which may not be a JavaScript reserved keyword. Optional (default is an unusable internal name). The form-item variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. Generally, you use this attribute only if you want to control <initial> execution explicitly. JavaScript expression that assigns the initial value of the form-item variable. Optional (default is undefined). If you set the form-item variable to a value other than undefined, you ll need to clear it before the <initial> element can execute. Note that you need to give the form-item variable a name if you want to clear it separately from other form-item variables. JavaScript boolean expression that must also evaluate to true for the <initial> element to execute. Optional (default is true). If not specified, the value of the form-item variable alone determines whether or not the <initial> element can execute. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 230 VOICEXML PROGRAMMER S GUIDE

243 <initial> Usage Exceptions See Also Examples Parents <form> Children <audio> <catch> <enumerate> <error> <help> <link> <noinput> <nomatch> <prompt> <property> <value> error.semantic - Thrown if no grammars are active. VoiceXML 2.0 Specification: <initial> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <grammar src="foo.grammar#main"/> <nomatch> Okay, let me ask you the questions separately. <reprompt/> <assign name="selection" expr="true"/> </nomatch> <initial name="selection"> Please say how many apples or oranges you want. </initial> <field name="fruit"> <grammar src="foo.grammar#fruit"/> Do you want apples or oranges? </field> <field name="quantity"> <grammar src="foo.grammar#quantity"/> How many <value expr="fruit"/> do you want? </field> <block> <prompt> You said that you wanted <value expr="quantity"/> <value expr="fruit"/>. </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 231

244 TAGS <item> Syntax New in VoiceXML 2.0. XML grammar input element that groups tokens or elements in a rule. Identifies one of a set of alternatives and serves as an easy way to associate attributes with a set of rule pieces. <item repeat="m" "M-N" "0-1" "M-" repeat-prob="p" weight="n.m" xml:lang="lang" > Content </item> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. An <item> element can contain any number of input elements and <tag> elements. The input elements indicate a sequence that must be matched in order. The repeat attribute applies to the entire sequence, indicating that it is optional or that it may be repeated. Any contained <tag> elements apply to the entire sequence. If the user input matches the input elements and repeat attribute, the interpreter uses the <tag> elements to assign values to input variables. Attribute repeat repeat-prob Indicates the number of times that the contained expansion may be repeated. Optional (if omitted, the sequence of input elements must occur exactly once). M The contained expansion is repeated exactly M times. M must be 0 or a positive integer. M-N The contained expansion is repeated between M and N times, (inclusive). M must be 0 or a positive integer; N must be a positive integer larger than M. For example, "3-5" declares that the contained expansion can occur exactly three, four, or five times. 0-1 is a special case, indicating that the contained expansion is optional. M- The contained expansion is repeated M times or more. M must be 0 or a positive integer. For example, "3-" declares that the contained expansion can occur 3, 4, 5 or more times. The probability of successive repetition of the repeated expression. This attribute is ignored if the repeat attribute is not specified. Optional. Must be in the range between 0.0 and 1.0; note that this is different from a weight attached to the entire item. A simple example is an optional item (zero or one occurrences) with a probability, for example, of 0.6. This indicates that the chance that the item will be matched is 60% and that the chance that it will not be present is 40%. 232 VOICEXML PROGRAMMER S GUIDE

245 <item> Usage Attribute weight xml:lang Indicates how likely the item is. Optional. Must be a positive floating point number, such as 2, 2.5, 0.8, or.4. A weight of 1 is equivalent to not specifying a weight. A weight larger than 1 indicates greater likelihood; a weight less than 1 indicates less likelihood. The language and optional country local identifier for the item. Optional (default is the language of the enclosing element). The accepted language identifiers are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian This attribute allows you to mix multiple languages in the same rule. If an unsupported language is specified, an error.unsupported.language event is thrown. See Also Examples Parents <rule> <item> <one-of> Children <token> <ruleref> <item> <one-of> <tag> Speech Recognition Grammar Specification: <item> Chapter 4, XML Speech Grammar Format in the Grammar Reference. <-- The word "angel" is optional and is not very likely to occur. --> <item> <item repeat="0-1" repeat-prob="0.25">angel</item> <token>fish</token> </item> <-- The rule reference to digit must occur between 2 and 4 times --> <-- and it is very likely it will occur 3 or 4 times. --> <item repeat="2-4" repeat-prob=".8"> <ruleref uri="#digit"/> </item> VOICEXML PROGRAMMER S GUIDE 233

246 TAGS <lexicon> Syntax New in VoiceXML 2.0. XML grammar element that references an external pronunciation lexicon document. Currently, the BeVocal VoiceXML interpreter ignores the <lexicon> element. <lexicon uri="uri" type="mediatype" /> Usage Attribute uri The location of the pronunciation lexicon. Required. type Media type of the lexicon document. Optional. All <lexicon> elements must occur before any <rule> elements in the grammar. Parents <grammar> Children None See Also Speech Recognition Grammar Specification: <lexicon> Chapter 4, XML Speech Grammar Format in the Grammar Reference. 234 VOICEXML PROGRAMMER S GUIDE

247 <link> <link> Syntax Specifies a transition common to all dialogs in the link s scope. <link next="uri" expr="js_expression" event="event" dtmf="dtmf_sequence" eventexpr="js_expression" message="string" messageexpr="js_expression" fetchhint="prefetch" "safe" fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" > Link Grammar </link> Transitions to another dialog or throws an event when user input matches one of the link grammars or the DTMF sequence specified in the dtmf attribute. If the link transitions to a different dialog, execution jumps to the link s destination; if it throws an event, execution resumes in the current dialog after the event is handled. The VoiceXML interpreter clears the prompt queue when going to another form. You will not be able to bargein during prompts played in the execution of a <link> tag that goes to another form. Attribute next expr event dtmf eventexpr message messageexpr The URI to go to when user input matches one of the link grammars. Optional (as alternative to expr, event). JavaScript expression that evaluates to the URI to go to when user input matches one of the link grammars. The JavaScript expression is evaluated in the context of the JavaScript scope containing the link, not in the context of the currently active form item. Optional (as alternative to next, event). The event to throw when user input matches one of the link grammars. Optional (as alternative to next, expr). New in VoiceXML 2.0. A DTMF sequence to activate this link. The specified sequence is equivalent to a simple DTMF grammar; the link is activated when user input matches this DTMF sequence. This attribute can be used at the same time as child grammars. New in VoiceXML 2.0. A JavaScript expression evaluating to the name of the event to be thrown. New in VoiceXML 2.0. A message string providing additional context about the event being thrown. The message is available as the value of a variable within the scope of the <catch> element. New in VoiceXML 2.0. A JavaScript expression evaluating to the message string. VOICEXML PROGRAMMER S GUIDE 235

248 TAGS Attribute fetchhint fetchtimeout fetchaudio maxage maxstale Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. One and only one of the next, expr, or event attributes must be specified. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Tips: During application development, put a link like the following in your application root document, so that while you re calling your application you can say: BeVocal reload or press: *** to start the application again. <vxml version="2.0" xmlns=" <link next="< <grammar type="application/x-nuance-gsl"> [ (bevocal reload) (dtmf-star dtmf-star dtmf-star) ] </grammar> </link>... </vxml> If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Usage Parents <field> <form> <initial> <vxml> Children <grammar> 236 VOICEXML PROGRAMMER S GUIDE

249 <link> See Also Examples VoiceXML 2.0 Specification: <link> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <link maxage="0" next="target.vxml"> <grammar type="application/x-nuance-gsl"> [ (go elsewhere) ] </grammar> </link> <form id="form"> <field name="welcome"> This is a test for link tag. You can say go elsewhere to load the next page of this application. </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 237

250 TAGS <log> Syntax New in VoiceXML 2.0. Writes debugging information to a the BeVocal Café call log, which you can view on the Café web site. <log label="string" expr="js_expression" > Debugging Text </log> Similar functionality is available within a JavaScript script using the bevocal.log function. Usage Attribute label expr A string that is added as a label to messages produced by this <log> element. Optional (default is no label on the messages). JavaScript expression that evaluates to a string to be added to the call log as a separate message. Optional. A <log> element may write one or two messages to the call log. It writes one message corresponding to the expr attribute, if any, and one message corresponding to the contained debugging text, if any. If the element has a label attribute, the specified label precedes each message. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <if> <help> <noinput> <nomatch> VoiceXML 2.0 Specification: <log> Children <value> 238 VOICEXML PROGRAMMER S GUIDE

251 <log> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo">  <block name="dbg"> <log> num: <value expr="num"/> fruit: <value expr="fruit"/> </log> </block> <field name="num" type="number"> <prompt>say a number.</prompt> </field> <field name="fruit"> <grammar type="application/x-nuance-gsl"> [ apples oranges ] </grammar> <prompt>do you want apples or oranges?</prompt> </field> <filled mode="any" namelist="num fruit"> <clear namelist="dbg"/> </filled> <block> <log> End of form foo reached </log> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 239

252 TAGS <mark> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that places a marker into the output stream for asynchronous notification. <mark name="string" /> This element does not affect the speech output process. Note: Currently, this element is ignored. Attribute name The name of the mark. When producing audio output for the containing element, the speech synthesizer throws an event that includes this name. Usage See Also Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children None VoiceXML 2.0 Specification: <mark> Related tags: <break>, <emphasis>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> 240 VOICEXML PROGRAMMER S GUIDE

253 <menu> <menu> Syntax Allows user to choose among alternative destinations. <menu id="string" scope="document" "dialog" dtmf="dtmf_sequence" accept="exact" "approximate" > Child Elements </menu> Dialog that transitions to <choice> destinations based on user input. Attribute id scope dtmf accept Menu identifier. Optional. Lets you specify this menu as the target for a <goto> or <submit>. Sets the scope of the menu s grammar. Optional (default is dialog). document The menu s grammar is active throughout the current document. If the document is the application root document, then it is active throughout the application (application scope). dialog The menu s grammar is active only in the current menu. Enables DTMF selection for all choices in this menu. Optional (default is false). true The interpreter assigns DTMF selectors of 1, 2, and so on to the first 9 or fewer <choice> elements in document order that do not explicitly specify a DTMF sequence using the dtmf attribute. false The interpreter does not make implicit DTMF assignments to menu choices with no DTMF sequences. Note that at most 9 choices can be assigned DTMF selectors with this attribute. If the menu contains more than 9 choices, you can explicitly assign selectors to the additional choices using the dtmf attribute of their <choice> tags. If you set this attribute to true, any values you specify for dtmf attributes of individual choices must be different from the numbers that will be assigned automatically. For example, If the menu contains 4 choices with no dtmf attribute, those choices will be assigned the selectors 1 through 4. The other choices should have their dtmf attributes set to 0, *, #, or a number greater than 4. If the menu tries to assign the same DTMF selector to more than one choice, an error.badfetch event is thrown when the containing document is fetched. New in VoiceXML 2.0. Specifies whether the default grammars generated for <choice> elements require all words or accept a subset of the words. exact Requires the user to say the exact phrase that appears in the <choice> element. approximate Allows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element is 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0. VOICEXML PROGRAMMER S GUIDE 241

254 TAGS Usage Exceptions See Also Parents <vxml> Children <audio> <catch> <choice> <enumerate> <error> <help> <noinput> <nomatch> <prompt> <property> <script> <value> error.semantic - Thrown if no grammars are active. VoiceXML 2.0 Specification: <menu> Related tags: <choice>, <enumerate> 242 VOICEXML PROGRAMMER S GUIDE

255 <menu> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <menu id="mainmenu" > <prompt> This is the main menu. Where do you want to start to complete your order? <enumerate/> </prompt> <choice next="#dateform"> date </choice> <choice next="#quantityform"> quantity </choice> <choice next="#phoneform"> phone </choice> </menu> <form id="dateform"> <field name="date" type="date"> <prompt> When do you want to pick up your order?</prompt> <filled> <prompt> Just to confirm, I heard you say you will pick it up on <say-as type="date"> <value expr="date"/> </say-as> </prompt> </filled> </field> <block> <goto next="#quantityform" /> </block> </form> <form id="quantityform"> <field name="quantity" type="number"> <prompt> How many bags of candy do you want?</prompt> <filled> <prompt> Okay, you have ordered <value expr="quantity"/> bags of candy. </prompt> </filled> </field> <block> <goto next="#phoneform" /> </block> </form> <form id="phoneform"> <field name="phone" type="phone"> <prompt> At what number can I reach you when the order is ready? </prompt> <filled> <prompt> I guess <say-as type="telephone"> <value expr="phone"/> </say-as> is your home number. </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 243

256 TAGS <meta> Syntax Defines a meta-data item as a name/value pair. Can be used either in defining a grammar in the XML grammar or in defining a VoiceXML document. <meta name="string" http-equiv="string" content="string" /> Usage Attribute name http-equiv content Name of the meta-data property. Optional (as an alternative to http-equiv). This form lets you provide general document information such as author, copyright, description, keywords, and so on. In particular, you can set the name attribute to maintainer to have the trace log ed to the user specified in the content attribute after the call. For a description of the log file, see Log Access Service. Name of an HTTP response header. Optional (as an alternative to name). This form lets you provide information to use in HTTP response headers. For example: <meta http-equiv="cache-control" content="max-age 10"/> <meta http-equiv="expires" content="02 Feb :59:59 GMT"/> While you are allowed to use this form of the <meta> tag to provide caching information about VoiceXML documents and grammars, we do not recommend you do this if you have a choice. If possible, you should place this information directly in your HTTP response headers. Value of the meta-data property. Exceptions See Also Parents <vxml> <grammar> Children None. Allowed as a child of the <grammar> tag only when defining a grammar using the XML grammar format. error.badfetch - If both the name and http-equiv attributes are specified. VoiceXML 2.0 Specification: <meta> Chapter 4, XML Speech Grammar Format in the Grammar Reference 244 VOICEXML PROGRAMMER S GUIDE

257 <meta> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns="  <meta name="author" content="bevocal"/> <meta name="maintainer" /> <form> <block> <prompt> Welcome to the BeVocal Cafe. The client log is ed to the maintainer. Please check your . </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 245

258 TAGS <metadata> Syntax Usage The <metadata> element is container in which information about the document can be placed using a metadata schema. Can be used either in defining a grammar in the XML grammar or in defining a VoiceXML document. This element is intended for including RDF metadata in formats like Dublin Core, but this is not supported by W3C DTD in the last Call Working Draft, which only supports an empty <metadata> element. When the BeVocal interpreter supports schemas, then metadata schemas will be supported. <metadata/> Currently this tag has no effect. Parents <vxml> <grammar> Children None. Allowed as a child of the <grammar> tag only when defining a grammar using the XML grammar format. Exceptions error.badfetch - Will be thrown currently if the <metadata> element has content. See Also VoiceXML 2.0 Specification: <metadata> Chapter 4, XML Speech Grammar Format in the Grammar Reference 246 VOICEXML PROGRAMMER S GUIDE

259 <noinput> <noinput> Syntax Catches a no-input event. <noinput count="integer" cond="js_expression" > Executable Content </noinput> Attribute count cond Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). If multiple handlers for no-input events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tip: JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true). If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 247

260 TAGS Usage See Also Examples Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> VoiceXML 2.0 Specification: <noinput> Related tags: <catch>, <error>, <help>, <nomatch>, <rethrow>, <throw> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <noinput> Sorry I did not hear what you said <reprompt/> </noinput> <form id="foo"> <field name="name"> <prompt> What is your favorite color? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> 248 VOICEXML PROGRAMMER S GUIDE

261 <noinput> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 249

262 TAGS <nomatch> Syntax Catches a no-match event. <nomatch count="integer" cond="js_expression" > Executable Content </nomatch> Attribute count cond Minimum number of times an error must have occurred during a form or menu invocation. Optional (default is 1). If multiple handlers for no-match events are defined in, or inherited by, the element in which the events occurs, one handler is chosen based on event count, scope, and document order. See Chapter 3, Event Handling. Tip: JavaScript expression that must also evaluate to true for an event to be caught. Optional (default is true). If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 250 VOICEXML PROGRAMMER S GUIDE

263 <nomatch> Usage See Also Parents <bevocal:listen> <field> <form> <initial> <menu> <record> <bevocal:register> <subdialog> <transfer> <bevocal:verify> <vxml> Children <audio> <assign> <bevocal:connect> <bevocal:dial> <bevocal:disconnect> <bevocal:foreach> <bevocal:hold> <bevocal:whisper> <clear> <data> <disconnect> <enumerate> <exit> <goto> <if> <log> <prompt> <reprompt> <rethrow> <return> <script> <send> <subdialog> <submit> <throw> <var> <value> VoiceXML 2.0 Specification: <nomatch> Related tags: <catch>, <error>, <help>, <noinput>, <rethrow>, <throw> VOICEXML PROGRAMMER S GUIDE 251

264 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <nomatch> Please say one of red green yellow blue or orange </nomatch> <form id="foo"> <field name="name"> <prompt> What is your favorite color? </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <filled> <prompt> Your favorite color is <value expr="name"/> </prompt> </filled> </field> </form> </vxml> 252 VOICEXML PROGRAMMER S GUIDE

265 <object> <object> Syntax Usage Throws an unsupported object exception. <object> Child Elements </object> This tag allows a platform to expose platform-specific functionality. The BeVocal VoiceXML interpreter does not expose such functionality. VoiceXML 1.0 only. When it encounters the <object> tag, the interpreter always throws an error.unsupported.object exception. New in VoiceXML 2.0. When it encounters the <object> tag, the interpreter throws an error.unsupported.object.objectname exception, where objectname is the value of the class attribute of the <object> tag. Parents Children <form> - VOICEXML PROGRAMMER S GUIDE 253

266 TAGS <one-of> Syntax New in VoiceXML 2.0. XML grammar input element that indicates alternative user inputs. <one-of xml:lang="lang" > Alternatives </one-of> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. The contained items are alternatives; any one of them may be matched by the user input. Each alternative is an <item> element. Usage Attribute xml:lang The language and optional country local identifier for the set of alternatives. Optional (default is the language of the enclosing element). The accepted language identifiers are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian This attribute allows you to mix multiple languages in the same rule. If an unsupported language is specified, an error.unsupported.language event is thrown. See Also Examples Parents <rule> <item> Children <item> Speech Recognition Grammar Specification: <one-of> Chapter 4, XML Speech Grammar Format in the Grammar Reference <grammar...> <rule id="coloredobject"> <ruleref id="color"/> <ruleref id="object"/> </rule> 254 VOICEXML PROGRAMMER S GUIDE

267 <one-of> <rule id="color"> <one-of> <item> red <tag> color="red" </tag> </item> <item> pink <tag> color="red" </tag> </item> <item> yellow <tag> color="yellow" </tag> </item> <item> canary <tag> color="yellow" </tag> </item> <item> green <tag> color="green" </tag> </item> <item> khaki <tag> color="green" </tag> </item> </one-of> </rule> <rule id="object"> <one-of> <item> <tag> object="vehicle" </tag> <one-of><item>truck</item><item>car</item></one-of> </item> <item> <tag> object="toy" </tag> <one-of><item>ball</item><item>block</item></one-of> </item> <item> <tag> object="clothing" </tag> <one-of><item>shirt</item><item>blouse</item></one-of> </item> </one-of> </rule> </grammar> VOICEXML PROGRAMMER S GUIDE 255

268 TAGS <option> Syntax Specifies an option in a <field>. <option accept="exact" "approximate" value="string" dtmf="dtmf_sequence" > Option Text </option> Provide one of a simple set of alternatives within a field without specifying a grammar. A grammar for the field is generated automatically, based on the option list. You can use <enumerate> to generate prompts automatically based on option lists as well. Usage Attribute accept value dtmf New in VoiceXML 2.0. Specifies whether the default grammar generated for this <choice> element requires all words or accepts a subset of the words; overrides the accept attribute of the parent <menu> element. exact Requires the user to say the exact phrase that appears in the <choice> element. approximate Allows the user to say a subset of the words in the <choice> element. Note: The default is exact if the version attribute of the containing <vxml> element specifies 2.0 or greater. For backward compatibility, the default is approximate if the version attribute is less than 2.0 or unspecified. String to assign to the input variable when this item is selected. Optional (default is the value of the dtmf attribute, if any, otherwise, the option text itself with leading and trailing white space removed). DTMF sequence to assign to this option. Optional. Parents Children <field> None. See Also VoiceXML 2.0 Specification: <option> Related tags: <field>, <choice>, <enumerate> 256 VOICEXML PROGRAMMER S GUIDE

269 <option> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="mainform"> <block name="welcome"> This is a test of the option tag. </block> <catch event="nomatch noinput"> <prompt> Sorry, I didn't understand. </prompt> <reprompt/> </catch> <field name="mainmenu"> <prompt> Please select an application from the list of <enumerate/> </prompt> <option value="directions">driving Directions </option> <option value="portfolio">stock Quotes </option> <option value="stores">business Finder </option> <filled> <prompt>you chose <value expr="mainmenu"/> </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 257

270 TAGS <p> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. <p xml:lang="lang" > Text </p> Identifies enclosed text as a paragraph for interpretive purposes. This tag is included for brevity; it is the exact equivalent of <paragraph>. Attribute xml:lang Not Supported. The language and optional country local identifier for the paragraph. Optional Usage See Also Parents <audio> <bevocal:whisper> <choice> <enumerate> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> VoiceXML 2.0 Specification: <paragraph> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <s>, <say-as>, <sentence>, <voice> 258 VOICEXML PROGRAMMER S GUIDE

271 <paragraph> <paragraph> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a paragraph. <paragraph xml:lang="lang" > Text </paragraph> Identifies enclosed text as a paragraph for interpretive purposes. For brevity, the tag <p> is an exact equivalent of this tag. Attribute xml:lang Not Supported. The language and optional country local identifier for the paragraph. Optional Usage See Also Parents <audio> <bevocal:whisper> <choice> <enumerate> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> VoiceXML 2.0 Specification: <paragraph> Related tags: <break>, <emphasis>, <mark>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> VOICEXML PROGRAMMER S GUIDE 259

272 TAGS <param> Syntax Specifies a parameter in a <subdialog> element. <param name="string" expr="js_expression" value="string" valuetype type="mime_type" index="integer" > Nested Param Element </param> Passes values to subdialogs. Not implemented When a <param> element is used to pass a parameter to a subdialog, the subdialog must contain a <var> declaration for the parameter. If the <var> contains an initializing expr attribute, that initializing value is ignored and the value passed with the <param> element is used instead. Attribute name expr value valuetype type index Parameter name. JavaScript expression that evaluates to the value of this parameter. Optional (as alternative to value). String to assign as the value of this parameter. Optional (as alternative to expr). Not implemented. Type of the value attribute (following Nuance precedent). Optional. Extension. Index of current <param> element. Optional. Use when parent <param> element is a vector (java.lang.vector) or an array. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <param> <subdialog> Children <param> VoiceXML 2.0 Specification: <param> Related tag: <subdialog> 260 VOICEXML PROGRAMMER S GUIDE

273 <param> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="main"> <block>we're about to call the sub dialog</block> <subdialog name="result" src="#subbie"> <param name="hello" value="goodbye"/> <param name="goodbye" value="goodbye"/> </subdialog> <block> We're back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">   <var name="hello" expr="'hello'"/>  <var name="goodbye"/> <block> This is the sub dialog. hello equals <value expr="hello"/> goodbye equals <value expr="goodbye"/> <return namelist="hello goodbye"/> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 261

274 TAGS <phoneme> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that provides a phonetic pronunciation for the contained text. <phoneme alphabet="ipa" "worldbet" "xsampa" "darpa" ph="phoneme_string" > Text </phoneme> A phoneme element may be empty. However, your application code will be easier to understand and maintain if the element contains human-readable text that approximates the specified phoneme string. Note: Any attribute is ignored; the contained text is spoken normally. Usage Attribute alphabet ph The phonetic alphabet used in the phoneme string. Optional (default is darpa). Possible values are: darpa DARPA phonetic alphabet ipa International Phonetic Alphabet (IPA) worldbet Worldbet (Postscript) phonetic alphabet xsampa X-SAMPA phonetic alphabet Note: This attribute is passed directly to the TTS engine. Currently, the TTS engine only supports DARPA and ignores the other values. The phoneme string in the specified phonetic alphabetic. See Also Parents <audio> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children None VoiceXML 2.0 Specification: <phoneme> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <prosody>, <say-as>, <sentence>, <voice> 262 VOICEXML PROGRAMMER S GUIDE

275 <phoneme> Example <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" xmlns:xsi=" xmlns:bevocal=" <form> <block> <prompt> <voice name="jennifer"> in good <phoneme ph="m eh 1 zh er 0"/> </voice> </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 263

276 TAGS <prompt> Syntax Queues TTS and audio output to the user. <prompt bargein="true" "false" bargeintype="hotword" "speech" cond="js_expression" count="integer" timeout="time_interval" xml:lang > Prompt Text </prompt> Uses TTS to convey information to the user. If the prompt consists of simple text only, you can omit the prompt tags and the text will be interpreted as if they were present. The prompt is played in its entirety unless interrupted. Not implemented If the bargein property is true, a user utterance can interrupt the prompt. If the bargeintype property is hotword, only an utterance that matches an active grammar can interrupt the prompt. In the latter case, the bevocal.hotwordmin and bevocal.hotwordmax properties specify the minimum and maximum time duration, respectively, of the interrupting utterance. Attribute bargein bargeintype cond count Determines whether user input will be recognized during the prompt. Optional (if not specified, the current value of the bargein property is used). New in VoiceXML 2.0. Determines what kind of input can interrupt the prompt. Optional (if not specified, the current value of the bargeintype property is used). Possible values are: hotword User input that doesn t match the grammar is ignored and only speech that matches a grammar can interrupt the prompt. speech Any user utterance can interrupt the prompt. This attribute is relevant only when bargein is true. If the <prompt> occurs inside a bridge transfer, the value of this bargeintype attribute is ignored. For a bridge transfer, the bargein type is always hotword. JavaScript boolean expression that must evaluate to true for the prompt to be spoken. Minimum number of times the user must have visited the form item containing the prompt for the prompt to be spoken. Optional (default is 1). Lets you vary prompts if the user is having problems and revisits the same form item several times. Form item prompt counters are reset with each invocation of the form. 264 VOICEXML PROGRAMMER S GUIDE

277 <prompt> Attribute timeout xml:lang Time to wait before throwing a no-input event. Optional (if not specified, the current value of the timeout property is used). Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds. Not implemented. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Parents <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <block> <catch> <error> <field> <filled> <help> <if> <initial> <menu> <noinput> <nomatch> <record> <transfer> <subdialog> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <prosody> <s> <say-as> <sentence> <value> <voice> See Also VoiceXML 2.0 Specification: <prompt> Related tags: <reprompt>, <audio>, <transfer> VOICEXML PROGRAMMER S GUIDE 265

278 TAGS Examples Example 1. Tapered prompts: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <property name="universals" value="help" /> <form id="tapered"> <block> <prompt bargein="false"> This is question number 1. </prompt> </block> <field name="color"> <noinput> <reprompt/> </noinput> <nomatch> <reprompt/> </nomatch> <grammar type="application/x-nuance-gsl"> [blue red green yellow] </grammar> <prompt count="1">what is the color of the Sky?</prompt> <prompt count="2">choose a color.</prompt> <prompt count="3">choose from red blue green or yellow.</prompt> <help> The color of the sky is usually blue except during sunset and sunrise. </help> <filled> <if cond="color== blue "> <prompt>thats correct. The sky is <value expr="color"/> </prompt> <else/> <prompt> thats not correct. The sky is blue in color. </prompt> </if> </filled> </field> </form> </vxml> 266 VOICEXML PROGRAMMER S GUIDE

279 <prompt> Example 2: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="mynum"> <var name="phone" expr=" "/> <field name="get_num" type="boolean"> <prompt> My phone number is <say-as type="telephone"> <value expr="phone"/> </say-as> Do you want to tell me your number? </prompt> <filled> <if cond="get_num"> <goto nextitem="your_num"/> <else/> <prompt> Ok, do not tell me. </prompt> </if> </filled> </field> <field name="your_num" type="phone" cond="get_num"> <prompt> What is your phone number </prompt> <filled> <prompt> So your number is <say-as type="telephone"> <value expr="your_num"/> </say-as> </prompt> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 267

280 TAGS <property> Syntax Controls settings specific to the BeVocal VoiceXML implementation platform. <property name="string" value="string" bevocal:expr="js_expression" /> Sets values that affect platform behavior and/or represent default attribute values. Attribute name value bevocal:expr Property name. The name can be any of the supported properties described in Chapter 12, Properties. Property value. The allowable values depend on the property specified in the name attribute. Property value. The JavaScript expression is evaluated whenever the <property> s scope is reinitialized. The property settings apply to the parent element and all descendents. However, property values set at lower levels take precedence. The value of time-related properties must be an unsigned number, optionally followed by s for seconds or ms for milliseconds. If there is no suffix, seconds are assumed. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Parents <bevocal:listen> <bevocal:register> <bevocal:verify> <field> <form> <initial> <menu> <record> <subdialog> <transfer> <vxml> Children None. See Also VoiceXML 2.0 Specification: <property> 268 VOICEXML PROGRAMMER S GUIDE

281 <property> Examples Example 1 bargein property: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <nomatch><reprompt/></nomatch> <property name="bargein" value="true"/> <form id="form_property"> <property name="bargein" value="false"/> <block> <prompt> This is a prompt where you can not barge in </prompt> <goto next="#form2"/> </block> </form> <form id="form2"> <field name="option" type="boolean"> <prompt> Try barging in by saying either yes or no as I speak. This is a prompt where you can interrupt me. </prompt> <filled> <if cond="option"> <prompt>you want to go. Goodbye.</prompt> <disconnect/> <else/> <prompt> Continuing till you say yes. </prompt> <clear/> </if> </filled> </field> </form> </vxml> Example 2 timing properties: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form>  <property name="timeout" value="10s"/> <noinput> <prompt>the caller said nothing.</prompt> <reprompt/> </noinput> VOICEXML PROGRAMMER S GUIDE 269

282 TAGS  <property name="incompletetimeout" value="5s"/> <nomatch> <prompt>the caller did not say enough to match the grammar.</prompt> <reprompt/> </nomatch>  <property name="completetimeout" value="0.1s"/> <filled> <prompt>the caller said the correct thing.</prompt> <exit/> </filled> <field name="one"> Say one two three four <grammar type="application/x-nuance-gsl"> (one two three four) </grammar> </field> <block> Finished. This should never play. </block> </form> </vxml> Example 3 maximum error properties: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns="  <property name="bevocal.maxerrors" value="9"/> <property name="bevocal.maxdialogerrors" value="4" /> <property name="timeout" value="1.5s" /> <noinput> i did not hear anything </noinput> <form id="firstform"> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. going to next form. </prompt> <goto next="#secondform"/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. this should not happen. exiting test. </prompt> <exit/> 270 VOICEXML PROGRAMMER S GUIDE

283 <property> </catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 3 times before moving to the next field. </field> </form>  <form id="secondform"> <property name="bevocal.maxdialogerrors" value="3" /> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. going to next form. </prompt> <goto next="#thirdform"/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. this should not happen. exiting test. </prompt> <exit/> </catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 2 times before moving to the next field. </field> </form>  <form id="thirdform"> <catch event="error.bevocal.maxdialogerrors_exceeded"> <prompt> max dialog errors event detected. this should not happen. </prompt> <exit/> </catch> <catch event="error.bevocal.maxerrors_exceeded"> <prompt> total max errors event detected. the test was successful. now exiting. </prompt> <exit/> </catch> <field name="first" type="boolean"> please do not say anything. you should hear the no input message 1 time before you get a max total errors event. </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 271

284 TAGS <pros> Syntax Usage VoiceXML 1.0 only. Java Speech Markup Language element that changes the prosody of speech output. <pros rate="string" vol="string" pitch="string" range="string" > Text </pros> Controls how the enclosed text is spoken. Currently all attributes are ignored and text is spoken normally. Note: In VoiceXML 2.0, this tag is replaced by the <prosody> tag. Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> Children <audio> <break> <div> <emp> <enumerate> <pros> <sayas> <say-as> <value> See Also VoiceXML 1.0 Specification: <pros> Related Tags: <break>, <div>, <emp>, <sayas> 272 VOICEXML PROGRAMMER S GUIDE

285 <prosody> <prosody> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that changes the prosody of speech output. <prosody pitch="high" "medium" "low" "default" "number_of_hertz" "relativechange" contour="targets_at_intervals" range="high" "medium" "low" "default" "number_of_hertz" "relativevalue" rate="fast" "medium" "slow" "default" "relativechange" duration="time_interval" volume="silent" "soft" "medium" "loud" "default" "percent_volume" "relativechange" > Text </prosody> Controls how the enclosed text is spoken. Not implemented Not implemented Not implemented Not implemented Currently only rate and volume attributes are implemented. For other attributes the text is spoken normally. Attribute pitch contour range rate duration volume Not implemented. The baseline pitch for the contained text in Hertz, a relative change or values high, medium, low, default. Not implemented. Sets the actual pitch contour for the contained text. The pitch contour is defined as a set of targets at specified intervals in the speech output. In each pair of the form (interval,target), the first value is a percentage of the period of the contained text and the second value is the value of the pitch attribute (absolute, relative, relative semitone, or descriptive values are all permitted). Interval values outside 0% to 100% are ignored. If a value is not defined for 0% or 100% then the nearest pitch target is copied. Not implemented. The pitch range (variability) for the contained text in Hertz, a relative change or values high, medium, low, default. The speaking rate for the contained text, a relative change or values fast, medium, slow, default. Not implemented. The desired time to take to read the element contents. Express time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds. The volume for the contained text as a percentage (in the range 0.0 to 100.0), a relative change or values silent, soft, medium, loud, default. VOICEXML PROGRAMMER S GUIDE 273

286 TAGS Usage See Also Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> VoiceXML 2.0 Specification: <prosody> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <say-as>, <sentence>, <voice> 274 VOICEXML PROGRAMMER S GUIDE

287 <record> <record> Syntax Records an audio sample. <record name="string" expr="js_expression" cond="js_expression" maxtime="time_interval" finalsilence="time_interval" type="mime_type" beep="true" "false" modal dtmfterm="true" "false" > Child Elements </record> Not implemented Input item that collects a recording from the user and stores the result in the input variable. Attribute name expr cond maxtime finalsilence Name of the input variable used to store the recording. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, you ll need to clear it before the <record> can execute. JavaScript boolean expression that also must evaluate to true for the <record> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <record> can execute. Maximum duration of the recording. Optional (default is 10s). Express a time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Duration of silence that will terminate the recording. Optional (default is 1.5s). The minimum allowed value is 0.2 seconds. The maximum allowed value is 10 seconds. Express a time interval as unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Note: If the value of the finalsilence attribute is 0s, the interpreter does a fixed length record for maxtime seconds. The recording does not wait for speech input and does not throw a NOINPUT event. The recording can still be terminated using any dtmf key if the dtmfterm attribute is set to true. VOICEXML PROGRAMMER S GUIDE 275

288 TAGS Attribute type beep modal dtmfterm MIME encoding of the submitted document. Optional (default is audio/wav). The supported types and the format of the resulting recording are: audio/wav WAV (RIFF header) 8 KHz 8-bit mono mu-law [PCM] single channel audio/basic WAV (RIFF header) 8 KHz 8-bit mono mu-law [PCM] single channel audio/vnd.wave;codec=1 WAV (RIFF header) 8 KHz 16-bit mono Linear [PCM] single channel] Specifies whether to emit a beep just prior to recording. Optional (default is false). Not implemented. Specifies whether a DTMF keypress terminates the recording. Optional (default is true). When set to true, the DTMF keypress is not part of the recording. If no audio is collected during the execution of <record>, its input variable is unfilled. This allows the form item to be visited again by the FIA. Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the recording is made, additional information is available in the following properties of this shadow variable: Property duration size termchar maxtime recording recordingsize recordduration The duration of the recording in milliseconds. The size of the recording in bytes. If the dtmfterm attribute is true, and the user terminates the recording by pressing a DTMF key, then this property is the key pressed (for example, #). Otherwise, the property is null. true if the recording was terminated because the maximum duration (specified by the maxtime attribute) was reached; false if the recording terminated within the maximum time. If the recordutterance or bevocal.audio.capture property is set to true and the user s speech matched the field grammar, the recording property contains an audio capture of the user s speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected. For a <record> element whose name is name, you access the property propname of the shadow variable with the syntax: name$.propname 276 VOICEXML PROGRAMMER S GUIDE

289 <record> For example, you access the duration property for the <record> named greeting as: greeting$.duration Note: If the user hangs up during the recording, the interpreter throws a hang up event (connection.disconnect.hangup in VoiceXML 2.0; telephone.disconnect.hangup in VoiceXML 1.0). Any audio data recorded before hang up is available in the input variable and properties of the shadow variable are set as described above. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Examples Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <property> <value> VoiceXML 2.0 Specification: <record> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form-record"> <record name="greeting" beep="true" maxtime="10s" finalsilence="2000ms"> <prompt> Please record your greeting after the tone. </prompt> <noinput> i did not hear anything. <reprompt/> </noinput> </record> <field name="confirm" type="boolean"> <prompt> your greeting is <audio expr="greeting"/> to keep it say yes, to discard it say no. </prompt> <filled> <if cond="confirm"> VOICEXML PROGRAMMER S GUIDE 277

290 TAGS <prompt> ok, i will save your greeting </prompt> <submit method="post" namelist="greeting" next="greetingstore.jsp"/> <else/> <prompt> ok, lets try again </prompt> <clear namelist="greeting confirm"/> </if> </filled> </field> </form> </vxml> 278 VOICEXML PROGRAMMER S GUIDE

291 <reprompt> <reprompt> Syntax Usage Plays a field prompt when a field is re-visited after an event. <reprompt/> Normally the interpreter suppresses prompts when it selects the next form item after executing a <catch> element or other event handler. By placing a reprompt element in the event handler, you can cause normal prompting to occur and prompt counters to increment when the next form item is executed. This tag affects both the explicit fields of a form and the single implicit field of a menu. Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None. See Also VoiceXML 2.0 Specification: <reprompt> Related tag: <prompt> VOICEXML PROGRAMMER S GUIDE 279

292 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns="  <form id="form-record"> <field name="mycolor"> <prompt> What is your favorite color? </prompt> <prompt count="2"> Please choose red green yellow blue or orange. </prompt> <grammar type="application/x-nuance-gsl"> [ red green yellow blue orange ] </grammar> <noinput> I did not quite get your message. <reprompt/> </noinput> </field> <block> <prompt> You chose <value expr="mycolor"/> </prompt> </block> </form> </vxml> 280 VOICEXML PROGRAMMER S GUIDE

293 <rethrow> <rethrow> Syntax Usage Extension. Causes the event currently being handled to be rethrown. <rethrow /> When executed within an event handler, this tag causes the event currently being handled to be rethrown. The execution environment searches for a new handler for the event starting in the scope above the one containing the current handler. Rethrowing an event allows you to handle the same event at different levels. For example, an event handler in a form could perform a certain amount of cleanup and then rethrow the event so that a document-level event handler could perform further cleanup. For more information, see Chapter 3, Event Handling. Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None. See Also Related tags: <catch>, <error>, <help>, <noinput>, <nomatch>, <throw> VOICEXML PROGRAMMER S GUIDE 281

294 TAGS Examples See other catch and throw examples under <throw> and <catch>. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <catch event="myevent"> <prompt>catch of my event at document scope.</prompt> <goto next="#nonumbers" /> </catch> <form id="numbers"> <catch event="myevent"> <prompt> Catch of my event at form scope. This will test the rethrow tag. </prompt> <rethrow/> </catch> <block name="numbergame"> This is a test of the throw tag. </block> <field name="mynumber" type="number"> <prompt> Tell me a number greater than ten. </prompt> <filled> <prompt> The number you said is <value expr="mynumber"/> </prompt> <if cond="mynumber < 10"> <throw event="myevent"/> </if> </filled> </field> </form> <form id="nonumbers"> <block>you will hear no numbers here. Goodbye.</block> </form> </vxml> 282 VOICEXML PROGRAMMER S GUIDE

295 <return> <return> Syntax Returns from a subdialog. <return namelist="variable1..."> event="event"/ eventexpr="js_expression" message="string" messageexpr="js_expression" /> Causes subdialog s execution context to terminate. You can use the <return> to pass back variable values from the subdialog s execution context and to propagate an event back to the calling dialog (for example, a no-match event). To propagate an event back to the calling dialog, use the event attribute. For example: <return event="nomatch"/> If <return> is encountered in a dialog that is not executing as a subdialog, an error.semantic event is thrown. Attribute namelist event The values returned with the namelist attribute are available to the calling dialog as properties of the <subdialog> input variable. They can be accessed using the following notation: subdialogname.namelistvarible For example, if a subdialog is invoked with: <subdialog name="sub".../> and exited with: <return namelist="a"/> then the return value can be accessed as: "sub.a" Space-separated list of variables to return to the calling dialog. Optional (default is to return no variables). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is returned as the entire object; it is not broken into its individual components as happens with variable submission. Return to calling dialog and throw this event. VOICEXML PROGRAMMER S GUIDE 283

296 TAGS Usage Exceptions See Also Examples Parents <bevocal:foreach> <block> <catch> <error> <help> <noinput> <nomatch> <if> <filled> error.badfetch - Thrown if both event and namelist attributes are specified. VoiceXML 2.0 Specification: <return> Related tag: <subdialog> Children None. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="main"> <subdialog name="result" src="#subbie"> <param name="goodbye" value="goodbye"/> </subdialog> <block> We re back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">    <var name="hello" expr=" hello "/>  <var name="goodbye"/> <block> This is the sub dialog. <return namelist="hello goodbye"/> </block> </form> </vxml> 284 VOICEXML PROGRAMMER S GUIDE

297 <rule> <rule> Syntax New in VoiceXML 2.0. XML grammar element that defines a grammar rule. <rule id="string" scope="private" "public" > Other Content </rule> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. A <rule> element can contain any number of input elements, examples, and <tag> elements. The input elements indicate a sequence that must be matched in order. Any contained <tag> elements apply to the entire sequence. If the user input matches the input elements, the <tag> elements are interpreted to assign values to input variables. Any <example> elements are ignored by the speech-recognition engine. Usage Attribute id scope The name of the rule; must be unique within the containing grammar. "NULL", "VOID", or "GARBAGE" are reserved for special rules; id must not be one of these values. The scope in which this rule can be used. Optional (default is "private"). private The rule can be used only by the containing grammar public The rule can be referenced by another grammar. Note: Do not confuse the scope of a rule with the scope of a containing grammar. The scope of the grammar indicates where in the VoiceXML application the grammar is active. See Also Parents <grammar> Children <example> <item> <one-of> <ruleref> <tag> <token> Speech Recognition Grammar Specification: <rule> Chapter 4, XML Speech Grammar Format in the Grammar Reference. VOICEXML PROGRAMMER S GUIDE 285

298 TAGS Examples This rule matches one of the 3 phrases "Garibaldi", "Nassau Grouper", or "Trout": <rule id="fish"> <one-of> <item>garibaldi</item> <item>"nassau Grouper"</item> <item>trout</item> </one-of> </rule> This rule matches any input that ends with "trigger fish", such as "queen trigger fish", "picasso trigger fish", and "mean old trigger fish", as well as simply "trigger fish": <rule id="trigger"> <ruleref special="garbage"/> <token>trigger Fish</token> </rule> The following set of rules creates one public rule named snapper and two private rules named snappertype and fishcolors: <rule id="snapper" scope="public"> <ruleref uri="#snappertype"/> <token>snapper</token> </rule> <rule id="snappertype"> <token>mutton</token> <ruleref uri="#fishcolors"/> </rule> <rule id="fishcolors" scope="private"> <one-of> <item>black</item> <item>gray</item> <item>red</item> </one-of> </rule> 286 VOICEXML PROGRAMMER S GUIDE

299 <ruleref> <ruleref> Syntax New in VoiceXML 2.0. XML grammar input element that references another rule. <ruleref uri="uri" type="mime_type" xml:lang="lang" special="null" "VOID" "GARBAGE" > Optional Tags </ruleref> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. The referenced rule specifies user input to be matched. If the user input matches the referenced rule, any contained <tag> elements are interpreted to assign values to input variables. You can refer to a rule from within another rule either in the same or a different grammar definition. In summary, the formats for doing so are: Reference to a Local rule Named rule of a grammar identified by a URI Root rule of a grammar identified by a URI Rule of a grammar identified by a URI and specifying a media type Rule of a grammar identified by a URI and specifying a language Special rule definition Format <ruleref uri="#rulename"/> <ruleref uri="grammaruri#rulename"/> <ruleref uri="grammaruri"/> <ruleref uri="grammaruri" type="mediatype"/> <ruleref uri="grammaruri" xml:lang="lang"/> <ruleref special="null"/> <ruleref special="void"/> <ruleref special="garbage"/> VOICEXML PROGRAMMER S GUIDE 287

300 TAGS Attribute uri type The URI of the referenced rule. Optional (Must provide uri or special, but not both). May be one of the following: #rulename References the local rule rulename in the contained grammar. grammarfileuri#rulename References the public rule rulename in the grammar defined in the grammar file whose URI is grammarfileuri. The grammar file can be in any grammar format, not just XML. grammarfileuri References the default rule of the grammar defined in the grammar file whose URI is grammarfileuri. The grammar file can be in any grammar format. MIME type of the grammar. Optional; only relevant for a reference to a rule in another grammar file. For such a reference, the default type is taken from the Content-type header of the returned file. If not present, the type is inferred from the URL's extension or from the contents of the grammar (for example, a file beginning with <?xml maps to application/srgs+xml). The recognized extensions are:.grxml,.xml XML Speech Grammar.gram ABNF Speech Grammar.gsl,.grammar Nuance GSL.ngo Nuance Grammar Object.jsgf Java Speech Grammar Format The currently supported types are: application/srgs+xml XML Speech Grammar application/grammar+xml XML Speech Grammar (Deprecated; support for this value will be removed from a future release) application/srgs ABNF Speech Grammar application/grammar ABNF Speech Grammar (Deprecated; support for this value will be removed from a future release) application/x-nuance-gsl Nuance GSL application/x-gsl Nuance GSL. (Deprecated; support for this value will be removed from a future release.) application/x-nuance-dynagram-binary Nuance Grammar Object If you specify an unsupported type, an error is thrown. This value is used only if the web server returns an unsupported grammar type. 288 VOICEXML PROGRAMMER S GUIDE

301 <ruleref> Usage Attribute xml:lang special The language and optional country local identifier for the rule reference. Optional (default is the language of the enclosing element). The accepted language identifiers are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian This attribute is only useful if the rule referred to is in the GSL format. In all other cases, the interpreter ignores this attribute. If an unsupported language is specified, an error.unsupported.language event is thrown. The referenced special rule. Optional (Must provide uri or special, but not both). NULL Rule that is automatically matched, that is, matched without the user speaking. VOID Rule that can never be matched. Inserting VOID into a sequence automatically makes that sequence unspeakable GARBAGE Rule that matches any speech up until the next rule match, the next token or until the end of spoken input. See Also Examples Parents <rule> <item> Children None. Speech Recognition Grammar Specification: <ruleref> Chapter 4, XML Speech Grammar Format in the Grammar Reference. // Reference to a specific rule of an external grammar <ruleref uri="../fish.gram#butterflies" /> // Reference to the root rule of an external grammar <ruleref uri=" /> // References with associated media types <ruleref uri=" type="application/srgs" /> <ruleref uri="../fish" type="application/srgs" /> // Reference with an associated media type and language <ruleref uri=" type="application/x-nuance-gsl" xml:lang="en-us" /> VOICEXML PROGRAMMER S GUIDE 289

302 TAGS <s> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. <s xml:lang="lang" > Text </s> Identifies enclosed text as a sentence for interpretive purposes. This tag is included for brevity; it is the exact equivalent of <sentence>. Attribute xml:lang Not supported. The language and optional country local identifier for the sentence. Optional Usage See Also Parents <audio> <bevocal:whisper> <choice> <enumerate> <p> <paragraph> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice> VoiceXML 2.0 Specification: <sentence> Related tags: <break>, <emphasis>, <mark>, <p>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence>, <voice> 290 VOICEXML PROGRAMMER S GUIDE

303 <say-as> <say-as> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that modifies how the enclosed word or phrase is spoken. <say-as bevocal:mode="mode" type="typename" "typename:format" > Text </say-as> Usage Attribute bevocal:mode type sub Extension; When value recorded, the output will be recorded audio. If no recorded audio is available, then TTS will be used to play out the content. Optional. Note: Recorded prompts are not available for Spanish or French Canadian. Speak enclosed text in the given style. The type attribute takes quite a few values that are listed below. Deprecated; Use the <sub> element instead. This attribute was supported in an earlier release; if you use it, the interpreter throws a parse error. The following events may be thrown during the execution of a <say-as> element: error.badfetch - In VoiceXML 1.0; the <say-as> tag is a version 2.0 element. error.badfetch - When this element contains a class attribute. The class attribute is not valid in version 2.0 error.semantic - When the type attribute contains a value which is not one of those allowed in the enumerated list of values above. Note that some values have changed between version 1.0 and 2.0. For example type="phone" (from 1.0) would cause this error to be thrown. The type attribute takes quite a few values. Some <say-as> types require a format. For others, it is optional. The values break down into 3 categories corresponding to a built-in grammar, other standard SSML <say-as> types, and extended <say-as> types available with the BeVocal interpreter. The sole type value that corresponds to a built-in grammar is: Type vxml Contained text is a field s input variable. The allowed formats correspond exactly to the available built-in grammars: boolean, currency, date, digits, number, phone, and time. The format is required. VOICEXML PROGRAMMER S GUIDE 291

304 TAGS The other standard <say-as> types and their formats are: Type acronym spell-out number date time duration currency measure telephone name net Contained text is an acronym. For example, <say-as type="acronym">ibm</say-as> is spoken as "eye bee em". Characters in the contained text string are spoken as individual characters. For example, <say-as type="spell-out">smith</say-as> is spoken as "ess em eye tea aitch". Contained text can be interpreted as a number. The allowed number formats are ordinal, cardinal, and digits. For example, <say-as type="number:ordinal">12</say-as> is spoken as "twelfth", but <say-as type="number:digits">12</say-as> is spoken as "one two". Contained text is a date. The allowed date formats indicate the form of the input and are dmy, mdy, ymd, ym, my, md, y, m, and d. For example, <say-as type="date:my">12/2003</say-as> is spoken as "December, two thousand and three". Contained text is a time of day. The allowed time formats indicate the form of the input and are hms, hm, and h. For example, <say-as type="time:h">3</say-as> is spoken as "3 o clock". Contained text is a temporal duration. The allowed duration formats indicate the form of the input and are hms, hm, ms, h, m, and s. For example, <say-as type="duration:h">3</say-as> is spoken as "3 hours". Contained text is a currency amount. For example, <say-as type="currency">$3</say-as> is spoken as "3 dollars". Contained text is a measurement. For example, <say-as type="measure">3ft</say-as> is spoken as "3 feet". Contained text is a telephone number. For example, <say-as type="telephone"> </say-as> is spoken as "five five five <pause> five five five <pause> five five <pause> five five". Contained text is a proper name of a person, company and so on. Contained text is an internet identifier. The allowed net formats indicate the form of the input and are and uri. For example, <say-as type="net: ">[email protected]</say-as> is spoken as "you at company dot com". 292 VOICEXML PROGRAMMER S GUIDE

305 <say-as> Extensions. The extended <say-as> types are: Type airport airline equity street city state citystate address Contained text is an airport code, such as DFW. For a list of the allowed airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Café web site. Contained text is an airline codes, such as AA. For a list of the allowed airport codes, see Airport & Airline Codes. Contained text is a company symbol, such as ibm or csco Contained text is a street name (with or without street number), such as bordeaux drive or 1380 bordeaux drive Contained text is a city name, such as sunnyvale Contained text is a state name, such as california or ca Contained text is a city name and state name separated by a comma, such as sunnyvale, california Contained text is a street, city, state, and zip code separated by commas, such as 1380 bordeaux drive, sunnyvale, california, See Also Examples Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <value> VoiceXML 2.0 Specification: <say-as> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <sentence>, <sub>, <voice> To hear an example of each supported type, see the Say-As Types example in the VoiceXML Samples section of the BeVocal Café. VOICEXML PROGRAMMER S GUIDE 293

306 TAGS <sayas> Syntax VoiceXML 1.0 only. Java Speech Markup Language element that modifies how a word or phrase is spoken. <sayas sub="string" class="phone" "date" "time" "digits" "literal" "currency" "number" "airport" "airline" "equity" "street" "city" "state" "citystate" type="phone" "date" "time" "digits" "literal" "currency" "number" "airport" "airline" "equity" "street" "city" "state" "citystate" phon Not implemented > Text </sayas> Note: In VoiceXML 2.0, this tag is replaced by the <say-as> tag. The <say-as> tag is from the Synchronized Speech Markup Language. Attribute sub Substitute text to be spoken instead of enclosed text. Optional. 294 VOICEXML PROGRAMMER S GUIDE

307 <sayas> Usage Attribute type class phon Speak enclosed text in the given style. Optional. Possible values with enclosed text formats are: phone For telephone number adhering to the North American Dialing Plan, such as date For dates, in yyyymmdd or yyyymm format, such as time For times such as 11:45 PM digits For digits, such as literal For literals currency For currency in dollars and cents, such as $ number For numbers, such as 10.5 or 10 Extensions: airport For airport codes, such as DFW. For a list of the allowed airport codes, see Airport & Airline Codes on the Resources page of the BeVocal Café web site. airline For airline codes, such as AA. For a list of the allowed airport codes, see Airport & Airline Codes. equity For company symbol, such as ibm or csco street For street name (with or without street number), such as bordeaux drive or 1380 bordeaux drive city For city name, such as sunnyvale state For state name, such as california or ca citystate For city name and state name separated by a comma, such as sunnyvale, california Note: The type attribute is the VoiceXML 2.0 standard; the class attribute is an extension. These two attributes are identical; only one of them should be used. Not implemented. Representation of the Unicode International Phonetic Alphabet (IPA) characters to be spoken instead of enclosed text. Optional. The following events may be thrown during the execution of a <sayas> element: error.badfetch - Thrown in VoiceXML 2.0. error.semantic - When the type attribute contains a value which is not one of those allowed in the enumerated list of values above. Note that some values have changed between version 1.0 and 2.0. See Also Parents <audio> <bevocal:whisper> <choice> <div> <emp> <enumerate> <prompt> <pros> Children None. VoiceXML 1.0 Specification: <sayas> Related tags: <break>, <div>, <emp>, <pros> VOICEXML PROGRAMMER S GUIDE 295

308 TAGS Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 1.0//EN" " <vxml version="1.0"> <form> <block> <prompt> Here is a currency amount: <sayas class="currency"> $ </sayas> Here is a company: <sayas class="equity"> csco </sayas> Here is a number: <sayas class="number"> </sayas> Here is an integer number. You should not hear any decimal content: <sayas class="number"> 10 </sayas> Here is an airport: <sayas class="airport"> DFW </sayas> Here is an airline: <sayas class="airline"> AA </sayas> Here is a phone number: <sayas class="phone"> </sayas> Here is a date: <sayas class="date"> </sayas> Here is a street: <sayas class="street"> heiden lane </sayas> Here is a city: <sayas class="city"> austin </sayas> Here is a state: <sayas class="state"> texas </sayas> Here is a citystate: <sayas class="citystate"> austin, texas </sayas> Here is a digit string: <sayas class="digits"> </sayas> </prompt> </block> </form> </vxml> 296 VOICEXML PROGRAMMER S GUIDE

309 <script> <script> Syntax Specifies a block of client-side scripting logic in JavaScript. <script src="uri" charset="string" fetchhint="prefetch" "safe" fetchtimeout="time_interval" maxage="time_interval" maxstale="time_interval" srcexpr="js_expression" > Script Text </script> Scripts do not have their own scope, but are executed in the scope of the containing element. Attribute src charset fetchhint fetchtimeout maxage maxstale srcexpr URI of the script. Optional (as alternative to inline). Character encoding of the script if src is used. Optional. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. JavaScript expression that evaluates to grammar file URI. Optional (as alternative to src or an inline grammar). If you specify this attribute, the element cannot have content. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. VOICEXML PROGRAMMER S GUIDE 297

310 TAGS Usage Tips: Inside a <script> element, if your JavaScript script contains any of the characters <, >, or &, that character must be escaped. The easiest way to do this is to place the entire script inside a CDATA section, as in: <script> <![CDATA[ function factorial(n) { return (n <= 1)? 1 : n * factorial(n-1); } ]]> </script> Remember that any variables or functions declared in a script are valid only within the scope that contains the <script> element. Define the script in document scope (in the <vxml> element) if it defines items that you want to use in several dialogs or blocks. Any items defined by a script inside a block are accessible only within that block. In the following example, the first block contains a script that defines the function foo. That function can be used in a JavaScript expression later in the same block. However, it is illegal to use the function in a different block. The <value> tag in the second block will fail because the function foo goes out of scope when the interpreter leaves the first block. <block> <script> <!{CDATA[ function foo() { return 1; } ]]> </script>  Foo is <value expr="foo()"/> </block> <block>  Foo is <value expr="foo()"/> </block> See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <menu> <noinput> <nomatch> <vxml> VoiceXML 2.0 Specification: <script> JavaScript Quick Reference Children None. 298 VOICEXML PROGRAMMER S GUIDE

311 <script> Examples Example 1 script to get the current time: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form"> <block name="time">  <var name="hours"/> <var name="minutes"/> <var name="seconds"/> <script> var d=new Date(); hours=d.gethours(); minutes=d.getminutes(); seconds=d.getseconds(); </script> <prompt> The time is now <value expr="hours"/> hours <value expr="minutes"/> minutes, and <value expr="seconds"/> seconds. </prompt> </block> </form> </vxml> Example 2 script that defines a factorial function: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <property name="universals" value="help" /> <script> <![CDATA[ function factorial(n) { return (n <= 1)? 1 : n * factorial(n-1); } ]]> </script> <var name="max" expr="10"/> <form id="form-factorial"> <field name="fact" slot="num" type="number"> <prompt> Tell me a number and I ll tell you its factorial. </prompt> <catch event="help"> <prompt> Please say a number more then zero and less than <value expr="max"/> </prompt> </catch> VOICEXML PROGRAMMER S GUIDE 299

312 TAGS <filled> <if cond="fact < max"> <prompt> <value expr="fact" /> factorial is <value expr="factorial(fact)"/> </prompt> <else/> <prompt> Please choose a number less than <value expr="max"/>. </prompt> <clear namelist="fact" /> </if> </filled> </field> </form> </vxml> 300 VOICEXML PROGRAMMER S GUIDE

313 <send> <send> Syntax VoiceXML 1.0 only; Experimental Extension. Submits values to a web server without transitioning to a new VoiceXML document. <send url="uri" expr="js_expression" method="get" "post" enctype=mime_type namelist="variable1..." fetchtimeout="time_interval" fetchaudio="uri" /> Note: VoiceXML 2.0 applications should use the BeVocal VoiceXML extension <data> tag instead of this tag. This tag submits the specified data to a web server; when the web server returns a successful HTTP result code, execution of the current document continues with the tag following the <send>. This tag is useful when you need to save data (for example, the result of a <record>) in a database but do not need to transfer to a new document. The servlet or CGI script document to which <send> submits data must return a valid HTTP reply, including headers and at least one blank line indicating the end of the headers. If the returned HTTP result code does not indicate success, an error.badfetch is thrown to signal a server error to the VoiceXML application. That application can catch the error and play an error message or take some other action to alert the user. Note: VoiceXML 2.0 applications should use the BeVocal VoiceXML extension <data> tag instead of this tag. This tag is obsolete and will be removed in a future release of BeVocal VoiceXML. Attribute url expr method enctype URI to which to submit the values. Optional (as alternative to expr). A JavaScript expression that evaluates to the URI to which to submit the values. Optional (as alternative to next). The query request method. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data. VOICEXML PROGRAMMER S GUIDE 301

314 TAGS Attribute namelist fetchtimeout fetchaudio Space-separated list of variables to submit. Optional (default is to submit all input variables that have been given explicit names with the name attribute of <field>, <record>, <transfer>, or <subdialog>). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of an audio clip to play while a resource is being fetched. See Background Audio on page 42. Optional. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. Usage See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Related tags: <submit> Children None. 302 VOICEXML PROGRAMMER S GUIDE

315 <send> Example <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 1.0//EN" " <vxml version="1.0"> <form id="form-record"> <field name="name" type="boolean"> <prompt> Say yes to send your data to Snoop Servlet or no to exit </prompt> <filled> <if cond="name"> <send method="post" url=" <prompt>data sent successfully!</prompt> </if> <prompt> Good bye! </prompt> <exit/> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 303

316 TAGS <sentence> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that classifies a region of text as a sentence. <sentence xml:lang="lang" > Text </sentence> Identifies enclosed text as a sentence for interpretive purposes. For brevity, the tag <s> is an exact equivalent of this tag. Attribute xml:lang Not Supported. The language and optional country local identifier for the sentence. Optional Usage See Also Parents <audio> <bevocal:whisper> <choice> <enumerate> <p> <paragraph> <prompt> <prosody> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <phoneme> <prosody> <say-as> <value> <voice> VoiceXML 2.0 Specification: <sentence> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <voice> 304 VOICEXML PROGRAMMER S GUIDE

317 <speak> <speak> Syntax The <speak> element is the root element of a standalone SSML document which contains all other SSML elements. The <speak> element is not a VoiceXML 2.0 element. <speak version="1.0" xml:lang="lang" > Text </speak> Through the BeVocal VoiceXML extension of <audio> with attributes bevocal:ssml and bevocal:ssmlexpr, you are able to refer to standalone SSML documents which must contain the <speak> element as the root element. Attribute xml:lang Not Supported. The language and optional country local identifier for the SSML document. Optional Usage Parents See Also SSML Specification: <speak> <audio> tag. Chapter 9, Dynamic SSML Children all SSML elements VOICEXML PROGRAMMER S GUIDE 305

318 TAGS <sub> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element whose alias attribute provides substitute text to be spoken instead of the contained text. This allows the document to contain both a written and a spoken form for a string. <sub alias="substitutetext" > OriginalText </sub> Attribute alias The string to be substituted for the OriginalText string. Usage Parents Children <speak> None. See Also VoiceXML 2.0 Specification: <sub> Related tags: <say-as>, <speak> 306 VOICEXML PROGRAMMER S GUIDE

319 <subdialog> <subdialog> Syntax Invokes another dialog as a subdialog of the current one. <subdialog name="string" expr="js_expression" cond="js_expression" src="uri" srcexpr="js_expression" method="get" "post" enctype=mime_type namelist="variable1..." fetchhint="prefetch" "safe" fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" > Child Elements </subdialog> Input item that invokes a subdialog, which is a reusable dialog that is specially coded to pass back values with a <return> element. When control returns from the subdialog, the returned values are available as properties of the <subdialog> input variable. You pass values into the subdialog by including <param> child elements. The subdialog must contain a <var> declaration for each parameter passed to it. If the <var> contains an initializing expr attribute, that initializing value is ignored and the value passed with the <param> element is used instead. When a subdialog is invoked, it runs in a new execution context. This means that all variables and state information in the subdialog s document are reinitialized (including the application root document, if one is used). Any changes that the subdialog makes to application-scoped variables apply only in the subdialog context. When the subdialog returns, its context is destroyed and the context of the calling dialog is in the same state it was in before the subdialog call. The only way for the subdialog to return information to its calling dialog is with a <return> element. If the subdialog invokes other dialogs, those dialogs are also run in the new execution context. The new context terminates and the old context resumes only when the subdialog or another dialog it has invoked calls <return> to pass back the results. Note: As a BeVocal extension, the BeVocal VXML Interpreter treats <subdialog> as executable content. This allows a subdialog to be invoked from within any tag which allows executable content, such as a <block> form item. If the <subdialog> is inside of a <form> element, then upon return, execution proceeds to any applicable <filled> elements in the calling context. If you use subdialog in an executable context, such as within an <if> element, the <filled> element does not work. Instead, you should use <assign> directly after the subdialog call. VOICEXML PROGRAMMER S GUIDE 307

320 TAGS Attribute name expr cond src srcexpr method enctype namelist fetchhint fetchtimeout fetchaudio Name of the input variable used to store the results of the subdialog. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. You can access the return values as properties of the input variable using the syntax: name.returnvariablename JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, you ll need to clear it before the <subdialog> can execute. JavaScript boolean expression that also must evaluate to true for the <subdialog> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <subdialog> can execute. URI of the subdialog. You can use the #DialogName syntax to refer to another dialog in the current document. Even in this case, the subdialog is run in a new context. Note that the method, enctype, namelist, caching, fetchtimeout, and fetchaudio parameters apply only if src points to a different document (as opposed to starting with # to invoke a subdialog in the current document). A JavaScript expression yielding the URI of the subdialog The query request method, either get or post. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencode multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data. Space-separated list of variables to submit. Optional (default is to submit nothing). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46. Specifies whether the interpreter can attempt to optimize dialog interpretation by prefetching the resource. See Prefetching Resources on page 40. Optional. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. 308 VOICEXML PROGRAMMER S GUIDE

321 <subdialog> Attribute maxage maxstale New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching modal VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Caching on page 45. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. VoiceXML 1.0 only. Boolean value that must be false to enable <link> grammars from the calling context. Optional (default is true). Lets you alter default behavior so that <link> grammars from the calling context can be active while the subdialog executes. Note: In VoiceXML 2.0, all subdialogs are modal. To ensure portability, your VoiceXML applications should avoid using this attribute. By default, no grammars from the calling dialog s context are active, except any default grammars defined by the VoiceXML interpreter. However, if the modal attribute is false, <link> elements in the calling dialog s context are active. When a grammar from the calling context is triggered during execution of a subdialog, the subdialog context terminates and control returns to the calling context. If an event is thrown during execution of a subdialog and no event handler for the event is found in the subdialog context, the interpreter s response depends on whether the subdialog is modal. If the subdialog is modal, a fatal error occurs, causing the interpreter to exit. If the subdialog is non-modal, the interpreter causes the subdialog s context to return. It then rethrows the event in the calling context and starts its search for the event handler in that context. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. VOICEXML PROGRAMMER S GUIDE 309

322 TAGS Usage See Also Examples Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <noinput> <nomatch> Children <audio> <catch> <enumerate> <error> <filled> <help> <noinput> <nomatch> <param> <prompt> <property> <value> VoiceXML 2.0 Specification: <subdialog> Related tags: <param>, <return> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="main"> <block>we re about to call the sub dialog</block> <subdialog name="result" src="#subbie"> <param name="hello" value="goodbye"/> <param name="goodbye" value="goodbye"/> </subdialog> <block> We re back from the sub dialog. Result dot hello equals <value expr="result.hello"/> Result dot goodbye equals <value expr="result.goodbye"/> </block> </form> <form id="subbie">   <var name="hello" expr=" hello "/>  <var name="goodbye"/> <block> This is the sub dialog. hello equals <value expr="hello"/> goodbye equals <value expr="goodbye"/> <return namelist="hello goodbye"/> </block> </form> </vxml> 310 VOICEXML PROGRAMMER S GUIDE

323 <submit> <submit> Syntax Submits values to a document server. The <submit> element is used to submit information to the origin web server and then transition to the document sent back in the response. <submit next="uri" expr="js_expression" method="get" "post" enctype=mime_type namelist="variable1..." fetchtimeout="time_interval" fetchaudio="uri" maxage="time_interval" maxstale="time_interval" /> Attribute next expr method enctype namelist fetchtimeout fetchaudio URI to which to submit the values. Optional (as alternative to expr). JavaScript expression that evaluates to the URI to which to submit the values. Optional (as alternative to next). The query request method. Optional (default is get). MIME encoding of the submitted document. Optional (default is application/x-www-form-urlencoded). The supported types are: application/x-www-form-urlencoded multipart/form-data The type multipart/form-data is more efficient when submitting large amounts of binary data. Space-separated list of variables to submit. Optional (default is to submit all input variables that have been given explicit names with the name attribute of <field>, <record>, <transfer>, or <subdialog>). This attribute can specify any variable currently in scope, both VoiceXML variables and JavaScript variables, including shadow variables and other variables that have not been explicitly declared. A variable set to a JavaScript object is submitted as the individual component values; see Submitting Complex JavaScript Objects on page 46. Specifies the interval to wait for the resource to be returned before throwing a error.badfetch event. See Handling Fetching Delays on page 42. Optional. Specifies the URI of background audio to be played during fetching. Note that this attribute and related properties affect whether queued prompts are played first. See Background Audio on page 42 for important details. Optional. VOICEXML PROGRAMMER S GUIDE 311

324 TAGS Attribute maxage maxstale New in VoiceXML 2.0. Specifies the maximum acceptable age, in seconds, of the cached resource. See Maximum Age on page 44. Optional. New in VoiceXML 2.0. Specifies the maximum acceptable time, in seconds, during which an expired cached resource can still be used. See Maximum Stale Time on page 44. Optional. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute caching VoiceXML 1.0 only. Specifies the caching policy for the resource being fetched. See Fetch Hints on page 41. Optional. Used in place of the VoiceXML 2.0 attributes maxage and maxstale. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> VoiceXML 2.0 Specification: <submit> Related tags: <goto>, <send> Children None. 312 VOICEXML PROGRAMMER S GUIDE

325 <submit> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="form-record"> <field name="name" type="boolean"> <prompt> Say yes or no to submit a request to Snoop Servlet </prompt> <filled> <if cond="name"> <submit maxage="0" method="post" next=" <else/> <prompt> Good bye! </prompt> <exit/> </if> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 313

326 TAGS <tag> Syntax Usage New in VoiceXML 2.0. XML grammar element that specifies how to interpret the user input. <tag> slotname="value" </tag> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. In general, a tag is an arbitrary string that may be included within any rule expansion. You can include as many tags as you want within a single expansion. Tags do not affect what constitutes a legal utterance for a rule nor do they affect how the recognition proceeds. You use tags to return information a semantic interpretation about a recognition to the element that invoked the grammar. Upon successful recognition, the BeVocal VoiceXML interpreter will create a JavaScript object whose properties (slotname) and values (value) are determined by the tags occurring in the matched rule. If you want the BeVocal VoiceXML interpreter to make use of your tags for this purpose, they must be of the format specified above. See Chapter 1, Using VoiceXML Grammars in the Grammar Reference for information on how the interpreter will use the semantic interpretation. Parents <rule> <item> Children None. Example The following grammar sets two different tags: <grammar...> <rule id="coloredobject"> <ruleref id="color"/> <ruleref id="object"/> </rule> <rule id="color"> <one-of> <item> red <tag> color="red" </tag> </item> <item> pink <tag> color="red" </tag> </item> <item> yellow <tag> color="yellow" </tag> </item> <item> canary <tag> color="yellow" </tag> </item> <item> green <tag> color="green" </tag> </item> <item> khaki <tag> color="green" </tag> </item> </one-of> </rule> 314 VOICEXML PROGRAMMER S GUIDE

327 <tag> <rule id="object"> <one-of> <item> <tag> object="vehicle" </tag> <one-of> <item>truck</item> <item>car</item> </one-of> </item> <item> <tag> object="toy" </tag> <one-of> <item>ball</item> <item>block</item> </one-of> </item> <item> <tag> object="clothing" </tag> <one-of> <item>shirt</item> <item>blouse</item> </one-of> </item> </one-of> </rule> </grammar> This grammar recognizes phrases such as "green truck" or "khaki car". For both of those phrases, it will return the same semantic interpretation: { } See Also color: green; object: vehicle; Speech Recognition Grammar Specification: <tag> Chapter 1, Using VoiceXML Grammars and Chapter 4, XML Speech Grammar Format in the Grammar Reference VOICEXML PROGRAMMER S GUIDE 315

328 TAGS <throw> Syntax Throws an event. <throw event="event" eventexpr="js_expression" message="string" messageexpr="js_expression" /> You can throw either a predefined event or an application-specific event. For more information, see Chapter 3, Event Handling. Attribute event eventexpr message messageexpr Event to throw. Optional (as alternative to eventexpr). New in VoiceXML 2.0. JavaScript expression that evaluates to the event to throw. Optional (as alternative to event). New in VoiceXML 2.0. Message string providing additional context about the event being thrown. Optional. New in VoiceXML 2.0. XML expression that evaluates to the message string. Optional. One and only one of the attributes event or eventexpr must be specified. At most one of the attributes message or messageexpr may be specified. Within an event handler that catches the thrown event, the variable Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <help> <if> <noinput> <nomatch> Children None. VoiceXML 2.0 Specification: <throw> Related tags: <catch>, <error>, <help>, <noinput>, <nomatch>, <rethrow> 316 VOICEXML PROGRAMMER S GUIDE

329 <throw> Examples See other catch and throw examples under <catch> and <rethrow>. <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <catch event="myevent"> <prompt> Catch of my event at document scope. But here the value of my number has gone away. It sounds like <value expr="mynumber"/> </prompt> <reprompt/> </catch> <catch event="nomatch noinput" > <prompt> Catch at document scope. </prompt> <reprompt/> </catch> <property name="universals" value="help" /> <help> <prompt> Catch for help at document scope. </prompt> <reprompt/> </help> <form id="numbers"> <catch event="myevent"> <prompt> Catch of my event at form scope. <value expr="_message"/> The value of my number has not gone away. It sounds like <value expr="mynumber"/> </prompt> <clear/> <rethrow/> </catch> <block name="numbergame"> This is a test of the throw tag. </block> <field name="mynumber" type="number"> <prompt> Tell me a number greater than ten. </prompt> <filled> <if cond="mynumber < 10"> <throw event="myevent" messageexpr="mynumber + is less than ten "/> <elseif cond="mynumber > 10" /> VOICEXML PROGRAMMER S GUIDE 317

330 TAGS <prompt> You said <value expr="mynumber"/> </prompt> <else/> <throw event="myevent" message="ten is not greater than ten"/> </if> </filled> </field> </form> </vxml> 318 VOICEXML PROGRAMMER S GUIDE

331 <token> <token> Syntax New in VoiceXML 2.0. XML grammar input element that specifies actual words to be spoken or DTMF keys to be pressed. <token xml:lang="lang" > Content </token> This tag is used to define grammars in the XML form of the W3C Speech Recognition Grammar Format. In a voice grammar, a token is consists of CDATA and cannot be enclosed in double quotes. In a DTMF grammar, a token must be one of the following characters: * # A B C D Usage Attribute xml:lang The language and optional country local identifier for the token. Optional (default is the language of the enclosing element). The accepted language identifiers are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian If an unsupported language is specified, an error.unsupported.language event is thrown. See Also Example Parents <rule> <item> Children None. Speech Recognition Grammar Specification: <token> Chapter 4, XML Speech Grammar Format in the Grammar Reference <rule id="trigger"> <ruleref special="garbage"/> <token>trigger Fish</token> </rule> <rule id="affirmative"> <token>yes</token> <token xml:lang="es">si</token> </rule> VOICEXML PROGRAMMER S GUIDE 319

332 TAGS <transfer> Syntax Transfers the user s call to a third party at another destination. <transfer name="string" expr="js_expression" cond="js_expression" dest="uri" destexpr="js_expression" connecttimeout="time_interval" maxtime="time_interval" bevocal:maxtimeexpr="js_expression" bridge="boolean" bevocal:type="blind" "bridge" "supervised" type="blind" "bridge" "consultation" transferaudio="uri" aai="string" aaiexpr="js_expression" > Child Elements </transfer> Input item for transferring the user to another phone number. The transfer may be done in one of three ways: Transfer Method Bridge transfer Blind transfer Supervised or consultation transfer The current session with the interpreter resumes after the call with the third party completes. The current session terminates as soon as it starts the transfer, regardless of the success of the transfer. The current session terminates as soon as the transfer successfully connects the outbound call. If the transfer is unsuccessful, control returns to the application. Note: To use blind or supervised transfers, contact BeVocal Customer Support. Only one outbound call can be in progress at a given time. The call placed by one <transfer> or <bevocal:dial> tag must be terminated before another <transfer> or <bevocal:dial> tag can be executed. Café customers can use a <transfer> tag to place local and long-distance calls only; international calls are not allowed. (Hosting customers are allowed to make international calls.) During execution of the <transfer> element, the user and the called third party talk to each other. The application is quiet. No universal grammars or grammars in higher scopes are active. During a bridge transfer, if the <transfer> element includes any child grammars, the application listens to the user. If a user utterance matches a child grammar, the transfer is terminated and the input variable is set to the status near_end_disconnect. The bevocal.hotwordmin and bevocal.hotwordmax 320 VOICEXML PROGRAMMER S GUIDE

333 <transfer> properties specify the minimum and maximum time duration, respectively, for an utterance that can terminate the call. During execution of the <transfer> element, the bargein type is always hotword. That is, the interpreter ignores the value of the bargeintype property and the value of the bargeintype attribute of any enclosed <prompt> element. The application ignores all speech and DTMF signals from the called third party. Attribute name expr cond Name of the input variable used to store the result of the transfer. The variable name may not be a JavaScript reserved keyword. The input variable has dialog (form) scope; its name must be unique among all VoiceXML and JavaScript variables within the form s scope. Possible values for the variable are: busy The call was refused by the endpoint (the number was busy). noanswer There was no answer within the time allowed for making the connection. network_busy The call was refused by an intermediate network. near_end_disconnect The call completed because the user hung up or said something that matched a child grammar. far_end_disconnect The call completed because the called third party hung up. network_disconnect The call completed and was terminated by the network. maxtime_disconnect The call was terminated because the maximum time (specified by the maxtime attribute) was exceeded. JavaScript expression that assigns the initial value of the input variable. Optional (default is undefined). If you set the input variable to a value other than undefined, you ll need to clear it before the <transfer> can execute. JavaScript boolean expression that also must evaluate to true for the <transfer> to execute. Optional (default is true). If not specified, the value of the input variable alone determines whether or not the <transfer> can execute. VOICEXML PROGRAMMER S GUIDE 321

334 TAGS Attribute dest URI of the destination (for example, phone, IP telephony address). Optional (as alternative to destexpr). You can specify the URI using any of the following formats: phone:// phone:// tel: tel: ;postd=1234 This format allows you to specify an extension as post-dial digits (postd). After the call is answered, the specified digits (1234 in this example) are sent to the called third party as DTMF. SIP URIs are specified as: sip:<destination value>:<port> where the default port is SIP URIs are valid only for VoIP calls. destexpr connecttimeout maxtime bevocal:maxtimeexpr Note: A leading 1 on the phone number is optional and will be ignored. Note: You must specify one of dest or destexpr, but not both. JavaScript expression that evaluates to the URI of the destination. Optional (as alternative to dest). Note: You must specify one of dest or destexpr, but not both. Time to wait for the outbound call to connect before returning a value of noanswer. Optional (default is 20 seconds). Express the time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). How long the outbound call is allowed to last. Optional: The default for Café customers is 60 seconds. This is an absolute maximum. If the time specified with this attribute is longer than 60 seconds, the interpreter sets the limit to 60 seconds. The default for hosting customers is 0, signifying no limit. Express a time interval as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). Extension. A JavaScript expression which resolves to the maxtime value. Optional. Again, the default for Café customers is 60 seconds. This is an absolute maximum. If the time specified with this attribute is longer than 60 seconds, the interpreter sets the limit to 60 seconds. The default for hosting customers is 0, signifying no limit. 322 VOICEXML PROGRAMMER S GUIDE

335 <transfer> Attribute bridge Determines whether the current session will resume after the transferred call completed. For more flexibility, you can instead specify a value for the bevocal:type attribute. Optional. The default value is false. However, to use blind transfers, you must make special arrangements with BeVocal Customer Support. If you set this attribute to false and arrangements have not been made to support blind transfers, an error.unsupported.transfer.blind event is generated. bevocal:type If you set this attribute to false and arrangements have been made to support blind transers, the current session terminates by throwing a transfer event (connection.disconnect.transfer in VoiceXML 2.0, telephone.disconnect.transfer in VoiceXML 1.0) when the transfer is made. Extension. Determines how much control the platform retains over a transferred call. Optional; if not specified, the platform uses the value of the bridge attribute. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is supervised, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You cannot specify both the bridge and bevocal:type attributes. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or supervised transfers, please contact BeVocal Customer Support. VOICEXML PROGRAMMER S GUIDE 323

336 TAGS Attribute type transferaudio aai aaiexpr Exactly one of dest or destexpr must be specified. Determines how much control the platform retains over a transferred call. Optional. The default is bridge. If the value is blind, as soon as the transfer starts, the current VoiceXML session ends and relinquishes control to the outbound call. Even if the transfer does not connect the call, the session is over. If the value is bridge, then the current VoiceXML session remains active. At the end of the outbound call, the session returns control to the application to resume processing. If the value is consultation, an intermediate path occurs. The current VoiceXML session monitors the progress of the outbound call until it is connected. If the call cannot be connected for some reason such as no answer or line busy, the session remains active and returns control to the application. If the call is connected, then the session ends, just as for a blind transfer. You cannot specify both the bridge and type attributes. You can specify either bevocal:type or type, but not both. If you specify both, a parse error is thrown. Note: To use blind or consultation transfers, please contact BeVocal Customer Support. The audio specified by the URI is played while the call connection attempt is in progress. Note: This attribute is available only for VoIP calls. Application-to-application information. A string containing data sent to an application on the far-end, available in the session variable session.connection.aai. The transmission of aai data may depend upon signaling network gateways and data translation (for example, ISDN to SIP); the status of data sent to a remote site is not known or reported. On the BeVocal platform, on ISDN calls, the value transmitted is UUI (User-to-User Information) A JavaScript expression yielding the AAI data. 324 VOICEXML PROGRAMMER S GUIDE

337 <transfer> Properties of the Shadow Variable. Corresponding to the input variable name is a shadow variable called name$. After the input variable is filled, some additional information is available in the following properties of this shadow variable: Property utterance inputmode duration recording recordingsize recordduration A string representation of the words actually spoken by the user. The mode in which input was provided, one of voice or dtmf. After the transfer is complete, the floating point duration of the call in milliseconds. If the recordutterance or bevocal.audio.capture property is set to true and the user s speech matched the field grammar, the recording property contains an audio capture of the user s speech. If no audio is collected, this variable is undefined. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. The size of the recording in bytes, or undefined if no audio is collected. The duration of the recording in milliseconds, or undefined if no audio is collected. For a field whose name is name, you access the property propname of the shadow variable with the syntax: name$.propname For example, you access the duration property for a field named services field as: services$.duration VOICEXML PROGRAMMER S GUIDE 325

338 TAGS Events. The following events may be thrown during the execution of a <transfer> element: Event connection.disconnect.hangup The user hung up. New in VoiceXML 2.0. connection.disconnect.transfer The user was transferred unconditionally to another line and will not return. New in VoiceXML 2.0. telephone.disconnect.hangup The user hung up. VoiceXML 1.0 only. telephone.disconnect.transfer The user was transferred unconditionally to another line and will not return. VoiceXML 1.0 only. error.badfetch The application specified a value for both bridge and bevocal:type. error.connection.baddestination The destination URI specified by dest or destexpr is not valid. New in VoiceXML 2.0. error.telephone.baddestination The destination URI specified by dest or destexpr is not valid. VoiceXML 1.0 only. error.connection.noauthorization A Café customer attempted to make an international call. (The outbound call is not placed.) New in VoiceXML 2.0. error.telephone.noauthorization A Café customer attempted to make an international call. (The outbound call is not placed.) VoiceXML 1.0 only. error.connection.noresource Another outbound call is already in progress. (A new outbound call is not placed.) New in VoiceXML 2.0. error.telephone.noresource Another outbound call is already in progress. (A new outbound call is not placed.) VoiceXML 1.0 only. Usage Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <form> Children <audio> <catch> <enumerate> <error> <filled> <grammar> <help> <noinput> <nomatch> <prompt> <value> VoiceXML 2.0 Specification: <transfer> 326 VOICEXML PROGRAMMER S GUIDE

339 <transfer> Examples Related tags: <bevocal:dial>, <disconnect> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <block> Now taking you to BeVocal Consumer Services. </block> <transfer name="services" bridge="true" connecttimeout="300" dest="phone:// " /> <block> Welcome back to Cafe! </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 327

340 TAGS <value> Syntax Inserts the value of a expression into audio output. <value expr="js_expression" /> Attribute expr JavaScript expression to evaluate and insert in the prompt. Note: In VoiceXML 2.0, the recorded audio stored in the input variable of a <record> item can now be played with <audio expr="...">. The expr attribute of <value expr="..."> is deprecated for playing recorded audio, and in the next release will no longer be able to play it. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Attribute class type VoiceXML 1.0 only. The <say-as> type of the variable for interpretive purposes. Optional. The type attribute is the VoiceXML 1.0 standard. These two attributes are identical; only one of them should be used. Note: A VoiceXML 2.0 application should not use either of these attributes, but should instead enclose the <value> tag within a <say-as> element. There is no implicit <say-as> type invoked when you use <value>. See <say-as> for details. The following event may be thrown during the execution of a <say-as> element: error.badfetch Thrown in VoiceXML 2.0 when any of the following attributes are present: class, mode or recsrc. These attributes are not valid for version 2.0. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. 328 VOICEXML PROGRAMMER S GUIDE

341 <value> Usage See Also Parents <audio> <bevocal:foreach> <bevocal:listen> <bevocal:register> <bevocal:verify> <bevocal:whisper> <block> <catch> <choice> <emphasis> <enumerate> <error> <field> <filled> <help> <if> <initial> <log> <menu> <noinput> <nomatch> <p> <paragraph> <prompt> <prosody> <record> <s> <say-as> <sentence> <subdialog> <transfer> <voice> Children None. VoiceXML 2.0 Specification: <value> Related tags: <assign>, <var>, <field>, <record>, <subdialog>, <transfer>, <block>, <initial> <say-as> VOICEXML PROGRAMMER S GUIDE 329

342 TAGS Examples Example 1 with input variable: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form> <field name="state"> <prompt> Do you want texas or california? </prompt> <grammar type="application/x-nuance-gsl"> [ texas california ] </grammar> </field> <block> <prompt> Your answer was <value expr="state"/> </prompt> </block> </form> </vxml> Example 2 using assignment: <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <property name="universals" value="help" /> <form id="mycalculator"> <var name="result"/> <field name="op">  <prompt> BeVocal calculator. Choose add, subtract, multiply, or divide. </prompt> <grammar type="application/x-nuance-gsl"> [add subtract multiply divide] </grammar> <help>say add, subtract, multiply, or divide.</help> <filled> <prompt>okay, let s <value expr="op"/> two numbers.</prompt> </filled> </field> <field name="a" type="number">  <prompt>whats the first number?</prompt> <help> Please say a number. This number will be used as the first operand. </help> <filled> 330 VOICEXML PROGRAMMER S GUIDE

343 <value> <prompt><value expr="a"/></prompt> </filled> </field> <field name="b" type="number">  <prompt>and the second number?</prompt> <help> Please say a number. This number will be used as the second operand. </help> <filled> <prompt><value expr="b"/> Okay.</prompt>  <if cond="op== add ">  <assign name="result" expr="number(a) + Number(b)"/> <prompt> <value expr="a"/> plus <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op== subtract "/> <assign name="result" expr="a - b"/> <prompt> <value expr="a"/> minus <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op== multiply "/> <assign name="result" expr="a * b"/> <prompt> <value expr="a"/> times <value expr="b"/> equals <value expr="result"/> </prompt> <elseif cond="op== divide "/> <assign name="result" expr="a / b"/> <prompt> <value expr="a"/> divided by <value expr="b"/> equals <value expr="result"/> </prompt> </if> <clear/> </filled> </field> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 331

344 TAGS <var> Syntax Declares a variable. <var name="string" expr="js_expression" /> Declaration of a variable in the scope of the enclosing element. Usage Attribute name expr Variable name, which must be a valid JavaScript identifier that does not begin with the underscore character (_) or end with the dollar-sign character ($); it may not be a reserved keyword in either JavaScript or Java. JavaScript expression that assigns the initial value of the variable. Optional (default is either undefined or the current value of the variable, if already declared in this scope). If a <var> element names a variable that is already in scope, it declares a new variable with the same name. If the <var> element has an expr attribute, the variable is assigned the specified value; otherwise, the variable is assigned the value undefined. Tip: If a JavaScript expression contains any of the characters <, >, or &, that character must be replaced with the corresponding escape sequence <, >, or &. For more information, see JavaScript Quick Reference. See Also Parents <bevocal:foreach> <block> <catch> <error> <filled> <form> <help> <if> <noinput> <nomatch> <vxml> VoiceXML 2.0 Specification: <var> Related tags: <assign>, <value> Children None. 332 VOICEXML PROGRAMMER S GUIDE

345 <var> Examples <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <var name="a"/> <var name="b"/> <var name="result"/> <block> <assign name="a" expr="10"/> <assign name="b" expr="35"/> <assign name="result" expr="a * b"/> </block> <block> <prompt> <value expr="a"/> multiplied by <value expr="b"/> equals <value expr="result"/> </prompt> </block> </form> </vxml> VOICEXML PROGRAMMER S GUIDE 333

346 TAGS <voice> Syntax New in VoiceXML 2.0. Speech Synthesis Markup Language element that requests a change in speaking voice. <voice gender="male" "female" "neutral" age="integer" variant="integer" name="string" xml:lang="string" > Text </voice> Attributes describe the preferred characteristics of the voice to speak the contained text. Errors Attribute gender age variant name xml:lang The preferred gender. Optional. Possible values are: male female neutral Not Implemented. The preferred age. Optional. Not Implemented. A preferred variant of the other voice characteristics (for example, the second or next male child voice). Optional. A platform-specific voice name to speak the contained text or a space-separated list of names ordered by decreasing preference. Currently limited to one of the following: jennifer female, American English laurie female, Vocalizer English katarina female, RealSpeak German maria female, Vocalizer Spanish mark male, American English reed male, Vocalizer American English Optional (default is current value of the bevocal.voice.name property). New in VoiceXML 2.0. Language and locale for the content. Optional. (default is en-us). An error.noresource event is thrown when an invalid TTS voice is specified in the name attribute. 334 VOICEXML PROGRAMMER S GUIDE

347 <voice> Usage See Also Parents <audio> <bevocal:whisper> <choice> <emphasis> <enumerate> <p> <paragraph> <prompt> <prosody> <s> <sentence> <voice> Children <audio> <break> <emphasis> <enumerate> <mark> <p> <paragraph> <phoneme> <prosody> <s> <say-as> <sentence> <value> <voice> VoiceXML 2.0 Specification: <voice> Related tags: <break>, <emphasis>, <mark>, <paragraph>, <phoneme>, <prosody>, <say-as>, <sentence> VOICEXML PROGRAMMER S GUIDE 335

348 TAGS <vxml> Syntax Contains the VoiceXML code of a document. <vxml application="uri" xml:base="uri" xml:lang="lang" version="version_number" xmlns="namespace" xmlns:bevocal="bvnamespace" > Child Elements </vxml> Top-level element in each VoiceXML document. Note that in VoiceXML 2.0 the xmlns attribute has been added as a required attribute. Attribute application xml:base xml:lang URI of this application s root document. Optional (default is not to have an application root document). If the specified document also has an application attribute, an error.semantic event is thrown. Base URI. Optional. New in VoiceXML 2.0. Language and locale for this document. Optional. The language identifier for this document. A language identifier labels information content as being of a particular human language variant. A legal language identifier is identified by an RFC 3066 code. BeVocal currently supports English, Spanish, and French Canadian. Valid values are: en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian The default value is en-us. If an unsupported language is specified, an error.unsupported.language event is thrown. If your application consists of multiple documents and is using a language other than the default language, each document must specify the xml:lang attribute. version VoiceXML version used in this document. The version number must be either 1.0 or VOICEXML PROGRAMMER S GUIDE

349 <vxml> Attribute xmlns xmlns:bevocal New in VoiceXML 2.0. The designated namespace for VoiceXML 2.0. The namespace for VoiceXML is defined as this must be the value of this attribute. Required. Extension. Defines the XML namespace bevocal. Optional (default is not to define the namespace). The only valid value for this attribute is You must specify this attribute if this document includes any BeVocal VoiceXML extension tags in the bevocal namespace, such as <bevocal:dial>. (VoiceXML 1.0 only) The following attributes can be used in applications in which the version attribute of the <vxml> tag is set to 1.0. Usage Attribute lang base VoiceXML 1.0 only. Language and locale for this document. Optional. This attribute is ignored. Used in place of the VoiceXML 2.0 xml:lang attribute. VoiceXML 1.0 only. Base URI. Optional. See Also Examples Parents None. Children <catch> <data> <error> <form> <help> <link> <menu> <meta> <metadata> <noinput> <nomatch> <property> <script> <var> VoiceXML 2.0 Specification: <vxml> <?xml version="1.0"?> <!DOCTYPE vxml PUBLIC "-//BeVocal Inc//VoiceXML 2.0//EN" " <vxml version="2.0" xmlns=" <form id="foo"> <block> <prompt> Hello Developers! You are the VXML Gurus. Please keep using our services. VOICEXML PROGRAMMER S GUIDE 337

350 TAGS </prompt> </block> </form> </vxml> 338 VOICEXML PROGRAMMER S GUIDE

351 12 Properties This chapter describes the VoiceXML properties that can be set with the <property> tag. See: Property Summary on page 340 for an overview of the properties, grouped by function Property Index on page 341 for an alphabetical list of all properties Each property description explains what the property does, what possible values the property may have, and what default value is used if the application does not set the property. In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not implemented Functionality not currently available. Extension Added functionality. Deprecated Non-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature. VoiceXML 1.0 only Property is part of the VoiceXML 1.0 standard, but has been removed from VoiceXML 2.0.

352 PROPERTIES Property Summary Properties control the following facet of the interpreter. Facet User Input Speech Recognition Text-to-Speech Output DTMF Recognition Background Audio Properties bargein bargeintype New in VoiceXML 2.0 bevocal.hotwordmax Extension bevocal.hotwordmin Extension bevocal.incrementerroronnsp Extension inputmodes timeout bevocal.audio.capture Extension bevocal.audio.outputvolume Extension bevocal.finaltimeout Extension bevocal.grammar.interpretationtype Extension bevocal.grammar.phoneticpruning Extension bevocal.grammar.weightfactor Extension bevocal.grammar.wordtransitionpenalty Extension bevocal.maxdialogerrors Extension bevocal.maxerrors Extension bevocal.maxinterpretations Extension bevocal.sounds.listening Extension bevocal.sounds.maskrecognitionlatency Extension bevocal.sounds.recognition Extension bevocal.utterance.prefix Extension bevocal.vxml.maxrecognitionlatency Extension completetimeout confidencelevel incompletetimeout maxnbest New in VoiceXML 2.0 maxspeechtimeout recordutterance recordutterancetype sensitivity speedvsaccuracy bevocal.voice.name Extension bevocal.dtmf.flushbuffer Extension bevocal.transfer.terminatetones Extension interdigittimeout termchar termtimeout bevocal.fetchaudio.allfetches Extension bevocal.fetchaudio.extend Extension bevocal.fetchaudio.flushqueue Extension bevocal.fetchaudio.sounds Extension fetchaudio fetchaudiodelay New in VoiceXML 2.0 fetchaudiominimum New in VoiceXML VOICEXML PROGRAMMER S GUIDE

353 Property Index Facet Fetching Resources audiofetchhint audiomaxage New in VoiceXML 2.0 audiomaxstale New in VoiceXML 2.0 caching VoiceXML 1.0 only datafetchhint Extension datamaxage Extension datamaxstale Extension documentfetchhint documentmaxage New in VoiceXML 2.0 documentmaxstale New in VoiceXML 2.0 fetchtimeout grammarfetchhint grammarmaxage New in VoiceXML 2.0 grammarmaxstale New in VoiceXML 2.0 scriptfetchhint scriptmaxage New in VoiceXML 2.0 scriptmaxstale New in VoiceXML 2.0 ssmlfetchhint Extension ssmlmaxage Extension ssmlmaxstale Extension Universal Grammars universals New in VoiceXML 2.0 Logging Calls Go-Back Facility Access to BeVocal Platform services Language/Locale Properties bevocal.logging Extension bevocal.securelogging.enabled Extension bevocal.securelogging.key Extension bevocal.goback Extension bevocal.mingoback Extension bevocal.security.key Extension bevocal.locale Property Index The following table lists the properties in alphabetical order. Property audiofetchhint audiomaxage New in VoiceXML 2.0 audiomaxstale New in VoiceXML 2.0 bargein bargeintype New in VoiceXML 2.0 bevocal.audio.capture Extension bevocal.audio.outputvolume Extension bevocal.dtmf.flushbuffer Extension bevocal.fetchaudio.allfetches Extension Controls Fetching Fetching Fetching User Input User Input Speech recognition Speech recognition DTMF recognition Background audio VOICEXML PROGRAMMER S GUIDE 341

354 PROPERTIES Property bevocal.fetchaudio.extend Extension bevocal.fetchaudio.flushqueue Extension bevocal.fetchaudio.sounds Extension bevocal.finaltimeout Extension bevocal.grammar.interpretationtype Extension bevocal.grammar.phoneticpruning Extension bevocal.grammar.weightfactor Extension bevocal.grammar.wordtransitionpenalty Extension bevocal.goback Extension bevocal.hotwordmax Extension bevocal.hotwordmin Extension bevocal.incrementerroronnsp Extension bevocal.locale bevocal.logging Extension bevocal.maxdialogerrors Extension bevocal.maxerrors Extension bevocal.maxinterpretations Extension bevocal.mingoback Extension bevocal.securelogging.enabled Extension bevocal.securelogging.key Extension bevocal.security.key Extension bevocal.sounds.listening Extension bevocal.sounds.maskrecognitionlatency Extension bevocal.sounds.recognition Extension bevocal.transfer.terminatetones Extension bevocal.utterance.prefix Extension bevocal.voice.name Extension bevocal.vxml.maxrecognitionlatency Extension caching completetimeout confidencelevel datafetchhint Extension datamaxage Extension datamaxstale Extension documentfetchhint documentmaxage New in VoiceXML 2.0 documentmaxstale New in VoiceXML 2.0 fetchaudio Controls Background audio Background audio Background audio Speech recognition Speech recognition Speech recognition Speech recognition Speech recognition Go-back facility User Input User Input User Input Language/Locale Logging calls Speech errors Speech errors Speech errors Go-back facility Logging calls Logging calls Access to BeVocal Platform services Speech recognition Speech recognition Speech recognition DTMF recognition Speech recognition Text-to-speech output Speech recognition Fetching Speech recognition Speech recognition Fetching Fetching Fetching Fetching Fetching Fetching Background audio 342 VOICEXML PROGRAMMER S GUIDE

355 Property s Property fetchaudiodelay New in VoiceXML 2.0 fetchaudiominimum New in VoiceXML 2.0 fetchtimeout grammarfetchhint grammarmaxage New in VoiceXML 2.0 grammarmaxstale New in VoiceXML 2.0 incompletetimeout inputmodes interdigittimeout maxnbest New in VoiceXML 2.0 maxspeechtimeout recordutterance recordutterancetype scriptfetchhint scriptmaxage New in VoiceXML 2.0 scriptmaxstale New in VoiceXML 2.0 sensitivity speedvsaccuracy ssmlfetchhint Extension ssmlmaxage Extension ssmlmaxstale Extension termchar termtimeout timeout universals New in VoiceXML 2.0 Controls Background audio Background audio Fetching Fetching Fetching Fetching Speech recognition User Input DTMF recognition Fetching Speech recognition Speech recognition Speech recognition Fetching Fetching Fetching Speech recognition Speech recognition Fetching Fetching Fetching DTMF recognition DTMF recognition User Input Universal grammars Property s This section contains property descriptions is alphabetical order. audiofetchhint The audiofetchhint property tells the interpreter whether it can attempt to optimize dialog interpretation by prefetching audio files. The value is one of: safe Fetch audio files only when they are needed, never before. prefetch Permit, but do not require, the interpreter to prefetch audio files. The default value is prefetch. VOICEXML PROGRAMMER S GUIDE 343

356 PROPERTIES audiomaxage New in VoiceXML 2.0. The audiomaxage property specifies the maximum acceptable age, in seconds, of cached audio resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tag s version attribute is 2.0 or greater, unexpired cached audio files are used. If the <vxml> tag s version attribute is 1.0, the caching property controls whether unexpired cached audio files are used. By default, no value is set for this property. audiomaxstale New in VoiceXML 2.0. The audiomaxstale property specifies the maximum acceptable time, in seconds, during which expired cached audio resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 300 seconds (5 minutes). This default results in much faster performance for most applications, since it greatly reduces the number of times the interpreter must send Get-If-Modified requests to the HTTP server. If you want the behavior defined in the VoiceXML specification (always do a Get-If-Modified request if there was no Expires header), set audiomaxstale to 0. bargein The bargein property indicates whether the user can barge in during prompts and audio output from the application. Set this property to true to allow barge-in; set it to false to disallow barge-in. When this property is set to false, any DTMF input buffered in a transition state is deleted from the buffer. It is not saved for use in the next recognition state. The default value is true. bargeintype New in VoiceXML 2.0. The bargeintype property indicates what kind of input can interrupt the prompt or audio output. This property is relevant only when bargein is true. The value is one of: hotword User input that doesn t match the grammar is ignored and only speech that matches a grammar can interrupt the prompt. speech Any user utterance can interrupt the prompt. Inside a bridge transfer, the value of the bargeintype property is ignored. For a bridge transfer, the bargein type is always hotword. If this property is set to hotword, the bevocal.hotwordmax and bevocal.hotwordmin properties determine how long a hotword utterance can be. The default value is speech. 344 VOICEXML PROGRAMMER S GUIDE

357 Property s bevocal.audio.capture Extension. The value of the bevocal.audio.capture property controls whether the interpreter captures spoken audio for each recognition. If this property is true, the interpreter captures spoken audio; if this property is false, the interpreter does not capture audio. The captured audio is available to the application in variables: When a successful recognition occurs for a field, the audio capture is available in the audio property of the field s shadow variable (fieldname$.audio) and in the application variable application.lastaudio$. When a no-match event occurs in a field, the audio capture is available in application.lastaudio$ only; fieldname$.audio is cleared. When a no-input event occurs in a field, both application.lastaudio$ and fieldname$.audio are cleared. When the user s speech successfully matches a form grammar, a menu choice, or a link grammar, the audio capture is available in application.lastaudio$. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. You can also access captured utterance files for later analysis, using the Media Access Service. For this purpose, you may want to use the bevocal.utterance.prefix property to modify the filenames of the utterance files. Note: Capturing audio causes a minor delay after each speech recognition for which bevocal.audio.capture is true. You should set this property to true only locally inside specific fields in which the capability is needed. The default value is false. bevocal.audio.outputvolume The value of the bevocal.audio.outputvolume property controls the output volume from the platform. Higher values play the audio prompts louder. You may need to tune this property upward if you expect your application to be used in noisy environments. The default value is 0.5. The minimum is 0.0 and the maximum is 1.0. bevocal.dtmf.flushbuffer Extension. The bevocal.dtmf.flushbuffer property controls whether or not DTMF key presses that are queued during the transitioning state are available for the next recognition state. If this property is true, the DTMF buffer is cleared upon entering the next recognition state. If false, the keys in the buffer are used for recognition in the next recognition state. For more details, see Collecting Input and Playing Prompts on page 19. The default value is true. bevocal.fetchaudio.allfetches Extension. The bevocal.fetchaudio.allfetches property controls whether background audio is played during the execution of a BeVocal VoiceXML extension tag that performs a fetch. If this property is true, any background audio is played during execution of a BeVocal VoiceXML extension tag. If this property is false, the background audio is not played for extension tags. Currently <data> is the only tag for which this property is relevant. VOICEXML PROGRAMMER S GUIDE 345

358 PROPERTIES The default value is false. bevocal.fetchaudio.extend Extension. The bevocal.fetchaudio.extend property controls whether background audio is played until the next audio event. If this property is false, any background audio is played until the fetch operation completes. If this property is true, the background audio is continued until the next audio event, for example, a prompt played by the interpreter, or user input that triggers speech recognition. Setting bevocal.fetchaudio.extend to true can give your application a more seamless audio interface. The default value is false. bevocal.fetchaudio.flushqueue Extension. The bevocal.fetchaudio.flushqueue property allows queued prompts to be played during fetches even if the fetchaudio property is not set. If the bevocal.fetchaudio.flushqueue property is true then the behavior of playing queued prompts is as if the fetchaudio property is set. That is, when true, queued prompts will be played during a <goto>, <submit>, or other operation which fetches VoiceXML data (or in a special case <data>). When false then queued prompts will be played or not played during fetches based on the fetchaudio property. See Queued Prompts when Fetching on page 43. The default value is false. bevocal.fetchaudio.sounds Extension. The bevocal.fetchaudio.sounds property controls whether the interpreter plays a sound before and after playing background audio. If this property is true, a short sound is played just before and after the background audio. If this property is false, background audio is played without any sounds to delimit its start and end. If you sent any bevocal.sounds.* properties to true, you may wish to set this property to true also. See bevocal.sounds.recognition and bevocal.sounds.listening. The default value is false. bevocal.finaltimeout Extension. The bevocal.finaltimeout property is the amount of time to wait for additional input after the speech-recognition engine has recognized speech which matches one of the input grammars but that match is not the longest legal match. That is, the user has said something that matches the grammar, but if he continues speaking, the longer utterance might also match the grammar. This property interacts with the completetimeout and incompletetimeout properties. Consider the following very simple ABNF grammar: #ABNF 1.0; $rule = why hello [world]; Which property is in play depends upon what the user has said up until this point: If the user has said "why", the incompletetimeout property is in play but neither of the other two properties is in play, because that utterance cannot match the grammar. If the user has said "why hello", the bevocal.finaltimeout property is in play because that utterance may be a complete match to the grammar but may also be a partial match. The incompletetimeout property is not in play because "why hello" does match the grammar and the 346 VOICEXML PROGRAMMER S GUIDE

359 Property s completetimeout property is not in play because the user still has a legal path to the longest legal match to the grammar. If the user has said "why hello world", only the completetimeout property is in play because this is the longest legal match to the grammar. The default value is 0.75s; that is, three-quarters of a second. bevocal.goback Extension. The bevocal.goback property specifies whether the parent element is a legal go-back destination. Set this property to true to make the parent element a legal go-back destination; set it to false to disallow going back to the parent element. You can set this property to false in a form to disable the go-back facility in that form. You can set it to false in a document to disable the go-back facility in that entire document. Note: The go-back facility is an experimental extension to BeVocal VoiceXML. See Go-Back Facility on page 79. The default value is true. bevocal.grammar.interpretationtype Extension. The bevocal.grammar.interpretationtype property specifies whether to use the NL engine in standard mode (full) or in robust mode (robust). Setting this property to robust facilitates the NL engine s interpretation of more spontaneous utterances from SLM grammars. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. You should set this property to robust only for recognizing against SLM grammars. You should set it back to full when recognizing against normal grammars. The default value is full. bevocal.grammar.phoneticpruning Extension. The bevocal.grammar.phoneticpruning property specifies whether the recognizer should perform phonetic pruning. For SLM grammars, set this parameter to true except for grammars with small vocabularies. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is true. bevocal.grammar.weightfactor Extension. The bevocal.grammar.weightfactor property controls the relative weighting of acoustic and linguistic scores during recognition. As this value increases, the recognizer runs faster and hence the value of the speedvsaccuracy property should be increased to get better recognition. The corresponding speech engine property is in the range between 0 and 100. For well-trained SLM grammars, the optimum value is between 0.58 and 0.6, corresponding to the range of 9-10 in the speech engine. VOICEXML PROGRAMMER S GUIDE 347

360 PROPERTIES Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is 0.5. This maps to a setting of 5 in the speech engine. bevocal.grammar.wordtransitionpenalty Extension. The bevocal.grammar.wordtransitionpenalty property controls the word transition weight. This is the trade-off between inserted and deleted words. For SLM based grammars, the optimal value is in the range 0 to 50. Note: This property is relevant only when recognizing against SLM grammars. It should always be set to its default value when recognizing against conventional grammars. For information on using SLM grammars, see Chapter 8, Nuance SayAnything Grammars in the Grammar Reference. The default value is bevocal.hotwordmax Extension. The bevocal.hotwordmax property specifies the maximum time duration of an utterance that can be recognized as a request to interrupt an operation. Recognition of a user utterance can interrupt the following operations: Playing a prompt or audio output when the bargein property is true. An outbound call performed by a <transfer> element that contains a child grammar. An outbound call initiated by a <bevocal:dial> element that contains a child grammar. If you want to allow users to interrupt with multisyllable words, or multiword commands, you can adjust bevocal.hotwordmax upward. The default value is 1.7 seconds. bevocal.hotwordmin Extension. The bevocal.hotwordmin property specifies the minimum time duration of an utterance that can be recognized as a request to interrupt an operation. Recognition of a user utterance can interrupt the following operations: Playing a prompt or audio output when the bargein property is true. An outbound call performed by a <transfer> element that contains a child grammar. During the execution of a <bevocal:listen> element that contains a child grammar. If you want to allow users to interrupt only with very short, single-syllable words, you can adjust bevocal.hotwordmin downward. The default value is 0.1 seconds. bevocal.incrementerroronnsp Extension. The bevocal.incrementerroronnsp property specifies whether to count "no speech" timeouts (NSP) as an error. If the property is set to a value of false, the NSPs are not counted as errors. The default value is true. bevocal.locale Extension. The bevocal.locale property specifies the locale to use for recognition, allowing the application to switch to a different locale within a document. The value must be a legal language identifier as identified by an RFC 3066 code. BeVocal currently supports English and Spanish. Valid values are: 348 VOICEXML PROGRAMMER S GUIDE

361 Property s en English en-us United States English es Spanish es-us United States Spanish fr-ca French Canadian The default value is en-us. If an unsupported language is specified, an error.unsupported.language event is thrown. Note: The xml:lang attribute of the <vxml> tag overrides the value of this property. Also, this property may be deprecated in a future release. Therefore, as a rule, you should use the xml:lang attribute of the <vxml> tag for specifying a locale. bevocal.logging Extension. The bevocal.logging property allows you to control whether all user input is recorded to the trace log which you see when you use the Log Browser tool. The value is one of: true all user input is recorded in the log. false user input is not recorded. If you want to prevent recording user input a particular field, dialog, or document for security reasons, you can set this property to false. The default value is true. bevocal.maxdialogerrors Extension. The value of the bevocal.maxdialogerrors property is the maximum number of speech errors that can occur within a particular execution of a dialog. A speech error is either a recognition error, which normally results in a no-match event, or a timeout while waiting for user input, which normally results in a no-input event. If you set the bevocal.maxdialogerrors property to 5, then on the fifth error in a particular form, an error.bevocal.maxdialogerrors_exceeded event is thrown (instead of a no-match or no-input event). The default value is 50. A value of 0 means that there is no limit on the number of errors. bevocal.maxerrors Extension. The value of the bevocal.maxerrors property is the maximum number of speech errors that can occur during the entire call. A speech error is either a recognition error, which normally results in a no-match event, or a timeout while waiting for user input, which normally results in a no-input event. If you set the bevocal.maxerrors property to 10, then on the tenth error in the call, an error.bevocal.maxerrors_exceeded event is thrown (instead of a no-match or no-input event). The default value is 100. A value of 0 means that there is no limit on the number of errors. bevocal.maxinterpretations Extension. The value of the bevocal.maxinterpretations property is used to enable or disable multiple interpretations: If the value is undefined or less than 1, both multiple-recognition features are controlled by the maxnbest property. If maxnbest is 1, both features are disabled; if maxnbest is greater than 1, both features are enabled. If the value is 1, multiple interpretations are disabled (independently of whether N-best recognition is enabled). VOICEXML PROGRAMMER S GUIDE 349

362 PROPERTIES If the value is greater than 1, multiple interpretations are enabled (independently of whether N-best recognition is enabled). For additional information about multiple recognition results, see Chapter 5, Using Multiple-Recognition. The value of bevocal.maxinterpretations is used in combination with the value of maxnbest to determine the maximum number of results that can be returned by the speech-recognition engine. See Maximum Array Size on page 366. The default value is undefined. Tip: For better performance, if you anticipate that the spoken inputs do not sound similar but that a valid spoken input might be ambiguous, leave the value of the maxnbest property as 1 and set the bevocal.maxinterpretations property to a number greater than 1. bevocal.mingoback Extension. The value of the bevocal.mingoback property is the minimum size of the go-back stack. The interpreter keeps at least this many entries on the stack, except at the beginning of the call when fewer steps have been executed, and after the user has said go back so many consecutive times that the stack has been depleted. Note: The go-back facility is an experimental extension to BeVocal VoiceXML. See Chapter 7, Go-Back Facility. The default value is 0, meaning that the go-back stack is always empty and the go-back facility is effectively disabled. bevocal.securelogging.enabled Extension. The bevocal.securelogging.enabled property allows you to control whether the log files and utterance files associated with the call are encrypted before being printed to the log. If secure logging is enabled and utterances are captured, then the captured utterances are also encrypted. The value is one of: true log contents are encrypted. false log contents are not encrypted. The default value is false. Note: If securelogging is enabled for a particular call, you will not be able to use VocalPlayer to listen to the call. You must download the desired files using the Media Access and Vendor Log Platform Services and decrypt them locally. For more information on secure logging, see bevocal.securelogging.key on page 350. bevocal.securelogging.key Extension. For secure logging, you must provide a public part of the public/private key pair using the bevocal.securelogging.key property. The public part of the key must be specified in Base64 encoded format. For example: <property name="bevocal.securelogging.key" value="migfma0gcsqgsib3dqebaquaa4gnadcbiqkbgqcdfg7t9r3ygmklw4snlwusyqhddxhwn9s olodgkq1jcttqf0uvh8hjjuc8eo+a44kfs+5z7pri9n5mm3d8sb+6jsegzqe6nqwzqexhloehfchmd kls6mr7wlhecaznfejdvlkq404mcptdeynhpvkljklrzcx3o/roh1a0j9/edwidaqab"/> If you are using an implementation of Java Cryptography Extension (JCE) to generate a public/private key pair, get the encoded bytes of the public key and do a Base64 encode of the bytes. 350 VOICEXML PROGRAMMER S GUIDE

363 Property s If you are using OpenSSL to create a public/private key pair, generate the key in DER format and encode the binary bytes with Base64 encoding. Alternately, you can copy the ASCII string from a PEM-formatted version of the key, which is effectively a Base64 representation of the DER format. The BeVocal VXML Interpreter generates a sessionkey and encrypts it using this public key using the RSA/ECB/OAIPPadding algorithm. This public/private encryption has been tested with Bouncy Castle s implementation of JCE. Using the sessionkey, the relevant parts of the log are encrypted and printed to the log. The encrypted sessionkey is printed to the log in HEX format. You decrypt the sessionkey using the private key and then decrypt the log contents using the DESEDE/ECB/PKCS5Padding algorithm. bevocal.security.key Extension. Access to BeVocal Platform services and to certain features of the interpreter is restricted with a 128-bit security key. To use such features, you must set the value of this property to your key. The Enrollment Facility uses a security key. In this case, applications using one security key are prevented from accessing enrollment grammars created by an application using a second key. When you develop applications for one of BeVocal's commercial hosting services such as Enterprise Hosting, you will need a security key in order to use enrollment. When you develop on Café, you can use enrollment without a key; however there are limitations. For details, see Chapter 9, Voice Enrollment Grammars in the Grammar Reference. By default, no value is set for this property. bevocal.sounds.listening Extension. The value of the bevocal.sounds.listening property controls whether the interpreter plays background music while listening for user input. If this property is true, the interpreter starts to play background music after playing a prompt; it stops when the user responds or when the timeout occurs, before throwing a no-input event. If this property is false, the interpreter does not play music while waiting for user input. The default value is false. bevocal.sounds.maskrecognitionlatency Extension. This property indicates whether, after an utterance, a sound clip is played back until the recognition result is returned. Use this property when you expect latencies from large grammars. If the value is true, then the sound clip is played. If the value is false, then nothing is played back. The sound clip played is determined by the fetchaudio property. The fetchaudiodelay property determines the minimum time to wait before the sound clip begins. The fetchaudiominimum property determines the minimum time to play the sound clip once it is started. The default value is false. bevocal.sounds.recognition Extension. The value of the bevocal.sounds.recognition property controls whether the user is given audio feedback that speech recognition was successful. Set this property to true to play a short sound after each successful recognition. This sound lets the user know that the input has been heard, which makes the application seem more responsive to typical users. If this property is false, no sound is played after successful recognition. The default value is false. VOICEXML PROGRAMMER S GUIDE 351

364 PROPERTIES bevocal.transfer.terminatetones Extension. This property exists because DTMF grammars are not supported during transfer. This property defines the DTMF sequences which can terminate a bridged <transfer> (or <dial>). The value can be string of DTMF tones to recognize, or a space-delimited list of such strings. The DTMF tones are identified by their numeric or string value. For example: <property name="bevocal.transfer.terminatetones" value="12345 ***"/> will terminate the call if the DTMF sequence of "12345" is recognized or the sequence "***". The default value is the empty string; that is, no tones. bevocal.utterance.prefix Extension. If utterance capture is enabled in the scope of this property, utterances are available both through the variables discussed with the bevocal.audio.capture property and through audio files when using the Media Access Service. The filename of a saved utterance is prefixed with the value of this property inside parentheses. For example, if the value of bevocal.utterance.prefix is restaurant, then the filename will start with (restaurant). This property can help identify an utterance with a specific application context. For example, in an application state that recognizes against a restaurant grammar, you can set the value to restaurant. When control leaves that application state, you can remove the value (by setting it to the empty string) or set it to a new value for the new state. Later, when analyzing audio files using the Media Access Service, you can safely assume that audio files whose names start with (restaurant) contain a restaurant name spoken by a user. If the underlying platform does not support filenames containing characters used in this property, then the utterance file is not created at all. Consequently, you should only use alphanumeric characters in your prefix. This property is relevant only if you have bevocal.audio.capture set to true. The default value is NONE; that is, by default filenames start with (NONE). bevocal.voice.name Extension. The name of the voice to be used for speaking prompts. In a previous BeVocal VoiceXML version, this applied to only the TTS voice used. Now the same property can be used to specify Recorded Voices. See Chapter 8, TTS and Recorded Voice Selection for details. The default value is the empty string, requesting the interpreter to use the most appropriate voice for each type. bevocal.vxml.maxrecognitionlatency The bevocal.vxml.maxrecognitionlatency property controls the maximum amount of time that the platform will wait for the recognizer to recognize any one utterance. If this time is exceeded, a NOMATCH is thrown. The default value is 15 seconds. caching VoiceXML 1.0 only. If the <vxml> tag s version attribute is 1.0, the caching property controls the use of unexpired cache files when the relevant maximum-age property is not set. The caching property can 352 VOICEXML PROGRAMMER S GUIDE

365 Property s be set to either safe to ignore any cached file when fetching, or fast to use any cached file instead of fetching. Note: The caching property is ignored when the <vxml> tag s version attribute is 2.0 or greater. In that case, the application uses an unexpired cached copy of a file if the relevant maximum-age property is not set. The default value is fast. completetimeout The value of the completetimeout property is the amount of time to wait for additional input after the speech-recognition engine has recognized speech matching one of the input grammars. See bevocal.finaltimeout for a description of how the completetimeout property interacts with the bevocal.finaltimeout and incompletetimeout properties. See Appendix D of the VoiceXML 2.0 specification for a description of how the timeout property interacts with the completetimeout and incompletetimeout properties. The default value is 0.5 seconds. The minimum possible value is 0.1 seconds and the maximum possible value is 10 seconds. confidencelevel The value of the confidencelevel property is the confidence threshold required for the speech-recognition engine to decide whether the input speech matches a grammar. In practice, most applications use a confidencelevel very near the default value of 0.5. The minimum is 0.0 and the maximum is 1.0. The default value of 0.5 maps to 35% confidence in the Nuance speech-recognition engine. Setting this property higher will result in fewer false recognitions at the cost of many more no-match events. Setting it lower will reduce the number of no-match events at the risk of more false recognitions. datafetchhint Extension. The datafetchhint property tells the interpreter whether XML data files may be prefetched. The value is one of: safe Fetch XML data files only when they are needed, never before. prefetch Permit, but do not require, the interpreter to prefetch XML data files. The default value is safe. datamaxage Extension. The datamaxage property specifies the maximum acceptable age, in seconds, of cached XML data files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tag s version attribute is 2.0 or greater, unexpired cached data files are used. If the <vxml> tag s version attribute is 1.0, the caching property to controls whether unexpired cached data files are used. VOICEXML PROGRAMMER S GUIDE 353

366 PROPERTIES By default, no value is set for this property. datamaxstale Extension. The datamaxstale property specifies the maximum acceptable time, in seconds, during which expired cached XML data files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. documentfetchhint The documentfetchhint property tells the interpreter whether VoiceXML documents may be prefetched. The value is one of: safe Fetch document files only when they are needed, never before. prefetch Permit, but do not require, the interpreter to prefetch document files. The default value is safe. documentmaxage New in VoiceXML 2.0. The documentmaxage property specifies the maximum acceptable age, in seconds, of cached VoiceXML document files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tag s version attribute is 2.0 or greater, unexpired cached VoiceXML document files are used. If the <vxml> tag s version attribute is 1.0, the caching property to controls whether unexpired cached document files are used. By default, no value is set for this property. documentmaxstale New in VoiceXML 2.0. The documentmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached VoiceXML document files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. fetchaudio The value of the fetchaudio property is the URI of the audio clip to play while waiting for a resource to be fetched. This property affects the fetching of some, but not all, resource types: This property is relevant when the interpreter fetches VoiceXML documents with the tags: <choice>, <goto>, <link>, <subdialog>, and <submit>. 354 VOICEXML PROGRAMMER S GUIDE

367 Property s Extension. If the bevocal.dtmf.flushbuffer property is true, this property affects the fetching of XML data with the <data> tag. Background audio is never played while the interpreter fetches grammar, audio, or script files. The default value is not to play any audio. fetchaudiodelay New in VoiceXML 2.0. The value of the fetchaudiodelay property is the amount of time to wait after a VoiceXML download is started before its fetchaudio source is played in the background. This property is useful if you want a short period of silence before the audio starts playing. If the download completes during the period of silence, the audio will not be played at all. The default value is 0 seconds. fetchaudiominimum New in VoiceXML 2.0. The value of the fetchaudiominimum property is the minimum time interval to play the fetchaudio source, once started, even if the fetch operation completes during play. If the value is greater than 0, it prevents the user from hearing a short clip of background audio which is immediately cut off. The default value is 0 seconds. With the default value, the interpreter interrupts the audio playback as soon as the resource is fetched, and resumes normal processing fetchtimeout The value of the fetchtimeout property is the timeout period for fetches. The interpreter will wait this long for the resource to be returned before throwing an error.badfetch event. The value is a time interval expressed as an unsigned number followed by s for time in seconds; ms for time in milliseconds (the default). If you set this property to 0, the interpreter waits indefinitely. The default value is 1 minute. grammarfetchhint The grammarfetchhint property tells the interpreter whether grammars may be prefetched. The value is one of: safe Fetch grammar files only when they are needed, never before. prefetch Permit, but do not require, the interpreter to prefetch grammar files. The default value is prefetch. grammarmaxage New in VoiceXML 2.0. The grammarmaxage property specifies the maximum acceptable age, in seconds, of cached grammar resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tag s version attribute is 2.0 or greater, unexpired cached grammar files are used. If the <vxml> tag s version attribute is 1.0, the caching property to controls whether unexpired cached grammar files are used. VOICEXML PROGRAMMER S GUIDE 355

368 PROPERTIES By default, no value is set for this property. grammarmaxstale New in VoiceXML 2.0. The grammarmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached grammar resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be fetched again; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. incompletetimeout The value of the incompletetimeout property is amount of time to wait for additional speech input when the user has begun speaking but the input does not yet match a complete grammar. The default value is 1.5 seconds. The minimum possible value is 0.2 seconds and the maximum possible value is 10 seconds. See bevocal.finaltimeout for a description of how the completetimeout property interacts with the bevocal.finaltimeout and incompletetimeout properties. See Appendix D of the VoiceXML 2.0 specification for a description of how the timeout property interacts with the incompletetimeout and completetimeout properties. inputmodes The inputmodes property specifies which input modes to enable: dtmf and/or voice. To disable speech recognition, set inputmodes to dtmf. To disable DTMF, set it to voice. To enable both input modes again after disabling one, specify both values, separated by a space: inputmodes="dtmf voice" You can use this property: To turn off speech recognition in noisy environments. To conserve speech recognition resources by turning them off where the all input is expected to be DTMF. The default value is "dtmf voice"; that is, accept both DTMF and voice input. interdigittimeout The value of the interdigittimeout property is the amount of time that the speech-recognition engine waits for another DTMF digit before it decides that the user has finished entering digits and returns a recognition or a no-match event. The default value is 3 seconds. 356 VOICEXML PROGRAMMER S GUIDE

369 Property s The interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition, as described here: Under these conditions... The user does not press any keys and timeout seconds elapse The user presses the termchar character The keys the user has pressed so far are not a valid input and interdigittimeout seconds elapse The keys the user has pressed so far are a valid input and The grammar does not allow the input to be extended to make a longer valid input and There is not an active termchar character The keys the user has pressed so far are a valid input and The grammar does not allow the input to be extended to make a longer valid input and There is an active termchar character The keys the user has pressed so far are a valid input and The grammar does allow the input to be extended to make a longer valid input and The interpreter does this... Generates a no-input event Recognition immediately ends and the interpreter either returns the valid input or throws a no-match event Throws a no-match event Immediately returns the valid input Waits termtimeout seconds before returning the valid input Waits interdigittimeout seconds before returning the valid input maxnbest New in VoiceXML 2.0. The value of the maxnbest property is used to enable or disable N-best recognition. If the value is 1, N-best recognition is disabled; if the value is greater than 1, N-best recognition is enabled. For additional information about multiple recognition results, see Chapter 5, Using Multiple-Recognition. Depending on the value of the bevocal.maxinterpretations property, maxnbest may also enable or disable multiple interpretations: If bevocal.maxinterpretations is undefined or less than 1, the maxnbest property enables and disables both N-best recognition and multiple interpretations together. If bevocal.maxinterpretations is 1 or greater, the maxnbest property enables and disables N-best recognition alone; the value of bevocal.maxinterpretations determines whether multiple interpretations are enabled. The value of maxnbest is used in combination with the value of bevocal.maxinterpretations to determine the maximum number of results that can be returned by the speech-recognition engine. See Maximum Array Size on page 366. Note: As the value of the maxnbest property increases, recognition can slow dramatically, because the speech-recognition engine is unable to prune lower-confidence hypotheses until much later in the recognition process. The default value is 1. Tips: Leave this property set to 1 except for those fields and mixed-initiative forms for which you anticipate that spoken inputs may sound similar to more than one expected response. When you do set this property, use a low value, namely the number of possible results that your application is prepared to handle. VOICEXML PROGRAMMER S GUIDE 357

370 PROPERTIES Set this property at the lowest possible level. Typically, you set this property in an individual <field> element or in a particular mixed-initiative <form> element when different expected responses can sound similar. For better performance, if you anticipate that the spoken inputs do not sound similar but that a valid spoken input might be ambiguous, leave the value of the maxnbest property as 1 and set the bevocal.maxinterpretations property to a number greater than 1. maxspeechtimeout This property specifies maximum duration of user speech. If this time elapsed before the user stops speaking, the maxspeechtimeout event is thrown. The default duration is 6 seconds. recordutterance The value of the recordutterance property controls whether the interpreter captures spoken audio for each recognition. This property is equivalent to the bevocal.audio.capture property. If this property is true, the interpreter captures spoken audio; if this property is false, the interpreter does not capture audio. The captured audio is available to the application in variables: When a successful recognition occurs for a field, the audio capture is available in the audio property of the field s shadow variable (fieldname$.audio) and in the application variable application.lastaudio$. When a no-match event occurs in a field, the audio capture is available in application.lastaudio$ only; fieldname$.audio is cleared. When a no-input event occurs in a field, both application.lastaudio$ and fieldname$.audio are cleared. When the user s speech successfully matches a form grammar, a menu choice, or a link grammar, the audio capture is available in application.lastaudio$. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. You can also access captured utterance files for later analysis, using the Media Access Service. For this purpose, you may want to use the bevocal.utterance.prefix property to modify the filenames of the utterance files. Note: Capturing audio causes a minor delay after each speech recognition for which recordutterance is true. You should set this property to true only locally inside specific fields in which the capability is needed. The default value is false. recordutterancetype The recordutterancetype property allows an application to record an utterance in a specified MIME type. BeVocal strongly recommends that this property is not used, as it causes unnecessary delays in transcoding the audio files. Supported media type values for the recordutterancetype property are: 358 VOICEXML PROGRAMMER S GUIDE

371 Property s Value audio/x-wav audio/x-alaw-basic audio/vnd.wave;codec=1 WAV (RIFF header) 8 KHz 8-bit mono mu-law WAV (RIFF header) 8 KHz 8-bit mono a-law WAV (RIFF header) 8 KHz 8-bit mono linear The default value is audio/x-wav. scriptfetchhint The scriptfetchhint property tells the interpreter whether scripts may be prefetched or not. The value is one of:. safe Fetch script files only when they are needed, never before prefetch Permit, but do not require, the interpreter to prefetch script files. The default value is prefetch. scriptmaxage New in VoiceXML 2.0. The scriptmaxage property specifies the maximum acceptable age, in seconds, of cached script resources. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property: If the <vxml> tag s version attribute is 2.0 or greater, unexpired cached script files are used. If the <vxml> tag s version attribute is 1.0, the caching property to controls whether unexpired cached script files are used. By default, no value is set for this property. scriptmaxstale New in VoiceXML 2.0. The scriptmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached script resources can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. ssmlfetchhint Extension. The ssmlfetchhint property tells the interpreter whether SSML documents may be prefetched. The value is one of: safe Fetch SSML files only when they are needed, never before. prefetch Permit, but do not require, the interpreter to prefetch SSML files. The default value is safe. VOICEXML PROGRAMMER S GUIDE 359

372 PROPERTIES ssmlmaxage New in VoiceXML 2.0; Extension. The ssmlmaxage property specifies the maximum acceptable age, in seconds, of cached SSML files. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. An unexpired cached file that does not exceed the maximum age will be used; a cached file that exceeds the maximum will be fetched again. When no value is set for this property, unexpired cached SSML files are used. By default, no value is set for this property. ssmlmaxstale New in VoiceXML 2.0; Extension. The ssmlmaxstale property specifies the maximum acceptable time, in seconds, during which expired cached SSML files can still be used. The value is a time interval expressed as an unsigned number followed by s for time in seconds (the default); ms for time in milliseconds. A cached file that has been expired for more that the maximum stale time will be refetched; one that has been stale for less than or equal to the maximum stale time will be used. The default value is 0 seconds. sensitivity The value of the sensitivity property is the sensitivity of the speech recognition. In effect, this is the gain or input volume used for the speech input. Higher values will allow the speech-recognition engine to recognize softer speech but will also pick up more background noise. You may need to tune this property downward if you expect your application to be used in noisy environments, or upward if you expect it to be used in very quiet environments. The default value is 0.5. The minimum is 0.0 and the maximum is 1.0. speedvsaccuracy The speedvsaccuracy property controls the trade-off between speed and accuracy in the speech-recognition engine. Lower values cause potential results with a low probability to be pruned early in the search process, resulting in faster recognition speed. Higher values retain potential results longer, resulting in slower but more accurate recognitions. If you have extremely large grammars that cause speech recognition to take too long, you can experiment with lower values for this property. Otherwise, you probably do not need to adjust this property. The default value of 0.5 maps to a fairly accurate setting (1200 on a scale of 0 to 1400 in the Nuance speech-recognition engine). The minimum is 0.0 and the maximum is 1.0. termchar The value of the termchar property is the (optional) terminating character for DTMF recognition. If this property is set, the speech-recognition engine will wait termtimeout seconds for the termination character before returning recognized DTMF input. See interdigittimeout for a description of how the interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition. The default value is #. termtimeout If the termchar property is set, the speech-recognition engine will wait termtimeout seconds for the termination character before returning recognized DTMF input. 360 VOICEXML PROGRAMMER S GUIDE

373 Property s See interdigittimeout for a description of how the interpreter uses a combination of the interdigittimeout, timeout, termtimeout, and termchar properties for DTMF recognition Note: When you have a fixed-length input field with a DTMF grammar, you may wish to set this property to 0 seconds to avoid a pause after the last digit is entered. The default value is 0 seconds. timeout The value of the timeout property is the maximum time the application waits for user input. After the specified amount of time has lapsed, the interpreter throws a no-input event. See Appendix D of the VoiceXML 2.0 specification for a thorough description of the way this property interacts with the completetimeout and incompletetimeout properties. See interdigittimeout for a description of how this property interacts with the interdigittimeout, termtimeout, and termchar properties for DTMF recognition. The default value is 5 seconds. universals New in VoiceXML 2.0. The universals property specifies which universal grammars should be active; it deactivates all other universal grammars. The value is one of: all make all universal grammars active none deactivate all universal grammars Space-separated list of the universal grammar names make the specified grammars active; deactivate all other universal grammars. The predefined universal grammars are help, exit, cancel and goback. You can define additional universal grammars by setting the universals property in <grammar> elements. The following <property> element deactivates the help grammar and activates the other predefined universal grammars: <property name="universals" value="exit cancel goback"/> For additional information, see Universal Commands and Grammars on page 14. If the <vxml> tag s version attribute is 2.0 or greater, the default value is none. If the version attribute is 1.0, the default value is all. VOICEXML PROGRAMMER S GUIDE 361

374 PROPERTIES 362 VOICEXML PROGRAMMER S GUIDE

375 13 Variables This chapter describes variables defined by VoiceXML. See: Variable Summary on page 363 for an overview of the variables, grouped by scope Variable Index on page 364 for an alphabetical list of all variables In the cases where the BeVocal VoiceXML interpreter deviates from the VoiceXML 2.0 Specification, the difference is clearly marked below in the following ways: Not Implemented Functionality not currently available. Extension Added functionality. Deprecated Non-standard or superseded feature that was supported by an earlier version but has been replaced by a new feature. Variable Summary The following table organizes variables according to scope in which each variable s value is available: Session variables are available to all applications that are executed in a particular session with the VoiceXML interpreter. Application variables are available throughout a particular application. Event-related variables are available in event handlers only. Scope Session Application Variable session.bevocal.timeincall Extension session.bevocal.version Extension session.iidigits Not Implemented session.telephone.ani session.telephone.dnis application.lastaudio$ Extension application.lastresult$ New in VoiceXML 2.0 Event handler _event New in VoiceXML 2.0 _message New in VoiceXML 2.0

376 VARIABLES Variable Index The following table lists the variables in alphabetical order. Variable _event New in VoiceXML 2.0 _message New in VoiceXML 2.0 application.lastaudio$ Extension application.lastresult$ New in VoiceXML 2.0 session.bevocal.timeincall Extension session.bevocal.version Extension session.iidigits Not Implemented session.telephone.ani VoiceXML 1.0 only session.telephone.dnis VoiceXML 1.0 only Scope Event handler Event handler Application Application Session Session Session Session Session Variable s This section contains variable descriptions is alphabetical order. _event New in VoiceXML 2.0. Within the anonymous scope of an event handler, the JavaScript variable _event is set to the name of the event that was thrown. For example: <error> <prompt>event is <value expr="_event"/></prompt> </error> If the event is error.badfetch.http.500, this handler will say, event is error.badfetch.http.500. _message New in VoiceXML 2.0. Within the anonymous scope of an event handler, the JavaScript variable _message is set to the message string that provides additional context about the event that was thrown. If no message was supplied when the event was thrown, this variable has the value undefined. application.lastaudio$ Extension. If the bevocal.audio.capture property is set to true, the interpreter captures spoken audio for each recognition. When the user s speech matches a field grammar, a form grammar, a menu choice, or a link grammar, the application.lastaudio$ variable contains an audio capture of the user s speech. When a no-match event occurs in a field, the application.lastaudio$ variable contains an audio capture of the user s speech. When a no-input event occurs in a field, the application.lastaudio$ variable is cleared. You can send the captured audio to a server using a <data> element. Doing so is useful if you need a record of the user s speech for legal reasons. 364 VOICEXML PROGRAMMER S GUIDE

377 Variable s application.lastresult$ New in VoiceXML 2.0. When the user provides input during the execution of a <field> element or the <initial> element of a mixed-initiative form, the interpreter invokes the speech-recognition engine on the user s response. The most likely recognized utterance is used to set the relevant input variables. This utterance is chosen on the basis of speech-recognition engine confidence levels as well as grammar weighting and scoping rules. If this utterance matches multiple rules in an ambiguous grammar, the input variables are set according to an arbitrary one of those rules. Additional information from the speech-recognition engine is available in the read-only variable application.lastresult$. This variable has a dual interpretation: It acts like a normal object whose properties describe the most likely recognized result that is, the one that was used to set the input variables. It also acts like an array of objects, each describing one of the likely recognition results. This array always has at least one element. If multiple recognition is enabled, it may contain additional elements. See Maximum Array Size for further details. The application.lastresult$ variable, and each member of the application.lastresult$ array, is a JavaScript object with the following properties: Property confidence utterance inputmode interpretation The recognition confidence level of this result (with 0.0 representing the lowest confidence and 1.0 representing the highest). The raw string of words that were recognized, for example "portland oregon". The mode in which user input was provided, one of: dtmf - DTMF input voice - spoken input The interpretation of this result. This property contains a JavaScript object whose properties correspond to the slots that can be set by the matched grammar rule. If the application.lastresult$ array contains more than one element, each element has a different combination of utterance and interpretation that is, different elements differ in the utterance, the interpretation, or both. Each element corresponds to one interpretation of one likely utterance. The same utterance may have different interpretations, and two or more different utterances may have a common interpretation. The elements of the application.lastresult$ array for different possible utterances are sorted by descending order of confidence level; elements for the different interpretations of a given utterance or for multiple utterances with the same confidence are in an undefined order. You can use the expression application.lastresult$.length to get the number of elements in the application.lastresult$ array. The application.lastresult$ variable holds information about the last recognition that occurred within the application. Before the interpreter enters a waiting state (a recognition, record, or transfer) the variable is set to undefined. When a nomatch event is thrown, application.lastresult$ is set to the nomatch result. When a noinput event is thrown, application.lastresult$ is not reset to undefined. An application can check the variable in the <filled> element of the field or form for which input was received. VOICEXML PROGRAMMER S GUIDE 365

378 VARIABLES Maximum Array Size The maxnbest and bevocal.maxinterpretations properties determine which multiple-recognition features are enabled and the maximum number of elements in the application.lastresult$ array. The following table shows the possible combinations of values for these properties. maxnbest bevocal.maxinterpretations Maximum Array Size 1 Unset, 1, or less than 1 1 Both features are disabled 1 Greater than 1 bevocal.maximuminterpretations Only multiple interpretations is enabled Greater than 1 Unset or less than 1 maxnbest Both features are enabled Greater than 1 1 maxnbest Only N-best recognition is enabled Greater than 1 Greater than 1 maxnbest * bevocal.maximuminterpretations Both features are enabled If both features are disabled, the application.lastresult$ array contains a single element with index 0 and application.lastresult$[0] is identical to application.lastresult$. If one or both of the features are enabled, the array may contain multiple elements, up to the maximum specified in the table. If N results were found, the array contains N elements with indexes from 0 to N-1. Variable Structure The entire structure of the application.lastresult$ variable is as follows: application.lastresult$ { confidence utterance inputmode interpretation { slotname1 slotname2... } } application.lastresult$[0] { confidence utterance inputmode interpretation {...} }... application.lastresult$[n] {... } Recognition Results You typically examine the application.lastresult$ object if multiple recognition is disabled. You typically examine the application.lastresult$ array if multiple recognition is enabled. Whether or not multiple recognition is enabled, application.lastresult$.utterance is the most likely recognized utterance and application.lastresult$.interpretation is the chosen interpretation of that utterance an object whose properties are the slots that were filled in, corresponding to the input variables that were set. 366 VOICEXML PROGRAMMER S GUIDE

379 Variable s Tip: Remember not to access properties of a particular element of the application.lastresult$ array until you have verified that the element exists. If you try to access application.lastresult$[i].utterance when i is greater than or equal to the number of results, an error.semantic event is thrown. Multiple Recognized Utterances If the speech-recognition engine finds multiple possible utterances, the application.lastresult$ array contains at least one element for each utterance. The elements for different utterances are ordered by speech-recognition engine confidence levels alone. When multiple utterances are recognized, the value of application.lastresult$ is always identical to application.lastresult$[0]. For example, suppose the user muttered something that sounded like Austin or Boston as the initial input to a form with an unambiguous grammar. The application.lastresult$ variable might be set as follows. Recognition Result Most likely utterance Chosen (and only) interpretation of the utterance Utterance with the highest confidence level First (and only) interpretation of the utterance Utterance with the second highest confidence level First (and only) interpretation of the utterance With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Austin and the state input variable would be set to TX. The application could either proceed with those settings, or examine the application.lastresult$ array to determine that another interpretation of the input is possible. Multiple Interpretations Property Value application.lastresult$.confidence.38 application.lastresult$.utterance "austin" application.lastresult$.inputmode "voice" application.lastresult$.interpretation.city "Austin" application.lastresult$.interpretation.state "TX" application.lastresult$[0].confidence.38 application.lastresult$[0].utterance "austin" application.lastresult$[0].inputmode "voice" application.lastresult$[0].interpretation.city "Austin" application.lastresult$[0].interpretation.state "TX" application.lastresult$[1].confidence.37 application.lastresult$[1].utterance "boston" application.lastresult$[1].inputmode "voice" application.lastresult$[1].interpretation.city "Boston" application.lastresult$[1].interpretation.state "MA" If a particular utterance matches a single grammar rule, the application.lastresult$ array contains a single element for that utterance. The interpretation property of this elements gives the slot values set by the matched grammar rule. If the utterance matches multiple rules in an ambiguous grammar, the application.lastresult$ array contains multiple elements for that utterance; the interpretation property of each of these elements gives the slot values set by one of those grammar rules. The order of these elements within the array is undefined. VOICEXML PROGRAMMER S GUIDE 367

380 VARIABLES For example, suppose the user clearly said Portland. The chosen interpretation for this recognized result might be Portland, Oregon, and the application.lastresult$ variable might be set as follows. Recognition Result Most likely (and only) utterance Chosen interpretation of the utterance Most likely (and only) utterance First interpretation of the utterance Most likely (and only) utterance Second interpretation of the utterance Property Value application.lastresult$.confidence.8 application.lastresult$.utterance "portland" application.lastresult$.inputmode "voice" application.lastresult$.interpretation.city "Portland" application.lastresult$.interpretation.state "OR" application.lastresult$.confidence.8 application.lastresult$.utterance "portland" application.lastresult$.inputmode "voice" application.lastresult$.interpretation[0].city "Portland" application.lastresult$.interpretation[0].state "ME" application.lastresult$.confidence.8 application.lastresult$.utterance "portland" application.lastresult$.inputmode "voice" application.lastresult$.interpretation[1].city "Portland" application.lastresult$.interpretation[1].state "OR" With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Portland and the state input variable would be set to OR. The application could either proceed with those settings, or examine the application.lastresult$ variable to determine that another interpretation of the input is possible. Multiple Utterances and Interpretations If the speech-recognition engine finds multiple possible utterances that match an ambiguous grammar, one or more of the utterances may have multiple interpretations. For example, suppose the user muttered something that sounded like Austin or Boston. If the speech-recognition engine found two interpretations for the utterance Austin and one for Boston, the application.lastresult$ variable might be set as follows. Recognition Result Most likely utterance Chosen (and only) interpretation of the utterance Property Value application.lastresult$.confidence.37 application.lastresult$.utterance "boston" application.lastresult$.inputmode "voice" application.lastresult$.interpretation.city "Boston" application.lastresult$.interpretation.state "MA" 368 VOICEXML PROGRAMMER S GUIDE

381 Variable s Recognition Result Utterance with the highest confidence level First (and only) interpretation of the utterance Utterance with the second highest confidence level First interpretation of the utterance Utterance with the second highest confidence level Second interpretation of the utterance Property Value application.lastresult$[0].confidence.37 application.lastresult$[0].utterance "boston" application.lastresult$[0].inputmode "voice" application.lastresult$[0].interpretation.city "Boston" application.lastresult$[0].interpretation.state "MA" application.lastresult$[1].confidence.36 application.lastresult$[1].utterance "austin" application.lastresult$[1].inputmode "voice" application.lastresult$[1].interpretation.city "Austin" application.lastresult$[1].interpretation.state "TX" application.lastresult$[2].confidence.36 application.lastresult$[2].utterance "austin" application.lastresult$[2].inputmode "voice" application.lastresult$[2].interpretation.city "Austin" application.lastresult$[2].interpretation.state "CA" With the application.lastresult$ variable set as shown in the preceding table, the city input variable would be set to Boston and the state input variable would be set to MA. The application could proceed with those settings, or examine the application.lastresult$ variable to determine that other interpretations of the input are possible. session.bevocal.timeincall Extension. The number of milliseconds since the beginning of this call. session.bevocal.version Extension. The version number of the VoiceXML interpreter (for example, 1.2.3). session.iidigits Not Implemented. Information Indicator Digits The session.iidigits variable is set to information about the caller s location (pay phone, and so on), when available. Complete list available in Local Exchange Routing Guide published by Telecordia. session.telephone.ani Automatic Number Identification The session.telephone.ani variable is set to the caller s telephone number, when available. session.telephone.dnis Dialed Number Identification Service The session.telephone.dnis variable is set to the number the caller dialed, when available. VOICEXML PROGRAMMER S GUIDE 369

382 VARIABLES 370 VOICEXML PROGRAMMER S GUIDE

383 14 JavaScript Functions and Objects This chapter describes JavaScript functions and objects that are available to BeVocal VoiceXML applications. These are all extensions to the VoiceXML specification. For general information on using JavaScript in BeVocal VoiceXML applications, see JavaScript Quick Reference. JavaScript Constants bevocal.outboundrequestid Extension. For calls initiated using the Outbound Notification Service, the value of bevocal.outboundrequestid provides the Outbound Request ID associated with the call. This value can be used to map specific application invocations with the request used to initiate the application. bevocal.sessionid Extension. Provides a unique ID for each call. This value can be used to keep track of call-specific information in the server-side component of your application if cookies cannot be used for any reason. _addheader Syntax Parameters Extension. Method of a JavaScript SOAP proxy object. Specifies an additional SOAP header for the SOAP service that the proxy object represents. Once the SOAP header is set on a service object, the SOAP header element is part of the SOAP message for every method executed on that service object. _addheader( String headernamespace, String headername, Object headervalue, String actor, String mustunderstand ) Parameter headernamespace headername headervalue The namespace component of the desired header's name. The local component of the desired header's name. The value of the header.

384 JAVASCRIPT FUNCTIONS AND OBJECTS See Also Example Parameter actor mustunderstand Chapter 10, SOAP Client Facility You can use the Call Detail Record Access Service to access information about individual calls to a VoiceXML application. You might decide to write another VoiceXML application that presents information retrieved using the CDR service. Because this is a BeVocal platform service, you must pass security information in SOAP request headers to the service. You could do so as follows: <script> var service = bevocal.soap.servicefromwsdl( " e_v2?wsdl", "CDRAccessService_v2", " "CdrAccessService", " e_v2" ); service._addheader( " "platformservicessessionid", "ACB035CB-C e-99BA-3D5B8468BB76");... </script> The value of the actor attribute for this header. Optional. The value of the mustunderstand attribute for this header. Optional. After the call to _addheader, SOAP headers using that proxy object would include this information: <soapenv:header> <ns1:platformservicessessionid xsi:type="xsd:string" xmlns:ns1=" > ACB035CB-C e-99BA-3D5B8468BB76 </ns1:platformservicessessionid> </soapenv:header> bevocal.cookies.addclientcookie Syntax Extension.Sets a cookie on the VXML client. The created cookie is treated as HTTP cookies and is passed over the subsequent fetches in the application. The cookie lasts for the duration of the call or the session. bevocal.cookies.addclientcookie( String domain, String key, 372 VOICEXML PROGRAMMER S GUIDE

385 bevocal.cookies.addclientcookie ) Parameters String value, See Also Example Parameter domain key value bevocal.cookies.deleteclientcookie bevocal.cookies.getclientcookie bevocal.cookies.getclientcookies This example demonstrates the bevocal.cookies.addclientcookie, bevocal.cookies.deleteclientcookie, bevocal.cookies.getclientcookie, and bevocal.cookies.getclientcookies methods. Note that the first two cookies are sent as HTTP cookies because the domain parameter is set to cafe.bevocal.com.  A valid domain. If a valid domain is passed, the created cookie is passed as a HTTP header for subsequent fetches matching the same domain. The name of the cookie The value of the cookie. <?xml version="1.0" encoding="iso "?> <vxml version="2.0" xmlns=" xmlns:bevocal=" <var name="user_1"/> <var name="user_2"/> <var name="user_3"/> <var name="user"/> <form id="form1"> <block> <prompt>testing client cookies</prompt> <script> bevocal.log("setting cookies now"); bevocal.cookies.addclientcookie("cafe.bevocal.com", "user_1", "1234"); bevocal.cookies.addclientcookie("cafe.bevocal.com", "user_2", "hello world"); bevocal.cookies.addclientcookie(null, "user_3", "xyz123"); user_1 = bevocal.cookies.getclientcookie(null, "user_1"); user_2 = bevocal.cookies.getclientcookie(null, "user_2"); user_3 = bevocal.cookies.getclientcookie(null, "user_3"); bevocal.log("user_1 = " + user_1 + "; user_2 = " + user_2 + "; user_3 = " + user_3); user = bevocal.cookies.getclientcookies(null); bevocal.log("the value from map is " + user.get("user_1")); </script> </block> <block> VOICEXML PROGRAMMER S GUIDE 373

386 JAVASCRIPT FUNCTIONS AND OBJECTS <prompt><value expr="user_1"/><value expr="user_2"/><value expr="user_3"/> </prompt> <goto next="clientcookies_2.vxml"/> </block> </form> </vxml>  <?xml version="1.0" encoding="iso "?> <vxml version="2.0" xmlns=" xmlns:bevocal=" <var name="user_2"/> <form id="form1"> <block> <prompt>testing client cookies in second document </prompt> <script> user_2 = bevocal.cookies.getclientcookie(null, "user_2"); bevocal.log("user_2 = " + user_2); bevocal.cookies.deleteclientcookie("user_2"); </script> </block> <block> <prompt><value expr="user_2"/></prompt> <goto next="clientcookies_3.vxml"/> </block> </form> </vxml>  <?xml version="1.0" encoding="iso "?> <vxml version="2.0" xmlns=" xmlns:bevocal=" <var name="user_2"/> <var name="user_3"/> <form id="form1"> <block> <prompt>testing client cookies in third document </prompt> <script> user_2 = bevocal.cookies.getclientcookie(null, "user_2"); user_3 = bevocal.cookies.getclientcookie(null, "user_3"); bevocal.log("user_2 = " + user_2 + "; user_3 = " + user_3); </script> </block> <block> 374 VOICEXML PROGRAMMER S GUIDE

387 bevocal.cookies.deleteclientcookie bevocal.cookies.deleteclientcookie Syntax Parameters Extension.Deletes a cookie on the VXML client. bevocal.cookies.deleteclientcookie( String domain, String key, ) See Also Example Parameter domain key bevocal.cookies.addclientcookie bevocal.cookies.getclientcookie bevocal.cookies.getclientcookies A valid domain. If a valid domain is passed, a cookie with name key that matches the domain is deleted. Otherwise, a cookie with that name is deleted. The name of the cookie See bevocal.cookies.addclientcookie on page 372 bevocal.cookies.getclientcookie Syntax Parameters Extension. Gets the value of a cookie. bevocal.cookies.getclientcookie( String domain, String key, ) See Also Parameter domain key A valid domain. If a valid domain is passed, a cookie with name key that matches the domain is returned. Otherwise, a cookie with that name is returned. The name of the cookie bevocal.cookies.addclientcookie bevocal.cookies.deleteclientcookie VOICEXML PROGRAMMER S GUIDE 375

388 JAVASCRIPT FUNCTIONS AND OBJECTS Example bevocal.cookies.getclientcookies See bevocal.cookies.addclientcookie on page 372 bevocal.cookies.getclientcookies Syntax Parameters Extension. Returns a HashMap of all key/value cookie pairs. bevocal.cookies.getclientcookies( String domain, ) See Also Example Parameter domain bevocal.cookies.addclientcookie bevocal.cookies.deleteclientcookie bevocal.cookies.getclientcookie A valid domain. If a valid domain is passed, cookies matching that domain are returned. Otherwise, all cookies are returned. See bevocal.cookies.addclientcookie on page 372 bevocal.enroll.removeenrolledphrase Syntax Extension. Deletes enrolled phrases from grammars. bevocal.enroll.removeenrolledphrase(grammar, speakerid, phraseid, key, 376 VOICEXML PROGRAMMER S GUIDE

389 bevocal.getproperty Parameters xml:lang) Parameter grammar speakerid phraseid key xml:lang The name of the grammar containing the enrolled phrase to delete. Required. The id of the speaker who enrolled the phrase to be deleted. Required. The id of the phrase to be deleted. Required. The security key for accessing the specified enrollment grammar. Optional when running on the BeVocal Café. (You must pass a key argument, but it can be null or an empty string). Required when running in other environments such as Enterprise Hosting. The language of the enrolled phrase to delete. Optional. The value of this attribute must be the same as the value of the xml:lang attribute of the <vxml> tag for the document. If you are using enrollment to maintain a voice address book or other dynamic lookup mechanism, you need to be able to delete phrases from the grammar in addition to adding them. bevocal.getproperty Extension. Gets the value of a VoiceXML property. Syntax bevocal.getproperty(string name) Parameters Parameter name The name of the property. Returns The value of the specified property in the current scope. bevocal.getversion Extension. Returns the VoiceXML interpreter version. Syntax bevocal.getversion() Parameters None. VOICEXML PROGRAMMER S GUIDE 377

390 JAVASCRIPT FUNCTIONS AND OBJECTS Returns The version number of the VoiceXML interpreter, as a string. Currently, this function always returns 2.4. bevocal.log Extension. Writes a message to the BeVocal Café website. Syntax bevocal.log(string message) Parameters Parameter message The message to be written to the website. The log function writes the specified message to the BeVocal Café call log, which you can view on the Café website. This function performs the same operation as a <log> VoiceXML element with no label or expr attribute. bevocal.soap.servicefromwsdl Syntax Parameters Extension. Method of the bevocal.soap object. Given the URL for a WSDL file, create an object to act as a proxy for one of the services described in the file. This is the preferred way to create a service proxy object. The WSDL file allows the interpreter to pre-load the proxy object with information about all of the service's methods and their parameter types, making the mapping from JavaScript types to SOAP encodings much more accurate. bevocal.soap.servicefromwsdl( String WSDLUrl, String port, String svcnamespace, String svcname, String endpointurl ) Parameter WSDLUrl port The URL of the WSDL file describing this service. Information on the endpoint, method names, argument types, and namespaces will be retrieved from the file. The port name of the service, as given in a <port name="..."> element inside the <service> element for the service. 378 VOICEXML PROGRAMMER S GUIDE

391 bevocal.soap.servicefromendpoint Error Handling See Also Parameter svcnamespace svcname endpointurl If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.soapexception will be thrown. Possible errors include: Receives a malformed URL. Cannot open a network connection or retrieve a resource. Referenced WSDL file is malformed or does not contain the requested service or port. Chapter 10, SOAP Client Facility The namespace used on the <service> element for the desired service. If the <service> element does not have its own namespace (that is, it does not have its own xmlns attribute), use the value of the targetnamespace attribute from the WSDL file s <definition> element. If neither of these is present, leave the parameter blank. The name of the service, as given in a <service name="..."> element for the service. Optional. If provided, the SOAP endpoint for this service. If this parameter is not provided, the endpoint retrieved from the WSDL will be used. bevocal.soap.servicefromendpoint Syntax Parameters Extension. Method of the bevocal.soap object. Given the name and endpoint URL of a SOAP service, create an object to act as a proxy for the service. This method should be used only when the WSDL for a service is not available, since the proxy it creates does not have information on method names and parameter types available, and thus has to make more assumptions when it makes SOAP method calls. bevocal.soap.servicefromendpoint( String endpointurl, String svcnamespace, String svcname ) Error Handling Parameter endpointurl svcnamespace svcname The SOAP endpoint for this service. The namespace component of the desired service's name. The local component of the desired service's name. If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.soapexception will be thrown. Possible errors include: Receives a malformed URL. Cannot open a network connection or retrieve a resource. VOICEXML PROGRAMMER S GUIDE 379

392 JAVASCRIPT FUNCTIONS AND OBJECTS See Also Example Referenced WSDL file is malformed or does not contain the requested service or port. Chapter 10, SOAP Client Facility To use a temperature service available from go there and see that the WSDL for this service is at The content of the WSDL is: <?xml version="1.0"?> <definitions name="temperatureservice" targetnamespace=" xmlns:tns=" xmlns:xsd=" xmlns:soap=" xmlns=" <message name="gettemprequest"> <part name="zipcode" type="xsd:string"/> </message> <message name="gettempresponse"> <part name="return" type="xsd:float"/> </message> <porttype name="temperatureporttype"> <operation name="gettemp"> <input message="tns:gettemprequest"/> <output message="tns:gettempresponse"/> </operation> </porttype> <binding name="temperaturebinding" type="tns:temperatureporttype"> <soap:binding style="rpc" transport=" <operation name="gettemp"> <soap:operation soapaction=""/> <input> <soap:body use="encoded" namespace="urn:xmethods-temperature" encodingstyle=" </input> <output> <soap:body use="encoded" namespace="urn:xmethods-temperature" encodingstyle=" </output> </operation> </binding> <service name="temperatureservice"> <documentation>returns current temperature in a given U.S. zipcode </documentation> <port name="temperatureport" binding="tns:temperaturebinding"> <soap:address location=" </port> </service> </definitions> 380 VOICEXML PROGRAMMER S GUIDE

393 bevocal.soap.locateservice The bold lines in the defintion provide the rest of the information you need to create a proxy object for this service in your VoiceXML application: <script> <![CDATA[ var service = bevocal.soap.servicefromwsdl( // WSDL URL " // name attribute of port "TemperaturePort", // targetnamespace attribute of definitions " // name attribute of service "TemperatureService", // location attribute of soap:address child of port element " // ); ]]> </script> bevocal.soap.locateservice Syntax Parameters Extension. Method of the bevocal.soap object. Uses the BeVocal SOAP service registry to locate either the standard BeVocal SOAP service whose id is serviceid or a particular version of that service. Returns a proxy object that can be used to call its methods. Currently, there are not services in the BeVocal SOAP service registry. bevocal.soap.locateservice( String serviceid, String version ) Parameter serviceid version Error Handling The ID of the SOAP service you wish to locate and create a proxy for. Optional. The desired version of the service. If not present, returns a proxy for the highest-numbered version of the service (if there multiple versions). If an error occurs while locating a SOAP service or creating a SOAP proxy object, an exception of type bevocal.soap.soapexception will be thrown. Possible errors include: Receives a service ID or version number that is not in the registry. Cannot open a network connection or retrieve a resource. VOICEXML PROGRAMMER S GUIDE 381

394 JAVASCRIPT FUNCTIONS AND OBJECTS See Also Referenced WSDL file is malformed or does not contain the requested service or port. Chapter 10, SOAP Client Facility bevocal.soap.soapexception Extension. Exception object. If an error occurs while creating a SOAP proxy object, an exception of type bevocal.soap.soapexception will be thrown. This object has two dynamic properties and a number of constants: Example See Also Property type message cause BAD_SERVICE_ID BAD_VERSION INVALID_URL NETWORK_ERROR UNKNOWN_ERROR <script> var service; try { service = bevocal.soap.locateservice("myservice", "2.0"); } catch (error if error.type == "bevocal.soap.soapexception") { if (error.cause == error.bad_version) { service = bevocal.soap.locateservice("myservice", "1.0"); } else { throw error; } } </script> If the exception is not caught in JavaScript code in the <script>, it will propagate to the VoiceXML interpreter and be re-thrown as an error.semantic in VoiceXML. Chapter 10, SOAP Client Facility String constant. bevocal.soap.soapexception. String. A message describing the reason for the exception. Number. A number giving the reason for the exception. It will be one of the following constants. Number constant. The serviceid passed to locateservice could not be found in the registry. Number constant. The version number passed to locateservice could not be found in the registry. Number constant. The endpoint or WSDL URL was malformed. Number constant. A network error occurred while accessing the registry or the WSDL file. This can include an error fetching the WSDL file, a timeout error, and so on. Number constant. The exact error could not be determined; see the message for details. Currently, errors parsing the WSDL fall into this category; in a future release we hope to split them into a separate category. 382 VOICEXML PROGRAMMER S GUIDE

395 bevocal.soap.soapfault bevocal.soap.soapfault Extension. Fault object. This object has the following properties: See Also Property type faultstring faultcode faultactor detailstring details MUST_UNDERSTAND VERSION_MISMATCH SERVER CLIENT Chapter 10, SOAP Client Facility String constant. bevocal.soap.soapfault. String. A message describing the reason for the exception. String. A code describing the reason for the failure. See section of the SOAP 1.1 specification for details. Number. The serviceid passed to locateservice could not be found in the registry. String. A String representation of the detail element of the fault. The detail element is present whenever a fault is caused by processing of the Body element. FaultDetails. An object whose properties are the detail elements; that is, the immediate child elements of the detail element in the fault element. See section 4.4 of the SOAP 1.1 specification. String constant. SOAP fault code: A child of the SOAP Header element that contained a mustunderstand attribute equal to "1" was not understood. String constant. SOAP fault code: The processor found an invalid namespace for the SOAP Envelope. String constant. SOAP fault code: Processing of the call failed due to an error on the server. The failure was not directly related to the contents of the message, and the message may succeed at a later point in time. String constant. SOAP fault code: Processing of the call failed due to the contents of the message. This failure code is returned when a function is called with invalid parameters, bad data, and so on. bevocal.soap.faultdetails Extension. FaultDetails object. Essentially, bevocal.soap.faultdetails is an object whose property names are the local components (without namespaces) of the names of the elements under the details element, and whose property values are the text contents of those elements. Property Type type String constant bevocal.soap.faultdetails (String) String If you try to use a FaultDetails object as a String, the BeVocal interpreter returns a String representation of the details elements. This is intended primarily for debugging. VOICEXML PROGRAMMER S GUIDE 383

396 JAVASCRIPT FUNCTIONS AND OBJECTS See Also Property Type (Number) Number If you try to use a FaultDetails object as a Number, the BeVocal interpreter returns the number of detail elements represented by the FaultDetail object. This is intended primarily for debugging. element id String Given a property name that is the local name (without namespace) of one of the child elements of the details element, the property value is the text contents of the child element. Chapter 10, SOAP Client Facility 384 VOICEXML PROGRAMMER S GUIDE