How To Use Voicexml On A Computer Or Phone (Windows)



Similar documents
An Introduction to VoiceXML

Thin Client Development and Wireless Markup Languages cont. VoiceXML and Voice Portals

Standard Languages for Developing Multimodal Applications

The Program. The Program. ALTA2004 Introduction to VoiceXML. Recommended Literature. Rolf Schwitter

VoiceXML-Based Dialogue Systems

! <?xml version="1.0">! <vxml version="2.0">!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml>

Dialog planning in VoiceXML

VoiceXML Overview. James A. Larson Intel Corporation (c) 2007 Larson Technical Services 1

Support and Compatibility

Traitement de la Parole

Combining VoiceXML with CCXML

VoiceXML. Erik Harborg SINTEF IKT. Presentasjon, 4. årskurs, NTNU, ICT

XML based Interactive Voice Response System

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is.

VoiceXML Programmer s Guide

How To Write A Powerpoint Powerpoint Gsl In A Html Document In A Wordpress (Html) Or A Microsoft Powerpoint (Html5) (Html3) (Powerpoint) (Web) (Www

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Specialty Answering Service. All rights reserved.

Version 2.6. Virtual Receptionist Stepping Through the Basics

A Development Tool for VoiceXML-Based Interactive Voice Response Systems

Voice Processing Standards. Mukesh Sundaram Vice President, Engineering Genesys (an Alcatel company)

Voic . Advanced User s Guide. Version 2.0

VXI* IVR / IVVR. VON.x 2008 OpenSER Summit. Ivan Sixto CEO / Business Dev. Manager. San Jose CA-US, March 17th, 2008

VoiceXML Data Logging Overview

VoiceXML. For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, and Johan Sverin, Date:

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

BeVocal VoiceXML Tutorial

VoiceXML versus SALT: selecting a voice

VoiceXML Discussion.

Signatures. Advanced User s Guide. Version 2.0

Phone Routing Stepping Through the Basics

Grammar Reference GRAMMAR REFERENCE 1

Interfaces de voz avanzadas con VoiceXML

Moving Enterprise Applications into VoiceXML. May 2002

A design of the transcoder to convert the VoiceXML documents into the XHTML+Voice documents

Cisco IOS VoiceXML Browser

VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML

Avaya Aura Orchestration Designer

AN EXTENSIBLE TRANSCODER FOR HTML TO VOICEXML CONVERSION

Emerging technologies - AJAX, VXML SOA in the travel industry

Voice Driven Animation System

Cisco IOS Voice XML Browser

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

Hosted Fax Mail. Hosted Fax Mail. User Guide

Voice Call Addon for Ozeki NG SMS Gateway

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

Form. Settings, page 2 Element Data, page 7 Exit States, page 8 Audio Groups, page 9 Folder and Class Information, page 9 Events, page 10

Voice XML: Bringing Agility to Customer Self-Service with Speech About Eric Tamblyn Voice XML: Bringing Agility to Customer Self-Service with Speech

Using Service Oriented Architecture (SOA) for Speaker-Biometrics Applications

Voice Tools Project (VTP) Creation Review

Integrating VoiceXML with SIP services

Abstract. Avaya Solution & Interoperability Test Lab

NeoIVR. Flexible & high performance IVR platform

new voice technologies deliver

Migrating Legacy IVR Applications to VoiceXML with Voxeo The advantages of a 100% VoiceXML compliant platform

SIP Voice Server Voice-enabling HTML

CCXML & the Power of Standards-Based Call Control E X E C U T I V E B R I E F I N G M A R C H

Administrator s Guide

Deploying Cisco Unified Contact Center Express Volume 1

Since we are starting small, we will partake in the time-honoured tradition of true 'Hello World' applications.

RAPID VOICEXML DEVELOPMENT USING IBM S GRAPHICAL CALL FLOW BUILDER

Dialogos Voice Platform

Dialogic IP Media Server. Erik Pras

Vocalité Version 2.4 Feature Overview

How To Develop A Voice Portal For A Business

Extension Manual. User portal, Dial codes & Voice mail for 3CX Phone System Version 7.0

Website Standards Association. Business Website Search Engine Optimization

Cisco IOS Voice XML Browser

Project Code: SPBX. Project Advisor : Aftab Alam. Project Team: Umair Ashraf (Team Lead) Imran Bashir Khadija Akram

Cisco Unified Communications Manager Auto-Attendant

Technical Bulletin Using Polycom SoundPoint IP and Polycom SoundStation IP Phones with Asterisk

SPEECH RECOGNITION APPLICATION USING VOICE XML

Using Avaya Aura Messaging

Aspect Education Services

DiskPulse DISK CHANGE MONITOR

Chapter 3. Basic Application Software. McGraw-Hill/Irwin. Copyright 2008 by The McGraw-Hill Companies, Inc. All rights reserved.

PBS&J Drives Caller Satisfaction with Voxeo-Powered 511 Phone Applications

Crystal Gears. Crystal Gears. Overview:

Model based development of speech recognition grammar for VoiceXML. Jaspreet Singh

Extension Manual User portal, Dial codes & Voice mail for 3CX Phone System Version 6.0

Voice User Interfaces (CS4390/5390)

Crystal Gears. The Next Generation Personal Desktop Call Recording Solution. Why Crystal Gears

Brekeke PBX Version 2 User Guide Brekeke Software, Inc.

Hosted VoIP Phone System. Desktop Toolbar User Guide

Hitachi ID Password Manager Telephony Integration

Multimedia Contact Center Setup and Operation Guide. BCM 4.0 Business Communications Manager

Enhanced VoIP Based Virtual PC Troubleshooting

Getting Started with Cisco Unified IP IVR, Release 8.5(1)

Information. OpenScape Contact Center Voice Portal V7.0 R2 Enable Open Dialogue, Intuitive Interaction, and Seamless Handoff

Telephony Fundamentals

IP PBX. SD Card Slot. FXO Ports. PBX WAN port. FXO Ports LED, RED means online

ADTRAN SBC and Cisco Unified Call Manager SIP Trunk Interoperability

Avaya IP Office Platform Web Self Administration

A Comparative Analysis of Speech Recognition Platforms

Prophecy Hosting. Hosted IVR and VoIP Services

Transcription:

Workshop Spoken Language Dialog Systems VoiceXML Rolf Schwitter schwitt@ics.mq.edu.au Macquarie University 2004 1

PhD Scholarship at Macquarie University A Natural Language Interface to a Logic Teaching Tool. HyperProof is a popular computer-based logic teaching tool. It comes with the largest-selling selling introductory logic textbook, Language, Proof and Logic by Jon Barwise and John Etchemendy. Goal: Make it possible to write logic problems in English. Make it possible to express the resulting proofs in English. More information: http://www.clt.mq.edu.au/information/scholarships.html Macquarie University 2004 2

Today s Program Wednesday VoiceXML Prompt Design Working with the CSLU Speech Toolkit Macquarie University 2004 3

Developing Speech Interfaces Speech interfaces can be developed using general-purpose programming languages special-purpose purpose programming languages. A special-purpose purpose language such as VoiceXML can simplify application development reduce network traffic separate interaction code from application logic code provide portability and simplicity support prototyping and refinement. Macquarie University 2004 4

Brief History of VoiceXML In 1999, AT&T, IBM, Lucent Technology and Motorola formed the VoiceXML Forum. The goal was to establish and promote VoiceXML for making Internet content available by phone and voice. Each company had previously developed its own markup language. Customers were reluctant to invest in proprietary technology. VoiceXML 1.0 was released in March 2000. VoiceXML 2.0 is a candidate recommendation (March 2004). Macquarie University 2004 5

W3C Voice Browser Working Group Organisations participating in the Voice Browser Working Group: BeVocal,, Canon, Comverse,, France Telecom, Genesys, HeyAnita,, Hitachi, HP, IBM, Intel, IWA/HWG, Loquendo, Microsoft, MITRE, Mitsubishi Electric, Motorola, Nokia, Nortel Networks, Nuance, PipeBeach,, SAP, Scansoft, Snowshore Networks, SpeechWorks,, Sun Microsystems, Syntellect, Tellme Networks, Unisys, Verascape, Vocalocity, VoiceGenie, Voxeo,, and Voxpilot Macquarie University 2004 6

A VoiceXML Example <?xml version="1.0"?> <vxml version="2.0"> <form> <block> <prompt bargein="false">welcome to Ajax Travel. <audio src="http://www.prerecorded.audiofile..."/> </prompt> </block> </form> </vxml> Macquarie University 2004 7

VoiceXML VoiceXML is designed to describe the speech user interface reduces the amount of speech expertise (but not design expertise). VoiceXML documents can be static or dynamically generated by server side code use the same business logic and databases as the visual Web. Note: The form interpretation algorithm of the voice browser drives the interaction between the VoiceXML documents and the user. Macquarie University 2004 8

Missing Features VoiceXML 2.0 does not support learn-to to-speak applications speaker identification and verification. Macquarie University 2004 9

VoiceXML Architecture Phone PSTN Internet Gateway & Voice Server Internet HTTP/VoiceXML Web Server SIP regular phone wireless phone soft phone telephony interface voice browser automated speech recognition text-to to-speech synthesis touchtone audio play/record VoiceXML documents audio files service logic (CGI) transaction processing database interface Macquarie University 2004 10

A VoiceXML Scenario A customer dials the phone number of a travel agent. The VoiceXML gateway receives the call along with information about the dialed and dialing number. The VoiceXML gateway searches a database. If successful, it maps the dialed number to a URL. This URL is the location of the agent s main page (ajax.vxml). ( The gateway retrieves the ajax.vxml page together with associated files such as grammars and recorded audio from the HTTP server. These associated files may be cached on the VoiceXML gateway. Macquarie University 2004 11

A VoiceXML Scenario The VoiceXML interpreter parses and executes the VoiceXML document. The interpreter steps through ajax.vxml playing prompts, hearing responses and passing them on to a speech recognition engine. If necessary, additional VoiceXML documents and associated files are retrieved from the HTTP server. Recorded audio is served by specifying the URL of the WAV file. Communications between the voice gateway and the HTTP server follow standard HTTP protocols. Macquarie University 2004 12

More On VoiceXML: An Example Dialog Computer: Computer: Caller: Computer: Caller: Computer: Welcome to Ajax Travel. Please say your name. Sam. Do you want to travel by air, rail, or boat? Rail. You have selected to travel by rail. Macquarie University 2004 13

A VoiceXML Code Fragment <?xml version = "1.0"?> <vxml version = "2.0"> <form> <block> <prompt> Welcome to Ajax Travel. </prompt> </block> <field name = "UserName"> <prompt> Please say your name. </prompt> Macquarie University 2004 14

A VoiceXML Code Fragment <grammar type = "application/srgs+xml" version = "1.0"> <rule id = "auser"> <one-of> <item>fred</item> <item>sam</item> </one-of> </rule> </grammar> Macquarie University 2004 15

A VoiceXML Code Fragment </field> <filled> <goto next = "#travel"/> </filled> </form> <! - transition to another dialog in the current document --> Macquarie University 2004 16

A VoiceXML Code Fragment <menu id = "travel"> <prompt> Do you want to travel by air, rail, or boat? </prompt> <! - choices follow on the next slides --> Macquarie University 2004 17

A VoiceXML Code Fragment <choice next = "#plane"> <grammar type = "application/srgs+xml" version = "1.0"> <rule id = "by_plane"> <item> air </item> </rule> </grammar> </choice> Macquarie University 2004 18

A VoiceXML Code Fragment <choice next = "#train"> <grammar type = "application/srgs+xml" version = "1.0"> <rule id = "by_train"> <item> rail </item> </rule> </grammar> </choice> Macquarie University 2004 19

A VoiceXML Code Fragment <choice next = "#boat"> <grammar type = "application/srgs+xml" version = "1.0"> <rule id = "by_boat"> <item> boat </item> </rule> </grammar> </choice> </menu> Macquarie University 2004 20

A VoiceXML Code Fragment <form id = "train"> <block> <prompt> You have selected to travel by rail. Details for making travel arrangement would be here in a real application </prompt> </block> </form> </vxml> Macquarie University 2004 21

VoiceXML Elements <vxml vxml> top-level element in each VoiceXML document <form> a dialog for presenting information and collecting data <block> a container of (non-interactive) executable code <prompt> queue speech synthesis and audio output to the user <field> declares an input field in a form <filled> an action executed when fields are filled Macquarie University 2004 22

VoiceXML Elements <menu> a dialog for choosing amongst alternative destinations <choice> define a menu item <grammar> specify a speech recognition or DTMF grammar <goto goto> go to another dialog in the same or different document Macquarie University 2004 23

GUI versus VUI Macquarie University 2004 24

GUI versus VUI Characteristic Fonts Pictures Backgrounds Menus Forms Prompting the user User s response GUI (HTML) Choice of multiple sizes, colours, and types Present the picture to the user Choice of multiple colours and patterns Large number of choices Users may enter values in any order Large menu of information and links Click a field and enter a value VUI (VoiceXML ( VoiceXML) Choice of multiple voices Describe the picture with words (caption) Choice of background music or sounds Usually 7 ± 2 choices Users enter values usually in predefined order Limited number of prompts and options Speak the menu choice or field value Macquarie University 2004 25

GUI versus VUI Characteristic Navigation Human memory Dialog style Input control Accuracy of input Event handling Global commands Dates GUI (HTML) Click the link or use the keyboard to enter URL Screen acts as an extension to human memory Mostly user-directed Apply the type constraints Input is always recognized Display the event message Available as a menu in a separate frame Absolute dates are used (28.7.2003) VUI (VoiceXML ( VoiceXML) Speak the option choice Caller is forced to use own mental memory Mostly application-directed Apply the grammar rules Input may not be recognised correctly Present help as a prompt Announced at the beginning of a session Relative dates are used ( next Monday ) Macquarie University 2004 26

VoiceXML Implementations Web-based based VoiceXML development tools: Tellme at http://studio.tellme.com BeVocal at http://café.bevocal.com HeyAnita at http://www.heyanita.com VoiceXML platforms and graphical development tools: Nuance at http://www.nuance.com OptimTalk at http://www.optimtalk.cz Motorola s Mobile ADK for Voice (old beta version only) Macquarie University 2004 27

Tellme Studio Tellme studio is a suite of Web-based based VoiceXML development tools. Tellme studio enables you to build and test, and publish VoiceXML applications without buying or installing any hardware or software. By registering, you can develop your application for free. But check out first the VoiceXML elements supported by the Tellme voice interpreter. Macquarie University 2004 28

MyStudio VoiceXML scratchpad You can write a phone application using the VoiceXML scratchpad. Application URL Alternatively, you can write a phone application using a text editor and store the result on a Web server. The application URL points to the initial VoiceXML document. VoixeXML terminal You can test the application logic and flow using the VoiceXML terminal. Macquarie University 2004 29

MyStudio Macquarie University 2004 30

VoiceXML Scratchpad Macquarie University 2004 31

Application URL Macquarie University 2004 32

VoiceXML Terminal Macquarie University 2004 33

Grammar Scratchpad The Tellme platform provides two choices when writing grammars: use a built-in in grammar define your own grammar Supported grammar languages are: Nuance Grammar Specification Language (GSL) Speech Recognition Grammar Specification (SRGS) You can execute GSL + SRGS in the VoiceXML scratchpad. But the Tellme grammar tools support GSL grammars only. Macquarie University 2004 34

Grammar Scratchpad: SRGS Macquarie University 2004 35

Grammar Scratchpad: GSL Macquarie University 2004 36

Grammar Phrase Checker Macquarie University 2004 37

Grammar Phrase Checker: Returned Value Macquarie University 2004 38

Grammar Phrase Generator Macquarie University 2004 39

Grammar Phrase Generator: Generated Phrase Macquarie University 2004 40

Connecting to Tellme Studio To preview your application, you can use a phone and call (408)-678 678-4465 or you can use a soft phone and call sip:8005558965@sip.studio.tellme.com Macquarie University 2004 41

Nuance Grammar Specification Language (GSL) Key to understand speech are well-designed grammars. A grammar describes the words and phrases that the recogniser can understand at a specific point in a VoiceXML document. GSL provides a rich set of functionality for writing grammars. But GSL syntax uses characters that are reserved by XML. Therefore, in-line grammars must be protected (CDATA section). Macquarie University 2004 42

GSL Syntax by Example Here is an in-line grammar in GSL format: <grammar type = "application/x-gsl" mode = "voice"> <![CDATA[ [ [(new york) (big apple)] {<destination "new york">} [washington (the capital)] {<destination "washington">} ] ]]> </grammar> Macquarie University 2004 43

GSL Syntax by Example All words within a GSL grammar are lowercase 1. Square brackets define an "or" condition. Parentheses define an "and" condition. [ ] [(new york) (big apple)] {<destination "new york">} [washington (the capital)] {<destination "washington">} 1 Uppercase letters are reserved to reference subgrammars. Macquarie University 2004 44

GSL Syntax by Example This grammar fragment [ ] [(new york) (big apple)] {<destination "new york">} [washington (the capital)] {<destination "washington">} recognises either "New York" or the synonym "Big Apple" or "Washington" or the synonym "The Captial" as destination. Macquarie University 2004 45

GSL Syntax by Example The value of the name attribute of a field in an active form is set to the value returned by the grammar. <form> <field name = "destination"> <prompt>do you want to fly to New York or Washington?</prompt> <grammar type = "application/x-gsl" mode = "voice"> <![CDATA[ [[(new york) (big apple)] {<destination "new york">} [washington (the capital)] {<destination "washington">}] ]]> </grammar> Macquarie University 2004 46

What Happens if the caller does not respond with an predefined expression or the caller does not respond at all? In these cases events are thrown by the platform. Events are caught by the catch element: <catch event = "nomatch noinput"> <reprompt/> </catch> Macquarie University 2004 47

GSL Grammar at Work <?xml version = "1.0"?> <vxml version = "2.0"> <form> <field name = "destination"> <prompt>do you want to fly to New York or Washington?</prompt> <grammar type = "application/x-gsl" mode = "voice"> <![CDATA[ [[(new york) (big apple)] {<destination "new york">} [washington (the capital)] {<destination "washington">}] ]]> </grammar> <catch event = "nomatch noinput"> <reprompt/> </catch> Macquarie University 2004 48

GSL Grammar at Work <filled> <prompt>you said <value expr = "destination"/></prompt> </filled> </field> </form> </vxml> Macquarie University 2004 49

More on GSL Syntax The following grammar fragment [([enroll add] me) (sign me up)] recognises the three sentences: enroll me add me sign me up because an "or" condition is nested within an "and" condition. Macquarie University 2004 50

More on GSL Syntax Variations in input are supported through additional operators. Operators are attached as prefixes to individual words or phrases. s. Operators can be used to indicate that a word or phrase may occur zero or one time (?) zero or more times (*) one or more times (+) Macquarie University 2004 51

Using Operators Use "?" before an expression when it is completely optional. In the following example, the word "me" is optional: [(enroll?me) (sign?me up)] The utterance should not be rejected because the caller neglects to say the word "me". Macquarie University 2004 52

Using Operators Use "*" before an expression when it is optional, but the caller may say it multiple times. In the following example, the word "please" is optional [(*please sign?me up)] but the caller may say this word more than once. Macquarie University 2004 53

Using Operators Use "+" before an expression when it must occur at least one time but may occur more than once. This operator is provided for completeness only. It is not typically used. Macquarie University 2004 54

GSL Grammar for Voice Input <grammar type="application/x-gsl" mode="voice"> <![CDATA[ [ [sales] {<dept "010">} [marketing] {<dept "020">} [engineering] {<dept "030">} [(public relations) (p r)] {<dept "040">} ] ]]> </grammar> Macquarie University 2004 55

GSL Grammar for Touchtone Input <grammar type="application/x-gsl" mode="dtmf"> <![CDATA[ [ [dtmf-1] {<dept "010">} [dtmf-2] {<dept "020">} [dtmf-3] {<dept "030">} [dtmf-4] {<dept "040">} ] ]]> </grammar> Macquarie University 2004 56

Take-Home Message You can develop a VoiceXML application using a Web-based based development environment a VoiceXML platform on a desktop computer. Tellme Studio is a suite of Web-based based VoiceXML development tools enables you to build, test and publish Voice XML applications supports the Nuance Grammar Specification Language (GSL). Macquarie University 2004 57