Thin Client Development and Wireless Markup Languages cont. VoiceXML and Voice Portals



Similar documents
Dialog planning in VoiceXML

Traitement de la Parole

How To Write A Powerpoint Powerpoint Gsl In A Html Document In A Wordpress (Html) Or A Microsoft Powerpoint (Html5) (Html3) (Powerpoint) (Web) (Www

VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML

VoiceXML. Erik Harborg SINTEF IKT. Presentasjon, 4. årskurs, NTNU, ICT

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

VoiceXML Discussion.

! <?xml version="1.0">! <vxml version="2.0">!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml>

How To Use Voicexml On A Computer Or Phone (Windows)

BeVocal VoiceXML Tutorial

VoiceXML Programmer s Guide

VoiceXML-Based Dialogue Systems

An Introduction to VoiceXML

VoiceXML Overview. James A. Larson Intel Corporation (c) 2007 Larson Technical Services 1

Version 2.6. Virtual Receptionist Stepping Through the Basics

A design of the transcoder to convert the VoiceXML documents into the XHTML+Voice documents

Specialty Answering Service. All rights reserved.

Interfaces de voz avanzadas con VoiceXML

Standard Languages for Developing Multimodal Applications

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Support and Compatibility

VoiceXML versus SALT: selecting a voice

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

VoiceXML. For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, and Johan Sverin, Date:

Cisco IOS VoiceXML Browser

VXI* IVR / IVVR. VON.x 2008 OpenSER Summit. Ivan Sixto CEO / Business Dev. Manager. San Jose CA-US, March 17th, 2008

Hosted Fax Mail. Hosted Fax Mail. User Guide

Combining VoiceXML with CCXML

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is.

Voic . Advanced User s Guide. Version 2.0

VoiceXML and Next-Generation Voice Services

XML based Interactive Voice Response System

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

A Development Tool for VoiceXML-Based Interactive Voice Response Systems

Phone Routing Stepping Through the Basics

AN EXTENSIBLE TRANSCODER FOR HTML TO VOICEXML CONVERSION

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

Using Service Oriented Architecture (SOA) for Speaker-Biometrics Applications

Voice User Interface Design

VoiceXML Data Logging Overview

Realising the Potential of VoiceXML

Integrating VoiceXML with SIP services

CHAPTER 4 Enhanced Automated Attendant

Avaya Aura Orchestration Designer

Voice Messaging. Reference Guide

The Program. The Program. ALTA2004 Introduction to VoiceXML. Recommended Literature. Rolf Schwitter

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

Envox CDP 7.0 Performance Comparison of VoiceXML and Envox Scripts

Develop Software that Speaks and Listens

Grammar Reference GRAMMAR REFERENCE 1

Emerging technologies - AJAX, VXML SOA in the travel industry

interactive product brochure :: Nina: The Virtual Assistant for Mobile Customer Service Apps

Signatures. Advanced User s Guide. Version 2.0

Since we are starting small, we will partake in the time-honoured tradition of true 'Hello World' applications.

Multimodality: The Next Wave of Mobile Interaction

Dialogos Voice Platform

Voice Processing Standards. Mukesh Sundaram Vice President, Engineering Genesys (an Alcatel company)

Dialogic IP Media Server. Erik Pras

IVR Primer Introduction

Building Applications with Vision Media Servers

Mobile Commerce. Contents

RAPID VOICEXML DEVELOPMENT USING IBM S GRAPHICAL CALL FLOW BUILDER

Moving Enterprise Applications into VoiceXML. May 2002

1. Introduction to Spoken Dialogue Systems

Model based development of speech recognition grammar for VoiceXML. Jaspreet Singh

The Future of VoiceXML: VoiceXML 3 Overview. Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo Developer Jam Session May 20, 2010

How To Develop A Voice Portal For A Business

New Voice Services Kenneth J. Turner

Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment

Oct 15, Internet : the vast collection of interconnected networks that all use the TCP/IP protocols

Application Notes for Speech Technology Center Voice Navigator 8 with Avaya Aura Experience Portal Issue 1.0

Specialty Answering Service. All rights reserved.

Empowered by Innovation. Setting Up and Using Fax Mail. P/N July 2006 Printed in U.S.A.

Contents. Specialty Answering Service. All rights reserved.

STATE OF THE IVR: INDUSTRY EXPERTS WEIGH IN Insights and best practices for getting the most out of your IVR interactions.

1Building Communications Solutions with Microsoft Lync Server 2010

Abstract. Avaya Solution & Interoperability Test Lab

WebSphere Portal Server and Web Services Whitepaper

Getting Started with Exchange Unified Messaging

Dialogic PowerMedia XMS VoiceXML

Enterprise Messaging, Basic Voice Mail, and Embedded Voice Mail Card

Creating a low cost VoiceXML Gateway to replace IVR systems for rapid deployment of voice applications.

2014 Direct Drive, Inc. All rights reserved.

INTERCALL ONLINE. Customer Portal Manage Account User Guide. View/Edit Owner Profile

Send a Message. Guide & User Instructions. America s Largest Message Notification Provider. Revised 04/2013

E I M S - Interactive Voice Response System

VIDEO IVR VAS & Customer Care

Welcome to Princeton University s Unified Messaging System. An Introduction. Janice Guarnieri Silvia Fernandes Julia Seymore

Australian Standard. Interactive voice response systems user interface Speech recognition AS AS

Telephone System Information

UX Mail Fax Features. Empowered by Innovation. P/N Rev 1, September 15, 2008 Printed in U.S.A. V4.21

Speech Interface Implementation for XML Browser

Voice Tools Project (VTP) Creation Review

Verizon Business National Unified Messaging Service Enhanced Service Guide

To help manage calls:

Hermes.Net IVR Designer Page 2 36

Deploying Cisco Unified Contact Center Express Volume 1

Transcription:

Thin Client Development and Wireless Markup Languages cont. David Tipper Associate Professor Department of Information Science and Telecommunications University of Pittsburgh tipper@tele.pitt.edu http://www.sis.pitt.edu/~dtipper/2727.html Slides 12 VoiceXML and Voice Portals VoiceXML together with Voice Portal provide speech enabled access to text/web/voice automated information. Allows user to navigate through voice web pages Why VoiceXML? remember it is a phone first computer/web device second Advantages Device independence works with any digital phone (wired or wireless) Easier more natural I/O Times when voice interaction more appropriate/easier while driving a car, obtaining directions, access email over phone, input info/data Low Cost 2

Standards based VoiceXML VoiceXML Forum Industry group (Motorola, Lucent, AT&T, etc) developed VXML 1.0 released in 2000 Based on XML W3C Voice Browser working group Developed VoiceXML 2.0 VoiceXML 2.1 June 2007 Current focus on improved speech and grammar recognition and text to speech translation multi-modal applications Voice + Web applications call for directions get map plus voice directions 3 VoiceXML Applications Predicted boom in VoiceXML applications especially in replacement for human operators Sample applications Information Retrieval Check weather, sports scores, directions (Cingular Voice Dial Service), stock price, tec. Directory Assistance AT&T uses this E-Commerce Catalog ordering, tickets, bill payment, etc Telephone Services Voice mail management, teleconferencing, secure phone calls Unified Messaging Browse listen to email messages over the phone Record voice and have it sent via email, SMS or voice mail. 4

VoiceXML Architecture User connects to Voice Portal that contains VoiceXML Browser VoiceXML Browser handles interaction with user (I/O) fetches information from web servers transforms VoiceXML content for delivery to user Portal contains several technology components accessed by browser to handle communication, process VoiceXML documents WWAN Internet Client Voice Portal/Gateway with VoiceXML Browser Server VoiceXML documents 5 Portal Technology Components Automatic Speech Recognition (ASR) Converts speech signal to text or numbers Strives to be speaker independent or speaker adaptive Matches speech with a given set of words or phrases (called a grammar) Much less computationally intensive than speech recognition Text to Speech Synthesis (TTS) Coverts text/numeric input to synthesized speech - older systems robotic sounding New systems use waveform concatenation ASR TTS Telephony VoiceXML Gateway Voice Browser Audio TCP/IP Examples: http://research.att.com/projects/tts.demo.html 6

Portal Technology Components Audio resource for playing prerecorded audio files Recording user input for post-processing Telephony resource Call processing Dual Tone Multi-Frequency (DTMF) keypad input Call transfer to third party Etc. TCP/IP resource Provides communication with web servers ASR TTS Telephony VoiceXML Gateway Voice Browser Audio TCP/IP 7 VoiceXML Session 1. User calls application phone number 2. VXML gateway coverts input to a http request to web server 3. Server responds to VXML gateway with content 4. Gateway converts to interactive audio session with user The score of the game is.. (1) Calling a voice application Cellular Network (4) Interactive audio between user and voice application VoiceXML Gateway (2) HTTP request INTERNET (3) Response (VoiceXML documents, audio files) Web Server (hosting VoiceXML documents and audio resources) ASR TTS Voice Browser Audio Telephony TCP/IP 8

VoiceXML Input/Output In a typical session user and application take turns in speaking/listening - I/O is crucial Methods for user input 1. Spoken Commands Interpreted by ASR accuracy improved by specifying a grammar 2. DTMF (Dual Tone Multi-Frequency) key input Users enters data on keypad accuracy improved by specifying expected input 3. Recorded speech for post processing Saved in a standard format (e.g.,.wav file) WWAN Internet Client Voice Portal/Gateway with VoiceXML Browser Server VoiceXML documents 9 VoiceXML Input/Output Methods for output to user 1. Text to Speech (TTS) synthesized speech on the fly can sound machine like Can mark up how TTS is played 2. Prerecorded audio files downloaded from server and played by portal sounds more natural to the user and easier to understand often recorded by a professional WWAN Internet Client Voice Portal/Gateway with VoiceXML Browser Server VoiceXML documents 10

Session VoiceXML Concepts Begins when user connects to portal and interacts with browser VoiceXML documents are loaded and unloaded as session continues Session end controlled by user, gateway or document Application A set of VoiceXML documents that share the same root document. 11 Dialogs VoiceXML Concepts Conversation with user- two basic types Form: presents information and collects user input, contains fields Menu: gives use options to select from and changes dialog state based on input Sub-dialogs are possible like a function call to commonly used forms/menus Dialog between user and the application needs to be carefully designed - typically application prompts user and user responds in turn 12

VoiceXML Concepts Grammars The expected user input, either spoken or DTMF key presses For example - ``say or enter your 5 digit zip code If spoken input a grammar library is often specified to help interpret the input correctly Specifying a grammar library greatly increases the accuracy of automatic speech recognition Should always include error checking and reprompting of user to handle mistakes in input 13 VoiceXML Documents VoiceXML Documents define one or more dialogs VoiceXML documents can contain Spoken prompts (synthetic speech or recorded) Output of audio files and streams Recognition of spoken words and phrases Recognition of touch tone key presses Recording of spoken input Control of dialog flow Links to other VoiceXML documents Events response to interruption or incorrect input Telephony control Call transfer to third party, hang up, etc. 14

vxml Concepts Basic concepts are inter-related as shown below Session invokes 1 or more applications Applications involves 1 or more documents Document can contain 0 to many dialogs 15 Basic VoiceXML Elements Follows XML format basic Elements start and end with tags <element name attribute name= ``attribute value > </element name> Main elements <form> dialog for presenting and collecting data <object> platform specific script that may gather user input and return <grammar> set of valid expressions that a user can say or type when interacting with an application <block> A piece of non-interactive executable code 16

VoiceXML Output Elements <prompt> outputs computer generated speech (TTS) or audio files Text for TTS can be marked up to improve quality <break> insert a pause <emphasis> increase volume (provide emphasis) <say-as> to specify a particular style Still-ers <say-as type= phone >014126249421 </say-as> <audio> plays a prerecorded file (.wav) <audio src= file.wav > common audio file cached at portal <reprompt> sends processing to original prompt 17 VoiceXML example <?xml version = 1.0 > <vxml version = 2.0 > <form> <block> <prompt> Pitt is it </prompt> </block> </form> </vxml> VoiceXML All VoiceXML files (.vxml) begin with xml, vxml prolog This document has a single form which contains a block that synthesizes and plays to the user ``Pitt is it Since a successor dialog is not specified the conversation ends Pitt is it Pitt is it. xhtml-mp, WML, chtml 18

Basic VoiceXML Elements Additional elements <menu> dialog for selecting among several options <choice> alternative in a menu dialog <field> gathers user input as defined by a specified grammar <filled> block of executable code that is run after user input field filled <record> records an audio file from user <if> <elseif> <else> conditional logic <goto> control flow from form within and between documents like links in html <var> declare variables <transfer> - transfers phone call to another number Can add scripting with Javascript 19 VoiceXML Examples <menu> <prompt> This is the main menu. Please choose a service: news, weather, or sports. </prompt> <choice next="news.vxml"> news </choice> <choice next="weather.vxml"> weather </choice> <choice next="sports.vxml"> sports </choice> </menu> 20

VoiceXML Examples <menu> <prompt> This is the main menu.for news press 6; for weather press 9; for sports press 7. </prompt> <choice dtmf= 6 next="news.vxml"> news </choice> <choice dtmf= 9 next="weather.vxml"> weather </choice> <choice next="sports.vxml"> sports </choice> </menu> Note in real applications need error checking and timeouts in place to deal with user input errors. Special VoiceXML elements for this <noinput>, <nomatch> etc. 21 VoiceXML Error handling <noinput> catches a noinput event within a timeout period <noinput> I'm sorry. I didn't hear anything. <reprompt/> </noinput> <nomatch> catches a nomatch event when input doesn t match a specified grammar <nomatch> I didn't get that. <reprompt/> </nomatch> <help> executed when user says help can be made universal to whole document or local to various parts <property name="universals" value="all" /> <help> <block> Now taking you to Coustemer Services. </block> <transfer name="services" bridge="true" connecttimeout="300" dest="phone://14088502255" /> </help> <property> can control platform features for example, how long application waits for input timeout after 10 secs <property name="timeout" value="10"> 22

Grammars Grammar specifies the natural language words or phrases that will be matched Can be included in the document or reference a separate file or standard dictionary several formats available <grammar> ;GSL2.0. grammar definition text </grammar <grammar src = filename.gram type= grammar type /> Most VoiceXML Portals specify a grammar type for example based on nuance speech technology <grammar type="application/x-nuance-gsl"> [ news weather sports ] </grammar> 23 VoiceXML Example <block> <prompt> This is the BeVocal calculator. </prompt> </block> <field name="op"> <prompt> Choose add, subtract, multiply, or divide. </prompt> <grammar type="application/x-nuance-gsl"> [add subtract multiply divide] </grammar> <help> Please say what you want to do. <reprompt/> </help> <filled> <prompt> Okay, let's <value expr="op"/> two numbers. </prompt> </field> 24

VoiceXML Applications Pros Easy to develop and implement don t need service provider Several hosting service available bevocal, Tellme, VoiceGenie, etc Easy to use and cost effective (according to Goldman- Sachs average $3 /call if human assisted vs. $.20/call if automated Easy to upgrade/modify Cons Need to carefully construct dialogs or users get frustrated Non-uniform grammars and document types can lead to cross platform problems 25 VoiceXML Example <?xml version="1.0" encoding="utf-8"?> <vxml version="2.0" xmlns="http://www.w3.org/2001/vxml" xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xsi:schemalocation="http://www.w3.org/2001/vxml http://www.w3.org/tr/voicexml20/vxml.xsd"> <form> <property name="bargein" value="true"/> <block> <prompt> Welcome to Mad Libs. Press the pound key after you say each word. </prompt> </block> <record name="one" beep="true" maxtime="5s" finalsilence="4000ms" dtmfterm="true" type="audio/x-wav"> <prompt timeout="5s"> Say a verb. </prompt> <noinput> I didn't hear anything, please try again. </noinput> </record> <record name="two" beep="true" maxtime="5s" finalsilence="4000ms" dtmfterm="true" type="audio/x-wav"> <prompt timeout="5s"> Say a noun. </prompt> <noinput> I didn't hear anything, please try again. </noinput> 26

VoiceXML Example <block> <prompt> To be, or not to <audio expr="one"/> that is the <audio expr="two"/> Whether 'tis nobler in the <audio expr="three"/> to suffer the slings and <audio expr="four"/> of <audio expr="five"/> fortune, Or to take <audio expr="six"/> against a sea of,<audio expr="seven"/> And by <audio expr="eight"/> end them. To die, to <audio expr="nine"/> No more; and by a <audio expr="nine"/> to say we end the <audio expr="ten"/> and the <audio expr="eleven"/> natural shocks that flesh is <audio expr="twelve"/> </prompt> 27 Markup Language Future Multi-modal markup languages proposed to combine features For example X+V language proposed by Motorola, Opera Software ASA and IBM to W3C 28