An Introduction to VoiceXML



Similar documents
VoiceXML-Based Dialogue Systems

How To Use Voicexml On A Computer Or Phone (Windows)

! <?xml version="1.0">! <vxml version="2.0">!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml>

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Dialog planning in VoiceXML

Interfaces de voz avanzadas con VoiceXML

VoiceXML Programmer s Guide

A Development Tool for VoiceXML-Based Interactive Voice Response Systems

VOICEXML TUTORIAL AN INTRODUCTION TO VOICEXML

Thin Client Development and Wireless Markup Languages cont. VoiceXML and Voice Portals

BeVocal VoiceXML Tutorial

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

VoiceXML. For: Professor Gerald Q. Maguire Jr. By: Andreas Ångström, and Johan Sverin, Date:

VoiceXML Overview. James A. Larson Intel Corporation (c) 2007 Larson Technical Services 1

Standard Languages for Developing Multimodal Applications

Short notes on webpage programming languages

VoiceXML Discussion.

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

VoiceXML and Next-Generation Voice Services

Grammar Reference GRAMMAR REFERENCE 1

Combining VoiceXML with CCXML

Form. Settings, page 2 Element Data, page 7 Exit States, page 8 Audio Groups, page 9 Folder and Class Information, page 9 Events, page 10

Version 2.6. Virtual Receptionist Stepping Through the Basics

Traitement de la Parole

Dialogos Voice Platform

VoiceXML versus SALT: selecting a voice

VXI* IVR / IVVR. VON.x 2008 OpenSER Summit. Ivan Sixto CEO / Business Dev. Manager. San Jose CA-US, March 17th, 2008

Phone Routing Stepping Through the Basics

XML based Interactive Voice Response System

AN EXTENSIBLE TRANSCODER FOR HTML TO VOICEXML CONVERSION

Web Pages. Static Web Pages SHTML

The Future of VoiceXML: VoiceXML 3 Overview. Dan Burnett, Ph.D. Dir. of Speech Technologies, Voxeo Developer Jam Session May 20, 2010

Dialogic PowerMedia XMS VoiceXML

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is.

Web page creation using VoiceXML as Slot filling task. Ravi M H

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

Web Hosting Features. Small Office Premium. Small Office. Basic Premium. Enterprise. Basic. General

Using VoiceXML, XHTML, and SCXML to Build Multimodal Applications. James A. Larson

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

Developing Usable VoiceXML Applications

Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment

Web Presentation Layer Architecture

Voice User Interface Design

2667A - Introduction to Programming

OVERVIEW OF ASP. What is ASP. Why ASP

Internet Technologies_1. Doc. Ing. František Huňka, CSc.

Dialogic PowerMedia XMS VoiceXML

07 Forms. 1 About Forms. 2 The FORM Tag. 1.1 Form Handlers

CSCI110: Examination information.

Unit 21: Hosting and managing websites (LEVEL 3)

WebSphere Voice Server for Multiplatforms. VoiceXML Programmer s Guide

ERIE COMMUNITY COLLEGE COURSE OUTLINE A. COURSE TITLE: CS WEB DEVELOPMENT AND PROGRAMMING FUNDAMENTALS

To use MySQL effectively, you need to learn the syntax of a new language and grow

A design of the transcoder to convert the VoiceXML documents into the XHTML+Voice documents

Vocalité Version 2.4 Feature Overview

ASP &.NET. Microsoft's Solution for Dynamic Web Development. Mohammad Ali Choudhry Milad Armeen Husain Zeerapurwala Campbell Ma Seul Kee Yoon

The Web Web page Links 16-3

VoiceXML Data Logging Overview

Developing and Implementing Web Applications with Microsoft Visual C#.NET and Microsoft Visual Studio.NET

ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition

Modern Web Application Framework Python, SQL Alchemy, Jinja2 & Flask

Lecture 2. Internet: who talks with whom?

INFORMATION BROCHURE Certificate Course in Web Design Using PHP/MySQL

ASP.NET: THE NEW PARADIGM FOR WEB APPLICATION DEVELOPMENT

Voice Driven Animation System

Software Requirement Specification For Flea Market System

Business & Computing Examinations (BCE) LONDON (UK)

Avaya Aura Orchestration Designer

Using Service Oriented Architecture (SOA) for Speaker-Biometrics Applications

4.2 Understand Microsoft ASP.NET Web Application Development

Application Servers G Session 2 - Main Theme Page-Based Application Servers. Dr. Jean-Claude Franchitti

Christian Leibold CMU Communicator CMU Communicator. Overview. Vorlesung Spracherkennung und Dialogsysteme. LMU Institut für Informatik

Voice User Interfaces (CS4390/5390)

A Performance Comparison of Web Development Technologies to Distribute Multimedia across an Intranet

Integrating VoiceXML with SIP services

XML: ITS ROLE IN TCP/IP PRESENTATION LAYER (LAYER 6)

12 File and Database Concepts 13 File and Database Concepts A many-to-many relationship means that one record in a particular record type can be relat

Business Information System Courses Description

Signatures. Advanced User s Guide. Version 2.0

XHTML Forms. Form syntax. Selection widgets. Submission method. Submission action. Radio buttons

Voice Processing Standards. Mukesh Sundaram Vice President, Engineering Genesys (an Alcatel company)

INTERNET PROGRAMMING AND DEVELOPMENT AEC LEA.BN Course Descriptions & Outcome Competency

Specialized Programme on Web Application Development using Open Source Tools

Introduction to Web Technologies

Transcription:

An Introduction to VoiceXML ART on Dialogue Models and Dialogue Systems François Mairesse University of Sheffield F.Mairesse@sheffield.ac.uk http://www.dcs.shef.ac.uk/~francois

Outline What is it? Why is it useful? How does it work? How to make it better? 2

Brief history 1999: AT&T, IBM, Lucent Technology and Motorola formed the VoiceXML Forum The goal was to for make Internet content available by phone and voice Each company had previously developed its own markup language Customers were reluctant to invest in proprietary technology 2000: release of VoiceXML 1.0 2005: VoiceXML 2.1 is a W3C candidate recommendation 3

What is VoiceXML? VoiceXML is a mark-up language for specifying interactive voice dialogues between a human and a computer Analogous to HTML VoiceXML browser interprets.vxml pages Can be dynamically generated by server-side scripts (JSP, ASP, CGI, Perl) Can access external databases (e.g. SQL) Example <?xml version="1.0"?> <vxml version="2.0"> <form> <prompt> Hello world! </prompt> </form> </vxml> VoiceXML platform 4

Architecture 5

Voice User Interface (VUI) Traditional web-based forms The purpose of a dialogue is to fill forms GUI vs. VUI Fonts vs. prosody Large menus vs. short utterances Hypertext navigation vs. voice commands Constraint on forms vs. recognition grammars Global options always visible vs. only uttered at the beginning of the dialogue 6

Why use VoiceXML? Advantages of VoiceXML platforms Special-purpose programming languages Reduces development costs Separation between dialogue system components Portability of application Flexibility: outsource or purchase equipment Choose best-of-breed components Re-use of Internet infrastructure VoiceXML is becoming a standard 7

The VoiceXML language XML structure < element_name attribute_name="attribute_value">...contained items... < /element_name> Basic elements prompt: specifies the system s utterance audio: play pre-recorded prompts form: set of fields field: information needed to complete task grammar: specifies possible inputs to a field 8

Basic elements filled: what to do if user input is recognized value: return a field s value goto: go to another form or file submit: go to another file and keep field values Error handling user says nothing: noinput nothing matches the grammar: nomatch Many more elements: http://www.vxml.org 9

VoiceXML document What do we want to know? Question Possible answers No answer? Wrong answer? Acceptable answer What s next? <?xml version="1.0"?> <vxml version="2.0 > <form id= get_student_name > <field name= student_name > <prompt> What's your name? </prompt> <grammar> john mary rob </grammar> <noinput> Please say your name. </noinput> <nomatch> I didn t understand that. </nomatch> <filled> Thank you, <value expr= student_name /> <submit next= next_document.vxml /> </filled> </field> </form> </vxml> 10

Recognition grammars Key to successful recognition Many platform-dependent formats (JSGF, SGL, etc.) Inline grammar External file <grammar src= mygram.gram type= application/x-jsgf /> Example with optional inputs (in brackets) #JSGF V1.0; grammar pizza; public <pizza> = [I d like a] <size> <type> [pizza] [please]; <size> = small medium large; <type> = vegetarian pepperoni cheese; 11

Built-in grammars Boolean Currency Date Digits Number Phone Time Example: <field name= get_digits type= digits > Can add additional constraints <field name= get_digits type= digits?minlength=3;maxlength=9 > 12

Events Similar to exceptions Thrown by Platform: ASR misrecognition Application: <throw> Handler Specific: <noinput>, <nomatch>, <help> General: <catch event= > Can count number of event occurrences Successive ASR errors with different repairs <nomatch count=3> What did you say? </nomatch> 13

VoiceXML properties Can be modified using the <property> element Confidence level of ASR Barge-in Time-out Voice/DTMF Properties can be defined at all levels: for the whole application, document, or a specific field 14

Mixed-initiative dialogues VoiceXML allows for simple mixed-initiative More flexible More room for errors A form-level grammar that can recognize multiple fields at once E.g. Please tell me a departure day and a destination Grammar needs to account for all possible orderings I m going to DEST on DATE I m leaving on DATE to go to DEST What if we don t have all required information at once? Back to directed dialogue Need traditional fields How to know what fields remain unfilled? 15

Form Interpretation Algorithm Defines how control flows through a VoiceXML application as it executes Makes VoiceXML declarative Just specify utterances, fields and grammars Define what happens, not how FIA deals with procedural details Keeps querying undefined fields Throw events and loop until field is filled by user <nomatch> or <noinput> <filled> 16

FIA - confirmations If a field value isn t confirmed by user, set it to undefined and the FIA will ask for it again <field name= confirm type= boolean > <prompt> Do you want details on <value expr= student_name />? </prompt> <filled> <if cond= confirm > Looking up details on <value expr= student_name /> <else /> Let s try again <clear namelist= student_name confirm /> </if> </filled> </field> 17

Limitations Simple mixed initiative How to retrieve information from a database? What about more advanced dialogue system features? Content summarization Multiple database entries Find alternatives answers Dynamic grammars If the database changes, the recognition grammar must adapt Generate VoiceXML pages dynamically 18

Dynamic VoiceXML Similar to dynamic HTML pages Content isn t stored on the server, but created on-thefly based on the user s parameters and a database Typical interaction: A static VXML page collects information from the user Submit the fields to a server-side script (JSP, PHP, ASP, Perl, etc.) The script queries the database and processes the results The script outputs VXML code which is interpreted by the browser 19

Dynamic VoiceXML JSP, PHP, Perl scripts (Database) 20

Implementation in Perl When form is filled, send fields value to the server-side script for processing <filled> Thank you <submit next= http://mywebserver/script.perl > </filled> The Perl script collects information $q = new CGI; $name = $q->param('student_name'); Connect to the SQL database $handler = DBI->connect("DBI:mysql:$db", $user, $password); Query the database for the student s name $query = $handler->prepare( SELECT * FROM students WHERE name = \ $name\ ); $query->execute; Output beginning of VoiceXML document (<xml>, <voicexml>, <form>, <prompt>) Output result, i.e. the student s phone number @row = $sth->fetchrow_array; print The phone number of $name is $row[2]\n ; Output end of VoiceXML document (</form>, </voicexml>, etc.) 21

Dynamic grammars What to do when the recognition vocabulary is not known in advance? Rewrite a grammar at each database update Better, use a server-side script to Retrieve patterns from database Write grammar to an external file Call a VXML page using this grammar 22

Conclusion VoiceXML has become a standard All-in-one solutions available Reduces dialogue system development time Comes with limited dialogue management and language generation capabilities Additional functions can be easily implemented Develop your own dialogue system with free VoiceXML browsers! https://studio.tellme.com 23