Real-World Experience Adding Speech to IVR Solutions with MRCP

Similar documents

Building Applications with Vision Media Servers

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

Cisco IOS Voice XML Browser

Cisco IOS Voice XML Browser

Voice Processing Standards. Mukesh Sundaram Vice President, Engineering Genesys (an Alcatel company)

Voice over IP (SIP) Milan Milinković

Abstract. Avaya Solution & Interoperability Test Lab

Cisco IOS VoiceXML Browser

Three-Way Calling using the Conferencing-URI

Vxi* VoiceXML browser!

TECHNICAL SUPPORT NOTE. 3-Way Call Conferencing with Broadsoft - TA900 Series

NTP VoIP Platform: A SIP VoIP Platform and Its Services

Avaya Aura Orchestration Designer

Avaya Media Processing Server 500

Dialogic IP Media Server. Erik Pras

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

Deploying Cisco Unified Contact Center Express Volume 1

Media Gateway Controller RTP

VXI* IVR / IVVR. VON.x 2008 OpenSER Summit. Ivan Sixto CEO / Business Dev. Manager. San Jose CA-US, March 17th, 2008

IP Media Servers for Next-Generation Contact Centers

Application Notes for Speech Technology Center Voice Navigator 8 with Avaya Aura Experience Portal Issue 1.0

VIDEO IVR VAS & Customer Care

Intel NetStructure Host Media Processing Software Release 1.0 for the Windows * Operating System

Dialogic Vision. Dec, Erik Pras

IP Office Technical Tip

and Voice Applications Eyal Wirsansky, Verso Technologies JaxJUG

Mobicents 2.0 The Open Source Communication Platform. DERUELLE Jean JBoss, by Red Hat 138

PBS&J Drives Caller Satisfaction with Voxeo-Powered 511 Phone Applications

ETM System SIP Trunk Support Technical Discussion

Interfaces de voz avanzadas con VoiceXML

Application Note. Using Dialogic Boards to Enhance Interactive Voice Response Applications

How To Understand The Purpose Of A Sip Aware Firewall/Alg (Sip) With An Alg (Sip) And An Algen (S Ip) (Alg) (Siph) (Network) (Ip) (Lib

Voice over IP Probe! for Network Operators and! Internet Service Providers

White Paper. Open Source Telephony: The Evolving Role of Hardware as a Key Enabler of Open Source Telephony in the Business Market.

SIP Basics. CSG VoIP Workshop. Dennis Baron January 5, Dennis Baron, January 5, 2005 Page 1. np119

Cisco Unified Contact Center Express Reporting

White paper. SIP An introduction

Deploying Cisco Unified Contact Center Express - Digital

Application Notes for IDT Net2Phone SIP Trunking Service with Avaya IP Office Issue 1.0

Application Notes for Configuring SIP Trunking between McLeodUSA SIP Trunking Solution and an Avaya IP Office Telephony Solution 1.

Interoperability Test Plan for International Voice services (Release 6) May 2014

How To Develop A Voice Portal For A Business

SIP ALG - Session Initiated Protocol Applications- Level Gateway

Testing IVR Systems White Paper

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

Vocalité Version 2.4 Feature Overview

Description: Objective: Upon completing this course, the learner will be able to meet these overall objectives:

Validation Visa VoiceXML Media Server GA 24/09/2012 M.B. Rakoto. Y.Evain

Aeroflot deploys speech-enabled call routing.

Whitepaper: Voice Call Notifications via VoIP and existing Dialogic Diva Boards

Voice over IP & Other Multimedia Protocols. SIP: Session Initiation Protocol. IETF service vision. Advanced Networking

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

Session Initiation Protocol (SIP) The Emerging System in IP Telephony

Contents. Specialty Answering Service. All rights reserved.

Session Border Controllers in Enterprise

Information. OpenScape Contact Center Voice Portal V7.0 R2 Enable Open Dialogue, Intuitive Interaction, and Seamless Handoff

VoiceXML Data Logging Overview

Session Initiation Protocol (SIP) 陳懷恩博士助理教授兼計算機中心資訊網路組組長國立宜蘭大學資工系 TEL: # 340

Adaptation of TURN protocol to SIP protocol

VoIP Fraud Analysis. Simwood esms Limited Tel:

Application Note. Configuring Dialogic Host Media Processing Software Release 3.1LIN Software Licenses

TSIN02 - Internetworking

Using Dialogic Boards to Enhance Voice Mail/Messaging Applications. Application Note

Application Note. Configuring Dialogic Host Media Processing Software Release 3.0 for Windows Software Licenses

Getting Started with Cisco Unified IP IVR, Release 8.5(1)

MX Platform Architecture Overview

ARCHITECTURES TO SUPPORT PSTN SIP VOIP INTERCONNECTION

NTP VoIP Platform: A SIP VoIP Platform and Its Services 1

IP-Telephony SIP & MEGACO

Enhanced Diagnostics Improve Performance, Configurability, and Usability

point to point and point to multi point calls over IP

Voice over IP (VoIP) Part 2

Dialogic Brooktrout SR140 Fax Software with Broadvox GO! SIP Trunking Service

MITEL Communications Platform

Cisco Healthcare Intelligent Contact Center

Request for Comments: August 2006

Dialogos Voice Platform

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme

Dialogic Diva SIPcontrol Software

ADTRAN SBC and Cisco Unified Call Manager SIP Trunk Interoperability

Developing Higher Density Solutions with Dialogic Host Media Processing Software

Envox CDP 7.0 Performance Comparison of VoiceXML and Envox Scripts

How to make free phone calls and influence people by the grugq

interactive product brochure :: Nina: The Virtual Assistant for Mobile Customer Service Apps

Technical Bulletin 25751

OfficeMaster Gate (Virtual) Enterprise Session Border Controller for Microsoft Lync Server. Quick Start Guide

Extreme Networks CoreFlow2 Technology TECHNOLOGY STRATEGY BRIEF

Design Grammars for High-performance Speech Recognition

White Paper: Voice Over IP Networks

X X X X X. Platinum Edition. Unlimited Extensions. Unlimited Auto Attendants. Unlimited Voic Boxes. ACD Features

Network Discovery Protocol LLDP and LLDP- MED

Transcription:

Real-World Experience Adding Speech to IVR Solutions with MRCP A webinar by NMS, ScanSoft and CapitalOne

Agenda Introduction to speech technology Dr. Rob Kassel, Senior Product Manager, ScanSoft, Inc. MRCP and Natural Access Jack Chase, Director, Product Marketing, NMS MRCP integration on the TelBert IVR Platform using NMS and ScanSoft Eric Cunningham, Enterprise Architect, Capital One Slide 2

Introduction to Speech Technology Rob Kassel Senior Product Manager ScanSoft Slide 3

The Need For Speech Recognition Automation less costly than live agents Increases call handling capacity / reduces hold times DTMF often is pressed into service Numeric entry is easy unless you are reading Spelling entry is more difficult Menus need to be enumerated, can t be too long Deep menu structure becomes tiresome Assignment inconsistent between vendors (e.g., voicemail) How do you enter 5 ½% or Albuquerque? With speech, questions are answered naturally Caller satisfaction is higher Fewer zero-outs leads to additional cost savings Slide 4

Speech Recognition Process Speech Speech Detector Feature Extraction Grammar Grammar Compiler Phoneme Classifier Search Acoustic Models System Dictionary Pronunciation Rules Slide 5 Confidence Scoring Results

Speech Recognition Challenges Speech can be difficult to decode, even for humans Fixed, confusable vocabularies: B-C-D-E-G-P-T-V-Z Ambiguous boundaries: It s hard to wreck a nice beach! Speaker variability: dialect, volume, gender, etc. Noise rejection: hands-free, mobile, telematics Out-of-vocabulary rejection & confidence measures Processor and memory demands Slide 6

Speech Recognition: State of the Art Callers speak naturally in directed dialogs Million-word vocabularies: stocks, names, addresses Open-ended responses, coupled with language understanding: How may I help you? High accuracy, infrequent confirmation Transaction completion rate over 90% is typical Automatically adapt to caller population and channel characteristics Slide 7

The Need For Text-To-Speech Professional recordings costly and time-consuming Large output vocabularies common (e.g. city names) Word concatenation is difficult to do well Often used for numeric output Can sound mechanical; irritating when frequent Some applications defy recordings (e.g. messaging) Slide 8

Text-To-Speech Process Text System Dictionary Pronunciation Rules Text Normalization Pronunciation Generation Prosody Generation Voice Database Unit Selection Concatenate and Smooth Speech Slide 9

Text-To-Speech Challenges Text Normalization Numerics: 12535 (number / zip code), 2x4 Abbreviations: OR (or / Oregon), Dr. Jones on Elm Dr. Acronyms: IBM is listed on NASDAQ Evolving usage: CUL8R Pronunciation Generation Homographs: minute (60 seconds / tiny) Vowel reduction: he came to town vs. he came to Prosody Generation Phrasing: he is physically and mentally exhausted Emphasis: Are you flying tomorrow? Emotion: upbeat vs. serious, calming vs. urgency Slide 10

Text-to-Speech: State of the Art Natural sounding output, no more drunken Swede Seamlessly mix dynamic data with recorded prompts Accurate pronunciation, including proper names A variety of voices to choose from Custom voices to maintain brand identity Listen here http://www.scansoft.com/speechworks/realspeak/teleco/ Slide 11

MRCP and Natural Access Jack Chase Director, Product Marketing NMS Slide 12

What Is MRCP v1? Control: MRCP/ RTSP/ TCP/ IP Speech: G.711/ RTP/ UDP/ IP MRCP Server PSTN IVR IVR Servers Servers IP Speech Speech Servers Servers Speech servers are connected by VoIP to IVR serves Standard API for ASR and TTS Easy to reconfigure system as needs change Easy to implement redundancy Slide 13

Natural Access and MRCP Call Control PSTN Trunking Conferencing Universal Speech Access (MRCP) IVR Services Fusion (VoIP) Fax Services Video Access OAM Service Managers, Libraries SNMP Driver Driver Driver IPC PCI PCI PCI IP HMP CX Boards AG Boards CG Boards PacketMedia HMP Slide 14

Universal Speech Access Makes Speech Integration Easy Slide 15

Current Support for Universal Speech Access Vendor Type Universal Speech Access 1.0 Universal Speech Access 1.1 ScanSoft ASR OMS 2.0.1 OSR 2.0 SWMS 3.1 OSR 3.0 ScanSoft TTS OMS 2.0.1 Speechify 2.0 SWMS 3.1 RealSpeak 4.0 Loquendo ASR N/A Loquendo ASR LSS 6.0 Nuance ASR MRCP Server SP5 Nuance 8.5 MRCP Server SP7 Nuance 8.5 Nuance TTS Vocalizer 3.0 Vocalizer 3.0.8 Telisma ASR Philsoft 3.2 telispeech 1.0 SP4 Slide 16

What s Next for MRCP? MRCP v2 draft-ietf-speechsc-mrcpv2-06, Feb 20, 2005 Adds SIP/ SDP for session setup Replaces RTSP Adds support for speaker verification Little deployment yet NMS will update Universal Speech Access when deployments occur Slide 17

MRCP Integration on the TelBert IVR Platform using NMS and ScanSoft Eric Cunningham Enterprise Architect Capital One Slide 18

Agenda Why use MRCP Main business drivers for voice enablement Overview of architecture Lessons learned Slide 19

Why Use MRCP Capital One has built its own IVR system (TelBert) Internally built and maintained Linux based C/C++ system 5000+ ports in production Handles nearly 100% of all in-bound credit card calls Business wants to have speech enabled applications Leading speech vendors are embracing MRCP for integration Centralizes automated speech recognition (ASR) and text-to-speech (TTS) resources in the network Standards based protocol, allowing multi-vendor interoperability continued Slide 20

Why Use MRCP (cont'd) Benefits to Capital One MRCP allows integration with leading vendors and avoids vendor lock-in NMS APIs simplify the learning of MRCP and RTP protocols and integration; accelerated the adoption of MRCP into TelBert Migration from AG 4000 to CG 6000 clean evolution CG 6000 provides on-board Ethernet and T1 terminations; eliminates host based processing of RTP data Current AG 4000 code compatible with CG 6000; quick upgrade to existing platform Slide 21

Overview of TelBert Architecture Where applications run. The control what grammars are used, processing of results, and user prompting Where NMS libraries are integrated. Single, statemachine model handling 184 ISDN callers, Voice processing commands, and the new ASR/TTS commands via Universal Speech Access. ScanSoft has their MRCP server (SWMS) co-located on the same machine as the OSR and RealSpeak servers. Note: This means that load balancing and failover is done by TelBert, not the MRCP serer Private network (100MB switch) to encapsulate the RTP traffic. Slide 22

Main Business Drivers for Voice Enablement Improve customer experience Provide both touch-tone and speech-enabled handling Switch between modes Provide additional automated customer servicing Automating time-consuming call center activities Frees call center representatives for more complex tasks Basically, all of the standard reasons a business wants to start using voice recognition technologies Slide 23

Lessons Learned NMS Universal Speech Access and Fusion APIs front-end the complexity of RTSP, MRCP, and RTP protocols You still need to read the specifications to troubleshoot problems You need to understand the specifications in order to talk to vendors you are integrating with (ScanSoft) continued Slide 24

Lessons Learned (cont'd) Example: NMS code if( (nrtn = saicreatesynthesizer(m_cta_context_handle, m_strtpendpointtts, m_ob_locate.get_server(), TELBERT_CONTEXT_TTS, &m_stttshd))!= SUCCESS){ } RTSP/MRCP sniffer trace (what the MRCP server sees) Request SETUP rtsp://newbox36/synthesizer/ RTSP/1.0 CSeq: 7 Transport: RTP/AVP;unicast;destination=10.87.204.8;client_port=3000-3001 Content-Type: application/sdp Content-Length: 167 v=0 o=139112752 0 127.0.0.1 s=nms speech c=in IP4 0.0.0.0 t=0 0 m=audio 3000 RTP/AVP 0 96 a=rtpmap:0 pcmu/8000 a=rtpmap:96 telephone-event/8000 Response RTSP/1.0 200 OK CSeq: 7 Session: RQKCRCSPWX0000000368fgJiuWPnxz Transport: RTP/AVP;unicast;client_port=3000-3001 Content-Length: 215 Content-Type: application/sdp v=0 o=- RQKCRCSPWX0000000368fgJiuWPnxz RQKCRCSPWX0000000368fgJiuWPnxz IN IP4 10.87.204.36 s=speechworks OpenSpeech Media Server version 2.0 c=in IP4 0.0.0.0 t=0 0 m=audio 3000 RTP/AVP 0 a=rtpmap: 0 pcmu/8000 Slide 25

Lessons Learned (cont'd) Load Balancing The MRCP specification allows for the MRCP server to coordinate where to setup the RTP connection with the ASR/TTS server; allows performance of load balancing activities Currently ScanSoft s MRCP server does not provide load balancing, but their engineers are looking at providing this Until then, your IVR will have to create its own load balancing and failover logic for the ASR/TTS server farm continued Slide 26

Lessons Learned (cont'd) Lots of specifications to be learned and not just by the integration team Specification Media Resource Control Protocol (MRCP) Specification Real Time Streaming Protocol (RTSP) Specification Real-Time Protocol (RTP) Specification Speech Recognition Grammar Specification Natural Language Semantics Markup Language for Speech Interface Framework (nl-spec) Specification Location ftp://ftp.rfc-editor.org/innotes/rfc2326.txt ftp://ftp.rfc-editor.org/innotes/internet-drafts/draftshanmugham-mrcp-05.txt ftp://ftp.rfc-editor.org/innotes/std/std64.txt http://www.w3.org/tr/2004/recspeech-grammar-20040316/ http://www.w3.org/tr/nl-spec/ Who needs to understand/ be aware of this spec Integration Team Application Interface Team Integration Team Integration Team Application Interface Team Application Developers Application Interface Team Application Developers Slide 27

Thank You! Note: PDF will be posted today Recorded version posted in a few days Slide 28

Q & A Session Please use the text messaging feature to send your questions Slide 29

For more information Contact Dr. Rob Kassel, Senior Product Manager, ScanSoft +1 617 428 4444; rob.kassel@scansoft.com Jack Chase, Director, Product Marketing, NMS +1 508 271 1109; jack_chase@nmss.com Eric Cunningham, Enterprise Architect, Capital One +1 804 855 3597; eric.cunningham@capitalone.com Upcoming Events VON Europe May 23 26 Stockholm, Sweden Booth # 1040 Upcoming Webinars June: Ready for Mainstream: AdvancedTCA Solutions Become Reality July: Transforming Speech Applications With NMS' new VoiceXML Server Slide 30