Loquendo Speech Technologies as key differentiating factor



Similar documents
Avaya Aura Orchestration Designer

Building Applications with Vision Media Servers

Interactive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps

Voice Processing Standards. Mukesh Sundaram Vice President, Engineering Genesys (an Alcatel company)

How To Develop A Voice Portal For A Business

Dialogos Voice Platform

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

VXI* IVR / IVVR. VON.x 2008 OpenSER Summit. Ivan Sixto CEO / Business Dev. Manager. San Jose CA-US, March 17th, 2008

VoiceXML-Based Dialogue Systems

Cisco IOS VoiceXML Browser

interactive product brochure :: Nina: The Virtual Assistant for Mobile Customer Service Apps

Materials Software Systems Inc (MSSI). Enabling Speech on Touch Tone IVR White Paper

Abstract. Avaya Solution & Interoperability Test Lab

Lindenbaum Web Conference

Avaya Interaction Center

VoiceXML Data Logging Overview

C E D A T 8 5. Innovating services and technologies for speech content management

Standard Languages for Developing Multimodal Applications

Migrating Legacy IVR Applications to VoiceXML with Voxeo The advantages of a 100% VoiceXML compliant platform

Develop Software that Speaks and Listens

Cisco IOS Voice XML Browser

Description: Objective: Upon completing this course, the learner will be able to meet these overall objectives:

Avaya Media Processing Server 500

Overview. Unified Communications

Speech as a Service. How to Put Your Speech Solution in the Cloud

Aspect Education Services

Avaya Interactive Voice Response

Avaya Interactive Voice Response

Speech Recognition of a Voice-Access Automotive Telematics. System using VoiceXML

Real-World Experience Adding Speech to IVR Solutions with MRCP

Cisco Unified Presence Server 1.0

Avaya Interaction Center

Application Notes for Speech Technology Center Voice Navigator 8 with Avaya Aura Experience Portal Issue 1.0

Interfaces de voz avanzadas con VoiceXML

Video Collaboration & Application Sharing Product Overview

Increased Productivity

Vxi* VoiceXML browser!

NeoIVR. Flexible & high performance IVR platform

Cisco IOS Voice XML Browser

PROPHECY. Unlocked Communications Customer Obsession Teams Communications Passion

Deploying Cisco Unified Contact Center Express - Digital

Deploying Cisco Unified Contact Center Express 5.0 (UCCX)

Workforce Management IVR. A multi-service voice platform

Design Grammars for High-performance Speech Recognition

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Avaya Interaction Center

Cisco Unified Attendant Console Premium Edition Version 9.1

MITEL Communications Platform

The Competella Attendant and Agent clients are Windows-based. Management- and configuration tools are web-based.

RTMP Channel Server I6NET Solutions and Technologies

Cisco Unified Attendant Console Business Edition Version 9.1

INTERNATIONAL JOURNAL OF ADVANCES IN COMPUTING AND INFORMATION TECHNOLOGY An International online open access peer reviewed journal

UNIVERGE SV8100 Issue 8.01

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

JDSU Signaling Analyzer Family. Dramatically Re-engineered for an Industry Revolution

320 E. 46 th Street, 11G New York, NY Tel Ext. 5

Cisco Agent Desktop for Cisco Unified Contact Center Express 9.0

Testing IVR Systems White Paper

Voice Tools Project (VTP) Creation Review

MSCML Protocol: The Key to Unlocking a New Generation of Multimedia SIP Services

Web Conferencing Comparison Guide

Simplifying and Empowering the Implementation of Enterprise Mobile Strategy

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

Translution Price List GBP

With HD quality Full transparent networking features And on-demand capacity enhancements

VoiceXML versus SALT: selecting a voice

Speech-Enabled Interactive Voice Response Systems

Dialogic IP Media Server. Erik Pras

Hassle-Free Meetings. Hold meetings anytime anywhere.

LocaTran Translations Ltd. Professional Translation, Localization and DTP Solutions.

Bridgit 4.6 software

Speaking your language...

Voice XML: Bringing Agility to Customer Self-Service with Speech About Eric Tamblyn Voice XML: Bringing Agility to Customer Self-Service with Speech

NVIDIA GeForce GT630 DP (2GB) PCIe x16 Card Graphics Card. QuickSpecs. Overview. Models. DA Worldwide Version 4 November 27, 2012 Page 1

Cisco Unified Communications System Release 6.1 Enriches Collaboration Through a Unified Workspace

HP Backup and Recovery Manager

Deploying Cisco Unified Contact Center Express Volume 1

Creating a low cost VoiceXML Gateway to replace IVR systems for rapid deployment of voice applications.

Personal Voice Call Assistant: VoiceXML and SIP in a Distributed Environment

Vocalité Version 2.4 Feature Overview

Multilingual and mixed-lingual TTS applications

Developing Usable VoiceXML Applications

SMART Notebook System Administrator s Guide. Windows Operating Systems

Transcription:

Loquendo Speech Technologies as key differentiating factor Paolo Baggia Director of International Standards March 3 rd, 2009 Voice Search Conference 2009 Voice Search 2009 Conference Paolo Baggia 1

Agenda Loquendo Today Loquendo Products Loquendo Speech Technologies Automatic Speech Recognition Text To Speech Loquendo MRCP Server VoxNauta Platform The value of Speech in Customer Relationship Voice Search 2009 Conference Paolo Baggia 2

Company Profile Privately held company (fully owned by Telecom Italia), founded in 2001 as spin-off from Telecom Italia Labs, capitalizing on 30yrs experience and expertise in voice processing. Global Company, leader in Europe and South America for award-winning, high quality voice technologies (synthesis, recognition, authentication and identification) available in 26 languages and 62 voices. Multilingual, proprietary technologies protected over 100 patents worldwide Financially robust, break-even reached in 2004, revenues and earnings growing year on year London Munich Growth-plan investment approved for the evolution of products and services. Offices in New York. Headquarters in Torino, local representative sales offices in Rome, Madrid, Paris, London, Munich New York Paris Madrid Torino Rome Flexible: About 100 employees, plus a vibrant ecosystem of local freelancers. Voice Search 2009 Conference Paolo Baggia 3

International Awards 2008 Frost & Sullivan European Telematics and Infotainment Emerging Company of the Year Award Winner of Market leader-best Speech Engine Speech Industry Award 2007 and 2008 Loquendo MRCP Server: Winner of 2008 IP Contact Center Technology Pioneer Award Best Innovation in Automotive Speech Synthesis Prize AVIOS-SpeechTEK West 2007 Best Innovation in Expressive Speech Synthesis Prize AVIOS-SpeechTEK West 2006 Best Innovation in Multi-Lingual Speech Synthesis Prize AVIOS-SpeechTEK West 2005 Voice Search 2009 Conference Paolo Baggia 4

Loquendo main points A Complete set of speech technologies and voice platforms (TTS, ASR, SV) focus on quality & innovative features, simplifying apps development multilingual worldwide coverage Extensive support for international standards All speech-related W3C and IETF standards A full range of integration options APIs, standard interfaces and protocols, client-server configurations Same technologies on a wide spectrum of platforms Same core engine for server, desktop, embedded & mobile devices, guarantees platform-independent sw engineering Partnership as a key factor Strategic alliance portfolio for each vertical market Set of powerful tools made available to our partners for tuning and improving speech applications, without need for costly professional services Voice Search 2009 Conference Paolo Baggia 5

Same Core Engine The only embedded engines with server quality and features Same Core Engines for all versions: Server, Multimedia and Embedded. Same languages (voices) are available for all versions Same APIs and support to standards (W3C, SAPI, ) Multiplatform: Symbian series 60 (7,8 and 9), Pocket PC 2003, CE.NET 4.2 and 5, Windows Mobile 2005/6, Windows Automotive, SmartPhone 2003, WindRiver VxWorks, QNX, Linux, Windows XP Embedded, Windows XP TabletPC Edition, Windows Vista, Voice Search 2009 Conference Paolo Baggia 6

Powered by Loquendo solutions in vertical markets Telco Media Center and Set-top-box Local and Central Government Mobile devices Banking and Insurance Transport Automotive and Navigation Healthcare and Differently able Industry Voice Search 2009 Conference Paolo Baggia 7

Loquendo Products Voice Footnotes Search 2009 Conference Paolo Baggia 18

Loquendo: Value chain & Product positioning Applications O.S. Solutions Servers VoxNauta MRCP Server Speech Engines Speech Engines LTTS LASR LSV Hardware Loquendo Focus Windows, Linux, WinMobile, Symbian, Turnkey solutions: Auto Attendant, DA, Banking, CRM, Self-service, Voice Controlled Media Center Basic Resources for application developments: Specialized lexicons, grammars Reusable Dialogue Objects VoiceXML & CCXML Platform: For vocal applications on any network (fixed, mobile, VoIP) Turnkey MRCP (v1 & v2) Server: For interfacing with IVRs and third party voice platforms Speech Engines, SW only: Text-To-Speech, Automatic Speech Recognition, Speaker Verification and Identification, Language Identification For servers, desktops, embedded & mobile devices Voice Search 2009 Conference Paolo Baggia 9

Language Coverage Language Female Male English US PP PPP English UK PP P Spanish (Castilian) PP PP Catalan (bilingual) P P Valencian (bilingual) P Galician (bilingual) P American Spanish / Colombian P P Mexican P Chilean P Argentinean P Italian PPPP PPPP French PP PP Canadian French P P Portuguese P P Brazilian Portuguese PP P German PP P Dutch P P Greek PP Danish P P Finnish P P Swedish P P Russian P P Polish P P Turkish P P Chinese PP Esperanto (robotic) P Japanese / Arabic PARTNER PARTNER Voice Search 2009 Conference Paolo Baggia 10

Loquendo Speech Technologies Loquendo Speech Technologies Voice Footnotes Search 2009 Conference Paolo Baggia 111

Text To Speech Voice Search 2009 Conference Paolo Baggia 12

Loquendo TTS Text To Speech Multi-language: 26 languages, 62 voices and more coming! Truly Natural and Expressive sounding voices for highly Emotional pronunciation: Commonly used phrases such as How are you? or You ve got to be kidding! and paralinguistic events such as yawning, coughing and laughing and to confirm, exclaim, thank, express doubt, etc.) Reading Styles and specialized support (e.g. addresses, SMS, etc.) Audio Mixer: to have complete control over all audio sources (including sampling rates and coding) audio files can be mixed, looped, faded in/out, and synchronized with synthetic speech Mixed Language Capability: Language Guesser: that automatically identifies the language of any text so that by means of Phonetic Mapping: any of Loquendo s voices can correctly pronounce any foreign word (e.g. English words in an Italian text) Voice Creator tool for new voice generation TTS Director for designing effective prompts and Lexicon Manager tool for creating personalized user lexicons Voice Search 2009 Conference Paolo Baggia 13

Loquendo TTS Director Loquendo TTS Director is a complete development environment for creating your own voice prompts, and for designing your own personalised voices. Target: clients wanting to edit their prompts at a more complex level and adjust parameters with far more precision, as well as to add pauses, phonetic transcriptions, and tailor-made lexicons for atypical pronunciation. Voice Search 2009 Conference Paolo Baggia 14

Loquendo TTS Tools: Lexicon Manager Loquendo Lexicon Manager helps to define the pronunciation of foreign language words, toponyms, proper names, acronyms, abbreviations, etc. The virtual keyboard helps to write the phonetic transcriptions Sections for different languages in the same lexicon Future support for the PLS (Pronunciation Lexicon Specification) standard format Voice Search 2009 Conference Paolo Baggia 15

Automatic Speech Recognition Voice Search 2009 Conference Paolo Baggia 16

Loquendo ASR Automatic Speech Recognition A reliable speaker independent technology Broad Vocabulary & Flexible Recognition - recognizes up to 1,000,000 words; supports isolated and continuous speech even in the noisiest environment such as wireless Highly Accurate Speech Recognition - thanks to integration of neural networks and hidden Markov models, and detailed acoustic-phonetic units trained on large speech corpora Multi-language speech recognition (20 languages) Barge-in capability to guarantee high reactivity and robustness to noise and background speech; Garbage rules definition to match arbitrary spoken sequences not modeled by the grammar Powers Loquendo Speaker Verification Tool package that automatically analyzes data collected in the field to improve service performances, including: Acoustic Model Adaptation (to the environment, speaker, channel adaptation, etc.) Phonetic Learning to identify frequent formulations that have not been covered and additional pronunciation variants Voice Search 2009 Conference Paolo Baggia 17

Loquendo ASR Automatic Speech Recognition Either HMM (Hidden Markov Models) or NN (Neural Networks) for core algorithms. Loquendo ASR combines both approaches, giving high performance, and increased efficiency with large vocabularies. ASR efficiency reduces hardware costs: lower PC-power requirements enables more recognition channels to run simultaneously. Loquendo ASR is so efficient that core ASR engine is used on embedded platforms. e.g smartphones, navigation devices. Extended Standards Support future-proofs customer investments: MRCP compliancy (for client-server architectures); complete support for grammar standards, such as W3C SRGS & SISR, enables optimization for VoiceXML applications; support for AURORA DSR (for distributed speech recognition); A highly accurate phonetic transcriber enhances recognition results. Loquendo ASR based on same phonetic transcriber as Loquendo TTS - tested both automatically and by painstaking human listening. Loquendo Speaker Verification - an extension to ASR module - combines both speaker and knowledge verification (i.e. matches what was said with who said it ). Voice Search 2009 Conference Paolo Baggia 18

Learn to distinguish a best-in-breed ASR from those not up to the job Key Factors For Success ASR Tuning e.g. to the environment, to the speaker Tools enabling ASR to learn from the field - avoids need for costly professional services e.g. Acoustic Model Adaptation Tool, Phonetic Learning Tool, De-noising module - improves performance in noisy environments by cleaning the signal while computing spectral parameters Loquendo ASR specialized tasks, including: Word Spotting - recognizes keywords in audio streams; Garbage Rules definition enables free speech, simplifying grammars and application development, matches expressions like Um, Er, "Well", "Let me think", etc., giving greater flexibility and a more natural interaction experience. Voice Search 2009 Conference Paolo Baggia 19

Loquendo ASR SDK Tool Suite Evaluation Kit Tool to select recorded audio material: Offline execution of recognition tests Statistics about recognition performance Semantic parsing over text strings Acoustic Model Adaptation Tool to efficiently adapt the recognition engine to difficult conditions, such as: different audio channels (wireless, multimedia microphone, PDA, etc) different environments (in-car, public areas, factory) application-dependent vocabulary (specific jargon, such as aeronautical terms) ways of speaking (regional accents, fast speech) Phonetic Learning Tool to improve performance using data collected in the field to deal with: additional linguistic formulations complex speech recognition applications non-native speakers or regional accents Voice Search 2009 Conference Paolo Baggia 20

Loquendo MRCP Server Voice Search 2009 Conference Paolo Baggia 21

Loquendo MRCP Server Optimized client-server solution for the large-scale deployment of speech technologies in the telephony field, such as call centers, CRM, news and email-reading, self-service applications, etc. Full benefits of Loquendo s high-performance technologies using standard protocols and languages. Easy-to-integrate through the standard IETF protocol MRCP (Media Resource Control Protocol). Both MRCP versions are supported: MRCP v1 (RFC 4463) based on RTSP/RTP and MRCP v2, the new IETF protocol, based on SIP/RTP offering the new audio recording and Speaker Verification functionalities. Loquendo MRCP Server is fully configurable and makes software component status available to both its onboard Management Console and external Management Systems through the SNMP protocol. Its modular architecture leaves Loquendo MRCP Server independent from ASR/TTS engine releases Voice Search 2009 Conference Paolo Baggia 22

MRCP Standard benefits Media Resource Control Protocol MRCP are IETF standards MRCP v1 is RFC 4463, http://www.ietf.org/rfc/rfc4463.txt MRCP v2 is Internet Draft, http://tools.ietf.org/html/draft-ietf-speechsc-mrcpv2-17 Provides a mechanism to control Speech Recognition and Text- To-Speech servers in distributed environments, allowing for implementation of distributed IVR platforms Standard for Speech technologies integration: cut costs and protect investments Standards are key drivers for the market. Support for Standards is one of the key considerations solution providers should make when selecting a speech system/platform We do not only adopt standards. We drive them! Voice Search 2009 Conference Paolo Baggia 23

VoxNauta Platforn Voice Search 2009 Conference Paolo Baggia 24

Multi-Purpose Server Solution VoxNauta 7.0 platform offers all the key drivers for voice applications, multimodal and multimedia, to create both telco and enterprise solutions: VoIP SIP/RTP interface is based on a SWonly implementation, no additional cost for HW boards DSR (Distributed Speech Recognition) support minimizes the network load for voice transmission and allows the creation of multimodal applications on GPRS networks SIP RFC 4240 (NETANN) support is required by IMS (IP Multimedia Subsystem) architecture Video applications are possible through a subset of VoiceXML 3.0 (2009) Highly scalable and flexible solution allows a wide range of deployed applications with support of advanced management for both telcos and enterprises Voice Search 2009 Conference Paolo Baggia 25

VoxNauta 7.0 Key Points A renewed architecture The last release of Loquendo speech platform, is based on a renewed architecture that exploits the MRCP v2 protocol for technology integration with a pervasive modularity that ensures the highest efficiency. Full standard compliance VoxNauta 7.0 and Loquendo technologies implement ALL the most advanced standards in the speech area: VoiceXML 2.1, CCXML 1.0, MRCP v2, SGRS, SISR 1.0, SSML 1.0. VoiceXML Forum certified VoxNauta 7.0 has been formally certified by the VoiceXML Forum to be VoiceXML 2.0 compliant. SW technology The SW has been re-engineered to allow Operating System independency to a great extent, without any loss in efficiency. Call control The new platform incorporates a CCXML interpreter that allows complete service development and control in the platform back-end (e.g. any third party application server) Service development The new version of the Loquendo VoiceXML interpreter supports both 2.0 and 2.1 giving increased possibilities to the customer for service deployment. Management and configuration The VoxNauta Management Console collapses in a single user friendly hierarchic graphic interface any OA&M needs. In addition proprietary provisioning mechanisms are no longer required, relying on standard URI access of files, grammars and lexicon, whenever required. Voice Search 2009 Conference Paolo Baggia 26

VoxNauta Management and Reporting Management Console: Configuration, Logging, Monitoring, Reporting Multiplatform graphic tool (Win and Linux) based on SNMP Network control of multiple platforms Trap Viewer: Real-time visualization of SNMP traps sent by VoxNauta components in case of errors Reporting: Statistics: calls, duration, etc Service Log Analyzer: In-depth analysis of VoiceXML execution Voice Search 2009 Conference Paolo Baggia 27

VoxNauta Compliance with Standards Full Standard Compliance - complete support of all the relevant speech IETF and W3C standards VoiceXML Complete support of VoiceXML 2.0 and 2.1, certified by the VoiceXML Forum Certification Program CCXML - Call Control XML: Standard for Call Control ASR the W3C SRGS 1.0 (Speech Recognition Grammar Specification) grammar formats in both XML and ABNF (Augmented Backus-Naur Form) formats, and also complete support of SISR 1.0 (Semantic Interpretation for Speech Recognition) DTMF even DTMF applications can take advantage of the SRGS 1.0 and SISR 1.0 standards, so that a voice/dtmf application can be given uniform results from voice and DTMF interactions TTS the W3C SSML (Speech Synthesis Markup Language) is the standard for enhancing text-to-speech rendering and for accessing the many unique features of Loquendo TTS EMMA published by the MMI (Multimodal Interaction) group of the W3C, it s a language for returning different modality results to the application (voice, gesture, keyboard) Pronunciation Lexicon: PLS (Pronunciation Lexicon Specification) standard for TTS and ASR (Loquendo is editor of this specification) Voice Search 2009 Conference Paolo Baggia 28

The value of Speech in Customer Relationship Loquendo Speech Technologies Voice Footnotes Search 2009 Conference Paolo Baggia 29 1

Why Choose Loquendo? - the key differentiating factors Loquendo is a global provider of high quality and reliable TTS, ASR and Speaker Verification worldwide, covering 26 languages, 62 voices and rising! TTS: our most renowned best-in-breed product ASR: providing innovative features such as Garbage techniques Full standards support. We strive for an open world: standards drive the speech industry. Customers are free to choose us without proprietary bindings Highly professional, customer-oriented technical assistance Price competitive Voice Search 2009 Conference Paolo Baggia 30

Looking Over the Engine, Checking Under the Hood The Key to Success in Voice Applications and CRM: A natural, accurate, well-designed speech interface A first-rate ASR and TTS to power it. Your voice-enabled service is the face your company presents to your customers! The naturalness and user-friendliness of your voice interface is key to enhancing customer experience. Voice Search 2009 Conference Paolo Baggia 31

THANK YOU For more information please: Keep an eye on: www.loquendo.com Contact: paolo.baggia@loquendo.com Keep in touch with Loquendo news, subscribe to the Loquendo Newsletter Try our interactive TTS demo: insert your text, choose a language, and listen The latest News at a click Consult the Loquendo Newsletter online Keep up to date on events and initiatives For further information, fill in our Contacts Form Loquendo S.p.A. 745 Fifth Ave, 27th Floor New York, NY 10151 USA Tel. +1 212.310.9075 Fax. +1 212.310.9001 www.loquendo.com Loquendo S.p.A. Via Arrigo Olivetti, 6 10148 TORINO Italy Tel. +39 011 291 3111 Fax +39 011 291 3199 www.loquendo.com Voice Search 2009 Conference Paolo Baggia 32