Design Grammars for High-performance Speech Recognition



Similar documents
Develop Software that Speaks and Listens

Voice Driven Animation System

Support and Compatibility

Dragon Solutions Enterprise Profile Management

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Avaya Aura Orchestration Designer

Voice-Recognition Software An Introduction

Standard Languages for Developing Multimodal Applications

VoiceXML-Based Dialogue Systems

IBM WebSphere ILOG Rules for.net

Abstract. Avaya Solution & Interoperability Test Lab

CA Aion Business Rules Expert 11.0

31 Case Studies: Java Natural Language Tools Available on the Web

1 What Are Web Services?

1 What Are Web Services?

MDA Overview OMG. Enterprise Architect UML 2 Case Tool by Sparx Systems by Sparx Systems

Resource Utilization of Middleware Components in Embedded Systems

Introducing Apache Pivot. Greg Brown, Todd Volkert 6/10/2010

An Easier Way for Cross-Platform Data Acquisition Application Development

Key Benefits of Microsoft Visual Studio 2008

IBM WebSphere Application Server Family

AppDev OnDemand Cloud Computing Learning Library

Business Application Development Platform

Speech as a Service. How to Put Your Speech Solution in the Cloud

IBM Tivoli Directory Integrator

new voice technologies deliver

Dialogos Voice Platform

The preliminary design of a wearable computer for supporting Construction Progress Monitoring

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme. Middleware. Chapter 8: Middleware

Microsoft Dynamics GP econnect Installation and Administration Guide

Oracle Application Development Framework Overview

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Sybase Unwired Platform 2.0

Agile Business Suite: a 4GL environment for.net developers DEVELOPMENT, MAINTENANCE AND DEPLOYMENT OF LARGE, COMPLEX BACK-OFFICE APPLICATIONS

VOCOLLECT VOICEARTISAN. Extending Your Vocollect Configuration

Voice Tools Project (VTP) Creation Review

2012 LABVANTAGE Solutions, Inc. All Rights Reserved.

Programming in C# with Microsoft Visual Studio 2010

VoiceXML and VoIP. Architectural Elements of Next-Generation Telephone Services. RJ Auburn

SOA, case Google. Faculty of technology management Information Technology Service Oriented Communications CT30A8901.

Planning and Deployment Guide. Version 2.3

Medical 360 Network Edition and Citrix

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

Dragon speech recognition Nuance Dragon NaturallySpeaking 13 comparison by product. Feature matrix. Professional Premium Home.

Talend Technical Note

Intel Integrated Native Developer Experience (INDE): IDE Integration for Android*

CA Aion Business Rules Expert r11

Interactive product brochure :: Nina TM Mobile: The Virtual Assistant for Mobile Customer Service Apps

Enterprise Contact Center

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

! <?xml version="1.0">! <vxml version="2.0">!! <form>!!! <block>!!! <prompt>hello World!</prompt>!!! </block>!! </form>! </vxml>

FTP Client Engine Library for Visual dbase. Programmer's Manual

VoiceXML Data Logging Overview

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Sybase Unwired Platform 2.1.x

Satisfying business needs while maintaining the

Application Notes for Speech Technology Center Voice Navigator 8 with Avaya Aura Experience Portal Issue 1.0

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

A standards-based approach to application integration

Enhanced Diagnostics Improve Performance, Configurability, and Usability

How To Develop A Voice Portal For A Business

Programming with the Microsoft.NET Framework Using Microsoft Visual Studio 2005 (VB)

Winscribe Citrix XenApp and Terminal Services Installation Guide

Monitoring Nginx Server

A secure face tracking system

Done. Imagine it. c Consulting. c Systems Integration. c Outsourcing. c Infrastructure. c Server Technology.

FAQ CE 5.0 and WM 5.0 Application Development

Architectural Overview

What Is the Java TM 2 Platform, Enterprise Edition?

CA Repository for Distributed. Systems r2.3. Benefits. Overview. The CA Advantage

Mobile RFID solutions

interactive product brochure :: Nina: The Virtual Assistant for Mobile Customer Service Apps

Dragon NaturallySpeaking and citrix. A White Paper from Nuance Communications March 2009

Microsoft Dynamics GP. econnect Installation and Administration Guide Release 9.0

Unicenter Desktop DNA r11

Successfully managing geographically distributed development

Dragon Solutions. Using A Digital Voice Recorder

Version Overview. Business value

SkyRecon Cryptographic Module (SCM)

Data Sheet VISUAL COBOL WHAT S NEW? COBOL JVM. Java Application Servers. Web Tools Platform PERFORMANCE. Web Services and JSP Tutorials

DeBruin Consulting. Key Concepts of IBM Integration Broker and Microsoft BizTalk

VMware Server 2.0 Essentials. Virtualization Deployment and Management

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 04 Java

App Development with Talkamatic Dialogue Manager

Application Development With Data Studio

Service Oriented Architectures

How Application Lifecycle Management can address elearning Software Challenges

Productivity for the Enterprise

Integrating the Internet into Your Measurement System. DataSocket Technical Overview

INF5820, Obligatory Assignment 3: Development of a Spoken Dialogue System

Real-World Experience Adding Speech to IVR Solutions with MRCP

ORACLE ADF MOBILE DATA SHEET

Interfaces de voz avanzadas con VoiceXML

Eliminating legal transcription bottlenecks

Summary. Contents. Introduction

Product Overview. Contents

Oracle Net Services for Oracle10g. An Oracle White Paper May 2005

Avaya Interaction Center

Organization of DSLE part. Overview of DSLE. Model driven software engineering. Engineering. Tooling. Topics:

Technical White Paper The Excel Reporting Solution for Java

Transcription:

Design Grammars for High-performance Speech Recognition

Copyright 2011 Chant Inc. All rights reserved. Chant, SpeechKit, Getting the World Talking with Technology, talking man, and headset are trademarks or registered trademarks of Chant Inc. Other marks are trademarks or registered trademarks of their respective holders.

Design Grammars for High-performance Speech Recognition A speech recognition grammar is a collection of rules comprised of words and phrases to be recognized from speech. A speech recognition engine (i.e., recognizer) uses a grammar to enhance its ability to recognize specific combinations of spoken words and phrases. With dictation recognition, a recognizer matches from all the word possibilities in a large dictionary and asserts contextual analysis to ensure it returns the correct word (i.e., spelling) for homonyms (e.g., right or write). Unlike dictation recognition, grammar recognition is context-free. A recognizer only matches against the rule definitions in the grammar. Context-free grammar recognition enables your applications to capture data very efficiently. Grammars also enable your applications to assert domain constraints to elevate data capture accuracy automatically. WHAT IS GRAMMAR MANAGEMENT? Grammar management enables you to: customize and tailor grammars in your development environment, compile grammars before application deployment, and integrate grammar generation and compilation as part of your deployed application. This provides your application added flexibility to run with information unknown until configuration time or runtime and to work with available technology on the deployed system. 3

WHAT IS GRAMMARKIT? Chant GrammarKit is comprised of application ready software components that handle the complexities of generating, compiling, and persisting the compiled grammar binary. It simplifies the process of managing grammars declared with IBM SRCL (IBM ViaVoice), Microsoft SAPI 4 Grammar Text File, Microsoft SAPI 5 XML Grammar, or Nuance BNF+ (VoCon 3200), Java Speech Grammar Format (JSGF), W3C ABNF, and W3C XML grammar syntax to use with your favorite speech recognizer. GrammarKit includes ActiveX, C++, C++Builder, Delphi, Java,.NET Framework, Silverlight, and Web component library formats to support all your programming languages and provides sample projects for popular IDEs such as the latest Visual Studio 2010 from Microsoft. The component libraries can be integrated with 32-bit, 64-bit, and mobile applications. GRAMMARKIT FEATURES The goal of good grammar design is to maximize application performance. With GrammarKit you can: Generate syntax-independent and -specific grammars. Compile grammar source from buffer, file, resource, stream, and string formats. Persist compiled grammar binary to buffer, file, and stream formats. Generate pronunciation phonemes from SAPI 4, SAPI 5, and VoCon 3200 recognizers. Dynamically switch among grammar compilers and syntax formats. Chant GrammarKit is comprised of software components that handle the complexities of constructing, compiling, and persisting grammars. This allows you to distribute compiled grammar binary files with your application, generate and compile grammars as part of your deployed application, and optimize grammar enablement at runtime by using compiled binary files. Recognizers have their own syntax for expressing grammars. GrammarKit supports the following recognizers and their grammar syntax: 4

Recognizer Speech API Grammar Syntax Nuance Dragon NaturallySpeaking V6 - V9 (all languages) SAPI 4 IBM ViaVoice (all languages) SMAPI IBM SRCL SAPI 4 Grammar Text File Microsoft SAPI 4 (all languages) SAPI 4 SAPI 4 Grammar Text File Microsoft SAPI 5 (all languages) SAPI 5 SAPI 5 XML Grammar, W3C SRGS XML Nuance VoCon 3200 (all languages) VoCon 3200 Nuance BNF+ V1.0, V1.1, V2.0, W3C SRGS, ABNF, Java Speech Grammar Format (JSGF) GRAMMAR MANAGEMENT COMPONENT ARCHITECTURE The GrammarKit component library includes a grammar management class that provides you a simple way to generate and compile speech recognition grammars. Your application can build and compile grammars as part of its runtime operation to enable real-time customization and tailoring of your speech recognition environment. The grammar management class, ChantGM, enables you to build a grammar independent of grammar syntax. Your application uses the ChantGrammar and adjunct classes to construct and modify grammar objects as needed and generate compiler-specific syntax on demand. With the ChantGM class, you can select a grammar compiler, compile the grammar, and optionally persist the compiled grammar binary. Your application uses the ChantGM class to manage the activities for compiling the grammar on behalf of your application. The ChantGM class manages the resources and interacts directly with the applicable grammar compiler. It supports the following grammar syntax: IBM Speech Recognition Control Language (SRCL), Microsoft SAPI 4 context-free grammar, Microsoft SAPI 5 XML grammar, Nuance VoCon 3200 BNF+ V1.0, V1.1, V2.0, Java Speech Grammar Format (JSGF), W3C SRGS ABNF, and W3C SRGS XML. Your application receives compiled grammar binary, warnings, and error messages through event callbacks. The ChantGM class encapsulates all of the technologies necessary to make the process of building and compiling grammars simple and efficient for your application. Optionally, it can persist the grammar binary across application invocations. 5

Your Application ChantGM Dragon SAPI 4 SAPI 5 SMAPI VoCon SAPI 4 CFG SAPI 4 CFG SAPI 5 XML W3C XML IBM SRCL Nuance BNF+ JSGF W3C ABNF The ChantGM class simplifies the process of building and compiling grammars by handling the lowlevel activities directly with the grammar compiler. You instantiate a ChantGM class object before you want to build or compile a grammar within your application. You destroy the ChantGM class object and release its resources when you no longer want to compile grammars within your application. The GrammarKit management component is designed to provide you a lot of flexiblity and minimize the programming necessary to manage the construction and compilation of your grammars. Your grammar source can be in a variety of formats (e.g., buffer, stream, and file) and your compiled binary can be save to a variety of formats (e.g., buffer, stream, and file). To simply compile your grammar and determine if there are any errors, all you need to do is pass the name of your grammar file source. You may optionally provide compiler-specific options to use when compiling your grammar and indicate whether the compilation process is synchronous or asynchronous. You can instantiate syntax-independent grammar objects from which you can generate compilerspecific syntax. These objects support generic and syntax-specific definitions that enable you to tailor grammars to leverage features across recognizers. 6

MORE INFORMATION To learn more about developing software that speaks and listens, explore how easily you can manage grammars, profiles, lexicons, recognizers, synthesizers, and text-to-speech markup directly within application software you develop in the following documents: Develop Software That Speaks and Listens, Integrate Speech Technology for Hands-free Operation, Tailor Pronunciations for Maximum Clarity, Administer Speaker Profiles for Accurate Speech Recognition, and Fine-tune Speech Synthesis Using Text-to-Speech Markup. 7