VECSYS LIMSI ARCHITECTURE



Similar documents
Considerations for developing VoiceXML in Canadian French

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

VoiceXML-Based Dialogue Systems

Traitement de la Parole

ABSTRACT 2. SYSTEM OVERVIEW 1. INTRODUCTION. 2.1 Speech Recognition

Mobile Application Languages XML, Java, J2ME and JavaCard Lesson 03 XML based Standards and Formats for Applications

Speech Signal Processing: An Overview

Christian Leibold CMU Communicator CMU Communicator. Overview. Vorlesung Spracherkennung und Dialogsysteme. LMU Institut für Informatik

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Evaluation of speech technologies

CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson

Specialty Answering Service. All rights reserved.

Standard Languages for Developing Multimodal Applications

Zeenov Agora High Level Architecture

Dialogos Voice Platform

Version 2.6. Virtual Receptionist Stepping Through the Basics

Natural Language to Relational Query by Using Parsing Compiler

White Paper Integration of TTY Calls into a Call Center Using the Placeholder Call Technique Updated: February 2007

Application Architectures

1. Introduction to Spoken Dialogue Systems

A design of the transcoder to convert the VoiceXML documents into the XHTML+Voice documents

Release Notes Scribe Adapter for Microsoft Dynamics

Linux. Reverse Debugging. Target Communication Framework. Nexus. Intel Trace Hub GDB. PIL Simulation CONTENTS

Call Recorder Oygo Manual. Version

SIPAC. Signals and Data Identification, Processing, Analysis, and Classification

Indepth Voice over IP and SIP Networking Course

Dialplate Receptionist Console Version

ENTRYCONTROL. Version Administrator manual for use with ALPHATECH TECHNOLOGIES IP BOLD DoorPhone intercom. For Administrators only

VOICE INFORMATION RETRIEVAL FOR DOCUMENTS. Except where reference is made to the work of others, the work described in this thesis is.

Voice Driven Animation System

D2.4: Two trained semantic decoders for the Appointment Scheduling task

VoiceXML Tutorial. Part 1: VoiceXML Basics and Simple Forms

Hermes.Net IVR Designer Page 2 36

Project Code: SPBX. Project Advisor : Aftab Alam. Project Team: Umair Ashraf (Team Lead) Imran Bashir Khadija Akram

IVR Primer Introduction

Deploying Cisco Unified Contact Center Express Volume 1

A HAND-HELD SPEECH-TO-SPEECH TRANSLATION SYSTEM. Bowen Zhou, Yuqing Gao, Jeffrey Sorensen, Daniel Déchelotte and Michael Picheny

Using the VMRC Plug-In: Startup, Invoking Methods, and Shutdown on page 4

Simple Voice over IP (VoIP) Implementation

INTELLECT TM Software Package

Creating a low cost VoiceXML Gateway to replace IVR systems for rapid deployment of voice applications.

Load Balancing Voice Applications with Piranha 1

INF5820, Obligatory Assignment 3: Development of a Spoken Dialogue System

Develop Software that Speaks and Listens

4.1 Threads in the Server System

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

1Building Communications Solutions with Microsoft Lync Server 2010

Envox CDP 7.0 Performance Comparison of VoiceXML and Envox Scripts

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

VOICE OVER IP AND NETWORK CONVERGENCE

Short Manual Intellect v SP2 module Unipos Contents:

A Lightweight Approach to Contact Data Synchronization in Mobile Social Networks

Firewall Builder Architecture Overview

Release Notes For Versant/ODBC On Windows. Release

SIP based HD Video Conferencing on OMAP4

COPYRIGHT 2011 COPYRIGHT 2012 AXON DIGITAL DESIGN B.V. ALL RIGHTS RESERVED

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

Conference Bridge setup

Open Source Telephony Projects as an Application Development Platform. Frederic Dickey Director Product Management

CHAPTER FIVE RESULT ANALYSIS

9RLFH$FWLYDWHG,QIRUPDWLRQ(QWU\7HFKQLFDO$VSHFWV

IP Office Technical Tip

Hosted Fax Mail. Hosted Fax Mail. User Guide

Systems Engineering and Integration for the NSG (SEIN) SharePoint Developer

Video Conferencing Demo Application for Dialogic Multimedia Products

Phone Routing Stepping Through the Basics

Document Management & Electronic Filing

The preliminary design of a wearable computer for supporting Construction Progress Monitoring

Endowing a virtual assistant with intelligence: a multi-paradigm approach

Dragon Solutions Transcription Workflow

Dialogic Diva Software Development Kit

XML based Interactive Voice Response System

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Government Case Study:

Speech as a Service. How to Put Your Speech Solution in the Cloud

estos ECSTA for OpenScape Business

3F6 - Software Engineering and Design. Handout 10 Distributed Systems I With Markup. Steve Young

Recording Supervisor Manual Presence Software

COMPONENTS in a database environment

Study Plan for the Bachelor Degree in Computer Information Systems

OpenVox DE210E/DE410E User Manual

How To Install Netbackup Access Control (Netbackup) On A Netbackups (Net Backup) On Unix And Non Ha (Net Backup) (Net Backups) (Unix) (Non Ha) (Windows) (

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING

Efficiency of Web Based SAX XML Distributed Processing

Syslog Monitoring Feature Pack

Example of Standard API

How the Computer Translates. Svetlana Sokolova President and CEO of PROMT, PhD.

INTERACTIVE VOICE RESPONSE WITH AUTOMATED SPEECH RECOGNITION AND BIOMETRICS FOR BANWEB

Transcription:

VECSYS LIMSI ARCHITECTURE Samir Bennacef Vecsys

Centralized Architecture Acoustic models Language models Caseframe Grammar Task Model Database Speech Recognizer text Semantic Analyzer Semantic frame Dialog Manager queries results Information Retrieval Telephone Interface Semantic Frame Speech Synthesis sentence Sentence Generator Unit Dictionary Generation Grammar Fig 1. Spoken Dialogue System Architecture

Telephone Interface phone program Input: commands and speech Output: events and speech Recording and playback, DTMF detection and generation, pickup, hangup and call transfert Hardware echo cancellation Barge-in based on adaptative speech detection NMS QX2000 hardware Computer Telephony Access API

Speech Recognizer Cepstral Features Computation: sig2mfcc Input: speech recorded by the phone program Output: 13 component cepstral vector every 10 ms on a 8kHz bandwidth. Speech recognizer: nsearch Inputs: commands and cepstral coefficients Output: recognized text

Semantic Analyzer Lexical normalization and labelling: sentprocess Input: recognized sentence Output: labelled sentence Caseframe analysis: cases Input: labelled sentence Output: semantic frame

Dialog Manager Dialog Input: semantic frame resulting from cases Output: semantic frame to be converted in natural language Contextual understanding Database query generation Semantic frame generation Use a powerful scripting language

Natural Language Generator Genere Input: semantic frame resulting from dialog Output: natural language sentence Use of hierarchic rules

Information Retrieval Interface Dbserver Input: SQL query Output: database result Query parsing and translating Retrieves informations from the target database Provides the result table

Speech Synthesis System Syn Input: sentence resulting from genere Output: speech signal which is played by the telephone interface Use of unit dictionary Select the best sequence of units using a dynamic programming algorithm

C-shell script # ------------------------ Phone interface ----------------------- # rsh $remote $bin/phone.exe h$dialhost $dialport t70 x8192 n2 \ l2 g f$cfg/cta.cfg a$data& # -------------------- Speech recognizer loading ----------------- # # ---- SigToCep ---- # set CEP = \ $bin/sig2mfcc -w240 -s80 -l20 -n12 -r8000 -b0:3500 -c -en0-0 \ --$fifo/tosentrec.fifo$i $fifo/tosig2mfcc.fifo$i -: # ---- Speech Recognizer ----# set RECO = \ $bin/nsearch -@$phones -d$fifo/tosentrec.fifo$i -t \ -p${plist}:$stbl -s0:160:0:f -l$voc -z3 -w4:25 -n1 -q63,12:8:3 \ -zb$tg -zw30 -zr -xg$gsl -zy$clst -sw50 -sh25000 \ -cmr${cepmean}:0.996 -en4.5 -- $hmm -xf $bin/recocheck -r $fifo/torecord.fifo$i -c$cep d$reco \ -t$fifo/fromdial.fifo$i -v < \ $fifo/pushtotalk.fifo$i > $fifo/tocases.fifo$i &

#-------- Semantic Analyzer and Dialogue loading ---------# $bin/sentprocess -k -t -d -c -v2 $dial/rules.txt < \ $fifo/tocases.fifo$i \ $bin/cases -k -o -m -v $dial/caseframe.txt \ $bin/dialogue -i -v1 $dial/task.txt $dial/dial.arg \ -tr$fifo/pushtotalk.fifo$i -fp$fifo/fromplay.fifo$i \ -fn$fifo/todial.fifo$i -rf$tmp/reco.tmp$i \ -e$fifo/fromdial.fifo$i -fg$fifo/fromgenere.fifo$i \ -tt$fifo/todb.fifo$i -ft$fifo/fromdb.fifo$i \ $bin/genere $dial/genere.txt -f$fifo/fromgenere.fifo$i v > $fifo/tosyn.fifo$i&

# ----------------------- Dispatcher ----------------------------- # $bin/dispatcher -v -p$synt/sig/prompt.sig -f$synt/sig/dtmf.sig \ -l"$logcmd" -s$fifo/torecord.fifo$i -db$fifo/fromplay.fifo$i \ -dt$fifo/todial.fifo$i -df$fifo/fromdial.fifo$i -dp$dialpid \ -kf$fifo/fromdbconn.fifo$i -kt$fifo/todb.fifo$i \ -kw$synt/sig/waitdb.sig -kl$synt/sig/wait.sig -v r \ < $fifo/fromphone.fifo$i > $fifo/tophone.fifo$i & # ------------------- Database Loading --------------------------- # $bin/dbserver -t$fifo/todbtarg.fifo$i -f$fifo/fromdbtarg.fifo$i \ -c$db/table1.txt -s$db/table2.txt -p$db/table3.txt \ -d$fifo/fromdbconn.fifo${i}:120 -m10 -a -v2 \ < $fifo/todb.fifo$i > $fifo/fromdb.fifo$i & # -------------------- Synthesis loading ------------------------- # $bin/syn -s${sig}:2 -l$wd -w4:2:0 -o$fifo/toplay.fifo$i -c \ $synt/wdlist.lst & $bin/play -i$fifo/toplay.fifo$i -o$fifo/tophone.fifo$i -p v &

How the system works server.csh: telephone interface loading server.csh: speech recognizer loading server.csh: dialog loading server.csh: dispatcher loading server.csh: dbserver loading server.csh: synthesis loading telephone: pickup telephone: line number=[0] telephone: play telephone: get dtmf [*] dialogue: frame: { concept: (acte formalite-ouverture). } genere: Quel voyage souhaitez-vous effectuer? telephone: play telephone: end of play telephone: recording nsearch: <s> Paris Lille pour demain matin </s>

sentprocess: Paris -> $place Lille -> $place matin -> *matin $place(paris) $place(lille) *to(pour) demain(demain) *matin(matin) cases: <defaut> { place: Paris. place: Lille. departure-period: *matin. departure-date: demain. } dialogue: request=[select from, deph, to, arrh, chg, day, stopa, stopah, stopd, stopdh, stopdur, type WHERE from=paris AND to=lille AND day=17/5/101 AND arrh ~= 1000] dbserver: target query=[00043 00000001? 12 FRPAR FRLIL 17 MAY 1000] dbserver: result=[1 ( from deph to arrh chg day stopa stopah stopd stopdh stopdur type )( Paris-Gare-du-Nord 0858 Lille-Flandres 0959 0 17/5/101 ----- 0959 ----- ----- ----- TGN )]

Des Hommes dialogue: de Parole { concept: (acte response) (type positive) (value train-hour). nb-trains: (value 1). concept2: (acte confirmation) (value hour). from-place: (value Paris-Gare-du-Nord). to-place: (value Lille-Flandres). departure-wday: (value jeudi). departure-day: (value 17/5/101). departure-period: (value *matin). stop: (value 0). sched: (dep 0858) (arr0959). } genere:le matin, jeudi dix-sept mai vous avez un train de Paris- Nord `a Lille-Flandres `a huit heures cinquante-huit arrivant `a neuf heures cinquante-neuf. Cet horaire vous convient-il? nsearch: <s> oui </s> cases: <defaut> { mode: *affirmatif.} dialogue: { concept: (acte relance) (value retour). } genere: Souhautez-vousle retour? nsearch: recognized string: <s> non merci </s> genere:vous avez donc un aller Paris-Nord Lille-Flandres le jeudi dix-sept mai d'epart huit heures cinquante-huit, arriv'ee neuf heures cinquante-neuf. Souhaitez-vous un autre trajet? nsearch: recognized string: <s> non merci </s> genere:au-revoir, le syst`eme Recital vous remercie et vous souhaite un bon voyage. October telephone:hangup 2001

Distributed Architecture Recognizer Recognizer Dialogue Dialogue Speech Speech synthesis synthesis Net Net Audio Audio server server Other Services Recognizer Recognizer Dialogue Dialogue Speech Speech synthesis synthesis Other Services Vnetd Daemon Host1 (Master) Vnetd Daemon Host n (Slave) Network (TCP/UDP) Network (TCP/UDP) Application Programming Interface Application Programming Interface (Data exchange Protocols) (Data exchange Protocols) (Service Name-Address Resolution) (Service Name-Address Resolution) Client Client Application Application 1 1 Client Client Application Application 2 2 Client Client Application Application m m

Services 1. Audio 2. Speech recognition 3. Dialog (understandig, dialog and generation) 4. Information retrieval 5. Speech synthesis 6. Application manager

Galaxy Communicator Similarities between GC and Oasis A distributed client/server architecture A central manager : hub in GC and the application manager in Oasis A set of services listening for client connections and requests

Des Hommes de Parole Make Oasis Services GC Compliant Include the GC server functions in all services: make initialization include a dispatch function invoke the hub by using GalIO_Comm family functions Use the brokering mechanism

Tests and Evaluation The speech recognizer only The dialog connected to the database The dialog with the recognizer The whole system Supported platforms (Dec, Sgi, Linux, Windows)