INTERACTIVE VOICE RESPONSE WITH AUTOMATED SPEECH RECOGNITION AND BIOMETRICS FOR BANWEB



Similar documents
Enumerating and Breaking VoIP

Troubleshooting Tools to Diagnose or Report a Problem February 23, 2012

MITM Man in the Middle

Configuring the CounterPath X-Lite SIP Softphone

Softswitch & Asterisk Billing System

Evolution PBX User Guide for SIP Generic Devices

Open Source VoiceXML Interpreter over Asterisk for Use in IVR Applications

Application Notes for Configuring Broadvox SIP Trunking with Avaya IP Office - Issue 1.0

Deploying Cisco Unified Contact Center Express 5.0 (UCCX)

The Trivial Cisco IP Phones Compromise

White Paper Integration of TTY Calls into a Call Center Using the Placeholder Call Technique Updated: February 2007

1 VoIP/PBX Axxess Server

Open-Source, Cross-Platform Java Tools Working Together on a Dialogue System

Voice Call Addon for Ozeki NG SMS Gateway

Deploying Cisco Unified Contact Center Express Volume 1

Hosted Fax Mail. Hosted Fax Mail. User Guide

IP Based Voice Server Application With PBX Using Free SWITCH ISSN

Automated Penetration Testing with the Metasploit Framework. NEO Information Security Forum March 19, 2008

VoIP Recorder V2 Setup Guide

and Voice Applications Eyal Wirsansky, Verso Technologies JaxJUG

Application Notes for Configuring Intelepeer SIP Trunking with Avaya IP Office Issue 1.0

Grandstream Networks, Inc. How to Integrate UCM6100 with Microsoft Lync Server

Version 0.1 June Xerox WorkCentre 7120 Fax over Internet Protocol (FoIP)

CUSTOMER CONFIGURATION AUTO ATTENDANT ADMINISTRATOR S GUIDE

VoIP Service Reference

Web Portal User Guide

Hosted VoIP Phone System. Desktop Toolbar User Guide

Configuration Notes 0217

Asterisk PBX Features

WiFi Security Assessments

G563 Quantitative Paleontology. SQL databases. An introduction. Department of Geological Sciences Indiana University. (c) 2012, P.

How to Build a Simple Virtual Office PBX System Using TekSIP and TekIVR

Professional Penetration Testing Techniques and Vulnerability Assessment ...

Integration of Voice over Internet Protocol Experiment in Computer Engineering Technology Curriculum

Wildix Management System (WMS) White Paper

Aculab digital network access cards

ACP 3.2 Novelties. Edition 01 March, Aastra

IP-PBX Quick Start Guide

Wireless Security: Secure and Public Networks Kory Kirk

Application Notes for Configuring Cablevision Optimum Voice SIP Trunking with Avaya IP Office - Issue 1.1

IVR CRM Integration. Migrating the Call Center from Cost Center to Profit. Definitions. Rod Arends Cheryl Yaeger BenchMark Consulting International

Securing end devices

The Customer Portal will allow you to administrate your Arch system via the Internet. From the portal you can:

Phone Routing Stepping Through the Basics

NAT TCP SIP ALG Support

Cisco Unified Communications Manager 5.1 SIP Configuration Guide

Integrating VoIP Phones and IP PBX s with VidyoGateway

Packetized Telephony Networks

WIRELESS SECURITY. Information Security in Systems & Networks Public Development Program. Sanjay Goel University at Albany, SUNY Fall 2006

COPYRIGHT 2011 COPYRIGHT 2012 AXON DIGITAL DESIGN B.V. ALL RIGHTS RESERVED

Interfaces de voz avanzadas con VoiceXML

TEL 500 WRITE UP WEEK 8 FREE PBX SIP LAB SUBMITTED TO: PROF. RONNY BULL BY: ANUSHA ALIGAPALLY

Introducing Cisco Voice and Unified Communications Administration Volume 1

A Smart Telephone Answering Machine with Voice Message Forwarding Capability

Traffic Analyzer Based on Data Flow Patterns

CRYPTUS DIPLOMA IN IT SECURITY

Abstract. Avaya Solution & Interoperability Test Lab

During your session you will have access to the following lab configuration. CLIENT1 (Windows XP Workstation) /24

VoIP Service Reference

Lab Introduction software Voice over IP

Thick Client Application Security

Version 2.6. Virtual Receptionist Stepping Through the Basics

Internet Telephony PBX System. IPX-300 Series. Quick Installation Guide

FileMaker Server 10 Help

Security and Risk Analysis of VoIP Networks

A Guide to Connecting to FreePBX

Cisco CallManager 4.1 SIP Trunk Configuration Guide

SL1100 Digital Call Logger User Guide

PENTEST. Pentest Services. VoIP & Web.

NetVanta 7100 Exercise Service Provider SIP Trunk

Workforce Management IVR. A multi-service voice platform

Call Recorder Oygo Manual. Version

Mediatrix 4404 Step by Step Configuration Guide June 22, 2011

Configuration Guide for connecting the Eircom Advantage 4800/1500/1200 PBXs to the Eircom SIP Voice platform.

Testing IVR Systems White Paper

Asterisk Calling Card & Billing System

A This panel lists all the IVR queues you have built so far. This is where you start the creation of a IVR

VoIP Server Reference

Using CounterPath X-Lite with Virtual PBX - PC

Vulnerability Assessment and Penetration Testing

GWAVA 5. Migration Guide for Netware GWAVA 4 to Linux GWAVA 5

PCBest Networks VOIP Recorder

AUTOCUE IVR. User Guide Updated: 06/18/15 Document Number: 36UG

SIP EXPRESS MEDIA SERVER (SEMS) WITH MPEG4 SUPPORT

DiskPulse DISK CHANGE MONITOR

UCCXA: Cisco Unified Contact Center Express Advanced v4

IP PBX. SD Card Slot. FXO Ports. PBX WAN port. FXO Ports LED, RED means online

Integrating Skype for SIP with UC500

UCCXA: Cisco Unified Contact Center Express Advanced v4

Chapter 9 Telephone Conferencing

TEL 500. Voice Communications. Week 1 Write Up. Session Initiation Protocol Lab. Submitted To: Prof Ronny Bull. By: Sai Sharan Korvi

VoiceXML Data Logging Overview

Contents 1. Setting up your Phone Phone Setup Phone Usage 2. User Portal 3. Softphone for your computer 4. Faxing

Interaction Center Integration with HEAT

Asterisk SIP Trunk Settings - Vestalink

spiderstar VoIP Interface Version 4.0 User manual

Connecting with Vonage

Transcription:

SCHOOL OF ENGINEERING AND APPLIED SCIENCE THE GEORGE WASHINGTON UNIVERSITY PROJECT REPORT INTERACTIVE VOICE RESPONSE WITH AUTOMATED SPEECH RECOGNITION AND BIOMETRICS FOR BANWEB Group Name Telecommunication For Academic Purpose Presented By Jaywant Kapadnis Mandar Patil Parameswaran Krishnan

Abstract In this project we have created a student s university account called BanWeb for GWU using Freeswitch. A student can simply dial their University ID number (i.e. Gworld Number) and access their University account which contains their Academic, Employment and Personal information s. A completely automated and secure system is created using Biometrics like Voice Recognition and Speech operated IVR.

Contents Chapter 1 Block Diagram and Flow Chart 1 Chapter 2 All you need to know about Freeswitch 3 Chapter 3 Biometrics Voice Recognition 4 Chapter 4 Speech Operated IVR 6 Chapter 5 Data Base Connect 7 Chapter 6 Hacking Banweb Server 8

1 Chapter 1 Block Diagram And Flow Chart The block diagram consists of three major parts Freeswitch, Matlab and Database. The student initially uses a softphone (X-Lite 4) to establish a to their university banweb account by dialing their Gworld number. This extension initiates a lua script. This lua script calls matlab and runs a matlab program for voice recognition. After the user authentication matlab returns variables to lua which then calls a javascript. Now using javascript, a speech operated IVR is built which provides direct student interaction and their banweb menu is accessed. All the data in is extracted from a database. Also a facility to edit their banweb information is available with the students.

Flow Chart 2

3 Chapter 2 All You Need To Know About Freeswitch The best part of freeswitch is that it is an open source software. For this project the latest version of Freeswitch 1.0.6 has been used. The open source software was build on Mac OS as well as Windows. A VoIP softphone is used at the ends integrated with freeswitch for communication. X-lite 4, a VoIP softphone is used which uses Session Initiation Protocol and is developed by CounterPath Corperations. The most used feature of Freeswitch is its modules. The modules used in this project are listed below. Applications mod_conference - Conference room module. mod_directory - Dial by Name directory. mod_dptools - Dialplan Tools: provides a number of apps and utilities for the dialplan. mod_fifo - FIFO module. Speech Recognition / Text-to-Speech mod_flite - Free open source Text to Speech. mod_pocketsphinx - Free open source Speech Recognition. Dialplan mod_dialplan_xml - Allows you to program dialplans in XML format. File Formats mod_local_stream - Multiple channels connected to same looped file stream. Languages mod_lua - Lua support. mod_spidermonkey - JavaScript support. o mod_spidermonkey_core_db - JavaScript support for the freeswitch SQLite. o mod_spidermonkey_skel - JavaScript dummy module. o mod_spidermonkey_teletone - JavaScript support for lib_teletone. o mod_spidermonkey_odbc - JavaScript support for ODBC. The most extensively used modules in this project are mod_spidermonkey, mod_lua, mod_flite and mod_pocketsphinx. The basic outer core of the project is based on Lua script which is a powerful, fast, lightweight, embeddable scripting language. Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and

4 extensible semantics. Lua is dynamically typed, runs by interpreting bytecode for a registerbased virtual machine, and has automatic memory management with incremental garbage collection, making it ideal for configuration, scripting and rapid prototyping. Mod_flite is an open source text to speech engine that converts the written text to speech. Mod_pocketsphinx allows Freeswitch to recognize speech. This is the basic difference between the two, mod_flite is Text to speech conversion engine and mod_pocketsphinx is used for speech recognition. Mod_pocketsphinx uses 8k and 16k acoustical models, Semi-continuous recognition and is great for smaller grammar. Mod_Spidermonkey is a Mozilla JavaScript (ECMAScript) engine. It supports all the standard JavaScript language elements for example for and while loops, regexps and many others. CHAPTER 3 Biometrics Voice Recognition It was very important to secure the student s account (banweb) over the phone so that no user other than the intended would use it. As voice was the only medium of data transfer it was important that voice recognition be used to secure the interaction. The most common approach to voice recognition is divided into two classes: "template matching" and "feature analysis". Template matching has the highest accuracy when used properly, but it also suffers from the most limitations. In this project the template matching approach of voice recognition is used. Initially when the student registers to this service they have to record their name ten times which would be integrated and used as a template for matching later on when the student logs in by calling. After calling the system by dialing zero followed by their Gworld or Identification number (eg. 031881753), the first step is to speak their name. The electrical signal from the microphone is digitized by an "analog-to-digital (A/D) converter", and is stored in memory. To determine the "meaning" of this voice input, the computer attempts to match the input with a digitized voice sample, or template that has a known meaning. This technique is a close analogy to the traditional command inputs from a keyboard. The program contains the input template, and attempts to match this template with the actual input using a simple conditional statement. A more general form of voice recognition is available through feature analysis and this technique usually leads to "speaker-independent" voice recognition. Thus by working on the feature aspect of voice it was possible to get a very precise match to the actual user and also terminate the background noise to a great extent. Matlab Matlab is a very powerful tool to deal with voice. The very powerful voice engine in matlab enables it to perform many operations on any voice file. The following is the block diagram of the function of matlab in the project.

An alternate lua program is written so that the user/student can initially record their 10 voice samples to the database which can be used later on for matching. The student on registering to banweb calls on an extension and then records the 10 voice samples. Matlab Program for Voice Recognition with detailed explanation is given below 5 Step 1 Calling the wav file recorded in freeswitch Step 2 Read contents of wav file Step 3 Work out the number of samples are in 2secs Step 4 As our program is designed for 88200 samples. These samples are calculated using the recording frequency by changing the recording frequency in dialplan to 44100khz and keeping recording as small as 2 seconds. Write this output resampled wav file to original wav file Step 5 Then the newly recorded file is cropped and placed in a 88200x20 matrix Step 6 The rows of the matrix are truncated to the smallest length of the 10 recordings. Step 7 Convert the individual columns into frequency domain by applying the Fast Fourier Transform. Then take the modulus squared of all the entries in the matrix. Step 8 Normalize the spectra of each recording and place into the matrix fn. Only frequencies up to 600 are needed to represent the speech of most humans. Step 9 Find the average vector Step 10 Normalize the average vector Step 11 Find the Standard Deviation of the matrix of samples of voice, which is recorded, from user. Step 12 The 10 recordings form a database to match with the newly recorded recording of user. Then the 10 recordings are cropped and placed in a 88200x20 matrix Step 13 The rows of the matrix are truncated to the smallest length of the 10 recordings. Step 14 Convert the individual columns into frequency domain by applying the Fast Fourier Transform. Then take the modulus squared of all the entries in the matrix. Step 15 Normalize the spectra of each recording and place into the matrix fn. Only frequencies up to 600 are needed to represent the speech of most humans. Step 16 Find the average vector and normalize the average vector Step 17 Find the Standard Deviation of the pre-recorded samples and Pre-saved doc file is opened and the value in doc file is changed

6 Step 18 Here the standard deviation of 10 pre-recorded samples (std )is subtracted from standard deviation (std1) of recorded sample which is to be matched. Step 19 If the voice of user matches with the samples in database then value 1 is written in doc file otherwise 0 is written in it and the value of doc file determines that the user has been authorized to enter in the system or not. CHAPTER 4 Speech Operated IVR Interactive Voice Response (IVR) is an automated telephony system that interacts with callers, gathers information and routes calls to the appropriate destination. IVR allows customers to interact with a company s database by speech recognition, after which they can service their own inquiries by following the IVR dialogue. IVR systems can respond with prerecorded or dynamically generated audio to further direct users on how to proceed. IVR applications can be used to control almost any function where the interface can be broken down into a series of simple interactions. IVR systems deployed in the network are sized to handle large call volumes. In this project the function of IVR is to ease user interface. FreeSWITCH IVRs can be written in any language that FreeSWITCH supports including JavaScript, Python, Perl, Lua and an XML macro format of which JavaScript is used for better result. In order to run JavaScript it is important to load mod_spidermonkey module in freeswitch. The flowchat of the IVR is as given below

7 CHAPTER 5 Data Base Connect A database is a system intended to organize, store, and retrieve large amounts of data easily. The reason of implementing a database connection for retrieving information about students is of a vast number of entries and a detailed subsection. Database management is also easy for editing, sorting and searching student details which makes the banweb system more effective. Database also provides secure data transaction with high flexibility. A structured query language MySQL was used to prepare a database with all the entries. A MyPHP Admin server was set up on the host computer and a database named freeswitch was built on it. The database stored student details in the form of text. Each entity was fetched into freeswitch and this was stored in a variable. This variable was read by freeswitch using module mod_flite. These variables were called into freeswitch using PHP. The BANWEB written in JavaScript is as follows: 1. User uses voice-operated options to move ahead in the JavaScript. 2. Main speech detection module used is pocket sphinx with can be loaded in freeswitch and is used in the JavaScript. 3. JSGF grammar files are used which enable the speech detection 4. The Java Speech Grammar Format (JSGF) is a BNF-style, platform-independent, and vendor-independent textual representation of grammars for use in speech recognition. 5. The module defines speech tool and this is used to detect the speech. The basic default language is English. (SpeechTool.jm) 6. JSGF grammar can be defined customized as needed and can be used In the JavaScript. 7. The pocket sphinx uses a default English grammar file. Pocket sphinx uses this dictionary when no other dictionary is specified in the script. 8. We have used the default English grammar, which suffice for the application we have made. 9. When a user says a word or a sentence it is matched with the predefined word or sentence and if the match occurs the script move forward.

10. Now user has also been given the privilege to edit few options in the BANWEB 11. The user edited option is saved as a wav file.this wav file can be retrieved by the BANNWEB personnel who then listens to it authenticates it and then writes it in a text file in a database. 12. Now this written document is appropriately fetched by the JavaScript when the user returns the next time he can hear the edited information. 13. Option given editing is, User can hear the recorded information at any given time given that if they precede the BANWEB the second time. 8 CHAPTER 6 Hacking Banweb Server It is important that a student server is secure and the information in it cannot be changed or read easily. In this hacking process a network was setup where a user establishes a call to the banweb server. A Man in the middle attack was initiated. The hacker hacks freeswitch of the user calling the server by sniffing using Address resolution protocol (ARPSPOOF) using Backtrack over the wireless network. This allows attacker to sniff data frames on a wireless local area network. ARPSOOF sends a fake ARP message to a WLAN. The tools used in attacking are SMAP, which was used as a sip scanner, ARPSPOOF and Wireshark to capture the packets. A successfully RTP Capture was made and this was decoded and played back. The portion of the recorded file where the User/Student says their name is clipped and injected while the matlab asks for user authentication. Steps 1. Bridged the Main OS network with VM network and Used SMAP for scanning SIP 2. Call was established between user and server. 3. The packets were captured by the user performing man in the middle attack using resolution protocol (ARPSPOOF) 4. The SIP and RTP packets were captured in wireshark. 5. Now, hacker dials the banweb number and RTP packets are played back into server freeswitch and were able to log in banweb of the user successfully.