Text Independent Speaker Verification System

Size: px
Start display at page:

Download "Text Independent Speaker Verification System"

Transcription

1 Text Independent Speaker Verification System Project Advisor: Professor Lawrence Saul 1

2 Abstract: User identification and verification are very important aspects of any security system today, as cheaters find more and more ways to break into even the most complex of security measures. Biometric recognition systems are in demand today due to their reliance of human features that are unique to a person and cannot be forged easily such as face, fingerprints and voice. Like fingerprints, a person s voice has particular unique features and using this voiceprint, their identity can be verified. The goal of my project is to design and implement a text-independent speaker identification system. This means that regardless of what the user speaks, the system should be able to verify whether he is the person he claims to be. Such a system would be useful in banks, at ATMs, as well as telephone-based applications, where there is no way to identify a user based on fingerprint or face. Related Work: Speech recognition is not a new subject, however it is a growing industry and continuously new methods to tap this human quality are being developed. A lot of research has been done on text-independent speaker verification systems using Gaussian mixture models and my project is a simple implementation of that. I will be using published papers on this topic to assist me in my goal. 2

3 Technical Approach: The object of this project is to implement a single speaker verification system. Statistically speaking, it is a hypothesis test between two hypotheses: where p(y H0) p(y H1) > θ, accept H0 < θ, accept H1 H0: Y is from the hypothesized speaker S H1: Y is not from the hypothesized speaker S 1 Figure taken from A Tutorial on Text-Independent Speaker Verification The output of front end processing is a sequence of feature vectors X = {x1, x2,t}, where xt is a feature vector indexed at discrete time t [1,2,3..., T]. These features are then used to compute the likelihood ratios of H0 and H1. The log of the likelihood ratio above would then be: Λ(X) = log p(x H0) log p(x H1) We need to generate two models for this test to work the speaker model as well as the background model. I have planned three stages for implementing this system Training Phase, Tuning Phase and Testing Phase. Training Phase: Generate the background model Tuning Phase: Generate the individual speaker models Testing Phase: Test the system using new wave files from test speakers I m using the Gaussian Mixture Model for the likelihood function and so the mixture density for the likelihood function, for a D-dimensional feature vector x, is: 1 A Tutorial on Text-Independent Speaker Verification 3

4 QuickTime and a TIFF (LZW) decompressor are needed to see this picture. The GMM parameters (mean, variance, etc) are calculated using the Expectation- Maximization (EM) Algorithm. It is an iterative process that monotonically increases the likelihood of the estimated model for the observed feature vectors such that for iterations k and k+1, p(x λ (k+1) ) p(x λ (k) ) The weight, mean, and variance parameters: 4

5 Data Collection: To implement this system, test data is required. I have recorded clips from 25 speakers. Each speaker data set consists of 15 speech clips, of varying lengths. This data set I split into three categories Training, Tuning and Testing. These are the three phases of the project and the data will be required in each stage. So 9 out of 15 clips I have used for training, 3 more for tuning and the rest for testing the application. To record the clips, I used a microphone and recording software called GoldWave. One factor that affected the results, was the distance between the microphone and the speaker s mouth. Too close or too far and the results were skewed. I realized this at a later stage, and so had to ask a few speakers to record more test clips. Training Phase: In the Training phase, the background model is created. The background model is basically a large pool of all sample data, just a large Gaussian mixture model. I have converted the wave files into a different format so that they can be used for this analysis. The wave file is a continuous signal, which must be broken down in discrete parameter vectors. Each vector is about 10ms long, because we assume that in this duration the vector is stationary. This is not strictly true, but it is a reasonable approximation to make. The format I ve used is MFCC, which stands for Mel Frequency Ceptral Coefficients. The conversion can be done as follows: 1. Divide signal into frames. 2. For each frame, obtain the amplitude spectrum. 3. Take the logarithm. 4. Convert to Mel (a perceptually-based) spectrum. 5. Take the discrete cosine transform (DCT). 2 However, instead of doing this manually, I used the HTK Toolkit in order to automate the process. Once the files are in the correct format, from each frame it is important to discard all the silence and keep the speech samples. So, then I generate mfcc.speech files. 2 Logan, Beth. Mel Frequency Cepstral Coefficients for Music Modeling 5

6 One of the features vectors extracted is energy, which corresponds to the loudness or softness of the speaker s voice. In order to avoid bad results due to this, I removed the energy vector from the speech files. Now, the speech files can be combined to generate the background model file. This model must now be trained. We must decide on the number of Gaussians to work with. In order to make that decision, you look at the log_likelihood values at the end of the training process, and compare the values. For example: Number of samples Log_Likelihood in Loop 4 Number of Gaussians The optimal number of Gaussians is one where the log likelihood value drops for the first time, because this means that the likelihood is actually increasing. During my earlier training phase, the optimal number of Gaussians was 300, with the lowest log likelihood value. However, as the number of samples increased, I decided to continue testing with higher number of Gaussians and finally achieved best results at 600 Gaussians. As the system is scaled for use by a large of speakers, this number will increase substantially. I keep the number of Gaussians fixed for the background model and the speaker models. Tuning Phase: In the tuning phase, the individual speaker models are generated. The process for generating these models is very similar to that for generating the background model, with a few minor changes. I take the mfcd.speech files these are the mfcc files with the energy feature removed and use these to generate a model for that speaker. I keep the number of Gaussians same as the background model in this case, 600 Gaussians. The purpose of this system is to test whether a given voice print belongs to the person the speaker claims to be. In order to achieve this, I needed to device a method to calculate a threshold value, which would make it easy to identify the speaker/imposter. An imposter is a user who claims to be somebody else, to try and cheat the system. To 6

7 do this, I used three test files from each user. I compared the each file of the speaker to the speaker model, and based on the matching of the features, calculated the likelihood value of a test recording belonging to that speaker. For each speaker, not only did I compare the speaker s test files, but also the files from other speakers in the background model. This provided a range of values that would be useful in calculating a threshold. Below is a sample of the data I got from running the above test. Dat file: dip divye jiten Khush madhu Speech files Dip Dip Dip Divye Divye Divye jiten jiten jiten khush khush khush madhu madhu madhu Each speech file belongs to some speaker, and the highlighted likelihood values are the results of comparing a speaker test file to the same speaker s model. The most important point that I noticed in the tuning phase results was that the likelihood values of a test file belonging to a speaker is positive when the file actually belongs to the speaker and negative when the file belongs to an imposter. I decided that the threshold had to be some function based on the average of the likelihood values of the speaker files as well as include the imposter values. The threshold function I used is: µ + xσ µ is the mean and σ is the standard deviation of all the likelihood values. x is an integer whose value is can be varied. I varied x, starting with x = 2. Using this threshold function, I computed the thresholds of all the speakers in my background model. 7

8 Testing Phase: Once we have all the threshold values and speaker models, it is time to test the remaining files. This will help us determine if the analysis done above is accurate enough. Using the threshold values calculated in the tuning phase, I tested the remaining speaker files. To ensure that the system is accurate while verifying users, we need to test the threshold values in two ways for false alarms and false rejections. If the likelihood value of an imposter file is higher than the threshold for the speaker being tested, then the system will validate the imposter as the speaker. This is a false alarm. On the other hand, sometimes a speaker s own file may not have a likelihood value higher than the threshold and so the speaker is falsely identified as an imposter. This is a false rejection. An optimal threshold value would minimize both these values, keeping the error rate low. I maintain a summary file for each user, which is generated when the testing scripts are run, recording the likelihood values, and the mean, variance and standard deviation of the results. The summary file also tracks the number of false alarms and rejections. mean = var = stdev = threshold for khush is number of false alarms with threshold are 1 number of false rejections with threshold are 0 As mentioned above, I started by keeping the value of x=2. This threshold gave a very high rate of error, allowing many imposters to be validated as another speaker. However, there were very few false rejections. So I experimented by varying the value of x to 3 and then finally x = 4. Currently, I have fixed the value of x as 4. However, with an increase in number of speakers, this would vary. 8

9 The User Interface: In order to make this system user friendly, I have developed a GUI application, which is simple and hides the layer of complexity from the user. There are two parts, for training a new speaker and to test a returning user. It is important to implement these features in a very short time, while demonstrating the application. I have incorporated a recorder in the GUI, so that no separate recording software is required. In theory, a new speaker would be added to the system offline. The background model need not contain all the users that are added to the system, but if there were a huge discrepancy in the actual number of users and the user data in the background sample pool, the results would get skewed. However, for the purpose of demonstration, while adding a new user, the background model is not modified. The entire procedure is automated using perl scripts. Once the user records a voice clip, and selects to be added to the system or to be identified as a particular speaker, all the processes are implemented and the result is shown on the screen. A new speaker is added by the following procedure: Speaker records a voice clip. Voice clip is converted into speech file of the correct format. 9

10 Using the data, the speaker model is generated. Using the same speech file, and the tuning files of the existing users, the threshold value of the speaker is generated. A speaker s identity is verified by the following procedure: User records voice clip Voice clip is converted into speech file. User selects his username from a drop down menu. Based on the user s selection, the likelihood value of the speech file is compared with the threshold value of the selected identity. If the likelihood value is higher than the threshold value, user is identified as speaker. If the likelihood value is below the threshold value, user is identified as imposter. 10

11 Conclusion: The aim of this project was to implement an application that would verify a speaker s identity by using the speaker s voice print characteristics that distinguish the speaker from other speakers. I wanted to implement a simple application using the algorithms already in existence. Data collection was a very important aspect of this project. It was a challenge to figure out how many speakers I should use. I initially had about 10, but then I increased that number to 25. It was also important to figure out what kind of data I should work with. Should I have multiple files or just one with a lot of speech? How many files for testing phase and tuning phase? I decided the details for data collection after a lot of trial and error. One of the challenges was understanding how the Hidden Markov Model Toolkit worked. It was important to extract the features that I needed for my experiments, and being able to manipulate the data the right way. One of the features extracted from the voice recording is energy. This energy corresponds to the loudness of the speaker s voice and would skew results if taken into account. So I to figure out how to remove the energy vector from the feature vectors that HTK generated. While the application gives pretty accurate results, it works well only under certain environmental circumstances. I recorded most of the data in a room with very little disturbance in the background. This is meant to be a single speaker verification system, so no other speakers should be heard in the background. Also, the microphone used for all the test speakers is the same. The mike is placed at a fixed distance from the speaker s mouth while recording the clip. Using a different microphone or adjusting the distance between the mike and speaker s mouth causes results to be skewed. So, the application works under this scenario, but not necessarily under any other circumstances. I would ve liked to accomplish this, but I was not successful. Overall I enjoyed working on this project since it was a topic that interested me. A blessing in disguise was my lack of information and awareness in this field, as it forced me to read and learn a lot on my own. Also, I learnt how to work on a large project with very little structure. It was important to set deadlines for myself, and keep working towards the end goal. There were times when everything went wrong and it was important not to give up. I am glad that I was able to achieve the goals I set for myself. 11

12 References: Logan, Beth. Mel Frequency Cepstral Coefficients for Music Modeling Schmidt, Regina. Identity Confirmed, Access Permitted: The Basics On Voice Authentication, Security And Consumer Use Of An Emerging Biometric. BiometriTech. 3 Sep < A Tutorial on Text-Independent Speaker Verification EURASIP Journal on Applied Signal Processing 2004 < Reynolds, Douglas. A., Quatieri, Thomas. F., Dunn, Robert B., Speaker Verification Using Adapted Gaussian Mixture Models Digital Signal Processing, The HTK Book 12

Hardware Implementation of Probabilistic State Machine for Word Recognition

Hardware Implementation of Probabilistic State Machine for Word Recognition IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS

ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS ImpostorMaps is a methodology developed by Auraya and available from Auraya resellers worldwide to configure,

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

Voice Authentication for ATM Security

Voice Authentication for ATM Security Voice Authentication for ATM Security Rahul R. Sharma Department of Computer Engineering Fr. CRIT, Vashi Navi Mumbai, India rahulrsharma999@gmail.com Abstract: Voice authentication system captures the

More information

Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics

Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics Anastasis Kounoudes 1, Anixi Antonakoudi 1, Vasilis Kekatos 2 1 The Philips College, Computing and Information Systems

More information

Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory

Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory Outline Signal Detection M. Sami Fadali Professor of lectrical ngineering University of Nevada, Reno Hypothesis testing. Neyman-Pearson (NP) detector for a known signal in white Gaussian noise (WGN). Matched

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

L9: Cepstral analysis

L9: Cepstral analysis L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,

More information

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Speech recognition for human computer interaction

Speech recognition for human computer interaction Speech recognition for human computer interaction Ubiquitous computing seminar FS2014 Student report Niklas Hofmann ETH Zurich hofmannn@student.ethz.ch ABSTRACT The widespread usage of small mobile devices

More information

Securing Electronic Medical Records using Biometric Authentication

Securing Electronic Medical Records using Biometric Authentication Securing Electronic Medical Records using Biometric Authentication Stephen Krawczyk and Anil K. Jain Michigan State University, East Lansing MI 48823, USA, krawcz10@cse.msu.edu, jain@cse.msu.edu Abstract.

More information

Measuring Performance in a Biometrics Based Multi-Factor Authentication Dialog. A Nuance Education Paper

Measuring Performance in a Biometrics Based Multi-Factor Authentication Dialog. A Nuance Education Paper Measuring Performance in a Biometrics Based Multi-Factor Authentication Dialog A Nuance Education Paper 2009 Definition of Multi-Factor Authentication Dialog Many automated authentication applications

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

Securing Electronic Medical Records Using Biometric Authentication

Securing Electronic Medical Records Using Biometric Authentication Securing Electronic Medical Records Using Biometric Authentication Stephen Krawczyk and Anil K. Jain Michigan State University, East Lansing MI 48823, USA {krawcz10,jain}@cse.msu.edu Abstract. Ensuring

More information

Alternative Biometric as Method of Information Security of Healthcare Systems

Alternative Biometric as Method of Information Security of Healthcare Systems Alternative Biometric as Method of Information Security of Healthcare Systems Ekaterina Andreeva Saint-Petersburg State University of Aerospace Instrumentation Saint-Petersburg, Russia eandreeva89@gmail.com

More information

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.

More information

Biometric Authentication using Online Signatures

Biometric Authentication using Online Signatures Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,

More information

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification Shou-Chun Yin, Richard Rose, Senior

More information

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics Unlocking Value from Patanjali V, Lead Data Scientist, Anand B, Director Analytics Consulting, EXECUTIVE SUMMARY Today a lot of unstructured data is being generated in the form of text, images, videos

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Speaker Identification and Verification (SIV) Introduction and Best Practices Document

Speaker Identification and Verification (SIV) Introduction and Best Practices Document Speaker Identification and Verification (SIV) Introduction and Best Practices Document Internal Working Draft February 13, 2006 VoiceXML Forum Speaker Biometrics Committee Authors: Valene Skerpac, ibiometrics,

More information

Lecture 8: Signal Detection and Noise Assumption

Lecture 8: Signal Detection and Noise Assumption ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,

More information

THE goal of Speaker Diarization is to segment audio

THE goal of Speaker Diarization is to segment audio 1 The ICSI RT-09 Speaker Diarization System Gerald Friedland* Member IEEE, Adam Janin, David Imseng Student Member IEEE, Xavier Anguera Member IEEE, Luke Gottlieb, Marijn Huijbregts, Mary Tai Knox, Oriol

More information

Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

More information

Continuous Biometric User Authentication in Online Examinations

Continuous Biometric User Authentication in Online Examinations 2010 Seventh International Conference on Information Technology Continuous Biometric User Authentication in Online Examinations Eric Flior, Kazimierz Kowalski Department of Computer Science, California

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition

Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition Ruben Vera-Rodriguez 1, John S.D. Mason 1 and Nicholas W.D. Evans 1,2 1 Speech and Image Research Group, Swansea University,

More information

Tuning Subwoofers - Calibrating Subwoofers

Tuning Subwoofers - Calibrating Subwoofers Tuning Subwoofers - Calibrating Subwoofers WHY The purpose of a subwoofer is to fill in the bottom octaves below the capabilities of the mains speakers. There are many reasons to use a subwoofer to do

More information

Biometric Authentication using Online Signature

Biometric Authentication using Online Signature University of Trento Department of Mathematics Outline Introduction An example of authentication scheme Performance analysis and possible improvements Outline Introduction An example of authentication

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

Solutions to Exam in Speech Signal Processing EN2300

Solutions to Exam in Speech Signal Processing EN2300 Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.

More information

Using Voice Biometrics in the Call Center. Best Practices for Authentication and Anti-Fraud Technology Deployment

Using Voice Biometrics in the Call Center. Best Practices for Authentication and Anti-Fraud Technology Deployment Using Voice Biometrics in the Call Center Best Practices for Authentication and Anti-Fraud Technology Deployment This whitepaper is designed for executives and managers considering voice biometrics to

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

VoiceSign TM Solution. Voice Signature Overview

VoiceSign TM Solution. Voice Signature Overview Voice Signature Overview VoiceSign adds 'speak on the dotted line' to transaction processes. Both business and client benefit from convenient and secure transaction verification process. How it works At

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE

EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE Uludağ Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, Cilt 18, Sayı 1, 2013 ARAŞTIRMA EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE Cemal HANİLÇİ * Figen ERTAŞ * Abstract:

More information

Measurement Information Model

Measurement Information Model mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides

More information

Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID

Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID R.Gokulavanan Assistant Professor, Department of Information Technology, Nandha Engineering College, Erode, Tamil Nadu,

More information

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

Tonal Analysis of Different Materials for Trumpet Mouthpieces

Tonal Analysis of Different Materials for Trumpet Mouthpieces Greg Formosa PHYS 199 POM Project Write-up Tonal Analysis of Different Materials for Trumpet Mouthpieces INTRODUCTION: Trumpets have been noted as one of the oldest instruments in the world, and ever since

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu

More information

Creating a NL Texas Hold em Bot

Creating a NL Texas Hold em Bot Creating a NL Texas Hold em Bot Introduction Poker is an easy game to learn by very tough to master. One of the things that is hard to do is controlling emotions. Due to frustration, many have made the

More information

Advanced Signal Processing and Digital Noise Reduction

Advanced Signal Processing and Digital Noise Reduction Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New

More information

Towards usable authentication on mobile phones: An evaluation of speaker and face recognition on off-the-shelf handsets

Towards usable authentication on mobile phones: An evaluation of speaker and face recognition on off-the-shelf handsets Towards usable authentication on mobile phones: An evaluation of speaker and face recognition on off-the-shelf handsets Rene Mayrhofer University of Applied Sciences Upper Austria Softwarepark 11, A-4232

More information

Writer Identification for Smart Meeting Room Systems

Writer Identification for Smart Meeting Room Systems Writer Identification for Smart Meeting Room Systems Marcus Liwicki 1, Andreas Schlapbach 1, Horst Bunke 1, Samy Bengio 2, Johnny Mariéthoz 2, and Jonas Richiardi 3 1 Department of Computer Science, University

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Online Diarization of Telephone Conversations

Online Diarization of Telephone Conversations Odyssey 2 The Speaker and Language Recognition Workshop 28 June July 2, Brno, Czech Republic Online Diarization of Telephone Conversations Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman Department of

More information

2WB05 Simulation Lecture 8: Generating random variables

2WB05 Simulation Lecture 8: Generating random variables 2WB05 Simulation Lecture 8: Generating random variables Marko Boon http://www.win.tue.nl/courses/2wb05 January 7, 2013 Outline 2/36 1. How do we generate random variables? 2. Fitting distributions Generating

More information

Security in Voice Authentication

Security in Voice Authentication Security in Voice Authentication by Chenguang Yang A Dissertation Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the Degree of Doctor of

More information

Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations

Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

IMDA Systems: Digital Signature Verification

IMDA Systems: Digital Signature Verification IMDA Systems: Digital Signature Verification ECE-492/3 Senior Design Project Spring 2011 Electrical and Computer Engineering Department Volgenau School of Engineering George Mason University Fairfax, VA

More information

Biometrics in Physical Access Control Issues, Status and Trends White Paper

Biometrics in Physical Access Control Issues, Status and Trends White Paper Biometrics in Physical Access Control Issues, Status and Trends White Paper Authored and Presented by: Bill Spence, Recognition Systems, Inc. SIA Biometrics Industry Group Vice-Chair & SIA Biometrics Industry

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

The Role of Automation Systems in Management of Change

The Role of Automation Systems in Management of Change The Role of Automation Systems in Management of Change Similar to changing lanes in an automobile in a winter storm, with change enters risk. Everyone has most likely experienced that feeling of changing

More information

Waves Trans-X. Software Audio Processor. User s Guide

Waves Trans-X. Software Audio Processor. User s Guide Waves Trans-X Software Audio Processor User s Guide Waves Trans-X software guide page 1 of 8 Chapter 1 Introduction and Overview The Waves Trans-X transient processor is a special breed of dynamics processor

More information

Random Fibonacci-type Sequences in Online Gambling

Random Fibonacci-type Sequences in Online Gambling Random Fibonacci-type Sequences in Online Gambling Adam Biello, CJ Cacciatore, Logan Thomas Department of Mathematics CSUMS Advisor: Alfa Heryudono Department of Mathematics University of Massachusetts

More information

159.334 Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology

159.334 Computer Networks. Network Security 1. Professor Richard Harris School of Engineering and Advanced Technology Network Security 1 Professor Richard Harris School of Engineering and Advanced Technology Presentation Outline Overview of Identification and Authentication The importance of identification and Authentication

More information

Monophonic Music Recognition

Monophonic Music Recognition Monophonic Music Recognition Per Weijnitz Speech Technology 5p per.weijnitz@gslt.hum.gu.se 5th March 2003 Abstract This report describes an experimental monophonic music recognition system, carried out

More information

Monotonicity Hints. Abstract

Monotonicity Hints. Abstract Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology

More information

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Hugues Salamin, Anna Polychroniou and Alessandro Vinciarelli University of Glasgow - School of computing Science, G128QQ

More information

Structural Health Monitoring Tools (SHMTools)

Structural Health Monitoring Tools (SHMTools) Structural Health Monitoring Tools (SHMTools) Getting Started LANL/UCSD Engineering Institute LA-CC-14-046 c Copyright 2014, Los Alamos National Security, LLC All rights reserved. May 30, 2014 Contents

More information

Semi-Supervised Learning in Inferring Mobile Device Locations

Semi-Supervised Learning in Inferring Mobile Device Locations Semi-Supervised Learning in Inferring Mobile Device Locations Rong Duan, Olivia Hong, Guangqin Ma rduan,ohong,gma @research.att.com AT&T Labs 200 S Laurel Ave Middletown, NJ 07748 Abstract With the development

More information

SIGNATURE VERIFICATION

SIGNATURE VERIFICATION SIGNATURE VERIFICATION Dr. H.B.Kekre, Dr. Dhirendra Mishra, Ms. Shilpa Buddhadev, Ms. Bhagyashree Mall, Mr. Gaurav Jangid, Ms. Nikita Lakhotia Computer engineering Department, MPSTME, NMIMS University

More information

COMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION

COMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION ITERATIOAL JOURAL OF RESEARCH I COMPUTER APPLICATIOS AD ROBOTICS ISS 2320-7345 COMPARATIVE STUDY OF RECOGITIO TOOLS AS BACK-EDS FOR BAGLA PHOEME RECOGITIO Kazi Kamal Hossain 1, Md. Jahangir Hossain 2,

More information

JdB Sound Acoustics Presents

JdB Sound Acoustics Presents JdB Sound Acoustics Presents KINGSTON ROAD UNITED CHURCH S A N C T U A RY A C O U S T I C S A N D S O U N D S Y S T E M B Y J O S E P H D E B U G L I O 2 0 0 8 Copyright by Joseph De Buglio 2008 The Beginning

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

Multi-factor Authentication in Banking Sector

Multi-factor Authentication in Banking Sector Multi-factor Authentication in Banking Sector Tushar Bhivgade, Mithilesh Bhusari, Ajay Kuthe, Bhavna Jiddewar,Prof. Pooja Dubey Department of Computer Science & Engineering, Rajiv Gandhi College of Engineering

More information

Error Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin

Error Log Processing for Accurate Failure Prediction. Humboldt-Universität zu Berlin Error Log Processing for Accurate Failure Prediction Felix Salfner ICSI Berkeley Steffen Tschirpke Humboldt-Universität zu Berlin Introduction Context of work: Error-based online failure prediction: error

More information

Statistical Analysis of Signature Features with Respect to Applicability in Off-line Signature Verification

Statistical Analysis of Signature Features with Respect to Applicability in Off-line Signature Verification Statistical Analysis of Signature Features with Respect to Applicability in Off-line Signature Verification BENCE KOVARI, HASSAN CHARAF Department of Automation and Applied Informatics Budapest University

More information

Automatic parameter regulation for a tracking system with an auto-critical function

Automatic parameter regulation for a tracking system with an auto-critical function Automatic parameter regulation for a tracking system with an auto-critical function Daniela Hall INRIA Rhône-Alpes, St. Ismier, France Email: Daniela.Hall@inrialpes.fr Abstract In this article we propose

More information

RF Network Analyzer Basics

RF Network Analyzer Basics RF Network Analyzer Basics A tutorial, information and overview about the basics of the RF Network Analyzer. What is a Network Analyzer and how to use them, to include the Scalar Network Analyzer (SNA),

More information

Welcome to the training on the TransCelerate approach to Risk-Based Monitoring. This course will take you through five modules of information to

Welcome to the training on the TransCelerate approach to Risk-Based Monitoring. This course will take you through five modules of information to Welcome to the training on the TransCelerate approach to Risk-Based Monitoring. This course will take you through five modules of information to introduce you to the concepts behind risk-based monitoring,

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

User Authentication Methods for Mobile Systems Dr Steven Furnell

User Authentication Methods for Mobile Systems Dr Steven Furnell User Authentication Methods for Mobile Systems Dr Steven Furnell Network Research Group University of Plymouth United Kingdom Overview The rise of mobility and the need for user authentication A survey

More information

The Sonometer The Resonant String and Timbre Change after plucking

The Sonometer The Resonant String and Timbre Change after plucking The Sonometer The Resonant String and Timbre Change after plucking EQUIPMENT Pasco sonometers (pick up 5 from teaching lab) and 5 kits to go with them BK Precision function generators and Tenma oscilloscopes

More information

My DevOps Journey by Billy Foss, Engineering Services Architect, CA Technologies

My DevOps Journey by Billy Foss, Engineering Services Architect, CA Technologies About the author My DevOps Journey by Billy Foss, Engineering Services Architect, CA Technologies I am going to take you through the journey that my team embarked on as we looked for ways to automate processes,

More information

SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS

SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS UDC: 004.8 Original scientific paper SELECTING NEURAL NETWORK ARCHITECTURE FOR INVESTMENT PROFITABILITY PREDICTIONS Tonimir Kišasondi, Alen Lovren i University of Zagreb, Faculty of Organization and Informatics,

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Framework for Biometric Enabled Unified Core Banking

Framework for Biometric Enabled Unified Core Banking Proc. of Int. Conf. on Advances in Computer Science and Application Framework for Biometric Enabled Unified Core Banking Manohar M, R Dinesh and Prabhanjan S Research Candidate, Research Supervisor, Faculty

More information

What makes a good coder and technology user at Mountfields Lodge School?

What makes a good coder and technology user at Mountfields Lodge School? What makes a good coder and technology user at Mountfields Lodge School? Pupils who persevere to become competent in coding for a variety of practical and inventive purposes, including the application

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

Probabilistic user behavior models in online stores for recommender systems

Probabilistic user behavior models in online stores for recommender systems Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta].

degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. 1.3 Neural Networks 19 Neural Networks are large structured systems of equations. These systems have many degrees of freedom and are able to adapt to the task they are supposed to do [Gupta]. Two very

More information

On sequence kernels for SVM classification of sets of vectors: application to speaker verification

On sequence kernels for SVM classification of sets of vectors: application to speaker verification On sequence kernels for SVM classification of sets of vectors: application to speaker verification Major part of the Ph.D. work of In collaboration with Jérôme Louradour Francis Bach (ARMINES) within E-TEAM

More information

Single voice command recognition by finite element analysis

Single voice command recognition by finite element analysis Single voice command recognition by finite element analysis Prof. Dr. Sc. Raycho Ilarionov Assoc. Prof. Dr. Nikolay Madzharov MSc. Eng. Georgi Tsanev Applications of voice recognition Playing back simple

More information

Camtasia: Importing, cutting, and captioning your Video Express movie Camtasia Studio: Windows

Camtasia: Importing, cutting, and captioning your Video Express movie Camtasia Studio: Windows Camtasia: Importing, cutting, and captioning your Video Express movie Camtasia Studio: Windows Activity 1: Adding your Video Express output into Camtasia Studio Step 1: the footage you shot in the Video

More information