Stress management with Music Therapy

Size: px
Start display at page:

Download "Stress management with Music Therapy"

Transcription

1 Stress management with Music Therapy Sougata Das 1 and Ayan Mukherjee 2 1 Senior Systems Engineer, IBM India Private Limited, Kolkata, India 2 Assistant Professor, Dept. of MCA, Brainware Group of Institutions, Barasat, Kolkata, India Abstract This paper is a culmination of two dynamic fields which is of major importance in the practical world Voice Recognition and Music Therapy. Voice Recognition, which is a very important field and is under development by various commercial and research organizations all over the world. It is implemented in critical applications such as healthcare, military, to regular applications such as smart phones, microwaves, biometric security. However, a portion of our speech which is less quantitative in nature is the emotion involved in speech. It is that portion of our speech which adds dynamism to our speech. This paper is an attempt to extract the emotional part of a speech and detect the emotional content of the speech. The detection of emotion is independent of any adjective-oriented word spoken but rather the emphasis on the pitch, tonality and stress on particular words. Keywords:- DCT, DFT, FFT, MFCC, Cepstrum, Mel Scale, Mel Spectrum, Hamming Window, Music Therapy, Emotion Detection. 1. INTRODUCTION 1.1 History of Music Therapy Music Therapy [6] is the systematic application of music in the treatment of the physiological and psychosocial aspects of an illness or disability. It focuses on the acquisition of non-musical skills and behaviors, as determined by a board certified music therapist through systematic assessment and treatment planning. Therefore, it is an allied health profession and one of the expressive therapies, consisting of an interpersonal process in which a certified music therapist uses music and all of its facets physical, emotional, mental, social, aesthetic, and spiritual to help clients to improve or maintain their health. Music therapy in the United States of America began in the late 18th century. However, using music as a healing medium dates back to ancient times. This is evident in biblical scriptures and historical writings of ancient civilizations such as Egypt, China, India, Greece and Rome. Today, the power of music remains the same but music is used much differently than it was in ancient times. The profession of music therapy in the United States began to develop during W.W.I and W.W. II, when music was used in Veterans Administration Hospitals as an intervention to address traumatic war injuries. Veterans actively and passively engaged in music activities that focused on relieving pain perception. Numerous doctors and nurses witnessed the effect music had on veterans' psychological, physiological, cognitive, and emotional state. Since then, colleges and universities developed programs to train musicians how to use music for therapeutic purposes. In 1950, a professional organization was formed by a collaboration of music therapists that worked with veterans, mentally retarded, hearing/visually impaired, and psychiatric populations. This was the birth of the National Association for Music Therapy (NAMT). In 1998, NAMT joined forces with another music therapy organization to become what is now known as the American Music Therapy Association (AMTA). 1.2 History of Voice Recognition Voice Recognition or a more accurate term which we can say is speech recognition is a form of technology which was developed solely to remove the concept of typing or writing or rather introduces the human voice as an input. Speech Recognition was first developed in the institute which is the mother of all computer oriented inventions, the Bell Labs. The first voice operated system developed was AUDREY in the year 1952, which had the ability to identify the spoken digits. Exactly 10 years from then, the leading corporate giants and another powerhouse of innovation, IBM, first demonstrated the ShoeBox, voice recognition software which had the ability to recognized 16 spoken English words. Slowly the idea of voice or rather speech recognition was spreading far and wide and laboratories in United States, Japan, England, and Soviet Union were developing voice recognition systems and also developing dedicated hardware to support such systems. Despite how little these efforts might sound but it was impressive beginning given to the fact that computation was quite primitive and not so developed. In 1970s, the US Department of Defense started taking interest in voice recognition and a Speech Understanding Research (SUR) cell was formed. Research work was going on all over the world when Carnegie-Mellon University first developed HARPY speech understanding system which was capable or recognizing 1011 words words could be supposed to be an average vocabulary of a 3 year old child. The most interesting point in HARPY was the search techniques involved a heuristic search based algorithm called the BEAM SEARCH, which provided an optimal Volume 3, Issue 6, November-December 2014 Page 273

2 solution. At the end of the 70s, Voice Recognition went around from a single voice to identifying and operating on the basis of multi-people voices. Also due to development on linguistics dedicated towards speech recognition, speech recognition went from a free hundred words to a few thousand words and potentially the ability to recognize unlimited words. 12 CONTROL FLOW DIAGRAM 12.1 Block Diagram 2. PROBLEM STATEMENT This project aims to detect the negative emotion from a given sound and determine that which music is to be implemented as a therapy on the given subject. The steps are as discussed below: Voice Recording At application level, a voice is to be recorded and fed as input. Detection of emotion The program will provide us with the output of an emotion detected. Final Result On the basis of the emotion detected, a mapping function will generate which music will be used as a therapy on the subject. 10 OBJECTIVES The objective of this project is to do stress management using music therapy but the innovation is to apply the music therapy using an automated method of detecting the stress in the voice sample. The primary objective and motivation of this project is to reduce stress problems by utilizing music therapy. The ragas involved in the music therapy happen to produce a positive effect on the subject and reduce the temporary negative emotions present. 11 BENEFITS It is an effective automation system which will determine the negative emotion and rather help the subject from going in negative psychological state. It will remove the necessity of any human intervention in the following process Reduce human mortality by reducing the chances of deaths related to psychiatric issues. It can be implemented on any device such as smart phones which will be available to each and every people Flow Diagram Figure 1 Block Diagram Figure 2 Flow Diagram 13 PROCESS DESCRIPTION The Stress Management [1] using Music Therapy is the complete culmination of two different well known fields. It is a combination of Music Therapy and Voice Recognition but involving the ability to detect emotional content from a given voice/speech sample. The entire system is divided in two sections. One section deals with the detection of emotion from the given voice/speech sample and the other section works with the mapping Volume 3, Issue 6, November-December 2014 Page 274

3 of the given emotion with the required raga which can be used to calm that emotion Framing It is generally known that a given speech sample will not be stationary over time or we will not be able to find any consistency in its waveform, But given in short term analysis of signals, if we consider an interval which is quite short in length, then we can consider the wave to stationary. The non-uniformity of the speech or voice waveform is given due to the fact of the rate of movement of speech articulators i.e. the lips, jaw, tongue etc. As the change of the voice spectrum is directly dependent to the rate of change of the articulation of the speech. We take a frame where the speech waveform is cropped and any extra silence of acoustic interference is removed which may be present in the staring or ending of the file Windowing The frames are taken and processed to remove any sort of signal discontinuities in the beginning or end of the frame. The concept here is to minimize the spectral distortion by using the window to taper the signal to zero at the beginning and end of each frame. In other words, when we perform Fourier Transform, it assumes that the signal repeats, and the end of one frame does not connect smoothly with the beginning of the next one. This introduces some glitches at regular intervals. So we have to make the ends of each frame smooth enough to connect with each other The technique involved in preparing the windows of the signal requires the implementation of soft window. However, it is mostly seen that Hamming Window is the mostly preferred technique because it causes the window to smoothly taper at both the ends. The Hamming Window function involved is: w(n) = cos(2 n/n-1) (1) where w(n) = hamming window function. n = any given sample from the total number of frames. N-1 = the total number of frames obtained 13.3 Fourier Transformation The given frame is now to be processed by a Fourier Transformation (Discrete Fourier Transformation) which converts each frame from the given N frames from time domain to the frequency domain. As the frames obtained were on the basis of the time domain against the amplitude but now after the implementation of the Fourier transformation, it converts it from time domain to frequency domain. The Fourier Transformation is implemented using the Fast Fourier Transform (FFT) algorithm which is a faster method of Discrete Fourier Transformation. The FFT is a fast algorithm to implement the Discrete Fourier Transform (DFT) which is defined on the set of N samples {x} as follow: N 1 X n = xk e -2i k(n/n) n=0,1,2,..,n-1 (2) k Mel - Frequency Wrapping The spectrum obtained in the frequency domain is now taken as input for this stage. The signal is plotted against the Mel-Scale [3] to mimic the human hearing. Human perception of frequency contents of sounds for speech signal does not follow a linear scale. Thus for each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called the Mel scale. The Mel-frequency scale is a linear frequency spacing below 1000 Hz and a logarithmic spacing above 1000 Hz. As a reference point, the pitch of a 1 KHz tone, 40dB above the perceptual hearing threshold, is defined as 1000 Mels. Therefore we can use the following approximate formula to compute the Mel on the basis of a given frequency. Mel(f) = 2595*log 10 (1 + f/700) (3) The approach is to simulate the subjective spectrum is to use a filter bank, one filter for each desired Mel-frequency component. That filter bank has a triangular band pass frequency response and the spacing as well as the bandwidth is determined by a constant Mel-frequency interval. The Mel scale filter bank is a series of l triangular band pass filters that have been designed to simulate the band pass filtering believed to occur in the auditory system. This corresponds to series of band pass filters with constant bandwidth and spacing on a Mel frequency scale Cepstrum The Cepstrum [4] name is derived from the word Spectrum by reversing the first four letters ''Spec becomes Ceps trum. We can additionally say that cepstrum is the Fourier Transform of the logarithm of the Fourier Transform of the Window Signal. Cepstrum = FT(log(FT(window signal))+j2πm) (4) The real values real cepstrum uses the logarithm function. While for defining the complex values whereas the complex cepstrum uses the complex logarithm function. The real cepstrum uses the information of the magnitude of the spectrum whereas complex cepstrum holds information about both magnitude and phase of the initial spectrum, which allows the reconstruction of the signal. The cepstral representation of the speech spectrum provides us with a very good representation of the local spectral properties of the signal for a given frame analysis. Volume 3, Issue 6, November-December 2014 Page 275

4 14 TEST RESULTS On the basis of the value generated, we play the necessary raga file. A mapping table is provided under the section of MUSIC THERAPY [8] where we have provided a list of diseases which can be used to cure using a particular music file. Table 1: Classification of various moods according to the Ragas Mood Sad Depression Hypertension Anger Fear Ragas Kafi Kapi Bageshri Sahana Mishra Mand Figure 3 Cepstral Representation The values of the cepstrum are then converted from frequency domain to time domain using Discrete Cosine Transformation. Thus we can calculate the MFCC's as: The following set of figures show the plot of the signal on left side and the power spectrum on the right Fear 13.6 Mel Frequency Cepstrum Coefficient (MFCC) Mel Frequency Cepstrum Coefficient are coefficients which represent audio on the basis of perception. It was developed by Paul Mermelstein along with Bridle and Brown who proposed the idea. It generates a 20 dimensional matrix from the signal and we utilize the value and generate and algorithm to deduce what emotion we have as the sample. Algorithmically, the concept of cepstrum [2] is presented here in the form of a block diagram. Figure below shows the flow chart that describes as to how to obtain cepstrum from a signal. (4) Anger Figure 5 Figure 3: Flow chart of Cepstrum The MFCCs are the amplitudes of the resulting spectrum. This procedure is represented step- wise in the figure below Depression Figure 6 Figure 4: MFCC Flow Chart Figure 7 Volume 3, Issue 6, November-December 2014 Page 276

5 HyperTension Figure 8 Figure 9: Data Store for sad mood 15 CONCLUSION The project is an example of two distinct fields of Computer Science and Para medicine merged into a single field and though at a nascent stage with very narrow production, it will lead to a very promising field. The project provides a solution to Stress Management which is very prevalent today in the modern world and if the project is taken up at a higher level it can be applied at a commercial level also. Though the technique used to emulate human hearing and perception, the Mel Frequency Cepstrum Coefficient is very much prone to noise and even after the noise is reduced, the results produced are sometimes very limited as we are unable to properly detect the emotion, however by utilising proper scales and on further research, it will be very much possible to detect the correct emotion with much accuracy. Even though with all the limitations, we have tried our level best to produce satisfactory result and generate a solution by which we can map the negative emotions with the ragas which can be used to cure them. References [1] Kumar, Ch.Srinivasan (2011), Design Of An Automatic Speaker Recognition System [2] Neiberg, Daniel (2006), Emotion Recognition in Spontaneous Speech Using GMMs, [3] Cornaz, Christian (2003), An Automatic Speaker Recognition System, February 03 [4] Tiwari, Vibha (2010), MFCC And Its Applications In Speaker Recognition, February 10 [5] Sairam,T.V: Music And Moods 2,[Online] (Sept 13, 2013) [6] MusicTherapy,[Online] Music_therapy (Aug 21, 2013) [7] Mahesh, Anuradha: Music-Therapy For Wellness,[Online] com/music-therapy/ ( Sept 20, 2013) [8] Raga Therapy For Healing Mind And Body,[Online] ntinfo/raga-therapy-for-healing-mind-and-bodyhealing-ragas_print.htm (Aug 30, 2013) [9] Music As Medicine, [Online] (Aug 29, 2013) 16 LIMITATIONS AND SCOPE FOR FUTURE Firstly, this application is still at a nascent stage and due to some hardware irregularities we have to work on stored audio file instead of real time recording. Secondly, highly noisy audio input produces deviating result and does not produce the correct result. Thirdly, not all emotions can be detected at present due to unavailability of exact mood emotion source audio files. This can be implemented as a smart phone app or web application. The range of emotions can be increased by professional actors. A professional database application can be used for efficient storage and retrieval of data. Volume 3, Issue 6, November-December 2014 Page 277

L9: Cepstral analysis

L9: Cepstral analysis L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

School Class Monitoring System Based on Audio Signal Processing

School Class Monitoring System Based on Audio Signal Processing C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

Hardware Implementation of Probabilistic State Machine for Word Recognition

Hardware Implementation of Probabilistic State Machine for Word Recognition IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

Quarterly Progress and Status Report. Measuring inharmonicity through pitch extraction

Quarterly Progress and Status Report. Measuring inharmonicity through pitch extraction Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Measuring inharmonicity through pitch extraction Galembo, A. and Askenfelt, A. journal: STL-QPSR volume: 35 number: 1 year: 1994

More information

Spectrum Level and Band Level

Spectrum Level and Band Level Spectrum Level and Band Level ntensity, ntensity Level, and ntensity Spectrum Level As a review, earlier we talked about the intensity of a sound wave. We related the intensity of a sound wave to the acoustic

More information

The loudness war is fought with (and over) compression

The loudness war is fought with (and over) compression The loudness war is fought with (and over) compression Susan E. Rogers, PhD Berklee College of Music Dept. of Music Production & Engineering 131st AES Convention New York, 2011 A summary of the loudness

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1 WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's

More information

Myanmar Continuous Speech Recognition System Based on DTW and HMM

Myanmar Continuous Speech Recognition System Based on DTW and HMM Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-

More information

Auto-Tuning Using Fourier Coefficients

Auto-Tuning Using Fourier Coefficients Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition

More information

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song , pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,

More information

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19 Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

T-61.184. Automatic Speech Recognition: From Theory to Practice

T-61.184. Automatic Speech Recognition: From Theory to Practice Automatic Speech Recognition: From Theory to Practice http://www.cis.hut.fi/opinnot// September 27, 2004 Prof. Bryan Pellom Department of Computer Science Center for Spoken Language Research University

More information

FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM

FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS Fanbo Xiang UIUC Physics 193 POM Professor Steven M. Errede Fall 2014 1 Introduction Chords, an essential part of music, have long been analyzed. Different

More information

Marathi Interactive Voice Response System (IVRS) using MFCC and DTW

Marathi Interactive Voice Response System (IVRS) using MFCC and DTW Marathi Interactive Voice Response System (IVRS) using MFCC and DTW Manasi Ram Baheti Department of CSIT, Dr.B.A.M. University, Aurangabad, (M.S.), India Bharti W. Gawali Department of CSIT, Dr.B.A.M.University,

More information

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda

More information

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING

LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING RasPi Kaveri Ratanpara 1, Priyan Shah 2 1 Student, M.E Biomedical Engineering, Government Engineering college, Sector-28, Gandhinagar (Gujarat)-382028,

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC

CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC CBS RECORDS PROFESSIONAL SERIES CBS RECORDS CD-1 STANDARD TEST DISC 1. INTRODUCTION The CBS Records CD-1 Test Disc is a highly accurate signal source specifically designed for those interested in making

More information

Lecture 4: Jan 12, 2005

Lecture 4: Jan 12, 2005 EE516 Computer Speech Processing Winter 2005 Lecture 4: Jan 12, 2005 Lecturer: Prof: J. Bilmes University of Washington Dept. of Electrical Engineering Scribe: Scott Philips

More information

RF Network Analyzer Basics

RF Network Analyzer Basics RF Network Analyzer Basics A tutorial, information and overview about the basics of the RF Network Analyzer. What is a Network Analyzer and how to use them, to include the Scalar Network Analyzer (SNA),

More information

Lecture 1-6: Noise and Filters

Lecture 1-6: Noise and Filters Lecture 1-6: Noise and Filters Overview 1. Periodic and Aperiodic Signals Review: by periodic signals, we mean signals that have a waveform shape that repeats. The time taken for the waveform to repeat

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques

More information

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - dp@audiomatica.com INTRODUCTION The Consumer Electronics Association (CEA),

More information

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;

More information

Aircraft cabin noise synthesis for noise subjective analysis

Aircraft cabin noise synthesis for noise subjective analysis Aircraft cabin noise synthesis for noise subjective analysis Bruno Arantes Caldeira da Silva Instituto Tecnológico de Aeronáutica São José dos Campos - SP brunoacs@gmail.com Cristiane Aparecida Martins

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Occupational Therapy Assisting STARK STATE COLLEGE OF TECHNOLOGY

Occupational Therapy Assisting STARK STATE COLLEGE OF TECHNOLOGY Occupational Therapy Assisting STARK STATE COLLEGE OF TECHNOLOGY Presented by The OTA class of Spring 2010 What is Occupational Therapy? Occupation: Activity in which one engages Therapy: Treatment of

More information

have more skill and perform more complex

have more skill and perform more complex Speech Recognition Smartphone UI Speech Recognition Technology and Applications for Improving Terminal Functionality and Service Usability User interfaces that utilize voice input on compact devices such

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

The Phase Modulator In NBFM Voice Communication Systems

The Phase Modulator In NBFM Voice Communication Systems The Phase Modulator In NBFM Voice Communication Systems Virgil Leenerts 8 March 5 The phase modulator has been a point of discussion as to why it is used and not a frequency modulator in what are called

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

Chapter 4: Eligibility Categories

Chapter 4: Eligibility Categories 23 Chapter 4: Eligibility Categories In this chapter you will: learn the different special education categories 24 IDEA lists different disability categories under which children may be eligible for services.

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID

Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID R.Gokulavanan Assistant Professor, Department of Information Technology, Nandha Engineering College, Erode, Tamil Nadu,

More information

Lecture 14. Point Spread Function (PSF)

Lecture 14. Point Spread Function (PSF) Lecture 14 Point Spread Function (PSF), Modulation Transfer Function (MTF), Signal-to-noise Ratio (SNR), Contrast-to-noise Ratio (CNR), and Receiver Operating Curves (ROC) Point Spread Function (PSF) Recollect

More information

K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002

K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002 K2 CW Filter Alignment Procedures Using Spectrogram 1 ver. 5 01/17/2002 It will be assumed that you have already performed the RX alignment procedures in the K2 manual, that you have already selected the

More information

3030. Eligibility Criteria.

3030. Eligibility Criteria. 3030. Eligibility Criteria. 5 CA ADC 3030BARCLAYS OFFICIAL CALIFORNIA CODE OF REGULATIONS Barclays Official California Code of Regulations Currentness Title 5. Education Division 1. California Department

More information

Lecture 1-10: Spectrograms

Lecture 1-10: Spectrograms Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed

More information

FREQUENCY RESPONSE OF AN AUDIO AMPLIFIER

FREQUENCY RESPONSE OF AN AUDIO AMPLIFIER 2014 Amplifier - 1 FREQUENCY RESPONSE OF AN AUDIO AMPLIFIER The objectives of this experiment are: To understand the concept of HI-FI audio equipment To generate a frequency response curve for an audio

More information

Analysis/resynthesis with the short time Fourier transform

Analysis/resynthesis with the short time Fourier transform Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis

More information

Speech Analysis for Automatic Speech Recognition

Speech Analysis for Automatic Speech Recognition Speech Analysis for Automatic Speech Recognition Noelia Alcaraz Meseguer Master of Science in Electronics Submission date: July 2009 Supervisor: Torbjørn Svendsen, IET Norwegian University of Science and

More information

Thirukkural - A Text-to-Speech Synthesis System

Thirukkural - A Text-to-Speech Synthesis System Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,

More information

Analog-to-Digital Voice Encoding

Analog-to-Digital Voice Encoding Analog-to-Digital Voice Encoding Basic Voice Encoding: Converting Analog to Digital This topic describes the process of converting analog signals to digital signals. Digitizing Analog Signals 1. Sample

More information

FFT Algorithms. Chapter 6. Contents 6.1

FFT Algorithms. Chapter 6. Contents 6.1 Chapter 6 FFT Algorithms Contents Efficient computation of the DFT............................................ 6.2 Applications of FFT................................................... 6.6 Computing DFT

More information

FTIR Instrumentation

FTIR Instrumentation FTIR Instrumentation Adopted from the FTIR lab instruction by H.-N. Hsieh, New Jersey Institute of Technology: http://www-ec.njit.edu/~hsieh/ene669/ftir.html 1. IR Instrumentation Two types of instrumentation

More information

B3. Short Time Fourier Transform (STFT)

B3. Short Time Fourier Transform (STFT) B3. Short Time Fourier Transform (STFT) Objectives: Understand the concept of a time varying frequency spectrum and the spectrogram Understand the effect of different windows on the spectrogram; Understand

More information

Control of affective content in music production

Control of affective content in music production International Symposium on Performance Science ISBN 978-90-9022484-8 The Author 2007, Published by the AEC All rights reserved Control of affective content in music production António Pedro Oliveira and

More information

SR2000 FREQUENCY MONITOR

SR2000 FREQUENCY MONITOR SR2000 FREQUENCY MONITOR THE FFT SEARCH FUNCTION IN DETAILS FFT Search is a signal search using FFT (Fast Fourier Transform) technology. The FFT search function first appeared with the SR2000 Frequency

More information

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN AUDIOLOGY (MSc[Audiology])

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN AUDIOLOGY (MSc[Audiology]) 224 REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN AUDIOLOGY (MSc[Audiology]) (See also General Regulations) Any publication based on work approved for a higher degree should contain a reference to

More information

COMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION

COMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION ITERATIOAL JOURAL OF RESEARCH I COMPUTER APPLICATIOS AD ROBOTICS ISS 2320-7345 COMPARATIVE STUDY OF RECOGITIO TOOLS AS BACK-EDS FOR BAGLA PHOEME RECOGITIO Kazi Kamal Hossain 1, Md. Jahangir Hossain 2,

More information

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY 3 th World Conference on Earthquake Engineering Vancouver, B.C., Canada August -6, 24 Paper No. 296 SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY ASHOK KUMAR SUMMARY One of the important

More information

Music Genre Classification

Music Genre Classification Music Genre Classification Michael Haggblade Yang Hong Kenny Kao 1 Introduction Music classification is an interesting problem with many applications, from Drinkify (a program that generates cocktails

More information

Building Design for Advanced Technology Instruments Sensitive to Acoustical Noise

Building Design for Advanced Technology Instruments Sensitive to Acoustical Noise Building Design for Advanced Technology Instruments Sensitive to Acoustic Noise Michael Gendreau Colin Gordon & Associates Presentation Outline! High technology research and manufacturing instruments respond

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

MICROPHONE SPECIFICATIONS EXPLAINED

MICROPHONE SPECIFICATIONS EXPLAINED Application Note AN-1112 MICROPHONE SPECIFICATIONS EXPLAINED INTRODUCTION A MEMS microphone IC is unique among InvenSense, Inc., products in that its input is an acoustic pressure wave. For this reason,

More information

Basics of Digital Recording

Basics of Digital Recording Basics of Digital Recording CONVERTING SOUND INTO NUMBERS In a digital recording system, sound is stored and manipulated as a stream of discrete numbers, each number representing the air pressure at a

More information

T = 1 f. Phase. Measure of relative position in time within a single period of a signal For a periodic signal f(t), phase is fractional part t p

T = 1 f. Phase. Measure of relative position in time within a single period of a signal For a periodic signal f(t), phase is fractional part t p Data Transmission Concepts and terminology Transmission terminology Transmission from transmitter to receiver goes over some transmission medium using electromagnetic waves Guided media. Waves are guided

More information

Comprehensive Special Education Plan. Programs and Services for Students with Disabilities

Comprehensive Special Education Plan. Programs and Services for Students with Disabilities Comprehensive Special Education Plan Programs and Services for Students with Disabilities The Pupil Personnel Services of the Corning-Painted Post Area School District is dedicated to work collaboratively

More information

American Standard Sign Language Representation Using Speech Recognition

American Standard Sign Language Representation Using Speech Recognition American Standard Sign Language Representation Using Speech Recognition 1 Attar Shahebaj, 2 Deth Krupa, 3 Jadhav Madhavi 1,2,3 E&TC Dept., BVCOEW, Pune, India Abstract: For many deaf people, sign language

More information

Functional Communication for Soft or Inaudible Voices: A New Paradigm

Functional Communication for Soft or Inaudible Voices: A New Paradigm The following technical paper has been accepted for presentation at the 2005 annual conference of the Rehabilitation Engineering and Assistive Technology Society of North America. RESNA is an interdisciplinary

More information

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification

Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification Voice Communication Package v7.0 of front-end voice processing software technologies General description and technical specification (Revision 1.0, May 2012) General VCP information Voice Communication

More information

Figure1. Acoustic feedback in packet based video conferencing system

Figure1. Acoustic feedback in packet based video conferencing system Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents

More information

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB FACULTY OF ENGINEERING AND SUSTAINABLE DEVELOPMENT. The Algorithms of Speech Recognition, Programming and Simulating in MATLAB Tingxiao Yang January 2012 Bachelor s Thesis in Electronics Bachelor s Program

More information

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics

Unlocking Value from. Patanjali V, Lead Data Scientist, Tiger Analytics Anand B, Director Analytics Consulting,Tiger Analytics Unlocking Value from Patanjali V, Lead Data Scientist, Anand B, Director Analytics Consulting, EXECUTIVE SUMMARY Today a lot of unstructured data is being generated in the form of text, images, videos

More information

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals Modified from the lecture slides of Lami Kaya (LKaya@ieee.org) for use CECS 474, Fall 2008. 2009 Pearson Education Inc., Upper

More information

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military

More information

Trigonometric functions and sound

Trigonometric functions and sound Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude

More information

CALIFORNIA SPECIAL EDUCATION MANAGEMENT INFORMATION SYSTEM (CASEMIS) SERVICE DESCRIPTIONS. San Diego Unified SELPA

CALIFORNIA SPECIAL EDUCATION MANAGEMENT INFORMATION SYSTEM (CASEMIS) SERVICE DESCRIPTIONS. San Diego Unified SELPA 210 Family training, counseling, and home visits(ages 0-2 only): This service includes: services provided by social workers, psychologists, or other qualified personnel to assist the family in understanding

More information

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking

Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various

More information

ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING

ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING ACOUSTICAL CONSIDERATIONS FOR EFFECTIVE EMERGENCY ALARM SYSTEMS IN AN INDUSTRIAL SETTING Dennis P. Driscoll, P.E. and David C. Byrne, CCC-A Associates in Acoustics, Inc. Evergreen, Colorado Telephone (303)

More information

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques Vineela Behara,Y Ramesh Department of Computer Science and Engineering Aditya institute of Technology and

More information

PART I DEPARTMENT OF PERSONNEL SERVICES 6.342 STATE OF HAWAII 6.343... 6.344. Class Specifications for the AUDIOLOGY SERIES

PART I DEPARTMENT OF PERSONNEL SERVICES 6.342 STATE OF HAWAII 6.343... 6.344. Class Specifications for the AUDIOLOGY SERIES PART I DEPARTMENT OF PERSONNEL SERVICES 6.342 STATE OF HAWAII 6.343............................. 6.344 Class Specifications for the AUDIOLOGY SERIES This series includes all classes of positions, the duties

More information

Signal to Noise Instrumental Excel Assignment

Signal to Noise Instrumental Excel Assignment Signal to Noise Instrumental Excel Assignment Instrumental methods, as all techniques involved in physical measurements, are limited by both the precision and accuracy. The precision and accuracy of a

More information

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29. Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet

More information

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP Department of Electrical and Computer Engineering Ben-Gurion University of the Negev LAB 1 - Introduction to USRP - 1-1 Introduction In this lab you will use software reconfigurable RF hardware from National

More information

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream

IBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu

More information

Mathematical Harmonies Mark Petersen

Mathematical Harmonies Mark Petersen 1 Mathematical Harmonies Mark Petersen What is music? When you hear a flutist, a signal is sent from her fingers to your ears. As the flute is played, it vibrates. The vibrations travel through the air

More information

What Audio Engineers Should Know About Human Sound Perception. Part 2. Binaural Effects and Spatial Hearing

What Audio Engineers Should Know About Human Sound Perception. Part 2. Binaural Effects and Spatial Hearing What Audio Engineers Should Know About Human Sound Perception Part 2. Binaural Effects and Spatial Hearing AES 112 th Convention, Munich AES 113 th Convention, Los Angeles Durand R. Begault Human Factors

More information

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones Final Year Project Progress Report Frequency-Domain Adaptive Filtering Myles Friel 01510401 Supervisor: Dr.Edward Jones Abstract The Final Year Project is an important part of the final year of the Electronic

More information

Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany

Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany Audio Engineering Society Convention Paper Presented at the 112th Convention 2002 May 10 13 Munich, Germany This convention paper has been reproduced from the author's advance manuscript, without editing,

More information

PRESCHOOL PLACEMENT CATEGORIES

PRESCHOOL PLACEMENT CATEGORIES PRESCHOOL PLACEMENT CATEGORIES CASEMIS 20 EARLY CHILDHOOD SPECIAL EDUCATION SETTING Early Childhood Special Education Setting: This is a placement setting where children receive all of their special education

More information

Filter Comparison. Match #1: Analog vs. Digital Filters

Filter Comparison. Match #1: Analog vs. Digital Filters CHAPTER 21 Filter Comparison Decisions, decisions, decisions! With all these filters to choose from, how do you know which to use? This chapter is a head-to-head competition between filters; we'll select

More information

Alternative Biometric as Method of Information Security of Healthcare Systems

Alternative Biometric as Method of Information Security of Healthcare Systems Alternative Biometric as Method of Information Security of Healthcare Systems Ekaterina Andreeva Saint-Petersburg State University of Aerospace Instrumentation Saint-Petersburg, Russia eandreeva89@gmail.com

More information

1. (Ungraded) A noiseless 2-kHz channel is sampled every 5 ms. What is the maximum data rate?

1. (Ungraded) A noiseless 2-kHz channel is sampled every 5 ms. What is the maximum data rate? Homework 2 Solution Guidelines CSC 401, Fall, 2011 1. (Ungraded) A noiseless 2-kHz channel is sampled every 5 ms. What is the maximum data rate? 1. In this problem, the channel being sampled gives us the

More information

McKee Music Therapy Services LLC Christine McKee, MT-BC, NMT Executive Director

McKee Music Therapy Services LLC Christine McKee, MT-BC, NMT Executive Director American Music Therapy Association, Inc. MUSIC THERAPY AND INDIVIDUALS WITH DIAGNOSES ON THE AUTISM SPECTRUM What is Music Therapy? Music therapy is a well-established allied health profession similar

More information

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids

Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain

More information

Detection of Leak Holes in Underground Drinking Water Pipelines using Acoustic and Proximity Sensing Systems

Detection of Leak Holes in Underground Drinking Water Pipelines using Acoustic and Proximity Sensing Systems Research Journal of Engineering Sciences ISSN 2278 9472 Detection of Leak Holes in Underground Drinking Water Pipelines using Acoustic and Proximity Sensing Systems Nanda Bikram Adhikari Department of

More information

Practical Design of Filter Banks for Automatic Music Transcription

Practical Design of Filter Banks for Automatic Music Transcription Practical Design of Filter Banks for Automatic Music Transcription Filipe C. da C. B. Diniz, Luiz W. P. Biscainho, and Sergio L. Netto Federal University of Rio de Janeiro PEE-COPPE & DEL-Poli, POBox 6854,

More information

High Quality Integrated Data Reconstruction for Medical Applications

High Quality Integrated Data Reconstruction for Medical Applications High Quality Integrated Data Reconstruction for Medical Applications A.K.M Fazlul Haque Md. Hanif Ali M Adnan Kiber Department of Computer Science Department of Computer Science Department of Applied Physics,

More information

Everybody has the right to

Everybody has the right to Everybody has the right to good hearing The Guide TO Hearing satisfaction for more people Beltone was founded on the act of helping a friend to enjoy life more. Since 19, we have provided knowledge, tools,

More information

Audio Content Analysis for Online Audiovisual Data Segmentation and Classification

Audio Content Analysis for Online Audiovisual Data Segmentation and Classification IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 441 Audio Content Analysis for Online Audiovisual Data Segmentation and Classification Tong Zhang, Member, IEEE, and C.-C. Jay

More information