School Class Monitoring System Based on Audio Signal Processing
|
|
- Ada Bernice Fleming
- 8 years ago
- Views:
Transcription
1 C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India. 3 Department of CNE, PG Student,VTU, Belgaum, Karnataka, India. rashmicr46@gmail.com; shantala.cp@cittumkur.org; yashavanthtr@gmail.com Abstract. This paper aims to develop a proof-of-concept system for monitoring the functioning of the school class. We envisage this system to be used (a) as a supporting tool for monitoring (b) provide reliable quantitative data to evaluate the system effectively (c) use is at primary, secondary and higher education levels. This is an ICT (Information & Communications Technology) based intervention to monitor the functioning of the school class. Currently, one aspect of effective functioning, the functioning of the classrooms, translated into a quantitative data, is covered ineffectively. The proposed system aims to monitor each and every classroom of the school and provide daily, weekly, and monthly reports to Head-teacher of the school, SDMCs (School Development and Monitoring Committees) and BEOs (Block Education Officers), respectively. Keywords: Speech processing, MFCC, Vector quantization. 1. Introduction Sarva Shiksha Abhiyan (SSA) is Government of India s first flagship programme for achievement of Universalization of Elementary Education (UEE) which was launched in Community based monitoring is one of the strengths of this programme. The community, through its representative institutions like Village Education Committees (VEC) and SDMC (School Development and Monitoring Committees), has been entrusted with the primary level of ensuring that the schools are functioning effectively. In order to monitor the functioning of class we use audio samples recorded from the school environment. Speech processing is done for the collected audio samples for development of software system. Speech processing can be divided into 5 categories. Speech coding is used for encoding of voice, for example digitalization of signal voice in WAV or MP3 format. Speech Recognition is used to identify what the speaker is saying, for example text processing software which will recognize the speech and translate it into text and this method is usually used for dictation purpose. Speech enhancement which will maximize the speaker voice is another category, for example voice in a song can be enhanced by using some filters and they are used by audio players. Next category is speech synthesis, it interprets text to voice; this system is helpful for the people who cannot use their vocal chords or in big companies as automatic phone answerers. Speaker recognition is another category which has capability of recognizing who is speaking on the basis of information included in the speech waves automatically. Because of the use of speaker s voice to verify their identity, several applications are possible like banking by telephone, telephone shopping, voice mail, database access services, remote access to computers, voice dialling. Speaker recognition mainly has two types, text-dependent system and text-independent system. Text-independent system recognizes the speaker without having the knowledge of any word in the database, singular characteristics of speaker s voice is extracted by the system which makes recognition possible without saying any precise word. Text-dependent system uses some words or phrases that were previously recorded and stored in database for speaker recognition, for example speaker say a PIN number or his name as a password for opening the door to enter into his office [1]. Corresponding author Elsevier Publications 2013.
2 C. R. Rashmi, C. P. Shantala and T. R. Yashavanth Figure 1. Central system. 2. System Functioning Microphone(s), which has one-to-one association with the classroom, pickup the audio data of the classroom(s) and transmit to the central system in figure 1. The central system will work on this audio data, and use the in-build algorithm to take appropriate decision on the classroom state. The classroom functioning, will be categorized suitably [categorization depends on need of the concerned parties like Head-Teacher, SDMC, BEO, etc] and stored for future transmission. Periodically (once a day/week/month), the stored data will be summarized and sent as SMS s (Short Message Service). 3. Overview of Implementation Steps 1. Students will visit few government schools to collect the audio data -classroom audio recording under different cases [interactive class, lecture-only class, no-teacher-class, etc]. 2. This audio data will be used to develop a software system, which will analyze the noise levels to make proper estimations. Better the step-1 is performed, estimations can get better. 3. Developed software system can be installed on an embedded PC. 4. Multi-audio-in interface can be built for the embedded PC combining the off-the-shelf components. 5. Necessary modem for SMS transmission, will be connected. 6. The system can be deployed in a school and tested. 4. Software System For the development of software system to check the functioning of class we use text-independent speech processing technique since we are not concerned on the text of speech samples hence text-dependent technique is not necessary. The collected speech samples are categorized into 3 types. They are 1. Interactive class 2. Teacher only class 3. Noise In our system we consider that if there is noise then class is not functioning. The above shown types are used for speech processing which has mainly two techniques. a. Feature Extraction b. Feature matching Feature extraction is the process of extracting small amount of data from the audio sample that will be used further for representation of each speech samples classified above. Feature matching is the process of comparing the extracted features with unknown speech samples to identify them. We have wide range of possibilities for parametrically representing the speech samples and they are Linear Prediction Coding (LPC), Mel-Frequency Cepstrum Coefficients (MFCC), and others. MFCC is considered to be well known and popular method, and will be used in our system for feature extraction. Different feature matching techniques are Dynamic Time Warping (DTW), Hidden Markov Modeling (HMM), and Vector Quantization (VQ). In this system, the VQ approach will be used. 4.1 Feature Extraction (MFCC) In order to get better recognition performance we should extract best parametric representation of acoustic signals. This phase should be more efficient and also important for next phase since it may affect behavior [2]. 536 Elsevier Publications 2013.
3 Figure 2. MFCC process [3]. Figure 3. Mel scale filter bank [2] MFCC process MFCC (Mel-Frequency Cepstrum Coefficients) is a method based on human hearing behavior which cannot recognize the frequencies above 1 KHz. They are based on difference of frequencies that the human ear can distinguish. The speech signal is expressed in MEL scale, which is a scale based on pitches in an equally spaced intervals judged by observers. MEL scale uses a filter which is linearly spaced at frequencies below 1000 Hz and logarithmic spacing above 1000 Hz [1]. MFCC process has few steps and they are explained as follows. Figure 2 shows the complete process of MFCC Framing In this step continuous speech sample is segmented or blocked into frames. Frames are of N samples and distance between adjacent samples is M(M < N). Usually the values of M and N are 100 and 256 respectively [1] Windowing In order to minimize the spectral distortion or signal discontinuities at the beginning and end of signal, windowing is done for each frame [4]. For this step hamming window is used Fast fourier transform (FFT) FFT is used to convert every frame of N samples from time domain to frequency domain. It converts the convolution of glottal pulse and the vocal tract impulse response in time domain [2] Mel filters Mel filters or Mel filter bank does the operation of filtering an input power spectrum through a bank of number of mel-filters [4]. Figure 3 shows a set of triangular filters which computes a weighted sum of filter spectral components which leads to approximation to Mel scale. Every filter s magnitude frequency response is triangular in shape and it is equal to unity at the centre frequency and decreases linearly to zero at centre frequency of two adjacent filters. Then, each filter output is the sum of its filtered spectral components [2]. Finally the following equation is used to compute the Mel for given frequency f in HZ: F(Mel)= [2595 log 10[1 + f ]700] (1) Logarithm Compression is carried out in this step. The set of values generated by Mel filter bank is reduced by replacing each value by its logarithm [4]. Elsevier Publications
4 C. R. Rashmi, C. P. Shantala and T. R. Yashavanth Discrete cosine transform (DCT) In this step, the log Mel spectrum is converted into time domain using DCT. The output of this step will be Mel Frequency Cepstrum Coefficients (MFCC s). MFCC s are time domain coefficients and these set are called as acoustic vectors. Each input utterances will be transformed into acoustic vectors [2]. The above shown steps extract the best parametric representation of acoustic signals. These set of coefficients are used at later stage. 4.2 Feature matching (VQ) We decide to use Vector Quantization (VQ) for feature matching Vector quantization (VQ) It is the process of mapping large number of vectors to finite number of regions. Every region is called a cluster and it will be represented by its center. Center is called as codeword and its collection is said to be codebook [5]. In 1980, Linde, Buzo, and Gray (LBG) proposed a VQ design algorithm based on a training sequence. LBG algorithm is as follows. 1. Design a 1-vector codebook; this is the centroid of the entire set of training vectors. 2. Double the size of the codebook by splitting each current codebook y n according to the rule y n = y n (1 + ε) (2) y n = y n (1 ε) (3) where n varies from 1 to the current size of the codebook, and ε is a splitting parameter. 3. Nearest neighbour search: for each training vector, find the codeword in the current codebook that is closest & assign that vector to the corresponding cell. 4. Update the codeword in each cell using the centroid of the training vectors assigned to that cell. 5. Repeat steps 3 & 4 until the average distance falls below a present threshold. 6. Repeat steps 2, 3 & 4 until a codebook size of M is designed. This VQ algorithm gives fixed size codebook of size QXT.HereQ is number of Mel filters and it is fixed, T is any number which satisfies following condition: If we follow the above steps codebook will be created [6]. 5. Methodology T = 2 i i = 1, 2, 3... We use Matlab for implementation. There are two main stages. They are training and testing phase. The requirement of training phase is as shown in table 1. In testing phase, the audio samples are taken in random and tested. The requirement of testing phase is similar to the training phase but it computes minimum distance (Euclidean Distance) and minimum distortion. Based on the minimum distortion and by comparing the random audio samples with codebook, it classifies to any one of the cases among three. If random audio sample is matched with noise then class is not functioning otherwise class is functioning. Table 1. Training requirement. Sl. No. Process Description 1 Data Audio samples 1. Interactive class, 2.Teacher-only class, 3. Noise 2 Sampling Frequency, Hz 8000 Hz 3 Audio format.wav 4 Number of cases 15 in each case taken in school environment 5 Duration Each sample of 20s 6 Feature extraction Uses MFCC and VQ to create codebook 538 Elsevier Publications 2013.
5 Table 2. Performance measure for classification among all cases. Number of audio samples Number of audio samples in training phase in testing phase Efficiency % % % 6. Results For experimentation purpose we have used different number of audio samples and got performance as shown in table 2. As the training data is increased, efficiency is also increased. If random audio sample used in testing phase is classified to interactive and teacher-only classes then we say that class is functioning otherwise class is not functioning. Result also depends on the noise levels because noise may be added even in teacher-only class also which reduces efficiency. 7. Conclusion This paper provides a proof-of-concept system for monitoring the functioning of classroom. For real time data samples we have got 80% efficiency. The efficiency can be improved by increasing the training data and also different methods like Linear Prediction Coding (LPC), Hidden Markov Model (HMM) and Artificial Neural Networks (ANN) can be tried. This system can be enhanced in future by implementing in a school by embedding the software system in PC with necessary components and modem for SMS transmission. References [1] Jorge MARTINEZ*, Hector PEREZ, Enrique ESCAMILLA and Masahisa Mabo SUZUKI, Speaker recognition using Mel Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) Techniques, pp , [2] Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques, vol. 2, issue 3, ISSN , pp , March [3] B. G. Nagaraja and H. S. Jayanna, Mono and Cross Lingual Speaker Identification with the Constraint of Limited Data, pp , March 21 23, [4] Ali Zulfiqar, Aslam Muhammad and A. M. Martinez Enriquez, A Speaker Identification System using MFCC Features with VQ Technique pp , [5] Fatma zohra. Chelali and Amar. DJERADI, MFCC and vector quantization for Arabic fricatives Speech/Speaker recognition, [6] Amruta Anantrao Malode and Shashikant Sahare, ADVANCED SPEAKER RECOGNITION vol. 4, issue 1, pp , July Elsevier Publications
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationSPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
More informationAutomatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
More informationL9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
More informationHardware Implementation of Probabilistic State Machine for Word Recognition
IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2
More informationEricsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
More informationMyanmar Continuous Speech Recognition System Based on DTW and HMM
Myanmar Continuous Speech Recognition System Based on DTW and HMM Ingyin Khaing Department of Information and Technology University of Technology (Yatanarpon Cyber City),near Pyin Oo Lwin, Myanmar Abstract-
More informationAvailable from Deakin Research Online:
This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,
More informationSpeech Signal Processing: An Overview
Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech
More informationAutomatic Detection of Emergency Vehicles for Hearing Impaired Drivers
Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX
More informationEmotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
More informationArtificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
More informationDeveloping an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
More informationFigure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
More informationAPPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
More informationVoice---is analog in character and moves in the form of waves. 3-important wave-characteristics:
Voice Transmission --Basic Concepts-- Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics: Amplitude Frequency Phase Voice Digitization in the POTS Traditional
More informationA Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
More informationMarathi Interactive Voice Response System (IVRS) using MFCC and DTW
Marathi Interactive Voice Response System (IVRS) using MFCC and DTW Manasi Ram Baheti Department of CSIT, Dr.B.A.M. University, Aurangabad, (M.S.), India Bharti W. Gawali Department of CSIT, Dr.B.A.M.University,
More informationSpeech Recognition on Cell Broadband Engine UCRL-PRES-223890
Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda
More informationSpeech recognition for human computer interaction
Speech recognition for human computer interaction Ubiquitous computing seminar FS2014 Student report Niklas Hofmann ETH Zurich hofmannn@student.ethz.ch ABSTRACT The widespread usage of small mobile devices
More informationLOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING
LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING RasPi Kaveri Ratanpara 1, Priyan Shah 2 1 Student, M.E Biomedical Engineering, Government Engineering college, Sector-28, Gandhinagar (Gujarat)-382028,
More informationSpeech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction
: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)
More informationT-61.184. Automatic Speech Recognition: From Theory to Practice
Automatic Speech Recognition: From Theory to Practice http://www.cis.hut.fi/opinnot// September 27, 2004 Prof. Bryan Pellom Department of Computer Science Center for Spoken Language Research University
More informationSolutions to Exam in Speech Signal Processing EN2300
Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.
More informationComputer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction
Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals Modified from the lecture slides of Lami Kaya (LKaya@ieee.org) for use CECS 474, Fall 2008. 2009 Pearson Education Inc., Upper
More informationBroadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
More informationTowards usable authentication on mobile phones: An evaluation of speaker and face recognition on off-the-shelf handsets
Towards usable authentication on mobile phones: An evaluation of speaker and face recognition on off-the-shelf handsets Rene Mayrhofer University of Applied Sciences Upper Austria Softwarepark 11, A-4232
More informationCloud User Voice Authentication enabled with Single Sign-On framework using OpenID
Cloud User Voice Authentication enabled with Single Sign-On framework using OpenID R.Gokulavanan Assistant Professor, Department of Information Technology, Nandha Engineering College, Erode, Tamil Nadu,
More informationSeparation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationMUSICAL INSTRUMENT FAMILY CLASSIFICATION
MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.
More informationMFCC-Based Voice Recognition System for Home Automation Using Dynamic Programming
International Journal of Science and Research (IJSR) MFCC-Based Voice Recognition System for Home Automation Using Dynamic Programming Sandeep Joshi1, Sneha Nagar2 1 PG Student, Embedded Systems, Oriental
More informationBLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be
More informationHow To Recognize Voice Over Ip On Pc Or Mac Or Ip On A Pc Or Ip (Ip) On A Microsoft Computer Or Ip Computer On A Mac Or Mac (Ip Or Ip) On An Ip Computer Or Mac Computer On An Mp3
Recognizing Voice Over IP: A Robust Front-End for Speech Recognition on the World Wide Web. By C.Moreno, A. Antolin and F.Diaz-de-Maria. Summary By Maheshwar Jayaraman 1 1. Introduction Voice Over IP is
More informationVector Quantization and Clustering
Vector Quantization and Clustering Introduction K-means clustering Clustering issues Hierarchical clustering Divisive (top-down) clustering Agglomerative (bottom-up) clustering Applications to speech recognition
More informationSpot me if you can: Uncovering spoken phrases in encrypted VoIP conversations
Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations C. Wright, L. Ballard, S. Coull, F. Monrose, G. Masson Talk held by Goran Doychev Selected Topics in Information Security and
More informationComparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationFinal Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones
Final Year Project Progress Report Frequency-Domain Adaptive Filtering Myles Friel 01510401 Supervisor: Dr.Edward Jones Abstract The Final Year Project is an important part of the final year of the Electronic
More informationANN Based Fault Classifier and Fault Locator for Double Circuit Transmission Line
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Special Issue-2, April 2016 E-ISSN: 2347-2693 ANN Based Fault Classifier and Fault Locator for Double Circuit
More informationImage Compression through DCT and Huffman Coding Technique
International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul
More informationMusic Genre Classification
Music Genre Classification Michael Haggblade Yang Hong Kenny Kao 1 Introduction Music classification is an interesting problem with many applications, from Drinkify (a program that generates cocktails
More informationDigital Transmission of Analog Data: PCM and Delta Modulation
Digital Transmission of Analog Data: PCM and Delta Modulation Required reading: Garcia 3.3.2 and 3.3.3 CSE 323, Fall 200 Instructor: N. Vlajic Digital Transmission of Analog Data 2 Digitization process
More informationCOMPARATIVE STUDY OF RECOGNITION TOOLS AS BACK-ENDS FOR BANGLA PHONEME RECOGNITION
ITERATIOAL JOURAL OF RESEARCH I COMPUTER APPLICATIOS AD ROBOTICS ISS 2320-7345 COMPARATIVE STUDY OF RECOGITIO TOOLS AS BACK-EDS FOR BAGLA PHOEME RECOGITIO Kazi Kamal Hossain 1, Md. Jahangir Hossain 2,
More informationDoppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19
Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and
More informationAdvanced Speech-Audio Processing in Mobile Phones and Hearing Aids
Advanced Speech-Audio Processing in Mobile Phones and Hearing Aids Synergies and Distinctions Peter Vary RWTH Aachen University Institute of Communication Systems WASPAA, October 23, 2013 Mohonk Mountain
More informationThirukkural - A Text-to-Speech Synthesis System
Thirukkural - A Text-to-Speech Synthesis System G. L. Jayavardhana Rama, A. G. Ramakrishnan, M Vijay Venkatesh, R. Murali Shankar Department of Electrical Engg, Indian Institute of Science, Bangalore 560012,
More informationAnalog Representations of Sound
Analog Representations of Sound Magnified phonograph grooves, viewed from above: The shape of the grooves encodes the continuously varying audio signal. Analog to Digital Recording Chain ADC Microphone
More informationSpeech Analysis for Automatic Speech Recognition
Speech Analysis for Automatic Speech Recognition Noelia Alcaraz Meseguer Master of Science in Electronics Submission date: July 2009 Supervisor: Torbjørn Svendsen, IET Norwegian University of Science and
More information(2) (3) (4) (5) 3 J. M. Whittaker, Interpolatory Function Theory, Cambridge Tracts
Communication in the Presence of Noise CLAUDE E. SHANNON, MEMBER, IRE Classic Paper A method is developed for representing any communication system geometrically. Messages and the corresponding signals
More informationIBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream
RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu
More informationFFT Algorithms. Chapter 6. Contents 6.1
Chapter 6 FFT Algorithms Contents Efficient computation of the DFT............................................ 6.2 Applications of FFT................................................... 6.6 Computing DFT
More informationThis document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;
More informationSpectrum Level and Band Level
Spectrum Level and Band Level ntensity, ntensity Level, and ntensity Spectrum Level As a review, earlier we talked about the intensity of a sound wave. We related the intensity of a sound wave to the acoustic
More informationA TOOL FOR TEACHING LINEAR PREDICTIVE CODING
A TOOL FOR TEACHING LINEAR PREDICTIVE CODING Branislav Gerazov 1, Venceslav Kafedziski 2, Goce Shutinoski 1 1) Department of Electronics, 2) Department of Telecommunications Faculty of Electrical Engineering
More informationDigital Speech Coding
Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2720 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html
More informationAnnotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
More informationA Comparison of Speech Coding Algorithms ADPCM vs CELP. Shannon Wichman
A Comparison of Speech Coding Algorithms ADPCM vs CELP Shannon Wichman Department of Electrical Engineering The University of Texas at Dallas Fall 1999 December 8, 1999 1 Abstract Factors serving as constraints
More information4 Digital Video Signal According to ITU-BT.R.601 (CCIR 601) 43
Table of Contents 1 Introduction 1 2 Analog Television 7 3 The MPEG Data Stream 11 3.1 The Packetized Elementary Stream (PES) 13 3.2 The MPEG-2 Transport Stream Packet.. 17 3.3 Information for the Receiver
More informationFrom Concept to Production in Secure Voice Communications
From Concept to Production in Secure Voice Communications Earl E. Swartzlander, Jr. Electrical and Computer Engineering Department University of Texas at Austin Austin, TX 78712 Abstract In the 1970s secure
More informationIntroduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles
Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles Sound is an energy wave with frequency and amplitude. Frequency maps the axis of time, and amplitude
More informationGSM speech coding. Wolfgang Leister Forelesning INF 5080 Vårsemester 2004. Norsk Regnesentral
GSM speech coding Forelesning INF 5080 Vårsemester 2004 Sources This part contains material from: Web pages Universität Bremen, Arbeitsbereich Nachrichtentechnik (ANT): Prof.K.D. Kammeyer, Jörg Bitzer,
More informationDepartment of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP
Department of Electrical and Computer Engineering Ben-Gurion University of the Negev LAB 1 - Introduction to USRP - 1-1 Introduction In this lab you will use software reconfigurable RF hardware from National
More informationLecture 1-6: Noise and Filters
Lecture 1-6: Noise and Filters Overview 1. Periodic and Aperiodic Signals Review: by periodic signals, we mean signals that have a waveform shape that repeats. The time taken for the waveform to repeat
More informationJPEG Image Compression by Using DCT
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-4 E-ISSN: 2347-2693 JPEG Image Compression by Using DCT Sarika P. Bagal 1* and Vishal B. Raskar 2 1*
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationMPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music
ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final
More informationUNIVERSITY OF CALICUT
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION BMMC (2011 Admission) V SEMESTER CORE COURSE AUDIO RECORDING & EDITING QUESTION BANK 1. Sound measurement a) Decibel b) frequency c) Wave 2. Acoustics
More informationSpeech Recognition System for Cerebral Palsy
Speech Recognition System for Cerebral Palsy M. Hafidz M. J., S.A.R. Al-Haddad, Chee Kyun Ng Department of Computer & Communication Systems Engineering, Faculty of Engineering, Universiti Putra Malaysia,
More informationTutorial about the VQR (Voice Quality Restoration) technology
Tutorial about the VQR (Voice Quality Restoration) technology Ing Oscar Bonello, Solidyne Fellow Audio Engineering Society, USA INTRODUCTION Telephone communications are the most widespread form of transport
More informationAn Experimental Study of the Performance of Histogram Equalization for Image Enhancement
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Special Issue-2, April 216 E-ISSN: 2347-2693 An Experimental Study of the Performance of Histogram Equalization
More informationIntroduction to Digital Audio
Introduction to Digital Audio Before the development of high-speed, low-cost digital computers and analog-to-digital conversion circuits, all recording and manipulation of sound was done using analog techniques.
More informationCourse overview Processamento de sinais 2009/10 LEA
Course overview Processamento de sinais 2009/10 LEA João Pedro Gomes jpg@isr.ist.utl.pt Instituto Superior Técnico Processamento de sinais MEAer (IST) Course overview 1 / 19 Course overview Motivation:
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationKeywords image processing, signature verification, false acceptance rate, false rejection rate, forgeries, feature vectors, support vector machines.
International Journal of Computer Application and Engineering Technology Volume 3-Issue2, Apr 2014.Pp. 188-192 www.ijcaet.net OFFLINE SIGNATURE VERIFICATION SYSTEM -A REVIEW Pooja Department of Computer
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationAUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
More informationAn Arabic Text-To-Speech System Based on Artificial Neural Networks
Journal of Computer Science 5 (3): 207-213, 2009 ISSN 1549-3636 2009 Science Publications An Arabic Text-To-Speech System Based on Artificial Neural Networks Ghadeer Al-Said and Moussa Abdallah Department
More informationKeywords: Image complexity, PSNR, Levenberg-Marquardt, Multi-layer neural network.
Global Journal of Computer Science and Technology Volume 11 Issue 3 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172
More informationMatlab GUI for WFB spectral analysis
Matlab GUI for WFB spectral analysis Jan Nováček Department of Radio Engineering K13137, CTU FEE Prague Abstract In the case of the sound signals analysis we usually use logarithmic scale on the frequency
More informationSachin Dhawan Deptt. of ECE, UIET, Kurukshetra University, Kurukshetra, Haryana, India
Abstract Image compression is now essential for applications such as transmission and storage in data bases. In this paper we review and discuss about the image compression, need of compression, its principles,
More informationInternational Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 8 August 2013
A Short-Term Traffic Prediction On A Distributed Network Using Multiple Regression Equation Ms.Sharmi.S 1 Research Scholar, MS University,Thirunelvelli Dr.M.Punithavalli Director, SREC,Coimbatore. Abstract:
More informationAlternative Biometric as Method of Information Security of Healthcare Systems
Alternative Biometric as Method of Information Security of Healthcare Systems Ekaterina Andreeva Saint-Petersburg State University of Aerospace Instrumentation Saint-Petersburg, Russia eandreeva89@gmail.com
More informationAudio Scene Analysis as a Control System for Hearing Aids
Audio Scene Analysis as a Control System for Hearing Aids Marie Roch marie.roch@sdsu.edu Tong Huang hty2000tony@yahoo.com Jing Liu jliu 76@hotmail.com San Diego State University 5500 Campanile Dr San Diego,
More informationThe Fourier Analysis Tool in Microsoft Excel
The Fourier Analysis Tool in Microsoft Excel Douglas A. Kerr Issue March 4, 2009 ABSTRACT AD ITRODUCTIO The spreadsheet application Microsoft Excel includes a tool that will calculate the discrete Fourier
More informationAuto-Tuning Using Fourier Coefficients
Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition
More informationCELL PHONE AUDIO CONTROLLED POINT OF SALE TECHNOLOGY REVIEW REPORT. Alana Sweat Group 9 October 18, 2008
CELL PHONE AUDIO CONTROLLED POINT OF SALE TECHNOLOGY REVIEW REPORT Alana Sweat Group 9 October 18, 2008 1. REVISION HISTORY 1 2. INTRODUCTION 2 2.1 CUSTOMER REQUIREMENTS & PROJECT BACKGROUND 2 2.2 PROJECT
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationQuarterly Progress and Status Report. Measuring inharmonicity through pitch extraction
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Measuring inharmonicity through pitch extraction Galembo, A. and Askenfelt, A. journal: STL-QPSR volume: 35 number: 1 year: 1994
More informationSpeech Recognition in Hardware: For Use as a Novel Input Device
Speech Recognition in Hardware: For Use as a Novel Input Device Nicholas Harrington Tao B. Schardl December 10, 2008 i Abstract Conventional computer input devices often restrict programs to exclusively
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationInternational Journal of Advanced Computer Technology (IJACT) ISSN:2319-7900 PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS
PRIVACY PRESERVING DATA MINING IN HEALTH CARE APPLICATIONS First A. Dr. D. Aruna Kumari, Ph.d, ; Second B. Ch.Mounika, Student, Department Of ECM, K L University, chittiprolumounika@gmail.com; Third C.
More informationNon-Data Aided Carrier Offset Compensation for SDR Implementation
Non-Data Aided Carrier Offset Compensation for SDR Implementation Anders Riis Jensen 1, Niels Terp Kjeldgaard Jørgensen 1 Kim Laugesen 1, Yannick Le Moullec 1,2 1 Department of Electronic Systems, 2 Center
More informationSimple Voice over IP (VoIP) Implementation
Simple Voice over IP (VoIP) Implementation ECE Department, University of Florida Abstract Voice over IP (VoIP) technology has many advantages over the traditional Public Switched Telephone Networks. In
More informationSpam Detection and the Types of Email
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Spam Detection
More informationIntroduction to Medical Image Compression Using Wavelet Transform
National Taiwan University Graduate Institute of Communication Engineering Time Frequency Analysis and Wavelet Transform Term Paper Introduction to Medical Image Compression Using Wavelet Transform 李 自
More informationLinear Predictive Coding
Linear Predictive Coding Jeremy Bradbury December 5, 2000 0 Outline I. Proposal II. Introduction A. Speech Coding B. Voice Coders C. LPC Overview III. Historical Perspective of Linear Predictive Coding
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationPERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS
The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*
More informationANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS
ANALYSIS OF THE EFFECTIVENESS IN IMAGE COMPRESSION FOR CLOUD STORAGE FOR VARIOUS IMAGE FORMATS Dasaradha Ramaiah K. 1 and T. Venugopal 2 1 IT Department, BVRIT, Hyderabad, India 2 CSE Department, JNTUH,
More informationB reaking Audio CAPTCHAs
B reaking Audio CAPTCHAs Jennifer Tam Computer Science Department Carnegie Mellon University 5000 Forbes Ave, Pittsburgh 15217 jdtam@cs.cmu.edu Sean Hyde Electrical and Computer Engineering Carnegie Mellon
More information