Audio Compression Codec Using a Dynamic Gammachirp Psychoacoustic Model And a D.W.T Multiresolution Analysis



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

An Alternative Way to Measure Private Equity Performance

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

An Enhanced Super-Resolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement

Adaptive Fractal Image Coding in the Frequency Domain

An Interest-Oriented Network Evolution Mechanism for Online Communities

A Secure Password-Authenticated Key Agreement Using Smart Cards

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

What is Candidate Sampling

Effective wavelet-based compression method with adaptive quantization threshold and zerotree coding

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

A Crossplatform ECG Compression Library for Mobile HealthCare Services

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Audio coding: 3-dimensional stereo and presence

Forecasting the Direction and Strength of Stock Market Movement

Calculating the high frequency transmission line parameters of power cables

DEFINING %COMPLETE IN MICROSOFT PROJECT

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Project Networks With Mixed-Time Constraints

Implementation of Deutsch's Algorithm Using Mathcad

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Damage detection in composite laminates using coin-tap method

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Quantization Effects in Digital Filters

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

STANDING WAVE TUBE TECHNIQUES FOR MEASURING THE NORMAL INCIDENCE ABSORPTION COEFFICIENT: COMPARISON OF DIFFERENT EXPERIMENTAL SETUPS.

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

RequIn, a tool for fast web traffic inference

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

A Multi-Camera System on PC-Cluster for Real-time 3-D Tracking

IMPACT ANALYSIS OF A CELLULAR PHONE

A DATA MINING APPLICATION IN A STUDENT DATABASE

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

Multiplication Algorithms for Radix-2 RN-Codings and Two s Complement Numbers

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

An RFID Distance Bounding Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

Politecnico di Torino. Porto Institutional Repository

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Statistical Approach for Offline Handwritten Signature Verification

Gender Classification for Real-Time Audience Analysis System

Can Auto Liability Insurance Purchases Signal Risk Attitude?

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

COMPUTER SUPPORT OF SEMANTIC TEXT ANALYSIS OF A TECHNICAL SPECIFICATION ON DESIGNING SOFTWARE. Alla Zaboleeva-Zotova, Yulia Orlova

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Automated information technology for ionosphere monitoring of low-orbit navigation satellite signals

Comparison of Control Strategies for Shunt Active Power Filter under Different Load Conditions

An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Lecture 2: Single Layer Perceptrons Kevin Swingler

Section 5.4 Annuities, Present Value, and Amortization

Simple Interest Loans (Section 5.1) :

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Dynamic Pricing for Smart Grid with Reinforcement Learning

The Effect of Mean Stress on Damage Predictions for Spectral Loading of Fiberglass Composite Coupons 1

Faraday's Law of Induction

An MILP model for planning of batch plants operating in a campaign-mode

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Parallel Numerical Simulation of Visual Neurons for Analysis of Optical Illusion

Extending Probabilistic Dynamic Epistemic Logic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Biometric Signature Processing & Recognition Using Radial Basis Function Network

Design and Development of a Security Evaluation Platform Based on International Standards

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

Multimedia Terminal System-on-Chip Design and Simulation

Improved SVM in Cloud Computing Information Mining

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Transcription:

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Audo ompresson odec Usng a Dynamc Gammachrp Psychoacoustc Model And a D.W.T Multresoluton Analyss Khall Abd Natonal Engneerng School of Tuns ( ENIT ) Laboratory of Systems and Sgnal Processng (LSTS) BP 37, Le Belvédère, Tuns, Tunsa Khallab6@yahoo.fr Kas Oun and Noureddne Ellouze Natonal Engneerng School of Tuns ( ENIT ) Laboratory of Systems and Sgnal Processng (LSTS) BP 37, Le Belvédère, Tuns, Tunsa Abstract Audo compresson algorms are used to obtan compact dgtal representatons of hgh-fdelty audo sgnals for e purpose of effcent transmsson over larger dstances and storage. Ths paper presents an audo coder for real-tme mutmeda applcatons. Ths coder apples a dscrete wavelet transform to decompose audo test fles nto subbands to elmnate redundant data usng spectral and temporal maskng propertes. Ths archtecture s combned w a psychoacoustc model charactersed by an external and mddle ear model as a frst part and a dynamc Gammachrp flter as a second part whose connectons are selected n order to come close to e crtcal bands of e ear. Expermental results show e best performance of s archtecture Keywords-Audo ompressont; Dscrete Wavelet Transform; Psychoacousc Model; External and Mddle Ear Model; Dynamc Gammachrp Flter Bank;MUSHRA lstenng test; I. INTRODUTION Alough wreless communcaton has played a great role n our lfestyle, transmsson of hgh fdelty audo sgnal wrelessly at a reasonable cost s stll challengng. Presently avalable audo codng technques am at reducng btrate and put less concern on complexty and effcent wreless transmsson. Such audo ODEs lke ISO/MPEG and AA [9] are sutable for non real tme applcatons and audo archve. In audo codng ere s a trade-off between e compresson rato and reconstructon qualty. Hgher compresson ratos mply more degradaton on e sound qualty when reconstructed. Any compresson scheme should consder ese facts and should also provde a good control mechansm to use e lmted bandwd and mantan acceptable qualty, keepng n mnd e computatonal complexty. The varablty of audo compresson technques offer varous compressed audo qualty, amount of data compresson and levels of complexty. Despte s varablty e present audo encoders are based on e Fourer transform whch s localsed only n frequency unlke e wavelet transform. Ths technque s a tme-frequency localzaton analyss meod for non-statonary sgnal and has been dentfed as an effectve tool for data compresson. The wndow wd of wavelet analyss s adjustable. It s shorter at hgher frequences and larger at lower frequences. The tmefrequency characterstc of a wavelet flter bank s a natural match to some of e propertes of wdeband speech and audo sgnal. Ths paper proposes an audo codng scheme based on subband analyss usng e dscrete wavelet transform and adopts an analyss of e frequency bands at come closer to e crtcal bands of e ear. Our goal s not to propose a new wavelet type but to apply e wavelet formalsm for speech/musc dscrmnaton. Our motvaton to apply wavelets to speech/musc dscrmnaton s due to er ablty to extract tme-frequency features and to deal w nonstatonary sgnals [8]. Ths paper s arranged as follows: Secton elaborates e proposed D.W.T encoder. Secton 3 wll focus on e e Gammachrp model. Secton 4 ams at presentng e archtecture of e psychoacoustc model usng e dynamc Gammachrp wavelet. Expermental results are shown n secton 5. A sound qualty evaluaton wll be presented n secton 6. Fnally, a hardware mplementaton of e D.W.T ODE wll be carred out to conclude s work. II. THE D.W.T AUDIO ENODER The archtecture of e classcal perceptual MPEG audo compresson [] s shown n e followng Fgure. Ths model use sgnal analyss, psychoacoustc models, bt allocaton and codng blocks. At e ISO/MPEG layer III (MP3) codng scheme [], e tme to frequency mappng block ncludes a polyphase analyss flter bank followed by decmaton of a factor of 3 [], feedng a modfed dscrete cosne transform (MDT) [] and adaptve segmentaton block, also connected w e psychoacoustc model. The bt allocaton block ncludes block compandng, quantzaton and Huffman codng []. ISSN : 975-3397 34

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Sgnal.wav Flter Bank (3 equal subbands) lasscal Psychoacoustc Model Quantzaton and odng Sde nformaton Btstream Formatng Audo encoded sgnal The D.W.T s appled to e audo nput sgnal w 8 levels of decomposton, us concentratng energy n lower frequency bands (hgher levels). Ths part decomposes and transforms e audo frame [7] nto wavelet doman. The decomposton tree conssts of 8 subbands chosen to resemble e crtcal band [][9] of human hearng as depcted n Fgure 3. Each subband wll be quantzed accordng to e sgnal to mask rato (SMR) calculated by e dynamc Gammachrp Psychoacoustc Model. Fgure. The classcal MPEG audo encoder A. Stucture of The D.W.T Encoder Ths secton explans e structure of e proposed D.W.T codec usng as nput wave fle audo. Each fle has e followng propertes (Sample rate Fe=44.Khz and Btrate=75Kbts/s) and s troncatured at 4 samples. The encoder s based on dscrete wavelet transform D.W.T, whch has strong relatons w subband analyss and flter bank. Normalsaton of e wavelet coeffcents Quantzaton of e normalzed wavelet coeffcents Btstream Formatng R-heck Audo Sgnal Dscrete Wavelet Transform: (8 levels of decomposton) (8 Subbands) 8 Determnaton of e correspondng Subband Scale 8 enters Frequences External and Mddle Ear model Psychoacoustc Model Usng A Dynamc Gammachrp Model SMR(dB) SNR(dB) Bt allocaton Fgure 3. The dscrete wavelet transform repartton TABLE I. THE D.W.T SUBBAND REPARTITION SMR(dB) Return to SNR SNR(dB) - SMR(dB) = MNR(dB) MNR< Fgure. The D.W.T Encoder Subband [ F - F ] mn max (HZ) enter Frequency f (Hz) Number of samples [ 9] 45 4 [9 7] 3 4 3 [7 6] 5 4 4 [6 34] 3 4 5 [34-5] 43 8 6 [5 69] 65 8 7 [69 86] 775 8 8 [86 3] 945 8 ISSN : 975-3397 34

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 9 [3 ] 5 8 [ 4] 3 8 [4 7] 55 6 [7 ] 9 6 3 [ 4] 5 6 4 [4 8] 6 6 5 [8 3] 95 6 6 [3 34] 35 6 7 [34 4] 375 3 8 [4 48] 445 3 9 [48 55] 55 3 [55 6] 585 3 [6 69] 655 3 [69 83] 76 64 3 [83 96] 895 64 4 [96 ] 3 64 5 [ 38] 4 8 6 [38 65] 55 8 7 [65 93] 79 8 8 [93 ] 65 8 In order to evaluate e proposed D.W.T repartton, we compared e postons of ts center frequences w e real one. As shown n Fgure 4 e second repartton usng D.W.T has approxmately e same repartton. Bark 4 3 Real enters Frequences repartton 3 4 Frequency (Hz) two at each level of e tree, we used e followng equaton: M ( S, W ) Samp( ) Y ( f, g ) () Where: j j k k jn jn n n n k k S S S S M W W W W M S S S S M3 M M W W W W M3 M M MS ( j, Wj ) j M j M S S S M M S S M3 W W W M M W W M3 Sam p( ) n 3 4 5 n n n ( nxn ) () (3) Bark 4 3 enters Frequences repartton usng The proposed DWT 3 4 Frequency (Hz) Fgure 4. omparson between e real and D.W.T centers frquences repartton Y ( f, g ) k k n n k k f g f g f g f 3 g 3 f n g n n (4) To avod cases where wavelet coeffcents exceeded e number of samples n tme doman, each frame s vewed as perodc. In order to decompose frame and down-samplng by The number of samples ( ) s equal to n. The number of wavelet functon coeffcents (W ) s equal to m ISSN : 975-3397 34

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 The number of scalng functons ( S ) s equal to m J= Detals Approxmatons 5 x [9.3 KHz] -4 5 x [6.5 9.3KHz] -4 wll be stored n e feld of 68 bts data as shown n Fgure 6. In order to encode e 8 subbands, we used bts data (4 bts data for each one). The varable bts feld contans e quantzed wavelet coeffcents whose leng depends on e bnary encodng of all subbands. Fnally, n order to make easy e decodng and to mantan a fxed btrate, we pad bt s J= -5 5 5 x -4 [3.8 6.5KHz] -5 5-5 5 5 x -4 [ 3.8KHz] -5 5 Sync 4 Bts 4 Bts Subband 4 Bts 4x8= Bts Subband 8 4 Bts 6X8 Bts 834 Bts The quantzed scalf Wavelet oeffcent 68 Bts 53 Bts 53 Bts Zero-pad 5 x -4 [9.6 KHz] 5 x -4 [8.3 9.6KHz] Fgure 7. The frame header structure of e D.W.T encoder J=3-5 4 6-5 4 6 Fgure 5. Example of wavelet coeffcents usng dfferent subbands n dfferent resolutons J B. Bts Allocaton The SMR, e output of e psychoacoustc model, s used for bt allocaton to e quantzed frequency samples resultng from analyss subband flterng. To allocate bts for e subband, we fnd e mask to nose rato (n db) MNR=SNR SMR. The bts are ncrementally allocated to subbands from lowest MNR to hghest MNR. Ths gves a set of step szes. We recompute SNR and MNR w e new step szes and terate untl all bts have been allocated D. Normalzaton and Quantzaton of e wavelet oeffcents The non-statonarty and dynamc range of audo sgnals s taken nto account by segmentaton and normalzaton after e sgnal has been decomposed nto subbands [8]. It s typcally assumed at audo sgnals can be consdered statonary over approxmately ms. At an nput samplng rate of 44.kHz, a segment leng equvalent to 4 nput samples was chosen. Ths corresponds to a segment leng of approxmately 3ms. We appled e correspondng nput troncatured audo wave fle to e proposed D.W.T structure. We obtaned fnally e output wavelet coeffcents. After downsamplng, e number of samples per segment per subband vares from 4 to 8. Each of ese segments s en normalzed. We defne 64 scale factors to be selected as approprate for each subband. The scaled coeffcents are unformly quantzed accordng to e number of bts receved from bt allocaton algorm.. Frame Header Structure of The Wavelet Encoder The wavelet encoder fle s bult up from smaller parts called frames. The frst part uses 4 bts and s called sync. The wavelet encoder forms frames of 4 samples per audo channel. 4 Audo Samples The wavelet encoder Fgure 6. The wavelet encoder functon Frame header of e wavelet encoder III. THE GAMMAHIRP MODEL FOR OHLEAR FILTERS Several models have been proposed to smulate e workng of e cochlear flters. Seen as ts temporal-specter propertes, e Gammachrp flter has been successful n e psychoacoustc research []. In addton to ts good approxmaton n e psychoacoustcal apparellement, t has a temporal spectar optmzaton of e human audtory flter (Irno and Paterson, 997). The noton of e wavelet transform has a bg mportance n e sgnal treatment doman and n e speech analyss. The complex mpulse response of e Gammachrp s [][3]: Ths encoder codes data n dfferent groups varyng from 4 to 8 samples for each subband. The encoder uses dfferent scale factor for each samples group. Each scalng factor wll be coded n 6 bts data for each subband and s sent only for e subband w non zero bts allocaton. Fnally all scales g () t. t. e. e c n. berb. ( fr) t.(. fr. tc.ln( t) ) (5) Where tme t, s e ampltude, n and b are parameters defnng e dstrbuton, fr s e asymptotc frequency, s ISSN : 975-3397 343

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 e parameter for e frequency modulaton and s e ntal phase. ERB( f r ) s e equvalent rectangular bandwd [] of e flter at e center frequency f r. It has e followng equaton [][]: ERB( fr) 4.7.8 f (6) r It has been demonstrated at e Gammachrp flter fts human psychoacoustc maskng data well when e parameter s assocated w e sound pressure level P (db) S typcally as [4]: 3.38.7. P S (7) A. Ampltude Spectrum of e Gammachrp flter The ampltude spectrum of The Gammachrp flter has e followng expresson: Where:. ( nc. ) Gc ( f ) e n. berb. ( f ). ( f f ) r. ( nc. ) Gc ( f ). S( f ) n. berb. ( f ). ( f f ) r f fr.arctan( ) berb. ( fr ) r r f fr.arctan( ) berb. ( fr ) (8) (9) S( f ) e () The peak frequency s obtaned as follows: f p f r cberb.. ( fr) () n The frst term n Eq.(8) represents e ampltude spectrum of e Gammatone snce S( f ) when c. Thus, e term S( f ) produces a shft n e peak frequency accordng to Eq.() and ntroduces asymmetry nto e ampltude spectrum. The ampltude of Eq.(7) s rewrtten as: G ( f ) G ( f ). S( f ) () c Where GT ( f ) s e ampltude spectrum of e Gammatone, whch s level-ndependent and nvarant snce n and b are constant. The gammachrp s represented by two cascaded flters: an nvarant Gammatone flter and an asymmetrc, level-dependent flter S( f ). T Fgure 8. The Fourer magntude spectrum of e Gammachrp flter usng dfferent values of (=-3 3) IV. THE DYNAMI PSYHOAOUSTI MODEL.The proposed model conssts of several processng stages, reflectng e above descrbed structure of human audtory paway as shown n e followng Fgure 9: wave sgnal Outer and Mddle Ear Model ochlea Fgure 9. The human audtory paway The fundamental concept of e perceptual encoder s to elmnate e audo sgnals at human cannot perceve because of sgnal maskng or ambguty. In fact, we need not encode nor be nterested n sgnal components at are below e hearng reshold. Three maskng patterns, whch are e absolute reshold of hearng, frequency maskng and temporal maskng [7][6], are erefore used to fnd e approprate quantzng bts for each subband whle mnmzng e quantzaton nose. The ear model wll be dvded nto two archtectures, e frst one representng e external and mddle ear model, and e second representng e nner ear model. The operatng of e new psychoacoustc model s as follows: We segmented e audo wave fle sgnal usng a 4 ponts Hannng wndow. The segmented sgnal s fltered usng e non lnear external and mddle ear model. Its transfer propertes may be affected by e actvty of mddle ear muscles n case of hgh level sounds. Under s assumpton, t s possble to model s model utlzng only a dgtal flter w approprate frequency response whch s gven by e followng analytcal expresson [4]: H( f).84. f 6.5. e. f.8.6( f 3.3) 3 3.6 (3) y(t) ISSN : 975-3397 344

35 3 5 5 5 4 6 8 4 6 8 Frequency(Hz) 5 5 5-5 4 6 8 4 6 8 Fréquency(Hz) 4 6 8 4 6 8 Fréquency(Hz) 5 5-5 - Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Fgure. Determnaton of e Gammachrp flter bank correspondng to each troncatured audo wave sgnal (4 samples) Note: The Gammatone flter bank s seres of 8 subbands. Ht () s e flterng result of e nput wave sgnal (4 samples) usng e correpondng external and mddle ear model flter Fgure. The transfer functon of e external and mddle ear model The output sgnal of e outer and mddle ear model flter s appled to a Gammatone flter bank (8 subbands) caracterzed by 8 centers frequences proposed by e dscrete wavelet transform repartton. On each subband we calculate e sound pressure level Ps (db) n order to have e correspondng subband chrp term. Those 8 values of chrp term correspondng to 8 subbands of e Gammatone flter bank lead to e correspondng Gammachrp flter bank G () t : s e flterng result of e nput Ht () sgnal usng e correspondng ( 8 charactersed by ts subband() Ps : s e sound pressure level of G () t : s e correspondng Ps chrp term ) Gammatone flter Audo wave sgnal (4 Samples) Ampltude(dB) Gammatone Flter Ampltude(dB) Asymetrc Functon (=) The external and mddle ear model flter Ht () X Gammatone Flter Bank x 4 Ampltude(dB) Gammachrp Flter (=) Fgure. Decomposton of e Gammachrp Flter G () t Ps The Sound Pressure Level Ps <= <=8 (The hrp Term ) <= <=8 G () t 7 Ps7 7 G () t 8 Ps8 8 On each subband of e dynamc Gammachrp flter bank we determne tonal and non tonal components []. Ths step begns w e determnaton of e local maxma, followed by extractng e tonal components (snusodal) and non tonal components (nose) n every bandwd of a crtcal band. The selectve suppresson of tonal and non tonal components of maskng s a procedure used to reduce e number of maskers taken nto account for e calculaton of e global maskng reshold..5..4.6.8..4.6.8 The correspondng Gammachrp Flter Bank x 4 ISSN : 975-3397 345

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Am pltude(db) 8 6 4 Tonal omponent Non tonal component Tonal Masker Non tonal Masker Spectrum of e w av sgnal 5 5 Frequency(Hz) Fgure 3. Example of e psychoacoustc model graphcal output The remanng tonal and non tonal components are ose whch are above e hearng absolute reshold [][3]. Indvdual maskng reshold takes account of e maskng reshold for each remanng component. Lastly, global maskng reshold s calculated by e sum of tonal and non tonal components whch are deduced from e spectrum to determne fnally e sgnal to mask rato Start Read Fle (.wav) Dvson of e sgnal nto small frames of 4 samples usng a Hannng wndow Determnaton of e local maxmas Elmnaton of e frequency components masked by e absolute reshold of hearng SMR Sgnal wav GFB P. A. M 4. Step 8 7 SMR samples (8 7 8 Sub Step4 SMR 8 bands) SMR Sgnal. wav GFB P. A. M 4 Step8 samples 7 SMR (8 7 8 Sub Step4 SMR 8 bands) Fgure 5. hange of e dynamc psychoacoustc model from one troncatured wave sgnal (4 samples) to anoer We used e term dynamc because f we move from one toncatured wave sgnal to anoer e shape of e Gammachrp flter bank used by e psychoacoustc model wll be modfed Note: Sgnal.wav=Nx ( troncatured sgnal (4 samples)) ) SMR Sgnal wav GFB P. A. M 4. Step 8 7 SMR samples (8 7 N 8 Sub Step4 SMR N 8 bands) N N N Sgnal. wav : The ( N ) troncatured wave 4 samples Flterng of e troncatured sgnal usng e combned external and mddle ear model Gammatone flter bank Estmaton of Ps n each Gammatone subband and calculaton of Asymmetrc functon characterzed by ts chrp term Localsaton of tonal and non tonal components Selectve suppresson of non tonal and tonal maskng components Indvdual maskng reshold Global maskng reshold sgnal usng e Hannng wndow (4 ponts) 7 8 :The ( N ) vector contanng e 8 chrp term values. It corresponds to e Sgnal. wav. 4 samples Gammachrp flter bank Sgnal To Mask Rato Fgure 4. The dfferent steps of e dynamc psychoacoustc model ISSN : 975-3397 346

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Sgnal.wav= Nx Frame of 4 Audo Samples GFB (8 Sub bands) : The ( N ) Gammachrp flter bank (8 4 Audo Samples + ---------- + N 4 Audo Samples subbands) correspondng to e Sgnal. wav 4 samples PAM.. : The Psychoacoustc Model as shown n Fgure Step8 Step4 D.W.T (8 Levels) (8 Subbands) 4 Audo Samples Quantzaton Btstream Formattng R-heck Audo Encoded Sgnal 4, correspondng to e Sgnal. wav and begnnng 4 sam ples from step 8 (Determnaton of e local maxmas) to step 4 (Sgnal To Mask Rato) SM R :The SM R 7 SM R8 ( N ) vector contanng e 8 Flter of External and Mddle Ear Model Gammatone Flter Bank 8 enters Frequences SNR(dB) Bt allocaton SMR values. It corresponds to e Sgnal. wav 4 samples V. EVALUATION OF SOUND OMPRESSION RATIO The compresson rato R s defned by e followng expresson: R sze _ wav _ sgnal (4) sze _ compressed _ sgnal Ps Gammachrp Flter Bank Ps8 8 Return To SNR SNR(dB) -SMR(dB) = MNR(dB) MNR< MNR> The nput sgnal s dvded n N dfferent troncatured audo wave samples (4 samples). Each one wll be encoded usng e followng D.W.T encoder as shown n Fgure 6 to obtan fnally e specfc frame header. oded data Determnaton of e local maxmas Global Maskng Thershold SMR(dB) Fgure 6. The detal archtecture of e D.W.T encoder The N all header frames wll be decoded usng e decoder block system as shown n Fgure 7 to obtan fnally our compressed sgnal. Sde nformaton ISSN : 975-3397 347

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Bt allocaton fles used for e test, er capactes, er duratons and e compresson ratos calculated for each btrate. Bt Unpackng Decode Wavelet oeffcents Wavelet oeffcents Inverse Wavelet Transform Reconstructed Output TABLE II. Sgnal. wav D.W.T OMPRESSION RATIO VALUES USING A BITRATE OF 96 KBITS/S, 8 KBITS/S AND 6 KBITS/S Duraton (s) apacty (Ko) 96 Kbt/s 8 Kbts/s 6 Kbts/s Slow 8 46.578.66 8.39 Soul 9.69.734 8.48 Rock 68.893.954 8.6 Jazz 5 66.735.84 8.569 Fgure 7. The D.W.T decoder Sgnal.wav= Nx Frame of 4 Audo Samples 4 Audo samples 4 Audo samples ---- ------------- D.W.T encoder D.W.T encoder The frame header : 834 bts The reconstructed sgnal D.W.T decoder D.W.T decoder The weakest compresson rato s gven for e flow of 6 Kbts/s and e hghest s gven for e flow of 64 Kbts/s. However, e weaker e btrate, e hgher e compresson rato s and e less ntellgble ts qualty becomes. Based on e results obtaned n Table, e proposed encoder usng e D.W.T repartton (8 subbands) and e dynamc Gammachrp flter bank takes account of e maskng phenomenon and e crtcal bands. The second part of our evaluaton wll focus on comparng e compresson rato obtaned from e proposed D.W.T model and e classcal MPEG one usng a btrate of 8kbts/s whch gves e best compromse between compresson rato and sound qualty [4]. TABLE III. OMPARISON BETWEEN THE LASSIAL MPEG ENODER AND THE PROPOSED D.W.T ENODER The classcal MPEG encoder The proposed D.W.T encoder Number of subbands 3 8 Number of samples per subband 3 (for each subband) Between : 4 and 8 (as shown n Table ) Sze of e troncatured sgnal 4 samples 4 samples N 4 Audo samples N D.W.T encoder N D.W.T decoder Psycho- Acoustc model lasscal Psychoacoustc model Psychoacoustc Model usng an external and mddle ear model and a dynamc Gammachrp flter bank The global compressed sgnal Fgure 8. The D.W.T encoder and decoder In order to evaluate e proposed codec usng e D.W.T and e dynamc Gammachrp psychoacoustc model, we used for varous btrates some types of sound such as Slow, Soul and Rock. The evaluaton s based on e compresson rato defned as e quotent between e sze of e orgnal fle and e compressed fle. Table contans e types of e sound Table 4 contans e type of e sound fles used for e test, e compresson rato values usng e classcal MPEG codec and e proposed D.W.T one TABLE IV. THE SOUND OMPRESSION RATIO OMPARISON USING THE LASSIAL MPEG ENODER AND THE PROPOSED D.W.T ENODER (BITRATE: 8KBITS/S) Sgnal.wav The classcal MPEG codec The proposed D.W.T codec Slow 7.393.66 Soul 7.64.734 Slow 7.393.66 The Table 4 reveals at sound compresson usng D.W.T s e best one. In fact, ts average compresson rato presents an mprovement of 4.3% n comparson to e classcal ISSN : 975-3397 348

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 MPEG coder. The spectrum of orgnal and compressed sgnal usng e D.W.T and e classcal MPEG model are shown n Fgures 9 and. Am pltude(db) 7 6 5 4 3 Spectrum of e wav sgnal Global maskng ershold Spectrum of e DWT compressed sgnal Sgnal wav masked by e global maskng ershold The no masked sgnal wav - 5 5 Frequency(Hz) Fgure 9. Spectrum of e orgnal and e compressed sgnal usng e D.W.T model Ampltude(dB) 7 6 5 4 3 Spectrum of e wav sgnal Global Maskng Thershold Spectrum of e classcal MPEG compressed sgnal The masked sgnal wav by e global maskng ershold The no masked sgnal wav 5 5 Frequency(Hz) naudblty for all frequences superor to s frequency (7Hz). All ose frequences are naudble snce ers ampltudes are below e absolute reshold of hearng. In our vew, e D.W.T model s better n comparson to e classcal MPEG one. In fact, f we compare e spectrum of e two compressed sgnals appearng n Fgures 9 and we remark at e frequency degradaton begns at 65Hz concernng e classcal MPEG model and 7Hz for e D.W.T model. We remark also at D.W.T frequences degradaton s more mportant. So e average of all D.W.T frequences ampltude s about db whereas t s db for e classcal model VI. AUDIO TEST QUALITY A. Evaluaton Usng e MUSHRA Lstenng Test Subjectve assessment of sound qualty s e best way to evaluate encoders performance. As a result of a heavy compresson, we always have to expect some knd of mparment. The man objectve of every compresson algorm s to make s mparment as much pleasant for a human ear as possble. There are several known meods used for subjectve assessment of audo codecs qualty. For s purpose we used e EBU MUSHRA meod, e most sutable for testng compressed audo samples w a very low btrates. MUSHRA stands for MUltple Stmulus w Hdden Reference and Anchors [5] and s a meodology for subjectve evaluaton of audo qualty, to evaluate e perceved qualty of e output from lossy audo compresson algorms. It s defned by ITU-R recommendaton BS.534- [5]. The MUSHRA test meod s e result of 6 years work by many people [6]. It was devsed to meet a need for a subjectve test meod at was approprate to ntermedate audo qualty systems. It s used n tests at are repeatable and reproducble. The fundamental characterstcs are:. multple stmul beng offered to e subject;. one of e stmul s a known reference; 3. one of e stmul s a hdden reference; 4. oer stmul must nclude hdden anchors w clearly defned parameters. The test meod has been used by several dfferent laboratores and s provng to be very effectve. In e Mushra test e lstener s presented w several audo stmul for whch e lstener must gve a score between and dependng upon er opnon of e qualty. The scale has fve ranges : excellent (-8), good (8-6), far (6-4), poor (4-) and bad (-) [6]. Fgure. Spectrum of e orgnal and e compressed sgnal usng e classcal MPEG model The compressed sgnal s e mask effect of e global maskng reshold on e orgnal audo fle spectrum. oncernng e D.W.T model, we note at e global maskng ershold masked e spectrum of e orgnal wave sgnal from 7Hz. Ths acton ntroduces a degradaton and ISSN : 975-3397 349

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Excellent 8 Good 6 Far 4 Poor Bad Fgure. The gradng scale for e MUSHRA lstenng test. MUSHRA SORE 8 6 3Kbts/s 4 64Kbts/s DWT AA WMA OGG ODE Fgure. The MUSHRA lstenng test (Sequence ) The selecton of compresson formats and er specfc encoders s a key part of e test preparaton phase. Due to extreme tme consumpton for s type of tests, we decded to compare e proposed audo D.W.T format w only ree most wdespread audo formats- OGG, AA and WMA. We pcked e followng partcular encoders: The proposed D.W.T encoder AA encoder ncluded n Apple Tunes 7.. WM Encoder usng WMA 9. codec OggEnc.. (part of VorbsTools..) MUSHRA SORE 8 6 4 DWT AA WMA OGG 3Kbts/s 64Kbts/s In order to do a good test, t s mportant to choose e test audo sgnals w some crtcal elements (such as percusson) to be sure at assessors wll be able to even dentfy e compressed sample. To keep a reasonable leng of an entre test sesson, we decded to pck only four muscal samples. Here s a bref lst of chosen musc sequences defned n e followng Table 5: TABLE V. THE MUSHRA TEST SEQUENES Sequence Type Artst Sound Duraton (s) Sequence Soul Steve Lvng For Wonder The ty 3 Sequence Slow Wtney A song Houston for you 5 Sequence3 Rock Kevn Barker Wor t Sequence4 Jazz Maurce Tme Tck Brown Tock 5 MUSHRA SORE ODE Fgure 3. The MUSHRA lstenng test (Sequence ). 8 6 4 DWT AA WMA OGG ODE 3Kbts/s 64Kbts/s Fgure 4. The MUSHRA lstenng test (Sequence 3) Ths test was ntended for comparson of codecs on low btrates. Our choce was 3 Kbts/s and 64 Kbts/s. The man goal of s experment was to compare sound qualty of e proposed D.W.T audo codec w oer wdely used audo codecs on very low btrates. Those codec are already mentoned above. Twenty persons took part n s subjectve test. Lsteners were mostly between 5 and 5. So ey lkely stll have a good hearng. Seven of em had a prevous experence w smlar lstenng tests. MUSHRA SORE 8 6 4 DWT AA WMA OGG ODE 3Kbts/s 64Kbts/s Fgure 5. The MUSHRA lstenng test (Sequence 4) ISSN : 975-3397 35

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 The test results for each sequence are dsplayed separately n Fgures, 3, 4 and 5. As for e 3 Kbts/s btrate, e proposed D.W.T model s e best assessed codec over all sequences. As for e 64 Kbts/s btrate, e results are much more balanced and e D.W.T model s not always e best. Fgure 6 s a numercal representaton of overall test results. On btrate 64 kbts/s, ere are no sgnfcant qualty dfferences. The qualty of most codecs was evaluated as "Good". Only AA was rated as "Far". A dfferent stuatons are on e 3 Kbts/s btrate. Qualty of Ogg and AA fell down almost to "Poor", n e case of WMA to "Far". It s very nterestng at D.W.T model has a hgher ratng on btrate 3 Kbts/s an on 64 Kbts/s. It may be because e qualty of oer samples n comparson w D.W.T model was so poor at lsteners were much more generous whle gvng "marks". MUSHRA SORE 8 6 4 DWT AA WMA OGG ODE Fgure 6. The global MUSHRA lstenng test 3Kbts/s 64Kbts/s Fgure 6 s a graphcal representaton of overall test results. As mentoned above, e D.W.T model s a good choce when e fnest sound qualty at e very low btrates s requred. B. SNR Evaluaton of The proposed ODE The second part of our sound qualty evaluaton conssts n measurng e SNR correspondng to e proposed D.W.T codec whch s gven by e followng equaton []: ( S X ) SNR.log( ) (5) ( SY ) Where S X s e mean square of e speech sgnal and S Y s e mean square dfference between e orgnal and reconstructed sgnal. For s, we select 5 mono wave sgnals at 75.6kbts/s. Each one s nputed to e proposed D.W.T codec. Next we play e 5 D.W.T audo decoded sgnals whch are compared to e audo sgnals decoded by e AA, WMA and Ogg/Vorbs software decoder usng e 64kbts/s btrate. It s wdely accepted at SNR cannot truly represent audo qualty under perceptual codec and our results confrm s assumpton. The SNR results are dsplayed n e followng Fgure 7. SNR(dB) 35 3 5 5 5 DWT AA WMA OGG Sunset Bolean Rasta Sunny The flght SOUND Fgure 7. :SNR Evaluaton of e proposed D.W.T ODE n comparson to oer schemes VII. ELETRONI IMPLEMENTATION OF THE PROPOSED D.W.T ODE A. EvalHardware Descrpton of The D.W.T ODE The block dagram of e D.W.T codec s represented n e followng Fgure 8: ISSN : 975-3397 35

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 Btstream Input Audo Output I²S S/P -DIF RIS DSP Host/PI Interface External Host PI Devce DEMUX MUX ROM Interface D.W.T Decoder ore D.W.T Encoder ore Audo Input Decoder Memory Interface Encoder Memory Interface GPIO ID PMU LK PLL OS SDRAM 4MByte SDRAM 4MByte I²S Btstream Output Fgure 8. Block dagram of e proposed D.W.T codec The D.W.T ODE s very flexble encoder and decoder card. It utlzes hgh speed dgtal sgnal processng for real tme D.W.T encodng or decodng of analog audo sgnals. Half duplex functonalty permts for e same channel card to be confgured to eer encode or decode. Incorporaton of e D.W.T encodng decodng algorm results n dgtal compresson of audo sgnals at reduces e requred dgtal bandwd to less an half of at requred for conventonal 64 Kbts/s PM [5] dgtzaton technques. As shown n Fgure 8, e D.W.T ODE conssts of an embedded RIS/DSP processor, a number of modules at comprse e encoder, a number of modules at comprse e decoder, and a number of I/O modules. As an audo encoder, t compresses audo data usng a propretary patent-pendng moton estmaton and rate control algorm at has been optmzed for low latency, fast scene detecton and smoo btrate allocaton. The embedded RIS/DSP processor packetzes e compressed data nto an output stream format selected by e user. The D.W.T odec accepts two channels of I S audo, and uses e audo encoder unctons n e RIS/DSP to generate compressed audo data, whch may be multplexed nto e output D.W.T audo compressed stream. As an audo decoder, e D.W.T ODE demultplexes e audo btstream nto ts audo and user data components and decompresses e audo nformaton. The output audo s avalable n I S or S/P-DIF dgtal audo format. The D.W.T ODE ncludes a power management module (PMU) and a number of I/O nterface modules such as Host/PI nterface (for an external processor, storage and oer devces), ROM nterface (for boot and oer mcrocode), General purpose nput / output (GPIO), and Phase-locked loops (audo PLL). Two dedcated hgh-speed SDRAM buses connect e RIS to e two SDRAM nterfaces. The encoder SDRAM bus s 3 bts wde, and e decoder SDRAM bus s 64 bts wde. Bo nterfaces to e external SDRAM devces are 3 bts wde. All modules n e encoder exchange data among emselves va e encoder SDRAM bus. All modules n e decoder exchange data among emselves va e decoder SDRAM bus. The audo encoder core retreves s data from SDRAM as needed. The audo encodng core reads and wrtes to e SDRAM e ntermedate and fnal results of e D.W.T encodng process. Ths compressed data s avalable to e multplexer n e RIS va e SDRAM. The audo nput unt sends ts nput data to e RIS va e encoder SDRAM. Audo compresson s done by e audo dgtal sgnal processng (DSP) extensons n e RIS. The DSP puts e compressed audo back nto SDRAM so e data wll be avalable to e multplexer n e RIS. B. Audo ompresson and odng Decodng functons have lower resource requrements. To reach prelmnary results as soon as possble, we have mplemented all e source program n language after transformaton from Matlab to w mcc command. The ISSN : 975-3397 35

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 fnal code has 478 lnes, and occupes 764Kbytes of memory, ncludng nput/output lbrares from e emulaton system. Ths code s tested nto a P-based prototypng board TMS354x DSK Alternatve smulatons on fxed pont TMS3546 DSP have been tred w very few success due to e leng of smulaton tme requred for e audo D.W.T compresson. The choce of s DSP s due to e hgh performance requred for e applcaton. Ths DSP contans 8 functonal unts ncludng two multplers and two adders. We have measured on dfferent peces of compled code (.asm fles) e number of nstructons at contan concurrent executon of functonal unts. Table 6 shows e number of nstructons w concurrent executon upon e number of concurrent unts. The frst column desgns e compled fle (.asm fle) tested. Under column #Instr, e total number of nstructons executed for each compled fle s found. Each followng column transform audo codec at would gan more momentum because t can match e current PM btrate at much better audo qualty. A hgh selectvty was notced and can lead to some nterestng perspectves on audo codng usng s type of model. TABLE VI. NUMBER OF INSTRUTIONS USING ONURENT FUNTIONAL UNITS Fle. asm #nstr 3 4 5 6 PAM 5874 486 8 6 Man 36 D.W.T-Enc 49 34 74 54 4 8 D.W.T-Dec 36 88 64 54 4 8 D.W.T 8 58 76 8 6 8 FFT 437 96 54 3 7 Frame Header 553 64 4 8 3 Quantze 74 8 4 4 Note: PAM s e abbrevaton of e psychoacoustc model VIII. ONLUSION The demand for compresson technology ncreases every year n parallel w e ncrease n aggregate bandwd for e transmsson of audo sgnals. A smple audo encoder scheme based on e Dscrete Wavelet Transform and a dynamc Gammachrp psychoacoustc model s presented n s paper. D.W.T usage allows frequency dependant resoluton n psychoacoustc model and frequency dependant wndowng for quantsaton. Maskng effects may be better nvolved by such procedure and artefacts lke pre-echo may be suppressed. Our D.W.T encoder was tested usng e MUSHRA test sound and compared w oer codecs usng dfferents btrates. Ths model can acheve transparent qualty at btrate 64Kbts/s and 3Kbts/s n real tme for e monophonc D qualty audo sgnals. We currently mplement e codec n hardware. The Electronc D.W.T codec card correspondng to e Fgure 8 s shown n Fgure 9. The result should confrm ts real tme performance and all oer clams made here. We are also workng towards lower btrate (64kbts/s) dscrete wavelet Fgure 9. Electronc card of e D.W.T codec REFERENES [] ISO/IE 7-3 (F), Informaton technology - odng of movng pcture and assocated audo for dgtal storage meda at up to about.5mbts/s Part3 : Audo, 999 [] T. Irno and R. D. Paterson, A compressve gammachrp audtory flter for bo physolgcal and psychophyscal data, J. Acoust. Soc. Amer., pp. 8- [3] B..Moore, An Introducton of e Psychology of Hearng, 5 ed. Oxford, UK: Academc, 3. [4] G. Davdson, L. Felder, and M. Antll, Hgh Qualty audo transform codng at 8Kbts/s, Proc. IEEE IASSP, vol.., 99, pp. 7- [5] ITU, ITU- R BS 534. Meod for subjectve assessment of ntermedate qualty level of codng systems, [6] G. Stoll and F. Kozamernk, EBU Lstenng tests on nternet audo codecs, n EBU Techncal Revew, No.8, [7] E. Zwcker and H. Fastl, Psychoacoustcs. Facts and Models, Sprngers- Verlag, Berln, Germany, nd Edton, 999 [8] S.Mallat, A Wavelet Tour of Sgnal Processng, Academc Press, San Dego, alf, USA, nd edton,. [9] ISO/IE JT/S9/WG (MPEG), Generc odng of Movng Pctures and Assocated Audo: Advanced Audo odng, Internatonal Standard ISO/IE IS 388-7, 997. [] J. O. Sm III and J.S. Abel, Bark and ERB blnear transforms, IEEE Tran. On speech and Audo Processng, Vol. 7, No. 6, November 999. []. Wang and Y.. Tong, An mproved crtcal-band transform processor for speech applcatons, rcuts and Systems, May 4, pp 46 464. [] A.M.M.A. Najh, A.R. Raml, A. Ibrahm, and A.R. Syed, omparng speech compresson usng wavelets w oer speech compresson schemes, Student onference on Research and Development (SOReD 3), Aug 3, pp 55 58 [3] K. Oun, ontrbuton to e vocal sgnal analyss usng knowledges on e audtory percepton and multresoluton tme frequency ISSN : 975-3397 353

Khall Abd et. al. / (IJSE) Internatonal Journal on omputer Scence and Engneerng Vol., No. 4,, 34-354 representaton of e speech sgnals, PhD Thess on Electrcal Engneerng, Natonal Engneerng School of Tuns, February 3. [4] Z. Hajayej, Etude, mse en oeuvre et évaluaton des technques de paramétrsaton perceptve des sgnaux de parole. Applcaton à la reconnassance de la parole par les modèles de Markov cachés, PhD Thess on Electrcal Engneerng, Natonal Engneerng School of Tuns, October 9. [5] T. Panter and A. Spanas, Perceptuel odng of Dgtal Audo, Proceedng of e IEEE, Vol. 88, Issue 4, pp. 45-55, Avrl [6] P. Rajmc and J. Vlach, Real-tme Audo Processng Va Segmented wavelet Transform, Internatonal onference on Dgtal Audo Effect, Bordeaux, France, Sept. 7 [7] P.R. Deshmukh, Mult-wavelet Decomposton for Audo ompresson, IE(I) Journal ET, Vol 87, July 6 [8] P. Phlppe, F. M. de Sant-Martn, and M. Lever, Wavelet packet flter banks for low tme delay audo codng, IEEE Trans. on Speech and Audo Processng, Vol. 7, No. 3, pp. 3-3, May 999. [9] F. Baumgarte, A computatonally effcent cochlear flter bank for perceptual audo codng, IASSP, Vol 5, pp. 365 368 [] H.G. Musmann, Geness of e MP3 audo codng standard, IEEE Trans. on onsumer Electroncs, Vol. 5, pp. 43 49, Aug. 6 AUTHORS PROFILE K. Abd receved e B.S. degree n Electrcal Engneerng from e Natonal School of Engneerng of Tuns, (ENIT), Tunsa, n 5, and e M.S degree n Automatc and Sgnal Processng n 6 from e same school. He started preparng hs Ph.D. degree n Electrcal Engneerng n 7. Hs research nteresets n Audo ompresson Usng Multresoluton Analyss K. Oun receved a Ph. D. degree n Electrcal Engneerng from ENIT n 3. He s teatchng n Electrcal Engneerng Department at ISTMT, Tunsa. He s also a researcher at Sgnal, Image and Pattern Recognton Laboratory, Natonal Engneerng School of Tuns (ENIT) Tunsa. Hs research concern contrbuton to e vocal sgnal analyss usng knowledges on e audtory percepton and multresoluton tme frequency representaton of e speech sgnals N. Ellouze was born n 9 December, 945. He receved a Ph.D. degree n 977 at INP (Toulouse- France), and Electronc Engneerng Dploma from ENSEEIHT n 968 Unversty P. Sabater. n 978. Pr. Ellouze joned e Electrcal Engneerng Department at ENIT (Tunsa). In 99, he became Professor n sgnal processng, dgtal sgnal processng and stochastc process. He was e head of e Electrcal Department from 978 to 983 and General Manager and Presdent of IRSIT from 987-994. He s now Drector of Research Laboratory LSTS at ENIT, and s n charge of ATS Master Degree at ENIT. Pr. Ellouze has drected multple Masters and Thess and publshed more an 3 scentfc papers n journals and proceedngs, n e doman of sgnal processng, speech processng, bomedcal applcatons, pattern recognton, and man machne communcaton. ISSN : 975-3397 354