Audio Engineering Society. Convention Paper. Presented at the 119th Convention 2005 October 7 10 New York, New York USA



Similar documents
PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

Image restoration for a rectangular poor-pixels detector

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

Resource Allocation in Wireless Networks with Multiple Relays

6. Time (or Space) Series Analysis

The Fundamentals of Modal Testing

Efficient Algorithms for MPEG-4 AAC-ELD, AAC-LD and AAC-LC Filterbanks

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Physics 211: Lab Oscillations. Simple Harmonic Motion.

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Fuzzy Sets in HR Management

Applying Multiple Neural Networks on Large Scale Data

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

IRCI Free Co-located MIMO Radar Based on Sufficient Cyclic Prefix OFDM Waveforms

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller

The Virtual Spring Mass System

Managing Complex Network Operation with Predictive Analytics

Dynamic Placement for Clustered Web Applications

Online Bagging and Boosting

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

ADJUSTING FOR QUALITY CHANGE

Searching strategy for multi-target discovery in wireless networks

RECURSIVE DYNAMIC PROGRAMMING: HEURISTIC RULES, BOUNDING AND STATE SPACE REDUCTION. Henrik Kure

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Data Set Generation for Rectangular Placement Problems

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX

Part C. Property and Casualty Insurance Companies

HW 2. Q v. kt Step 1: Calculate N using one of two equivalent methods. Problem 4.2. a. To Find:

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

How To Get A Loan From A Bank For Free

ASIC Design Project Management Supported by Multi Agent Simulation

Preference-based Search and Multi-criteria Optimization

An Innovate Dynamic Load Balancing Algorithm Based on Task

Amplifiers and Superlatives

Machine Learning Applications in Grid Computing

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Signature-Embedding In Printed Documents For Security and Forensic Applications

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

SOME APPLICATIONS OF FORECASTING Prof. Thomas B. Fomby Department of Economics Southern Methodist University May 2008

Lesson 44: Acceleration, Velocity, and Period in SHM

5.7 Chebyshev Multi-section Matching Transformer

High Performance Chinese/English Mixed OCR with Character Level Language Identification

Exercise 4 INVESTIGATION OF THE ONE-DEGREE-OF-FREEDOM SYSTEM

Optimal Resource-Constraint Project Scheduling with Overlapping Modes

2. FINDING A SOLUTION

Equivalent Tapped Delay Line Channel Responses with Reduced Taps

ESTIMATING LIQUIDITY PREMIA IN THE SPANISH GOVERNMENT SECURITIES MARKET

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information

SUPPORTING YOUR HIPAA COMPLIANCE EFFORTS

Airline Yield Management with Overbooking, Cancellations, and No-Shows JANAKIRAM SUBRAMANIAN

Work, Energy, Conservation of Energy

SAMPLING METHODS LEARNING OBJECTIVES

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP July 2013

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Generating Certification Authority Authenticated Public Keys in Ad Hoc Networks

Cooperative Caching for Adaptive Bit Rate Streaming in Content Delivery Networks

A Fast Algorithm for Online Placement and Reorganization of Replicated Data

The Design and Implementation of an Enculturated Web-Based Intelligent Tutoring System

arxiv: v1 [math.pr] 9 May 2008

Leak detection in open water channels

An Approach to Combating Free-riding in Peer-to-Peer Networks

The Benefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelism

Evaluating the Effectiveness of Task Overlapping as a Risk Response Strategy in Engineering Projects

Halloween Costume Ideas for the Wii Game

LEAN FOR FRONTLINE MANAGERS IN HEALTHCARE An action learning programme for frontline healthcare managers

Entity Search Engine: Towards Agile Best-Effort Information Integration over the Web

Geometrico-static Analysis of Under-constrained Cable-driven Parallel Robots

Energy Proportionality for Disk Storage Using Replication

A framework for performance monitoring, load balancing, adaptive timeouts and quality of service in digital libraries

Fuzzy approach for searching in CRM systems

AutoHelp. An 'Intelligent' Case-Based Help Desk Providing. Web-Based Support for EOSDIS Customers. A Concept and Proof-of-Concept Implementation

A Soft Real-time Scheduling Server on the Windows NT

Partitioned Elias-Fano Indexes

Nonlinear Control Design of Shunt Flexible AC Transmission System Devices for Damping Power System Oscillation

Energy Efficient VM Scheduling for Cloud Data Centers: Exact allocation and migration algorithms

Onset Detection and Music Transcription for the Irish Tin Whistle

Model-Based Error Correction for Flexible Robotic Surgical Instruments

Driving Behavior Analysis Based on Vehicle OBD Information and AdaBoost Algorithms

COMBINING CRASH RECORDER AND PAIRED COMPARISON TECHNIQUE: INJURY RISK FUNCTIONS IN FRONTAL AND REAR IMPACTS WITH SPECIAL REFERENCE TO NECK INJURIES

Position Auctions and Non-uniform Conversion Rates

Pure Bending Determination of Stress-Strain Curves for an Aluminum Alloy

A Scalable Application Placement Controller for Enterprise Data Centers

Method of supply chain optimization in E-commerce

Using Bloom Filters to Refine Web Search Results

The individual neurons are complicated. They have a myriad of parts, subsystems and control mechanisms. They convey information via a host of

AC VOLTAGE CONTROLLER CIRCUITS (RMS VOLTAGE CONTROLLERS)

Analysis/resynthesis with the short time Fourier transform

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

Insurance Spirals and the Lloyd s Market

LCOS Projector WUX500

The Model of Lines for Option Pricing with Jumps

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Multi-Class Deep Boosting

A quantum secret ballot. Abstract

Transcription:

Audio Engineering Society Convention Paper Presented at the 119th Convention 2005 October 7 10 New York, New York USA This convention paper has been reproduced fro the authors advance anuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers ay be obtained by sending request and reittance to Audio Engineering Society, 60 East 42 nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not peritted without direct perission fro the Journal of the Audio Engineering Society. Multi-Channel Audio Tie-Scale Modification David Dorran 1, Robert Lawlor 2, and Eugene Coyle 3 1 Digital Media Centre, Dublin Institute of Technology, Aungier Street, Dublin 2, Ireland. david.dorran@dit.ie 2 Departent of Electronic Engineering, National University of Ireland, Maynooth, Co. Kildare, Ireland. rlawlor@eeng.ay.ie 3 School of Control Systes and Electrical Engineering, Dublin Institute of Technology, Dublin 8, Ireland. eugene.coyle@dit.ie ABSTRACT Phase vocoder based approaches to audio tie-scale odification introduce a reverberant artefact into the tiescaled output. Recent techniques have been developed to reduce the presence of this artefact; however, these techniques have the effect of introducing additional issues relating to their application to ulti-channel recordings. This paper addresses these issues by collectively analysing all channels prior to tie-scaling each individual channel. 1. INTRODUCTION Tie-scale odification of audio alters the duration of an audio signal whilst retaining the signals local frequency content, resulting in the overall effect of speeding up or slowing down the perceived playback rate of a recorded audio signal without affecting its perceived pitch or tibre. There are two broad approaches used to achieve a tiescaling effect i.e. tie-doain and frequency-doain. Tie-doain algoriths, such as the synchronized overlap-add (SOLA) algorith [1], are generally ore efficient than their frequency-doain counterparts, but require the existence of a strong quasi-periodic eleent within the signal to be tie-scaled in order to produce a high quality output. This akes the generally unsuitable for their application to coplex audio such as ulti-pitched polyphonic usic. Frequency-doain techniques, such as the phase vocoder [2] and sinusoidal odelling [3], are capable of tie-scaling coplex audio but introduce a reverberant/phasy artifact into the tie-scaled output. This artifact is generally ore objectionable in speech than in usic; since usic recordings typically contain a significantly higher level of reverberation than speech so that additional reverberation introduced by tie-scaling is not as noticeable.

In [], a hybrid tie-frequency doain algorith is presented that takes advantage of certain aspects of each broad approach to realize an efficient and robust tiescaling ipleentation, which reduces the presence of the phasiness artifact associated with frequency-doain ipleentations. The hybrid ipleentation introduces additional considerations when applied to ulti-channel recordings. This paper addresses those issues. This paper is structured as follows: Section 2 provides an overview of SOLA; Section 3 outlines the basic operation of the iproved phase vocoder [5], which akes use of sinusoidal odeling techniques to iprove upon the standard phase vocoder; Section 4 discusses the phase tolerance allowed within phase vocoder ipleentations [6] and deonstrates how this tolerance can be used to push/pull phases back into a phase coherent state; Section 5 describes the hybrid approach which incorporates both tie-doain and frequency-doain features through anipulation of the phase tolerance identified; Section 6 addresses the issues associted with ulti-channel recordings; Section 7 concludes. 2. SYNCHRONIZED OVERLAP-ADD Tie-doain algoriths operate by appropriately discarding or repeating suitable segents of the input; with the duration of these segents being typically an integer ultiple of the local pitch period (when it exists). Tie-doain techniques are capable of producing a very high quality output when dealing with quasi periodic signals, such as speech, but have difficulty with ore coplex audio, such as ultipitched polyphonic audio [7]. It should be noted that fewer discard/repeat segents are required the closer the desired tie-scale duration is to that of the original duration [7]. Therefore tie-doain algoriths produce particularly high quality results for tie-scale factors close to one, since significant portions of the output are directly copied, without processing, fro the input. The SOLA algorith achieves the discard/repeat process by first segenting the input into overlapping fraes, of length N, with each frae S a saples apart. S a is the analysis step size. The tie-scaled output y is synthesized by overlapping successive fraes with each frae a distance of S s + τ saples apart. S s is the synthesis step size, and is related to S a by S s = αs a, where α is the tie scaling factor. τ is a offset that ensures that successive synthesis fraes overlap synchronously. Figure 1 illustrates an iteration of this process, whereby an input frae is appended to the current output. Figure 1: SOLA iteration Standard SOLA paraeters are generally fixed, however in [8] an adaptive and efficient paraeter set is derived, which is used in the hybrid ipleentation (section 5) and is given by S a L stat SR = 1 α (1) Lstat SR (2) N = SR + α 1 α where L stat is the stationary length (approx 25-30s) and SR is the search range over which τ is deterined (approx 12-20s). 3. IMPROVED PHASE VOCODER Tie-doain techniques aintain horizontal synchronization between successive fraes by deterining regions of siilarity between the fraes prior to overlap-adding; as such, tie-doain techniques require the input to be suitably periodic in nature. Phase vocoder ipleentations operate by aintaining horizontal synchronization along subbands; such an approach reoves the necessity for a quasi-periodic broadband signal. Within phase vocoder ipleentations it is assued that each subband contains a quasi-sinusoidal coponent [2]. Standard ipleentations of the phase vocoder ake use of unifor width filterbanks to extract the quasi-sinusoidal subbands, typically through the efficient use of a short-tie Fourier transfor (STFT). Page 2 of 7

Horizontal synchronization (or horizontal phase coherence [5]) is aintained at a subband level by ensuring that the expected phase of each sinusoidal coponent follows the sinusoidal phase propagation rule i.e. φ 2 = φ 1 + ω(t 2 t 1 ) (3) where φ 1 is the instantaneous phase at tie t 1, ω is the frequency of the sinusoidal coponent, and φ 2 is the expected phase of the sinusoidal coponent at tie t 2. During tie-scale odification agnitude values of the sinusoidal subband coponents are siply interpolated or deciated to the desired duration. In [9] tie-scale expansion is achieved by appropriately repeating STFT windows e.g. to tie-scale by a factor of 1.5 every second window is repeated; siilarly tie-scale copression is achieved by oitting windows e.g. to tie scale by a factor of 0.9 every tenth analysis window is oitted. The phase propagation forula of equation (3) is then applied to each subband (or discrete Fourier Transfor (DFT) bin), fro window to window. In [5] it is recognized that not all subbands are true sinusoidal coponents, and soe are essentially interference ters introduced by the windowing process of the STFT analysis. [5] notes that applying the phase propagation rule to these interference ters results in a loss of vertical phase coherence between subbands which introduces a reverberant or phasy artifact into the tie-scaled output. The solution to this proble is to identify true sinusoidal coponents through a agnitude spectru peak peaking procedure and applying the phase propagation rule to these coponents only. The phases of the subband coponents in the region of influence of a peak/sinusoidal subband are updated in such a anner as to preserve the original phase relationships [5]. Whilst [5] results in iproved vertical phase coherence between a true sinusoidal coponent and its neighboring interference coponents, it does not attept to aintain the original phase relationships that exist between true sinusoidal coponents. The loss of phase coherence between these coponents also results in the introduction of reverberation. This proble is addressed in the literature, whereby the phase relationship or relative phase difference between haronically related coponents of a haronic signal is aintained through various techniques e.g. [9-11]. These approaches, however, require the deterination of the local pitch period. Whilst the techniques of [9-11] attept to aintain vertical phase coherence through the anipulation of the phase values of haronically related sinusoidal coponents, tie-doain approaches iplicitly aintain vertical phase coherence by virtue of the fact that the broadband signal is not partitioned into subbands. 4. PHASE FLEXIBILITY WITHIN PHASE VOCODER In [6] it is shown that displacing the horizontal phase of a pure sinusoidal coponent fro its ideal/expected value, within a window of the phase vocoder, results in a certain aount of aplitude and frequency odulation being introduced into the sinusoidal coponent. Furtherore, in [6] it is shown, through a psychoacoustic analysis, that if the phase deviation introduced is less than a particular value, the aplitude and frequency odulations will not be perceived. The phase deviation that is perceptually tolerated is dependent on the hop size and window length of the STFT. Fro [6] the axiu phase deviation tolerated θ for a 50% analysis window overlap is: θ = in{0.5676, 2arctan(3.6L)} radians (4) where L is the duration of the analysis window in seconds. The workings for the derivation of equivalent equations for a 75% overlap are soewhat verbose and can be deterined in a siilar anner to the ethodology outlined in [6]. For the sake of convenience the equations derived for a 75% overlap are provided here. The axiu phase deviation tolerated θ is given by θ = in{0.27, 2arcsin(2.53L)} radians (5) It should be noted that (5) is an approxiation, valid within 0.2% for values of θ less than 0.27 radians. [6] also shows how the phase tolerance can be used to push or pull a odified STFT representation into a phase coherent state; the basic principle is briefly explained as follows: Consider the situation illustrated in Figure 2; assue that the phases of synthesis window 1 are equal to those of analysis window 1; the phases of the repeated synthesis window 2 are then deterined such that Page 3 of 7

horizontal phase coherence is aintained between true sinusoidal coponents (peaks), whilst phases of neighboring coponents are updated so as to aintain vertical phase coherence. Horizontal phase coherence between the peaks of synthesis windows 1 and 2 can be preserved by keeping the sae phase difference between the that exists between analysis windows 1 and 2 [9]; then synthesis window 1 coprises of the agnitudes and phases of analysis window 1 (and is therefore perfectly phase coherent), whilst synthesis window 2 coprises of the agnitudes of analysis window 1 and a set of phases close to those of analysis window 2 (and is therefore generally not perfectly phase coherent). It follows that, in general, synthesis window n coprises of the agnitudes of analysis window n-1 and phases close to those of analysis window n, for all windows up to the next discard/repeat frae. In [6] the synthesis phase values of synthesis window n are pushed or pulled toward the phase values of analysis window n-1 using the horizontal phase tolerance established. Once the phases of window n equal those of the target phases of analysis window n-1 perfect phase coherence is restored. It follows that subsequent windows up to the next discard/repeat window will also be perfectly phase coherent. Fro Figure 2, once phase coherence is realized (at synthesis window 7 in Figure 2), there is no need for further frequency-doain processing and a segent of the original tie-doain input can be siply inserted into the output, in a siilar anner to tie-doain ipleentations, as shown in Figure 2. This has the added benefit of reducing the coputational costs whilst bringing the tie-scaled output into a phase coherent state. This process requires that a certain nuber of windows exist before the next discard/repeat operation; for exaple given a phase tolerance of 0.314 (i.e. π/10) radians, perfect phase coherence is assured to be established for tie-scale factors between 0.9 and 1.1, since phase values can be at ost +/-π radians fro perfect phase coherence. It should be noted that if the phase values of synthesis window 2 were close to those of analysis window 1 then perfect phase coherence would be established quickly; the following section addresses this issue by aking use of tie-doain techniques in identifying good initial phase values, thereby reducing the transition tie to perfect phase coherence. Figure 2: Tie-scaling process 5. HYBRID IMPLEMENTATION The original otivation behind the SOLA algorith [1] was to provide an initial set of phase estiates for the reconstruction of a agnitude only STFT representation of a signal. The sae principle is used here to provide a set of phase estiates for use within the procedure outlined in section 4. The reainder of this section describes the approach used to deterine the initial phase estiates and their use within the hybrid ipleentation. Consider the situation shown in Figure 3, in which a frae extracted fro the input is shown overlapping with the current output. As with the standard SOLA ipleentation the overlap shown is deterined through the use of a correlation function. For the th iteration of the algorith the offset τ is chosen such that the correlation function R (τ), given by R ( τ ) = L 1 L 1 2 j= 0 j= 0 y( S + τ + j) x( S x ( S a s + j) L 1 2 j = 0 a + j) y ( S + τ + j) s (6) is a axiu for τ = τ, where x is the input signal, y is the tie-scaled output, L is the length of the overlapping region and τ is in the range 0 < τ < τ ax, where τ ax is typically the nuber of saples which equates to approxiately 20s. S a and S s are defined in section 2. The optiu frae overlap L ov shown in Figure 3 is then given by L ov = N- S s τ (7) where N is the frae length, defined in section 2. Page 4 of 7

Figure 3: Hybrid iteration Also shown in Figure 3 below the input frae, are the synthesis windows and the synthesis frae; it is this synthesis frae which is appended to the current output within the hybrid approach and not the input frae, as is the case in SOLA. The following details the generation of the synthesis frae. Window b is first extracted fro the output y and is positioned such that it has its center at the center of the optiu overlap, as shown in the diagra. More specifically, for the th iteration of the algorith, frae b is given by b(j) = y(s s + τ + L ov /2 L/2 +j).w(j) for 0 < j L (8) where w is the STFT analysis window, typically hanning, L is the STFT window length, typically the nuber of saples which equates to approxiately 60s. (Both shorter and longer windows have been proposed in the literature, however 60s was found to be suitable for an ipleentation which is intended to cater for both speech and a wide range of polyphonic usic.) The window f 1 is extracted fro the input x and is positioned such that it is aligned with frae b. Subsequent windows are sequentially spaced by the STFT hop size H. More specifically, for the th iteration of the algorith window f n is given by f n (j) = x(s a + L ov /2 + H.(n -1) L/2 + j).w(j) for 0 < j L (9) F 1 the DFT representation of f 1, is then derived using the agnitudes of F 1 and the phase values B, where F n and B are the DFT representations of f n and b, respectively; then F ( k) = F ( k) exp( i B( k) ) for all k in the set P 1 (10) 1 1 where P 1 is the set of peak bins found in F 1. All other bins are updated so as to aintain the original phase difference between a peak and bins in its region of influence, as described in [5]. The phase values of STFT window B are chosen since they provide a set of phase values that naturally follow the window labeled a in Figure 3 and therefore aintain horizontal phase coherence. Subsequent synthesis windows are derived fro Fn ( k ) = Fn ( k ) exp( i( Fn 1 ( k ) + Fn n 1 ( k ) + D( k ) ) (11) for all k in the set P n, where P n is the set of peak bins found in F n. As above, all other bins are updated so as to aintain the original phase difference between a peak and bins in its region of influence. For the hybrid case perfect phase coherence is achieved when synthesis STFT window F n has the agnitude and phase values of window F n. D is the phase deviation which is used to push or pull the fraes into a phase coherent state. D is dependent on the bin nuber denoted by k and is given by D or ( k ) = F ( k) D if princarg ( F ( k) ) θ ( k ) = sign F ( k ) ( ) θ if princarg ( F ( k )) > θ (12) (13) where θ is the axiu phase tolerance (see section 4). The nuber of synthesis STFT windows required is such that an inverse STFT on these windows results in a synthesis frae of duration N+3L/2. This is to ensure that window b is available for the next iteration of the algorith. It should be noted that the nuber of the synthesis windows also controls the ability of the algorith to recover phase coherence; if N is large (which is the case when is α is close to one, see equation (2)) phase coherence is recovered ore easily. The synthesis frae x is obtained through the application of an inverse STFT on windows F 1, F 2, F 3,. The output y is then updated by y(s s + τ + L ov /2 L/2 +j) := E(j).y(S s + τ + L ov /2 L/2 +j) + x (j) for 0 < j L H (14) y(s s + τ + L ov /2 L/2 +j) = x (j) for L-H < j N +3L/2 (15) where := in equation (14) eans becoes equal to and E is an envelope function which ensures that the output y sus to a constant during the overlap-add procedure. Page 5 of 7

E is dependent on the STFT hop size H and whether a synthesis window is eployed during the inverse STFT procedure. For the case where a synthesis window is eployed, which is equal to the analysis hanning window w, and H = L/4 E(j) = w 2 (H + j) + w 2 (2H + j) + w 2 (3H + j) for 0<j L H (16) It should be noted that for the case where the input is perfectly periodic the initial phase estiates provided by STFT window B are assured to be equal to the target phase values of window F 1 and the tie-scaled output is always perfectly phase coherent. For quasi-periodic signals, such as speech, the initial phase estiates are generally close to the target phase, and the transition period to perfect phase coherence is generally short. For the case where ore coplex audio is being tiescaled, the transition to perfect phase coherence is relatively long; nevertheless, the reverberant artifact introduced, due to the loss of perfect phase coherence, is perceptually less objectionable in these types of signals, due to the reverberation level generally already present. The hybrid approach described does, however, have the benefit of noticeably reducing the effects of transient searing without the necessity of explicit transient detection. As with tie-doain ipleentations, the quality and efficiency iproveents offered by the hybrid approach over frequency-doain approaches are ost noticeable for tie-scaling factors close to one, with results being particularly good for factors in the range 0.8 to 1.2. 6. CONSIDERATIONS FOR MULTI- CHANNEL RECORDINGS In [9] the iplications of the application of a phase vocoder based tie-scale odification algorith to stereo recordings are outlined. [9] aintains the stereo iage by ensuring that both agnitude and phase differences between related channel coponents are preserved. Magnitude differences are aintained within standard phase vocoder ipleentations if the sae paraeters are used to tie-scale each channel, whilst phase differences are explicitly aintained. Within the hybrid ipleentation, segents of different duration could be discarded/repeated fro each channel if the channels are tie-scaled separately; even if the sae algorith paraeters are applied to each channel. This could result in an alteration of the stereo iage, since agnitude differences between channels are unlikely to be aintained. The solution to this potential proble is to su channels before applying the correlation function of equation (6). The offset identified, by finding the axiu of the correlation function, is then applied to both channels for each iteration of the algorith. Phase differences are preserved between peaks, at the sae bin location, between channels, by first updating the peak with the greater agnitude in the anner described earlier; the peak with the lesser agnitude is updated so as to preserve the original phase relationship. Bins in the region of influence of a peak are updated in the usual anner. 7. CONCLUSIONS In [4] a robust and efficient hybrid tie-scaling algorith is developed; the approach draws upon features fro existing tie-doain and frequencydoain tie-scaling ipleentations. The hybrid approach introduces difficulties when applied to ultichannel audio; this issue is addressed in this paper. 8. ACKNOWLEDGMENTS The authors wish to express their gratitude to Dan Barry for his fruitful discussions during the developent of the algoriths. 9. REFERENCES [1] S. Roucos, A.M. Wilgus, High quality tie-scale odification for speech, IEEE Int conf on Acoustics, Speech and Signal processing, pp. 493-496, 85. [2] M. Dolson, The phase vocoder: A tutorial, Coputer Music Journal, vol. 10, pp. 145-27, 86. [3] R. McAulay, T Quatieri, Speech analysis/synthesis based on a sinusoidal representation, IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 34(4),pp.744 754, 86. [4] D. Dorran, R. Lawlor, E. Coyle, Audio tie-scale odification using a hybrid tie-frequency doain approach, accepte for publication at the IEEE Page 6 of 7

Workshop on App s of Signal Processing to Audio and Acoustics, 05. [5] J. Laroche, M. Dolson, Iproved phase vocoder tie-scale odification of audio, IEEE Transactions on Speech and Audio Processing, vol. 7(3), pp. 323-332, 99. [6] D. Dorran, R. Lawlor, E. Coyle, An efficient phasiness reduction technique for oderate audio tie-scale odification, Proceedings of DAFX- 04, pp. 83-88, 04. [7] J. Laroche, Autocorrelation ethod for highquality tie/pitch-scaling, IEEE Workshop on App s of Signal Processing to Audio and Acoustics, pp. 131 134, 93. [8] D. Dorran, R. Lawlor, An efficient tie-scale odification algorith for use within a subband ipleentation, Proc. of DAFX-03, pp. 339-343, 03. [9] J. Bonada, Autoatic technique in frequency doain for near-lossless tie-scale odification of audio, Proc. of International Coputer Music Conference, 00. [10] T. Quatieri, R. McAulay, Shape invariant tiescale and pitch-scale odification of speech, IEEE Transactions on Signal Processing, vol. 40(3), pp 497-510, 92. [11] R. Di Federico, Wavefor preserving tie stretching and pitch shifting for sinusoidal odels of sound, Proc. of DAFX-98, pp. 44-48, 98. [12] J. Laroche, Frequency-doain techniques for high quality voice odification, Proc. of DAFX-03, pp.328-322, 03. Page 7 of 7