TEMPO AND BEAT ESTIMATION OF MUSICAL SIGNALS
|
|
|
- Gertrude Bishop
- 9 years ago
- Views:
Transcription
1 TEMPO AND BEAT ESTIMATION OF MUSICAL SIGNALS Miguel Alonso, Bertrand David, Gaël Richard ENST-GET, Département TSI 46, rue Barrault, Paris cedex 3, France ABSTRACT Tempo estimation is fundamental in automatic music processing and in many multimedia applications. This paper presents an automatic tempo tracking system that processes audio recordings and determines the beats per minute and temporal beat location. The concept of spectral energy flux is defined and leads to an efficient note onset detector. The algorithm involves three stages: a frontend analysis that efficiently extracts onsets, a periodicity detection block and the temporal estimation of beat locations. The performance of the proposed method is evaluated using a large database of 489 excerpts from several musical genres. The global recognition rate is 89.7 %. Results are discussed and compared to other tempo estimation systems. Keywords: beat, tempo, onset detection.. INTRODUCTION It is very difficult to understand western music without perceiving beats, since a beat is a fundamental unit of the temporal structure of music [4]. For this reason, automatic beat tracking, or tempo tracking, is an essential task for many applications such as musical analysis, automatic rhythm alignment of multiple musical instruments, cut and paste operations in audio editing, beat driven special effects. Although it might appear simple at first, tempo tracking has proved to be a difficult task when dealing with a broad variety of musical genres as shown by the large number of publications on this subject appeared during the last years [2, 5, 6, 8, 9,, 2]. Earlier tempo tracking approaches focused on MIDI or other symbolic formats, where note onsets are already available to the estimation algorithm. More recent approaches directly deal with ordinary CD audio recordings. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. c 24 Universitat Pompeu Fabra. The system that we present in this paper lies into this category. For musical genres with a straightforward rhythm such as rap, rock, reggae and others where a strong percussive strike drives the rhythm, current beat trackers indicate high performance as pointed out by [5, 9, 2]. However, the robustness of the beat tracking systems is often much less guaranteed when dealing with classical music because of the weakness of the techniques employed in attack detection and tempo variations inherent to that kind of music. In the present article, we describe an algorithm to estimate the tempo of a piece of music (in beats per minute or bpm) and identify the temporal locations when it occurs. Like most of the systems available in the literature, this algorithm relies on a classical scheme: a front-end processor extracts the onset locations from a time-frequency or subband analysis of the signal, traditionally using a filter bank [, 7,, 2] or using the discrete Fourier transform [3, 5, 6, 8, 9]. Then, a periodicity estimation algorithm finds the rate at which these events occur. A large variety of methods has been used for this purpose, for example, a bank of oscillators which resonate at integer multiples of their characteristic frequency [6, 9, 2], pitch detection methods [, ], histograms of the inter-onset intervals [2, 3], probabilistic approaches such as Gaussian mixture model to express the likelihood of the onset locations [8]. In this paper, following Laroche s approach [9], we define the quantity so-called spectral energy flux as the derivative of the signal frequency content with respect to time. Although this principle has been previously used [3, 6, 8, 9], a significant improvement has been obtained by using an optimal filter to approximate this derivative. We exploit this approach to obtain a high performance onset detector and integrate it into a tempo tracking algorithm. We demonstrate the usefulness of this approach by validating the proposed system using a large manually annotated data base that contains excerpts from rock, latin, pop, soul, classical, rap/hip-hop and others. The paper is organized as follows: Section 2 provides a detailed description of the three main stages that compose the system. In Section 3, test results are provided and compared to other methods. The system parameters used during the
2 Audio Signal Synchronised beat signal Onset detection Spectral analysis (STFT) Spectral energy flux Beat location Periodicity estimation Detection function Figure. Architecture of the beat tracking algorithm. validation procedure are provided as well as comments about the issues of the algorithm. Finally, Section 4 summarizes the achievements of the presented algorithm and discusses possible directions for further improvements. 2. DESCRIPTION OF THE ALGORITHM In this paper, it is assumed that the tempo of the audio signal is constant over the duration of the analysis window and that it eventually evolves slowly from one to the other. In addition, we suppose that the tempo lies between 6 and 2 BPM, without loss of generality since any other value can be mapped into this range. The algorithm proposed is composed of three major steps (see figure ): onset detection: it consists in computing a detection function based on the spectral energy flux of the input audio signal; periodicity estimation : the periodicity of the detection function is estimated using pitch detection techniques ; beat location estimation : the position of the corresponding beats is obtained from the cross-correlation between the detection function and an artificial pulsetrain. 2.. Onset detection The aim of onset detection consists in extracting a detection function that will indicate the location of the most salient features of the audio signal such as note changes, harmonic changes and percussive events. As a matter of fact, these events are particularly important in the beat perception process. Note onsets are easily masked in the overall signal energy by continuous tones of higher amplitude [9], while they are more likely detected after separating them in frequency channels. We propose to follow a frequency domain approach [3, 5, 6, 8, 9] as it proves to outperform time-domain methods based on direct processing of the temporal waveform as a whole Spectral analysis and spectral energy flux The input audio signal is analyzed using a decimated version of the short-time Fourier transform (STFT), i.e., short signal segments are extracted at regular time intervals, multiplied by an analysis window and transformed into the frequency domain by means of a Fourier transform. This leads to N X( f,m) = n= w(n)x(n + mm)e j2π f n () where x(n) denotes the audio signal, w(n) the finite analysis window of size N in samples, M the hop size in samples, m the frame index and f the frequency. Motivated by the work of Laroche [9], we define the spectral energy flux E( f,k) as an approximation to the derivative of the signal frequency content with respect to time E( f,k) = h(m k)g( f,m) (2) m where h(m) approximates a differentiator filter with: and the transformation H(e j2π f ) j2π f (3) G( f,m) = F { X( f,m) } (4) is obtained via a two step process: a low-pass filtering of X( f,m) to perform energy integration in a way similar to that in the auditory system, emphasizing the most recent inputs, but masking rapid modulations [4] and a nonlinear compression. For example, in [9] Laroche proposes h(m) as a first order differentiator filter (h = [; ]), no low-pass filtering is applied and the non-linear compression function is G( f,n) = arcsinh( X( f,m) ). In [6] Klapuri uses the same first order differentiator filter, but for the transformation, he performs the low-pass filtering after applying a logarithmic compression function. In the present work we propose h(m) to be a FIR filter differentiator. Such a filter is designed by a Remez optimisation procedure which leads to the best approximation to Eq. (3) in the minimax sense []. This new approach, compared to the first order difference used in [6, 8, 9] highly improves the extraction of musical meaningful features such as percussive attacks and chord changes. In addition, G( f,k) is obtained via low-pass filtering with a second half of a Hanning window followed by a logarithmic compression function as suggested by Klapuri [7], since the logarithmic difference function gives the amount of change in a signal in relation to its absolute level. This is a psycho-acoustic relevant measure since the perceived signal amplitude is in relation to its level, the same amount of increase being more prominent in a quite signal [7]. During the system development, several orders for the differentiator filter h(m) were tested. We found that using an order 8 filter was the best tradeoff between complexity and efficiency. In practice, the algorithm uses an N
3 point FFT to evaluate the STFT, thus the frequency channels to N 2 of the signal s time frequency representation are filtered using h(m) to obtain the spectral energy flux. Then, all the positive contributions of these channels are summed to produce a temporal waveform v(k) that exhibits sharp maxima at transients and note onsets, i.e., those instants where the energy flux is large. Beat tends to occur at note onsets, so we must first distinguish the true beat peaks from the spurious ones in v(k) to obtain a proper detection function p(k). In addition, we work under the supposition that these unwanted peaks are much smaller in amplitude compared to the note attack peaks. Thus, a peak-picking algorithm that selects peaks above a dynamic threshold calculated with the help of a median filter is a simple and efficient solution to this problem. The median filter is a nonlinear technique that computes the pointwise median inside a window of length 2i + formed by a subset of v(k), thus the median threshold curve is given by the expression: θ(k) = C median(g k ) (5) where g k = {v k i,...,v k,...,v k + i } and C is a predefined scaling factor to artificially rise the threshold curve slightly above the steady state level of the signal. To ensure accurate detection, the length of the median filter must be longer than the average width of the peaks of the detection function. In practice, we set the median filter length to 2 ms. Then, we obtain the signal ˆp(k) = v(k) θ(k), which is half-wave rectified to produce the detection function p(k): { ˆp(k) if ˆp(k) > p(k) = otherwise (6) In our tests, the onset detector described above has proved to be a robust scheme that provides good results for a wide range of musical instruments and attacks at a relatively low computational cost. For example, Figure 2-a shows the time waveform of a piano recording containing seven attacks. These attacks can be easily observed in the signal s spectrogram in Figure 2-b. The physical interpretation of Figure 2-c can be seen as the rate at which the frequency-content energy of the audio signal varies at a given time instant, i.e., the spectral energy flux. In this example, seven vertical stripes represent the reinforcement of the energy variation, clearly indicating the location of the attacks (the position of the spectrogram edges). When all the positive energy variations are summed in the frequency domain and thresholded, we obtain the detection function p(k) shown in Figure 2-d. An example of an instrument with smooth attacks, a violin, is shown in Figure 3. Large energy variations in the frequency content of the audio signal can still be observed as vertical stripes in Figure 3-c. After summing the positive contributions, six of the seven attacks are properly detected as shown by the corresponding largest peaks in Figure 3-d. a) amplitude b) frequency (khz) c) frequency (khz) d) magnitude time (s) Figure 2. From top to bottom: time waveform of a piano signal, its spectrogram, its spectral energy flux and the detection function p(k) Periodicity estimation The detection function p(k) at the output of the onset detection stage can be seen as a quasi-periodic and noisy pulse-train that exhibits large peaks at note attacks. The next step is to estimate the tempo of the audio signal, i.e., the periodicity of the note onset pulses. Two methods from traditional pitch determination techniques are employed: the spectral product and the autocorrelation function. These techniques have already been used for this purpose in [] Spectral product The spectral product principle assumes that the power spectrum of the signal is formed from strong harmonics located at integer multiples of the signal s fundamental frequency. To find this frequency, the power spectrum is compressed by a factor m, then the obtained spectra are multiplied, leading to a reinforced fundamental frequency. For a normalized frequency, this is given by: S(e j2π f ) = M m= P(e j2πm f ) for f < 2M (7) where P(e j2π f ) is the discrete Fourier transform of p(k). Then, the estimated tempo T is easily obtained by picking out the frequency index corresponding to the largest peak of S(e j2π f ). The tempo is constrained to fall in the range 6 < T < 2 BPM.
4 a) amplitude b) frequency (khz) c) frequency (khz) d) magnitude time (s) Figure 3. From top to bottom: time waveform of a violin signal, its spectrogram, its spectral energy flux and the detection function p(k). Genre Pieces Percentage classical % jazz % latin % pop % rock % reggae 3 6. % soul % rap, hip-hop 2 4. % techno % other 55.2 % total 489 % Table. Genre distribution of the test database. location, i.e., t i = t i + T and a corresponding peak in p(k) is searched within the area t i ±. If no peak is found, the beat is placed in its expected position t i. When the last beat of the window occurs, its location is stored in order to assure the continuity with the first beat of the new analysis window. Where the tempo of the new analysis window differs by more than % from the previous tempo, a new beat phase is estimated. The peaks are searched using the new beat period without referencing the previous beat phase Autocorrelation function This is a classical method in periodicity estimation. The non-normalized deterministic autocorrelation function of p(k) is calculated as follows: r(τ) = p(k + τ)p(k) (8) k Again, we suppose that 6 < T < 2 BPM. Hence, during the calculation of the autocorrelation, only the values of r(τ) corresponding to the range of 3 ms to s are calculated. To find the estimated tempo T, the lag of the three largest peaks of r(τ) are analyzed and a multiplicity relationship between them is searched. In the case that no relation is found, the lag of the largest peak is taken as the beat period Beat location To find the beat location, we use a method based on the comb filter idea that resembles previous work carried out by [6, 9, 2]. We create an artificial pulse-train q(t) of tempo T previously calculated as explained in Section 2.2 and cross-correlate it with p(k). This operation has a low computational cost, since the correlation is evaluated only at the indices corresponding to the maxima of p(k). Then, we call t the time index where this cross-correlation is maximal and we consider it as the starting location of the beat. For the second and succesive beats in the analysis window, a beat period T is added to the previous beat 3. PERFORMANCE ANALYSIS 3.. Database, annotation and evaluation protocole The proposed algorithm was evaluated using a corpus of 489 musical excerpts taken from commercial CD recordings. These pieces were selected to cover as many characteristics as possible: various tempi in the 5 to 2 BPM range, a wide variety of instruments, dynamic range, studio/live recordings, old/recent recordings, with/without vocals, male/female vocals and with/without percussions. They were also selected to represent a wide diversity of musical genres as shown in Table. From each of the selected recordings, an excerpt of 2 seconds having a relatively constant tempo, was extracted and converted to a monophonic signal sampled at 6 khz. The procedure for manually estimating the tempo of each musical piece is the following: the musician listens to a musical excerpt using headphones (if required, several times in a row to be accustomed to the tempo), while listening, he/she taps the tempo, the tapping signal is recorded and the tempo is extracted from it. As pointed out by Goto in [4], the beat is a perceptual concept that people feel in music, so it is generally difficult to define the correct beat in an objective way. People have a tendency to tap at different metric levels. For
5 Method Recognition rate Paulus [] 56.3 % Scheirer [2] 67.4 % SP % AC % SP using SEF. 84. % AC using SEF 89.7 % Method PLS SCR SP AC SP-SEF AC-SEF Genre % % % % % % classical jazz latin pop rock reggae soul rap techno other Table 2. Tempo estimation performances. SEF stands for spectral energy flux, SP for spectral product and AC for autocorrelation. example, in a piece that has a 4/4 time signature, it is correct to tap every quarter-note or every half-note. In general, a ground truth tempo cannot be established unless the musical score of the piece is available. This is a very common problem when humans tap along with the music, i.e., to tap twice as fast or twice as slow the true tempo. Whenever this case ocurred during the database annotation, the slower tempo was taken as reference T R. In a similar way to humans, automatic tempo estimation methods also make this doubling or halving of the true tempo. Thus, for evaluation purposes the tempo estimation T provided by the algorithm is labeled as correct if there is a less than 5% disagreement from the manually annotated tempo used as reference T R under the principle.95αt < T R <.5αT with α { 2,,2} Results During the evaluation, the algorithm parameters were set as follows. The length of the analysis window for tempo estimation was set to four seconds, with an overlapping factor of 5%. Smaller window size values reduce the algorithm performance. For the spectral energy flux calculation, the length of the analysis window used in the computation of the STFT was 64 samples (4 ms) with an overlapping factor of 5% and a 28 point FFT, thus the detection function v(k) could be seen as signal sampled at 5 Hz. As mentioned, the order of the differentiator FIR filter was set to L = 8. In the beat location stage, the median filter i was set to 25 samples, C was set to 2, and for the peak location was set to % of the beat period. To have a better idea of the performance of our algorithm, we decided to compare it with our own implemententation of the algorithms proposed by Paulus [] and Scheirer [2]. We also compared it with our previous work in tempo estimation []. In this case, the main difference between the previous and the current system lies on the onset extraction stage. Table 2 summarizes the overall recognition rate for the evaluated systems. In this table, SP stands for spectral product, AC for autocorrelation and SEF for spectral energy flux. In more details, the performance of these methods by musical genre are presented in Table 3. In this table, PLS stands for Paulus, SCR for Scheirer. As expected, results Table 3. Tempo estimation performances by musical genre. PLS stands for Paulus [], SCR for Scheirer [2]. indicate that classical music is the most difficult genre. Nevertheless, the proposed algorithm displayed promising results. For the other genres, it shows good performance, particularly for music with a straightforward rhythm. Several authors have pointed out the difficulty in evaluating beat tracking systems [4, 6, 9] due to the subjective interpretation of the beat and the inexistence of a consensual data base of beat-labeled audio tracks. In our case, the beat location evaluation was done at a subjective level, that is, artificial sound clicks were superimposed on the tested signal at the calculated beat locations and tempo. During the validation procedure, we note that the proposed algorithm produces erroneous results under the following circumstances: when dealing with signals having a stealthily or long fading-in attacks, the hypothesis that supurious peaks are smaller than attack peaks does not hold any more, leading to false onset detections; the spectral energy flux follows the principle that stable spectra regions are followed by transition regions. When many instruments play simultaneously, as in an orchestra, their spectral mixture lacks stable regions, leading to false onset detections; when the tempo varies too quickly in short time segments or if there are large beat gaps in the signal, the periocity estimation stage cannot keep up with the changes. The reader is welcome to listen to the sound examples available at malonso/ismir4. 4. CONCLUSIONS In this paper we have presented an efficient beat tracking algorithm that processes audio recordings. We have also defined the concept of spectral energy flux and used it to derive a new and effective onset detector based on the STFT, an efficient differentiator filter and dynamic thresholding using a median filter. This onset detector displays high performance for a large range of audio signals. In addition, the proposed tempo tracking system is straightforward to implement and has a relatively low computational cost. The performance of the algorithm presented
6 was evaluated on a large database containing 489 musical excerpts from several musical genres. The results are encouraging since the global success rate for tempo estimation was 89.7%. The method presented works offline. A real-time implementation is considered, but currently there are various issues to be resolved such as the block-wise processing that requires access to future signal samples and the non-causality of the thesholding filter. Future work should explore other periodicity estimation techniques and an analysis of the residual part after a harmonic/noise decomposition. 5. REFERENCES [] Alonso M., David B. and Richard G., A Study of Tempo Tracking Algorithms from Polyphonic Music Signals, Proceedings of the 4th. COST 276 Workshop, Bordeaux, France. March 23. [2] Dixon, M. Automatic Extraction of Tempo and Beat from Expressive Performances, Journal of New Music Research,vol. 3 No., pp [3] Duxbury C., Sandler M. and Davies M., A Hybrid Approach to Musical Note Onset Detection, Proceedings of the 5th. Int. Conf. on Digital Audio Effects (DAFx), pp , Hamburg Germany, September 22. [4] Goto M. and Muraoka Y. Issues in Evaluating Beat Tracking Systems, Working Notes of the IJCAI-97 Workshop on Issues in AI and Music, August 997. [5] Goto, M. An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds, Journal of New Music Research, vol. 3, No. 2, pp. 59 7, 2. [6] Klapuri, A. Musical meter estimation and music transcription, Cambridge Music Processing Colloquium, Cambridge University, UK, March 23. [7] Klapuri, A. Sound Onset Detection by Applying Psychoacoustic Knowledge, Proceedings IEEE Int. Conf. Acoustics Speech and Sig. Proc. (ICASSP), pp , Phoenix AR, USA March 999. [8] Laroche, J. Estimating, Tempo, Swing and Beat Locations in Audio Recordings, Proceedings IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp , New Paltz, NY, USA October 2. [9] Laroche, J. Efficient Tempo and Beat Tracking in Audio Recordings, J. Audio. Eng. Soc., vol. 5, No. 4, pp , April 23. [] Paulus J. and Klapuri A., Measuring the Similarity of Rythmic Patterns, Proceedings of the International Symposium on Music Information Retrieval, Paris, France, 22. [] Proakis, J. G. and Manolakis, D., Digital Signal Processing: Principles, Algorithms and Applications. Prentice Hall, New York, 995. [2] Scheirer, E. D., Tempo and Beat Analysis of Acoustic Music Signals, J. Acoust. Soc. Am., vol. 4, pp , January 998. [3] Seppa anen, J. Tatum grid analysis of musical signals, Proceedings IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA October 2. [4] Todd, N. P. McA., The Auditory Primal Sketch : A Multiscale model of rhythmic grouping, Journal of New Music Research, vol. 23, No., pp. 25 7, 994.
Onset Detection and Music Transcription for the Irish Tin Whistle
ISSC 24, Belfast, June 3 - July 2 Onset Detection and Music Transcription for the Irish Tin Whistle Mikel Gainza φ, Bob Lawlor* and Eugene Coyle φ φ Digital Media Centre Dublin Institute of Technology
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.
This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore. Title Transcription of polyphonic signals using fast filter bank( Accepted version ) Author(s) Foo, Say Wei;
A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
Automatic Transcription: An Enabling Technology for Music Analysis
Automatic Transcription: An Enabling Technology for Music Analysis Simon Dixon [email protected] Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary University
Recent advances in Digital Music Processing and Indexing
Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components
Practical Design of Filter Banks for Automatic Music Transcription
Practical Design of Filter Banks for Automatic Music Transcription Filipe C. da C. B. Diniz, Luiz W. P. Biscainho, and Sergio L. Netto Federal University of Rio de Janeiro PEE-COPPE & DEL-Poli, POBox 6854,
FAST MIR IN A SPARSE TRANSFORM DOMAIN
ISMIR 28 Session 4c Automatic Music Analysis and Transcription FAST MIR IN A SPARSE TRANSFORM DOMAIN Emmanuel Ravelli Université Paris 6 [email protected] Gaël Richard TELECOM ParisTech [email protected]
MUSICAL INSTRUMENT FAMILY CLASSIFICATION
MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.
7. Beats. sin( + λ) + sin( λ) = 2 cos(λ) sin( )
34 7. Beats 7.1. What beats are. Musicians tune their instruments using beats. Beats occur when two very nearby pitches are sounded simultaneously. We ll make a mathematical study of this effect, using
Separation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
L9: Cepstral analysis
L9: Cepstral analysis The cepstrum Homomorphic filtering The cepstrum and voicing/pitch detection Linear prediction cepstral coefficients Mel frequency cepstral coefficients This lecture is based on [Taylor,
Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)
Page 1 Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT) ECC RECOMMENDATION (06)01 Bandwidth measurements using FFT techniques
Trigonometric functions and sound
Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude
Analysis/resynthesis with the short time Fourier transform
Analysis/resynthesis with the short time Fourier transform summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis
A causal algorithm for beat-tracking
A causal algorithm for beat-tracking Benoit Meudic Ircam - Centre Pompidou 1, place Igor Stravinsky, 75004 Paris, France [email protected] ABSTRACT This paper presents a system which can perform automatic
Annotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
B3. Short Time Fourier Transform (STFT)
B3. Short Time Fourier Transform (STFT) Objectives: Understand the concept of a time varying frequency spectrum and the spectrogram Understand the effect of different windows on the spectrogram; Understand
FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM
FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS Fanbo Xiang UIUC Physics 193 POM Professor Steven M. Errede Fall 2014 1 Introduction Chords, an essential part of music, have long been analyzed. Different
MUSIC is to a great extent an event-based phenomenon for
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 5, SEPTEMBER 2005 1035 A Tutorial on Onset Detection in Music Signals Juan Pablo Bello, Laurent Daudet, Samer Abdallah, Chris Duxbury, Mike
Auto-Tuning Using Fourier Coefficients
Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition
AUTOMATIC TRANSCRIPTION OF PIANO MUSIC BASED ON HMM TRACKING OF JOINTLY-ESTIMATED PITCHES. Valentin Emiya, Roland Badeau, Bertrand David
AUTOMATIC TRANSCRIPTION OF PIANO MUSIC BASED ON HMM TRACKING OF JOINTLY-ESTIMATED PITCHES Valentin Emiya, Roland Badeau, Bertrand David TELECOM ParisTech (ENST, CNRS LTCI 46, rue Barrault, 75634 Paris
PeakVue Analysis for Antifriction Bearing Fault Detection
August 2011 PeakVue Analysis for Antifriction Bearing Fault Detection Peak values (PeakVue) are observed over sequential discrete time intervals, captured, and analyzed. The analyses are the (a) peak values
Lecture 1-10: Spectrograms
Lecture 1-10: Spectrograms Overview 1. Spectra of dynamic signals: like many real world signals, speech changes in quality with time. But so far the only spectral analysis we have performed has assumed
Little LFO. Little LFO. User Manual. by Little IO Co.
1 Little LFO User Manual Little LFO by Little IO Co. 2 Contents Overview Oscillator Status Switch Status Light Oscillator Label Volume and Envelope Volume Envelope Attack (ATT) Decay (DEC) Sustain (SUS)
ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1
WHAT IS AN FFT SPECTRUM ANALYZER? ANALYZER BASICS The SR760 FFT Spectrum Analyzer takes a time varying input signal, like you would see on an oscilloscope trace, and computes its frequency spectrum. Fourier's
The Fourier Analysis Tool in Microsoft Excel
The Fourier Analysis Tool in Microsoft Excel Douglas A. Kerr Issue March 4, 2009 ABSTRACT AD ITRODUCTIO The spreadsheet application Microsoft Excel includes a tool that will calculate the discrete Fourier
AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE
AUDIOMATICA AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD by Daniele Ponteggia - [email protected] INTRODUCTION The Consumer Electronics Association (CEA),
A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques
A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques Vineela Behara,Y Ramesh Department of Computer Science and Engineering Aditya institute of Technology and
Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA
Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract
Convolution, Correlation, & Fourier Transforms. James R. Graham 10/25/2005
Convolution, Correlation, & Fourier Transforms James R. Graham 10/25/2005 Introduction A large class of signal processing techniques fall under the category of Fourier transform methods These methods fall
The Calculation of G rms
The Calculation of G rms QualMark Corp. Neill Doertenbach The metric of G rms is typically used to specify and compare the energy in repetitive shock vibration systems. However, the method of arriving
Speech Signal Processing: An Overview
Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech
DeNoiser Plug-In. for USER S MANUAL
DeNoiser Plug-In for USER S MANUAL 2001 Algorithmix All rights reserved Algorithmix DeNoiser User s Manual MT Version 1.1 7/2001 De-NOISER MANUAL CONTENTS INTRODUCTION TO NOISE REMOVAL...2 Encode/Decode
Automatic Music Transcription as We Know it Today
Journal of New Music Research 24, Vol. 33, No. 3, pp. 269 282 Automatic Music Transcription as We Know it Today Anssi P. Klapuri Tampere University of Technology, Tampere, Finland Abstract The aim of this
Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19
Doppler Doppler Chapter 19 A moving train with a trumpet player holding the same tone for a very long time travels from your left to your right. The tone changes relative the motion of you (receiver) and
Lecture 1-6: Noise and Filters
Lecture 1-6: Noise and Filters Overview 1. Periodic and Aperiodic Signals Review: by periodic signals, we mean signals that have a waveform shape that repeats. The time taken for the waveform to repeat
Monophonic Music Recognition
Monophonic Music Recognition Per Weijnitz Speech Technology 5p [email protected] 5th March 2003 Abstract This report describes an experimental monophonic music recognition system, carried out
Music Mood Classification
Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may
HOG AND SUBBAND POWER DISTRIBUTION IMAGE FEATURES FOR ACOUSTIC SCENE CLASSIFICATION. Victor Bisot, Slim Essid, Gaël Richard
HOG AND SUBBAND POWER DISTRIBUTION IMAGE FEATURES FOR ACOUSTIC SCENE CLASSIFICATION Victor Bisot, Slim Essid, Gaël Richard Institut Mines-Télécom, Télécom ParisTech, CNRS LTCI, 37-39 rue Dareau, 75014
Audio Content Analysis for Online Audiovisual Data Segmentation and Classification
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 4, MAY 2001 441 Audio Content Analysis for Online Audiovisual Data Segmentation and Classification Tong Zhang, Member, IEEE, and C.-C. Jay
Matlab GUI for WFB spectral analysis
Matlab GUI for WFB spectral analysis Jan Nováček Department of Radio Engineering K13137, CTU FEE Prague Abstract In the case of the sound signals analysis we usually use logarithmic scale on the frequency
TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY
4 4th International Workshop on Acoustic Signal Enhancement (IWAENC) TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY Takuya Toyoda, Nobutaka Ono,3, Shigeki Miyabe, Takeshi Yamada, Shoji Makino University
Audio-visual vibraphone transcription in real time
Audio-visual vibraphone transcription in real time Tiago F. Tavares #1, Gabrielle Odowichuck 2, Sonmaz Zehtabi #3, George Tzanetakis #4 # Department of Computer Science, University of Victoria Victoria,
Beethoven, Bach und Billionen Bytes
Meinard Müller Beethoven, Bach und Billionen Bytes Automatisierte Analyse von Musik und Klängen Meinard Müller Tutzing-Symposium Oktober 2014 2001 PhD, Bonn University 2002/2003 Postdoc, Keio University,
Silver Burdett Making Music
A Correlation of Silver Burdett Making Music Model Content Standards for Music INTRODUCTION This document shows how meets the Model Content Standards for Music. Page references are Teacher s Edition. Lessons
Figure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium [email protected]
The IRCAM Musical Workstation: A Prototyping and Production Tool for Real-Time Computer Music
The IRCAM Musical Workstation: A Prototyping and Production Tool for Real-Time Computer Music Cort Lippe, Zack Settel, Miller Puckette and Eric Lindemann IRCAM, 31 Rue St. Merri, 75004 Paris, France During
Ericsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music
ISO/IEC MPEG USAC Unified Speech and Audio Coding MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music The standardization of MPEG USAC in ISO/IEC is now in its final
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking
Workshop Perceptual Effects of Filtering and Masking Introduction to Filtering and Masking The perception and correct identification of speech sounds as phonemes depends on the listener extracting various
THIS paper reports some results of a research, which aims to investigate the
FACTA UNIVERSITATIS (NIŠ) SER.: ELEC. ENERG. vol. 22, no. 2, August 2009, 227-234 Determination of Rotor Slot Number of an Induction Motor Using an External Search Coil Ozan Keysan and H. Bülent Ertan
AP1 Waves. (A) frequency (B) wavelength (C) speed (D) intensity. Answer: (A) and (D) frequency and intensity.
1. A fire truck is moving at a fairly high speed, with its siren emitting sound at a specific pitch. As the fire truck recedes from you which of the following characteristics of the sound wave from the
The Sonometer The Resonant String and Timbre Change after plucking
The Sonometer The Resonant String and Timbre Change after plucking EQUIPMENT Pasco sonometers (pick up 5 from teaching lab) and 5 kits to go with them BK Precision function generators and Tenma oscilloscopes
AN1200.04. Application Note: FCC Regulations for ISM Band Devices: 902-928 MHz. FCC Regulations for ISM Band Devices: 902-928 MHz
AN1200.04 Application Note: FCC Regulations for ISM Band Devices: Copyright Semtech 2006 1 of 15 www.semtech.com 1 Table of Contents 1 Table of Contents...2 1.1 Index of Figures...2 1.2 Index of Tables...2
Quarterly Progress and Status Report. Measuring inharmonicity through pitch extraction
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Measuring inharmonicity through pitch extraction Galembo, A. and Askenfelt, A. journal: STL-QPSR volume: 35 number: 1 year: 1994
Introduction to Digital Audio
Introduction to Digital Audio Before the development of high-speed, low-cost digital computers and analog-to-digital conversion circuits, all recording and manipulation of sound was done using analog techniques.
Fermi National Accelerator Laboratory. The Measurements and Analysis of Electromagnetic Interference Arising from the Booster GMPS
Fermi National Accelerator Laboratory Beams-Doc-2584 Version 1.0 The Measurements and Analysis of Electromagnetic Interference Arising from the Booster GMPS Phil S. Yoon 1,2, Weiren Chou 1, Howard Pfeffer
The Computer Music Tutorial
Curtis Roads with John Strawn, Curtis Abbott, John Gordon, and Philip Greenspun The Computer Music Tutorial Corrections and Revisions The MIT Press Cambridge, Massachusetts London, England 2 Corrections
RF Measurements Using a Modular Digitizer
RF Measurements Using a Modular Digitizer Modern modular digitizers, like the Spectrum M4i series PCIe digitizers, offer greater bandwidth and higher resolution at any given bandwidth than ever before.
Jitter Measurements in Serial Data Signals
Jitter Measurements in Serial Data Signals Michael Schnecker, Product Manager LeCroy Corporation Introduction The increasing speed of serial data transmission systems places greater importance on measuring
Artificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN. [email protected]. [email protected]
MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN Zheng Lai Zhao Liu Meng Li Quan Yuan [email protected] [email protected] [email protected] [email protected] I. Overview Architecture The purpose
PCM Encoding and Decoding:
PCM Encoding and Decoding: Aim: Introduction to PCM encoding and decoding. Introduction: PCM Encoding: The input to the PCM ENCODER module is an analog message. This must be constrained to a defined bandwidth
POWER SYSTEM HARMONICS. A Reference Guide to Causes, Effects and Corrective Measures AN ALLEN-BRADLEY SERIES OF ISSUES AND ANSWERS
A Reference Guide to Causes, Effects and Corrective Measures AN ALLEN-BRADLEY SERIES OF ISSUES AND ANSWERS By: Robert G. Ellis, P. Eng., Rockwell Automation Medium Voltage Business CONTENTS INTRODUCTION...
Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers
Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX
Graham s Guide to Synthesizers (part 1) Analogue Synthesis
Graham s Guide to Synthesizers (part ) Analogue Synthesis Synthesizers were originally developed to imitate or synthesise the sounds of acoustic instruments electronically. Early synthesizers used analogue
DIODE CIRCUITS LABORATORY. Fig. 8.1a Fig 8.1b
DIODE CIRCUITS LABORATORY A solid state diode consists of a junction of either dissimilar semiconductors (pn junction diode) or a metal and a semiconductor (Schottky barrier diode). Regardless of the type,
Linear Parameter Measurement (LPM)
(LPM) Module of the R&D SYSTEM FEATURES Identifies linear transducer model Measures suspension creep LS-fitting in impedance LS-fitting in displacement (optional) Single-step measurement with laser sensor
Analog and Digital Signals, Time and Frequency Representation of Signals
1 Analog and Digital Signals, Time and Frequency Representation of Signals Required reading: Garcia 3.1, 3.2 CSE 3213, Fall 2010 Instructor: N. Vlajic 2 Data vs. Signal Analog vs. Digital Analog Signals
Product Data Bulletin
Product Data Bulletin Power System Harmonics Causes and Effects of Variable Frequency Drives Relative to the IEEE 519-1992 Standard Raleigh, NC, U.S.A. INTRODUCTION This document describes power system
Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
Signal to Noise Instrumental Excel Assignment
Signal to Noise Instrumental Excel Assignment Instrumental methods, as all techniques involved in physical measurements, are limited by both the precision and accuracy. The precision and accuracy of a
Real-Time Audio Watermarking Based on Characteristics of PCM in Digital Instrument
Journal of Information Hiding and Multimedia Signal Processing 21 ISSN 273-4212 Ubiquitous International Volume 1, Number 2, April 21 Real-Time Audio Watermarking Based on Characteristics of PCM in Digital
Short-time FFT, Multi-taper analysis & Filtering in SPM12
Short-time FFT, Multi-taper analysis & Filtering in SPM12 Computational Psychiatry Seminar, FS 2015 Daniel Renz, Translational Neuromodeling Unit, ETHZ & UZH 20.03.2015 Overview Refresher Short-time Fourier
Advanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction
Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals Modified from the lecture slides of Lami Kaya ([email protected]) for use CECS 474, Fall 2008. 2009 Pearson Education Inc., Upper
Lock-in amplifiers. A short tutorial by R. Scholten
Lock-in amplifiers A short tutorial by R. cholten Measuring something Common task: measure light intensity, e.g. absorption spectrum Need very low intensity to reduce broadening Noise becomes a problem
A Computationally Efficient Multipitch Analysis Model
708 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 8, NO. 6, NOVEMBER 2000 A Computationally Efficient Multipitch Analysis Model Tero Tolonen, Student Member, IEEE, and Matti Karjalainen, Member,
Timing Errors and Jitter
Timing Errors and Jitter Background Mike Story In a sampled (digital) system, samples have to be accurate in level and time. The digital system uses the two bits of information the signal was this big
* Published as Henkjan Honing, Structure and interpretation of rhythm and timing, in: Tijdschrift voor
Structure and Interpretation of Rhythm and Timing* Henkjan Honing Music, Mind, Machine Group NICI, University of Nijmegen Music Department, University of Amsterdam Rhythm, as it is performed and perceived,
LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING
LOW COST HARDWARE IMPLEMENTATION FOR DIGITAL HEARING AID USING RasPi Kaveri Ratanpara 1, Priyan Shah 2 1 Student, M.E Biomedical Engineering, Government Engineering college, Sector-28, Gandhinagar (Gujarat)-382028,
The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network
, pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and
Realtime FFT processing in Rohde & Schwarz receivers
Realtime FFT in Rohde & Schwarz receivers Radiomonitoring & Radiolocation Application Brochure 01.00 Realtime FFT in Rohde & Schwarz receivers Introduction This application brochure describes the sophisticated
STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION
STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION Adiel Ben-Shalom, Michael Werman School of Computer Science Hebrew University Jerusalem, Israel. {chopin,werman}@cs.huji.ac.il
1. interpret notational symbols for rhythm (26.A.1d) 2. recognize and respond to steady beat with movements, games and by chanting (25.A.
FIRST GRADE CURRICULUM OBJECTIVES MUSIC I. Rhythm 1. interpret notational symbols for rhythm (26.A.1d) 2. recognize and respond to steady beat with movements, games and by chanting (25.A.1c) 3. respond
Acoustics for Musicians
Unit 1: Acoustics for Musicians Unit code: QCF Level 3: Credit value: 10 Guided learning hours: 60 Aim and purpose J/600/6878 BTEC National The aim of this unit is to establish knowledge of acoustic principles
Control of affective content in music production
International Symposium on Performance Science ISBN 978-90-9022484-8 The Author 2007, Published by the AEC All rights reserved Control of affective content in music production António Pedro Oliveira and
Score following in practice
Score following in practice Miller Puckette and Cort Lippe IRCAM, 31, rue St-Merri, 75004 Paris, FRANCE [email protected], [email protected] ABSTRACT Score following was first presented at the 1984 ICMC, independently,
Mathematical Harmonies Mark Petersen
1 Mathematical Harmonies Mark Petersen What is music? When you hear a flutist, a signal is sent from her fingers to your ears. As the flute is played, it vibrates. The vibrations travel through the air
