Auto-Tuning Using Fourier Coefficients



Similar documents
FOURIER TRANSFORM BASED SIMPLE CHORD ANALYSIS. UIUC Physics 193 POM

B3. Short Time Fourier Transform (STFT)

Analysis/resynthesis with the short time Fourier transform

Analog and Digital Signals, Time and Frequency Representation of Signals

Lab 1. The Fourier Transform

The Calculation of G rms

Trigonometric functions and sound

Lecture - 4 Diode Rectifier Circuits

The Fourier Analysis Tool in Microsoft Excel

SGN-1158 Introduction to Signal Processing Test. Solutions

S. Boyd EE102. Lecture 1 Signals. notation and meaning. common signals. size of a signal. qualitative properties of signals.

SR2000 FREQUENCY MONITOR

7. Beats. sin( + λ) + sin( λ) = 2 cos(λ) sin( )

Electronic Communications Committee (ECC) within the European Conference of Postal and Telecommunications Administrations (CEPT)

L9: Cepstral analysis

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

CHAPTER 6 Frequency Response, Bode Plots, and Resonance

Electrical Resonance

Mathematics. ( : Focus on free Education) (Chapter 5) (Complex Numbers and Quadratic Equations) (Class XI)

Introduction to Complex Numbers in Physics/Engineering

Sound absorption and acoustic surface impedance

Design of FIR Filters

The continuous and discrete Fourier transforms

Introduction to Digital Audio

ε: Voltage output of Signal Generator (also called the Source voltage or Applied

CIRCUITS LABORATORY EXPERIMENT 3. AC Circuit Analysis

Speech Signal Processing: An Overview

Little LFO. Little LFO. User Manual. by Little IO Co.

SIGNAL PROCESSING & SIMULATION NEWSLETTER

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

Matlab GUI for WFB spectral analysis

25. AM radio receiver

AN-007 APPLICATION NOTE MEASURING MAXIMUM SUBWOOFER OUTPUT ACCORDING ANSI/CEA-2010 STANDARD INTRODUCTION CEA-2010 (ANSI) TEST PROCEDURE

AN Application Note: FCC Regulations for ISM Band Devices: MHz. FCC Regulations for ISM Band Devices: MHz

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

Scientific Programming

Convolution, Correlation, & Fourier Transforms. James R. Graham 10/25/2005

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

L and C connected together. To be able: To analyse some basic circuits.

Frequency Response of FIR Filters

High Quality Integrated Data Reconstruction for Medical Applications

Correlation and Convolution Class Notes for CMSC 426, Fall 2005 David Jacobs

SWISS ARMY KNIFE INDICATOR John F. Ehlers

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

Adding Sinusoids of the Same Frequency. Additive Synthesis. Spectrum. Music 270a: Modulation

SIGNAL PROCESSING FOR EFFECTIVE VIBRATION ANALYSIS

Sampling Theorem Notes. Recall: That a time sampled signal is like taking a snap shot or picture of signal periodically.

Unified Lecture # 4 Vectors

Time Series Analysis: Introduction to Signal Processing Concepts. Liam Kilmartin Discipline of Electrical & Electronic Engineering, NUI, Galway

FFT Algorithms. Chapter 6. Contents 6.1

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Sampling and Interpolation. Yao Wang Polytechnic University, Brooklyn, NY11201

Short-time FFT, Multi-taper analysis & Filtering in SPM12

Understanding Poles and Zeros

Chapter 4 Online Appendix: The Mathematics of Utility Functions

Experimental Modal Analysis

EDEXCEL NATIONAL CERTIFICATE/DIPLOMA UNIT 5 - ELECTRICAL AND ELECTRONIC PRINCIPLES NQF LEVEL 3 OUTCOME 4 - ALTERNATING CURRENT

Introduction to Matrices for Engineers

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Shaft. Application of full spectrum to rotating machinery diagnostics. Centerlines. Paul Goldman, Ph.D. and Agnes Muszynska, Ph.D.

Lesson 3 DIRECT AND ALTERNATING CURRENTS. Task. The skills and knowledge taught in this lesson are common to all missile repairer tasks.

Analog Representations of Sound

Measuring Impedance and Frequency Response of Guitar Pickups

10.3. The Exponential Form of a Complex Number. Introduction. Prerequisites. Learning Outcomes

The Time Constant of an RC Circuit

Final Year Project Progress Report. Frequency-Domain Adaptive Filtering. Myles Friel. Supervisor: Dr.Edward Jones

Applications of the DFT

The Sonometer The Resonant String and Timbre Change after plucking

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Practical Design of Filter Banks for Automatic Music Transcription

Introduction to Digital Filters

Introduction to IQ-demodulation of RF-data

Signal Processing First Lab 01: Introduction to MATLAB. 3. Learn a little about advanced programming techniques for MATLAB, i.e., vectorization.

A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song

AC CIRCUITS - CAPACITORS AND INDUCTORS

DIODE CIRCUITS LABORATORY. Fig. 8.1a Fig 8.1b

Musical Analysis and Synthesis in Matlab

Basics of Digital Recording

Unit 1 Number Sense. In this unit, students will study repeating decimals, percents, fractions, decimals, and proportions.

Chapter 15, example problems:

No Solution Equations Let s look at the following equation: 2 +3=2 +7

Lecture 3: Signaling and Clock Recovery. CSE 123: Computer Networks Stefan Savage

Fermi National Accelerator Laboratory. The Measurements and Analysis of Electromagnetic Interference Arising from the Booster GMPS

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

PeakVue Analysis for Antifriction Bearing Fault Detection

A comparison of radio direction-finding technologies. Paul Denisowski, Applications Engineer Rohde & Schwarz

Supporting Information

Direct and Reflected: Understanding the Truth with Y-S 3

Transmission Lines. Smith Chart

RightMark Audio Analyzer 6.0. User s Guide

Introduction to FM-Stereo-RDS Modulation

Department of Electrical and Computer Engineering Ben-Gurion University of the Negev. LAB 1 - Introduction to USRP

The Fundamentals of FFT-Based Audio Measurements in SmaartLive

Review of Fundamental Mathematics

CHAPTER 5 Round-off errors

Using the Impedance Method

Operation Count; Numerical Linear Algebra

Transcription:

Auto-Tuning Using Fourier Coefficients Math 56 Tom Whalen May 20, 2013 The Fourier transform is an integral part of signal processing of any kind. To be able to analyze an input signal as a superposition of infinitely many signals of different frequency allows for manipulation of the coefficients and, after completing the inverse Fourier transform, interesting effects on the initial signal. Thus, in my project I focused on creating pitch correction effects as seen in software Auto-Tune, and in this paper I will talk about the process of creating a Matlab code that performs this effect. The Fast Fourier Transform algorithm is effective at breaking down mathematical functions into a Fourier series of different frequency complex exponentials and the corresponding coefficients. However, when analyzing audio signals it is impractical and inefficient to take one Fouier Transform of an entire audio clip; for a band-limited signal, most of the 44100t Fourier coefficients will contain no imformation about the original signal and there is no way to tell which frequencies are dominant at different times in the signal. Thus, the way that audio signals are most often processed is using something called the Short-Time Fourier Transform (STFT). The basic idea of the STFT is that one can window the actual signal multiple times and take the FFT of the resulting modified input. The windowing process is done simply by multiplying the signal by another function that is mostly zero and has a maximum of 1. This will kill off the signal at other times that are not of interest and allow for getting the frequencies at a certain point in time via the FFT. There are several types of windowing functions that can be used in the STFT process. The most commonly used set of windows are of the form ( ) 2πn α (1 α) cos, 0 n < N N 1 where α is a constant that determines the shape of the function and N is the length of the windowed signal. The windowing function that I used most in my analysis was the Hanning window, which is the above equation with α =.5. Some other possible widowing functions are the Hamming window (α( =.54) ) and the square function (α = 1). One can even form functions like cos 3 πn N 1 where 0 n < N that are very similar in shape to the Hanning window but have 1

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 2πn Figure 1: The Hanning window: 2 (1 cos( N 1 )), 0 n < N a more defined peak near N/2. Windows like the Hanning are much prefered to the square window because the jumps in the square window cause the Fourier coefficients to decay much more gradually than those obtained via one of the smooth windows. However, when using the cosine-based window, it is necessary to have a jump size between windows that enables full reconstruction of the original signal. This reconstruction condition is f j w(j t 0 ) + f j w(j t 1 ) +... = 1 which means that each spectral frame will be weighted evenly. For the square, the obvious choice for the jump size is N, the number of samples in the windowed signal, which will prevent any component of the signal from being processed more than once. For the other windows, the jumpsize is N/4, but this will cause the signal to be interpreted as having twice the actual value, hence a scaling factor of 1/2 to recover the signal. Now this brings us to the actual form of the STFT: ˆf m,t = w(j t)f j ω mj j=0 In practice the result of the running the ST-FFT algorithm is a matrix with each row representing the Fourier coefficients of the signal at some point in time. The final step in the STFT process is choosing a suitable value for N, the number of samples of the signal included in each window. For a typical sampling rate of 44.1 khz, the most common N is 1024, which is large enough to ensure that the Fourier coefficients will decay and there will be no aliasing effects. It is also a small enough that the time resolution of the signal is decent, as there will be roughly 43 spectral frames per second of signal. Here is a generic audio signal and its STFT displayed as a colorscale image with time on the vertical axis and the frequency on the horizontal. To invert the STFT matrix, one must loop through all of the spectral frames and essentially add up all of the resuting signals that are produced by each set 2

0.15 0.1 0.05 0 Signal 0.05 0.1 0.15 0.2 0.25 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Time x 10 4 Figure 2: The input signal 100 200 300 400 500 600 700 800 900 100 200 300 400 500 600 700 800 900 1000 Figure 3: Its STFT on a color-scale map 3

of coefficients, which is commonly referred to as the overlap-add method of reconstruction. Shifting by the jumpsize each iteration, multiplying the ISTFT output by the window, and then adding it to the total output signal will perfectly reconstruct a signal from its set of STFT coefficients. The problem with the STFT is that, for any given spectral frame, it doesn t tell us the instantaneous frequencies due to the fact that there isn t much energy resolution. The solution to this problem relies in the fact that a complex coefficient a + ib can be written in the polar form βe φ where β = a 2 + b 2 and φ = tan 1 b a. The φ term acts as a phase of the signal, due to the shift of a particular frequency in the windowing process. Looking at the term in the Fourier series for which the coefficient is a + ib, we can now write this as βe iφ e im = βe i(φ+m). The new interpretation of this is that the new amplitude of the wave is the magnitude of the original coefficient and the new frequency is actually the old frequency shifted by the phase φ. Thus, by converting to polar coordinates we can keep track of actual values of the frequencies in an audio signal rather than those determined by the length of the DFT matrix. The implementation of the phase vocoder is illustrated by finding the change in phase for a specified energy bin between successive spectral frames, as shown in the following equations: ω(k, t) = φ(k, t) φ(k, t) t ω(k, t) ω wrapped (k, t) = [( ω(k, t) + π)%2π] π ω new (k, t) = ω wrapped (k, t) ω(k, t) Because the phases are wrapped from -π to π, we can t simply calculate the new frequency by finding the phase difference and adding it to the frequency denoted by k. We have to instead find the phase difference, subtract the expected value, make it positive and mod it with 2π and finally add it with the frequency value of the bin to get the true frequency of the signal. Now that we know exactly which frequencies we are dealing with, it is possible to directly manipulate the coefficients. Pitch correction is carried out by first finding out the frequency values for the musical notes and making them target bins for the signal coefficients. Once, this has been done the next step is to find the dominant overtone of the input signal. Since pitch correction is employed mainly for human singing, there will normally only be one tone that dominates the spectrum of Fourier coefficients. It is also necessary to convert the frequencies, which have now been processed by the phase vocoder, into Hertz in order to correctly coorelate the target bins, which is given by the equation: ω Hz (k) = k(sampling rate) where k is the bin number and ranges from 1 to the length of the window length of window Once this has been done, the next step is to loop through all of the spectral frames, identifying the dominant tone in each one, and pushing each into the target bin of the nearest musical note. However, because Matlab s implementation 4

of the FFT doesn t allow for changing the actual frequencies that correspond to the coefficients, I had to effectively change the frequencies by altering the values of the amplitudes. Since we want to rebin the dominant tone of the signal, we want to go from frequency m to frequency m + φ, where φ is some phase factor that moves this frequency to the desired one. Now writing out the series term as: fˆ m e i(m+φ) = f ˆ m e iφ e im This tells us that we should multiply the Fourier coefficients by a factor e iφ that depends on the distance from the nearest desired frequency. This will shift all the frequencies to output the pitch corrected signal that should, in theory, produce a much more euphonic sound than the original clip. There are a few restrictions, however. For a real-valued input like an audio clip, we have the relation that f m = fm, which means that the coefficient with the maximum amplitude will appear twice. Thus, we need only analyze the coefficients f 0 to f N/2, which will give us the correct difference between the actual frequency corresponding to the coefficient and the goal musical note. Furthermore, there isn t enough energy resolution to separate the lower frequencies present in the signal. With a window size of 1024 and a sampling rate of approximately 43 Hz, the information of the entire lowest musical octave, with values below 43 Hz, is all contained in the first Fourier coefficient, so it is impossible to resolve out which notes are dominant in the signal. Thus, it is best just to avoid this difficulty and begin analyzing the dominant tones starting with the second coefficient. This all sounds easy enough in theory, but the actual implementation was a bit more difficult. After a long amount of time put into making the code work, I wasn t able to get a functional auto-tune script up and running. Though the computational aspect of this project was not completely fulfilled, the theory behind pitch correction programs is very interesting creates a basis for other neat effects to implemented on audio signals. References [1] Boulanger, Richard; Lazzarini, Victor. The Audio Programming Book. Massachusetts Institute of Technology, 2011. 5