Nonlinear Blind Source Separation and Independent Component Analysis

Similar documents
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Component Ordering in Independent Component Analysis Based on Data Power

Independent Component Analysis: Algorithms and Applications

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

ADVANCES IN INDEPENDENT COMPONENT ANALYSIS WITH APPLICATIONS TO DATA MINING

Understanding and Applying Kalman Filtering

Solving simultaneous equations using the inverse matrix

Advanced Signal Processing and Digital Noise Reduction

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

4F7 Adaptive Filters (and Spectrum Estimation) Least Mean Square (LMS) Algorithm Sumeetpal Singh Engineering Department sss40@eng.cam.ac.

8.2. Solution by Inverse Matrix Method. Introduction. Prerequisites. Learning Outcomes

TIETS34 Seminar: Data Mining on Biometric identification

Unknown n sensors x(t)

Time Domain and Frequency Domain Techniques For Multi Shaker Time Waveform Replication

Applications to Data Smoothing and Image Processing I

Solving Systems of Linear Equations Using Matrices

Introduction to Matrix Algebra

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

Forecasting the U.S. Stock Market via Levenberg-Marquardt and Haken Artificial Neural Networks Using ICA&PCA Pre-Processing Techniques

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Solving Linear Systems, Continued and The Inverse of a Matrix

High Quality Image Deblurring Panchromatic Pixels

Accurate and robust image superresolution by neural processing of local image representations

Probability and Random Variables. Generation of random variables (r.v.)

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Least-Squares Intersection of Lines

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

Using row reduction to calculate the inverse and the determinant of a square matrix

A Wavelet Based Prediction Method for Time Series

Department of Chemical Engineering ChE-101: Approaches to Chemical Engineering Problem Solving MATLAB Tutorial VI

Question 2: How do you solve a matrix equation using the matrix inverse?

Natural Image Statistics

Solving Systems of Linear Equations

Weather Data Mining Using Independent Component Analysis

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Blind source separation of multichannel neuromagnetic responses

1 Determinants and the Solvability of Linear Systems

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

1 Introduction to Matrices

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

System Identification for Acoustic Comms.:

A NOVEL DETERMINISTIC METHOD FOR LARGE-SCALE BLIND SOURCE SEPARATION

Model-free Functional Data Analysis MELODIC Multivariate Exploratory Linear Optimised Decomposition into Independent Components

Artificial Neural Network for Speech Recognition

Revision of Lecture Eighteen

Univariate and Multivariate Methods PEARSON. Addison Wesley

Factor analysis. Angela Montanari

1 Review of Least Squares Solutions to Overdetermined Systems

Signal to Noise Instrumental Excel Assignment

The Method of Least Squares

We shall turn our attention to solving linear systems of equations. Ax = b

Sparse Component Analysis: a New Tool for Data Mining

Machine Learning for Data Science (CS4786) Lecture 1

Adaptive Equalization of binary encoded signals Using LMS Algorithm

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

1 Short Introduction to Time Series

General Framework for an Iterative Solution of Ax b. Jacobi s Method

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Palmprint Recognition with PCA and ICA

Vector and Matrix Norms

Linear Algebra Notes for Marsden and Tromba Vector Calculus

Unsupervised and supervised dimension reduction: Algorithms and connections

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Capacity Limits of MIMO Channels

Chapter 4: Vector Autoregressive Models

Linear Threshold Units

Lecture 5: Variants of the LMS algorithm

Introduction to Principal Components and FactorAnalysis

Machine Learning Introduction

Lecture 5 Least-squares

Lecture 8: Signal Detection and Noise Assumption

Solving Mass Balances using Matrix Algebra

Dynamic data processing

Name: Section Registered In:

Principal Component Analysis

Random Projection-based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

Nonlinear Iterative Partial Least Squares Method

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

2x + y = 3. Since the second equation is precisely the same as the first equation, it is enough to find x and y satisfying the system

Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

Typical Linear Equation Set and Corresponding Matrices

Systems of Linear Equations

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Chapter 4: Artificial Neural Networks

IN current film media, the increase in areal density has

Optimization Modeling for Mining Engineers

Independent component ordering in ICA time series analysis

Soft Clustering with Projections: PCA, ICA, and Laplacian

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Simple and efficient online algorithms for real world applications

1 Teaching notes on GMM 1.

Predict the Popularity of YouTube Videos Using Early View Data

Transcription:

Nonlinear Blind Source Separation and Independent Component Analysis Prof. Juha Karhunen Helsinki University of Technology Neural Networks Research Centre Espoo, Finland Helsinki University of Technology, Espoo, Finland 1

Part I: Linear Independent Component Analysis and Blind Source Separation Helsinki University of Technology, Espoo, Finland 2

Motivation for independent component analysis (ICA) and blind source separation (BSS) Let us start with an example: three people are speaking simultaneously in a room that has three microphones. Denote the microphone signals by x 1 (t), x 2 (t), and x 3 (t). Each is a weighted sum of the speech signals which we denote by s 1 (t), s 2 (t), and s 3 (t): x 1 (t) = a 11 s 1 (t) + a 12 s 2 (t) + a 13 s 3 (t) (1) x 2 (t) = a 21 s 1 (t) + a 22 s 2 (t) + a 23 s 3 (t) (2) x 3 (t) = a 31 s 1 (t) + a 32 s 2 (t) + a 33 s 3 (t) (3) Cocktail-party problem: estimate the original speech signals using only the recorded signals. Helsinki University of Technology, Espoo, Finland 3

0.5 0 0.5 0 500 1000 1500 2000 2500 3000 3500 1 0 1 0 500 1000 1500 2000 2500 3000 3500 1 0 1 0 500 1000 1500 2000 2500 3000 3500 Figure 1: The original speech waveforms. Helsinki University of Technology, Espoo, Finland 4

1 0 1 0 500 1000 1500 2000 2500 3000 3500 2 0 2 0 500 1000 1500 2000 2500 3000 3500 2 1 0 1 0 500 1000 1500 2000 2500 3000 3500 Figure 2: The observed microphone signals. Helsinki University of Technology, Espoo, Finland 5

The problem: find the sources s 1 (t), s 2 (t) and s 3 (t) from the observed signals x 1 (t), x 2 (t) and x 3 (t). As the weights a ij are different, we may assume that the matrix A = (a ij ) (although unknown) is invertible. Thus there exist another set of weights w ij such that s 1 (t) = w 11 x 1 (t) + w 12 x 2 (t) + w 13 x 3 (t) (4) s 2 (t) = w 21 x 1 (t) + w 22 x 2 (t) + w 23 x 3 (t) s 3 (t) = w 31 x 1 (t) + w 32 x 2 (t) + w 33 x 3 (t) It turns out that this blind source separation (BSS) problem can be solved using independent component analysis (ICA). In ICA, it suffices to assume that the sources s j (t) are nongaussian and statistically independent. Helsinki University of Technology, Espoo, Finland 6

10 5 0 5 0 500 1000 1500 2000 2500 3000 3500 5 0 5 0 500 1000 1500 2000 2500 3000 3500 5 0 5 0 500 1000 1500 2000 2500 3000 3500 Figure 3: The estimates of the speech waveforms obtained by ICA. Helsinki University of Technology, Espoo, Finland 7

Definition of Independent Component Analysis ICA model is a statistical latent variable model x i = a i1 s 1 + a i2 s 2 +... + a in s n, for all i = 1,..., n (5) where the a ij, i, j = 1,..., n are some real coefficients. This is the basic linear ICA model, which can be extended in many ways. In the basic ICA model, we assume that each mixture x i as well as each independent component s j is a random variable. Using vector-matrix formulation: let x = (x 1,..., x n ) T, s = (s 1,..., s n ) T, A = (a ij ) (6) Then the basic ICA model is x = As (7) Helsinki University of Technology, Espoo, Finland 8

If the columns of A are denoted a j, the model can also be written as x = n a i s i (8) i=1 There are some basic assumptions or restrictions in the model. 1. The independent components are assumed statistically independent. 2. The independent components must have nongaussian distributions. - In the basic ICA, we need not know them. 3. In the basic ICA, the unknown mixing matrix A is square. - In other words, the number of independent components is equal to the number of observed mixtures. - This assumption can be relaxed by allowing more or less mixtures than independent components. Helsinki University of Technology, Espoo, Finland 9

Indeterminacies in the basic ICA model: scaling, sign, and order of the independent components. That is, only the waveforms of the independent components can be recovered without further information. Methods for linear ICA Independent components are usually estimated by trying to find an inverse separating matrix B. The components of the vector should be statistically independent. Ideally, B = A 1. y = Bx (9) Even though the ICA model x = As is linear and simple, the Helsinki University of Technology, Espoo, Finland 10

problem is difficult because of its blind nature. Higher-order statistics are needed for ICA. Using second-order statistics (covariances) provides uncorrelated components only. There exist infinitely many such uncorrelated solutions; most of them are quite different from ICA. However, prewhitening the data vectors x so that their components become uncorrelated is a useful preprocessing step. After that, the separating matrix B becomes orthogonal. Many methods for linear ICA now exist; the most popular of them are: The natural gradient algorithm B = µ[i g(y)y T ]B (10) Helsinki University of Technology, Espoo, Finland 11

- Here g(y) is a suitable nonlinearity applied to the components of the output vector y. - A simple adaptive neural algorithm, well justified theoretically. Fixed-point (FastICA) algorithms. - Fast batch algorithms applicable for large-scale problems. For more information, see our new 500-page textbook/monograph: A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, Wiley 2001. Linear blind source separation (BSS) In linear blind source separation (BSS), one tries to separate the original source signals from their linear mixtures. Assuming that the sources are independent and the mixing model is linear, x = As, one can apply linear ICA methods directly to BSS. Helsinki University of Technology, Espoo, Finland 12

Another major group of linear BSS methods utilizes time structure of the sources. Second-order temporal statistics are then sufficient for achieving blind separation. The sources can be even Gaussian provided that they have different autocorrelation sequences. ICA neglects possible temporal structure of the sources or independent components, treating them as random variables. On the other hand, it works for temporally uncorrelated sources. Ideally, both spatial independence and temporal structure should be taken into account in estimation. Helsinki University of Technology, Espoo, Finland 13

Practical applications of ICA The cocktail party problem : separation of voices or music or sounds. Sensor array processing, e.g. radar. Biomedical signal processing with multiple sensors: EEG, ECG, MEG, fmri. Telecommunications: e.g. multiuser detection in CDMA. Financial and other time series. Noise removal from signals and images. Feature extraction for images and signals. Projection pursuit: finding interesting projections from the data for visualizing it in two dimensions. Helsinki University of Technology, Espoo, Finland 14

Figure 4: Basis functions in ICA of natural images. These basis functions can be considered as the independent features of images. Every image window is a linear sum of these windows. Helsinki University of Technology, Espoo, Finland 15

MEG 1000 ft/cm 3 4 5 5 6 6 2 1 EOG ECG 500 µv 500 µv saccades blinking biting MEG 1 1 2 2 3 3 4 4 5 5 6 6 VEOG HEOG ECG 10 s Figure 5: 12 magnetic brain (MEG) signals containing various artifacts: ocular and muscle activity, the cardiac cycle, and magnetic disturbances. Helsinki University of Technology, Espoo, Finland 16

Magnitude Time delay Figure 6: An example of multipath propagation in urban environment. Helsinki University of Technology, Espoo, Finland 17

Extensions of basic linear ICA Noisy ICA; estimation of the mixing matrix and independent components requires more sophisticated methods. Overcomplete bases: the number of independent components is larger than the number of mixtures. Taking into account the temporal structure in the data. ICA and BSS for nonlinear mixture models. Separation of convolutive mixtures containing time delays. Separation of correlated or non-independent sources. Nonstationary sources, time dependent mixing matrices. Semi-blind problems: some prior information on the source signals and/or mixtures is available. Helsinki University of Technology, Espoo, Finland 18