Digital Video Systems ECE 634

Similar documents
Feature Tracking and Optical Flow

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt)

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

Analyzing Facial Expressions for Virtual Conferencing

How To Improve Performance Of The H264 Video Codec On A Video Card With A Motion Estimation Algorithm

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201

Two-Frame Motion Estimation Based on Polynomial Expansion

An Iterative Image Registration Technique with an Application to Stereo Vision

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER


Automatic Labeling of Lane Markings for Autonomous Vehicles

How To Analyze Ball Blur On A Ball Image

Latest Results on High-Resolution Reconstruction from Video Sequences

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Monash University Clayton s School of Information Technology CSE3313 Computer Graphics Sample Exam Questions 2007

Introduction to image coding

INTRODUCTION TO RENDERING TECHNIQUES

Constrained curve and surface fitting

Finite Elements for 2 D Problems

Computational Optical Imaging - Optique Numerique. -- Deconvolution --

Accurate and robust image superresolution by neural processing of local image representations

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet

Building an Advanced Invariant Real-Time Human Tracking System

Object tracking & Motion detection in video sequences

Face Model Fitting on Low Resolution Images

Parametric Comparison of H.264 with Existing Video Standards

A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

Edge tracking for motion segmentation and depth ordering

The Scientific Data Mining Process

EdExcel Decision Mathematics 1

Efficient Motion Estimation by Fast Three Step Search Algorithms

Quality Estimation for Scalable Video Codec. Presented by Ann Ukhanova (DTU Fotonik, Denmark) Kashaf Mazhar (KTH, Sweden)

A Learning Based Method for Super-Resolution of Low Resolution Images

Efficient Coding Unit and Prediction Unit Decision Algorithm for Multiview Video Coding

Subspace Analysis and Optimization for AAM Based Face Alignment

Complexity-bounded Power Control in Video Transmission over a CDMA Wireless Network

Clustering Very Large Data Sets with Principal Direction Divisive Partitioning

Image Registration and Fusion. Professor Michael Brady FRS FREng Department of Engineering Science Oxford University

Multi-Block Gridding Technique for FLOW-3D Flow Science, Inc. July 2004

View Sequence Coding using Warping-based Image Alignment for Multi-view Video

Linear Threshold Units

Summary: Transformations. Lecture 14 Parameter Estimation Readings T&V Sec Parameter Estimation: Fitting Geometric Models

Vision based Vehicle Tracking using a high angle camera

jorge s. marques image processing

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

OpenFOAM Optimization Tools

CONVERGE Features, Capabilities and Applications

Image Analysis for Volumetric Industrial Inspection and Interaction

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

Computer Graphics. Geometric Modeling. Page 1. Copyright Gotsman, Elber, Barequet, Karni, Sheffer Computer Science - Technion. An Example.

H 261. Video Compression 1: H 261 Multimedia Systems (Module 4 Lesson 2) H 261 Coding Basics. Sources: Summary:

Automatic 3D Mapping for Infrared Image Analysis

Single Depth Image Super Resolution and Denoising Using Coupled Dictionary Learning with Local Constraints and Shock Filtering

Template-based Eye and Mouth Detection for 3D Video Conferencing

A Short Introduction to Computer Graphics

Scientific Visualization with ParaView

Metrics on SO(3) and Inverse Kinematics

Low-resolution Character Recognition by Video-based Super-resolution

Video stabilization for high resolution images reconstruction

Colour Image Segmentation Technique for Screen Printing

Real-Time BC6H Compression on GPU. Krzysztof Narkowicz Lead Engine Programmer Flying Wild Hog

Introduction to Engineering System Dynamics

Traffic Monitoring Systems. Technology and sensors

Image Segmentation and Registration

Performance Verification of Super-Resolution Image Reconstruction

Introduction to the Finite Element Method (FEM)

Lecture 5: Variants of the LMS algorithm

Integer Computation of Image Orthorectification for High Speed Throughput

Introduction to acoustic imaging

The Essence of Image and Video Compression 1E8: Introduction to Engineering Introduction to Image and Video Processing

Video Coding Standards and Scalable Coding

Line and Polygon Clipping. Foley & Van Dam, Chapter 3

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

animation animation shape specification as a function of time

Høgskolen i Narvik Sivilingeniørutdanningen STE6237 ELEMENTMETODER. Oppgaver

Transform-domain Wyner-Ziv Codec for Video

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

Super-Resolution Methods for Digital Image and Video Processing

Interpolation of RGB components in Bayer CFA images

Image Compression through DCT and Huffman Coding Technique

Adaptation of General Purpose CFD Code for Fusion MHD Applications*

Fast Hybrid Simulation for Accurate Decoded Video Quality Assessment on MPSoC Platforms with Resource Constraints

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

1 Error in Euler s Method

Object tracking in video scenes

Protein Protein Interaction Networks

Geometric Camera Parameters

Joint MAP Registration and High Resolution Image Estimation Using a Sequence of Undersampled Images 1

3D Scanner using Line Laser. 1. Introduction. 2. Theory

2.5 Physically-based Animation

Performance Optimization of I-4 I 4 Gasoline Engine with Variable Valve Timing Using WAVE/iSIGHT

Linear Programming for Optimization. Mark A. Schulze, Ph.D. Perceptive Scientific Instruments, Inc.

Transcription:

Digital Video Systems ECE 634 engineering.purdue.edu/~reibman/ece634/index.html Professor Amy Reibman MSEE 356 reibman@purdue.edu

Outline 1/27/15 MoKon eskmakon A computer assignment

Summary (last class) 3D Motion Rigid vs. non-rigid motion Camera model: 3D à 2D projection Perspective projection vs. orthographic projection What causes 2D motion? Object motion projected to 2D Camera motion Models corresponding to typical camera motion and object motion Piece-wise projective mapping is a good model for projected rigid object motion Can be approximated by affine or bilinear functions Affine functions can also characterize some global camera motions Ways to represent motion: Pixel-based, block-based, region-based, global, etc. Yao Wang, 2003 3

MoKon Field Corresponding to Different 2- D MoKon Models Translation Affine Bilinear Projective Yao Wang, 2003 4

Region of support for representakon of mokon Global: Entire motion field is represented by a few global parameters Pixel-based: One MV at each pixel, with some smoothness constraint between adjacent MVs. Block-based: Entire frame is divided into blocks, and motion in each block is characterized by a few parameters. Region-based: Entire frame is divided into regions, each region corresponding to an object or sub-object with consistent motion, represented by a few parameters. Yao Wang, 2003 5

MulKple objects: Uncovered background and occlusion Motion is undefined in occluded regions Yao Wang, 2003 6

MoKon eskmakon algorithms MoKon representakon Pixel- based, block- based, mesh, global mokon,.. OpKmizaKon criteria Minimize displaced frame difference, opkcal flow, while subject to constraints.. OpKmizaKon strategies Gradient descent, exhauskve search,..

MoKon eskmakon outline 2D motion and optical flow Motion estimation General methodologies Pixel-based Block-based

Apparent mokon, or opkcal flow 2D velocity (object mokon projected onto the image plane) is not the same as opkcal flow Case 1: constantly lit sphere rotakng Case 2: skll sphere with changing lighkng Yao Wang, 2003

OpKcal flow derivakon (1) Constant intensity assumption (ambient and diffuse lighting) ψ ( x + d, y + d, t + d ) = ψ ( x, y, t) Taylor s expansion So clearly x y ψ(x + d x, y + d y,t + d t ) = ψ(x, y,t)+ ψ x d x + ψ y d y + ψ From Yao Wang, 2003 10 t ψ x d x + ψ y d y + ψ t d t = 0 ψ x v + ψ x y v + ψ y t = 0 y d t

OpKcal flow derivakon (2) Constant intensity assumption Apply chain rule dψ(x, y,t) dt = 0 ψ x dx dt + ψ dy y dt + ψ t = ψ x v x + ψ y v y + ψ t = 0 From Yao Wang, 2003 11

Flow equakon ψ x v x + ψ y v y + ψ t = 0 or ψ T v + ψ t = 0 Can only eskmate mokon in direckon orthogonal to spakal gradient Applies to a single point only

AmbiguiKes in MoKon EsKmaKon Optical flow equation only constrains the flow vector in the gradient direction The flow vector in the tangent direction ( v t ) is under-determined In regions with constant brightness ( ψ = 0), the flow is indeterminate à Motion estimation is unreliable in regions with flat texture, more reliable near edges v n v = v e v n n n + v e ψ ψ + = 0 t t t Yao Wang, 2003 13

Inaccuracies Object boundaries MoKon eskmakon more reliable around strong edges, but strong edges are likely to be where two objects move differently Occlusion No correspondence exists for (un)covered background The aperture problem This is an underconstrained problem; one equakon, two unknowns. Tangent direckon is undetermined Aperature must contain at least 2 different gradient direckons

Flow equakon ψ x v x + ψ y v y + ψ t = 0 or ψ T v + ψ t = 0 May not hold exactly for real images Noise, aliasing, illuminakon variakons,.. Instead, minimize some funckon of ψ x v + ψ x y v + ψ y t

Two categories of approaches for MoKon EsKmaKon Feature based (more often used in object tracking, 3D reconstruction from 2D) Find motion only for sparse points Impose a motion model to estimate a dense field Intensity based (based on constant intensity assumption) (more often used for motion compensated prediction, required in video coding, frame interpolation) Yao Wang, 2003 16

General ConsideraKons for MoKon EsKmaKon Three important questions How to represent the motion field? (ex: dense or sparse? Region or ) What optimization criteria to use to estimate motion parameters? Depends on the application; compression minimize average prediction error; motion-compensated interpolation minimize maximum interpolation error How to search for the best motion parameters? Yao Wang, 2003 17

Mix- and- match Motion representation Optimization criteria Optimization strategies Examples: Pixel-based representation, DFD, gradient descent Pixel-based representation, OF, least-square Block-based representation, DFD, exhaustive search Block-based representation, DFD, hierarchical search Mesh representation, DFD, iterative search Global motion

NotaKons Motion parameters: a Motion field: d ( x; a), x Λ ψ ( ) 2 x ψ ( ) ψ 2 x ( ) 1 x Mapping function: w ( x; a) = x + d( x; a), x Λ Yao Wang, 2003 19

MoKon EsKmaKon Criteria Minimize the displaced frame difference (DFD) E DFD( 1 x Λ a) = ψ 2( x + d( x; a)) ψ ( x) p = 1: MAD; P = 2:MSE p min Satisfy the optical flow (OF) equation E OF T ( ψ ( x) ) d( x; a) + ψ ( x) ψ ( x) min ( a) = 1 2 1 x Λ Impose additional smoothness constraint using regularization technique (Important in pixel- and block-based representation) E w s ( a) DFD E = x Λ y N DFD x ( a) + w E d( x; a) d( y; a) s s ( a) min Yao Wang, 2003 20 2 p

OpKmizaKon Strategies to find Min. or Max. Exhaustive search Typically used for the DFD criterion with p=1 (MAD) Guarantees reaching the global optimal Computation required may be unacceptable when there are many parameters to search simultaneously! Fast search algorithms reach sub-optimal solution in shorter time Gradient-based search Typically used for the DFD or OF criterion with p=2 (MSE) the gradient can often be calculated analytically When used with the OF criterion, closed-form solution may be obtained Reaches the local optimal point closest to the initial solution Multi-resolution search Search from coarse to fine resolution, faster than exhaustive search Less likely to be trapped into a local minimum Yao Wang, 2003 21

High- level Framework Motion representation Optimization criteria Optimization strategies Mix and match Pixel-based representation, DFD, gradient descent Pixel-based representation, OF, least-squares Block-based representation, DFD, exhaustive search Block-based representation, DFD, hierarchical search Mesh representation, DFD, iterative search Global motion ARReibman 2011 22

Block Matching Algorithm Overview: Assume all pixels in a block undergo a translation, denoted by a single MV Estimate the MV for each block independently, by minimizing the DFD error over this block Minimizing function: E DFD ( dm) = ψ 2( x + dm) ψ1( x) x B m min Optimization method: Exhaustive search (feasible as one only needs to search one MV for all pixels in the block), using MAD criterion (p=1) Fast search algorithms Integer vs. fractional pel accuracy search p Yao Wang, 2003 23

ExhausKve Block Matching Algorithm (EBMA) Yao Wang, 2003 24

Complexity of Integer- Pel EBMA Assumption Image size: MxM Block size: NxN Search range: (-R,R) in each dimension Search stepsize: 1 pixel (assuming integer MV) Operation counts (1 operation=1 -, 1 +, 1 * ): Each candidate position: N^2 Each block going through all candidates: (2R+1)^2 N^2 Entire frame: (M/N)^2 (2R+1)^2 N^2=M^2 (2R+1)^2 Independent of block size! Example: M=512, N=16, R=16, 30 fps Total operation count = 2.85x10^8/frame =8.55x10^9/second Regular structure suitable for VLSI implementation Software-only implementation slow Yao Wang, 2003 25

Pseudo- code/matlab Script for Integer- pel EBMA %f1: anchor frame; f2: target frame, fp: predicted image; %mvx,mvy: store the MV image %widthxheight: image size; N: block size, R: search range for ii=1:n:height-n, for jj=1:n:width-n %for every block in the anchor frame MAD_min=256*N*N;mvx=0;mvy=0; for kk=-r:1:r, for ll=-r:1:r %for every search candidate MAD=sum(sum(abs(f1(ii:ii+N-1,jj:jj+N-1)-f2(ii+kk:ii+kk+N-1,jj+l:jj+ll+N-1)))); % calculate MAD for this candidate if MAD<MAX_min MAD_min=MAD,dy=kk,dx=ll; end; end;end; fp(ii:ii+n-1,jj:jj+n-1)= f2(ii+dy:ii+dy+n-1,jj+dx:jj+dx+n-1); %put the best matching block in the predicted image iblk=(floor)(ii-1)/n+1; jblk=(floor)(jj-1)/n+1; %block index mvx(iblk,jblk)=dx; mvy(iblk,jblk)=dy; %record the estimated MV end;end; Note: A real working program needs to check whether a pixel in the candidate matching block falls outside the image boundary and such pixel should not count in MAD. This program is meant to illustrate the main operations involved. Not the actual working matlab script. 26 Yao Wang, 2003

FracKonal Accuracy EBMA Real MV may not always be multiples of pixels. To allow sub-pixel MV, the search stepsize must be less than 1 pixel Half-pel EBMA: stepsize=1/2 pixel in both dimension Difficulty: Target frame only have integer pels Solution: Interpolate the target frame by factor of two before searching Bilinear interpolation is typically used Complexity: 4 times of integer-pel, plus additional operations for interpolation. Fast algorithms: Search in integer precisions first, then refine in a small search region in half-pel accuracy. Yao Wang, 2003 27

Half- Pel Accuracy EBMA Yao Wang, 2003 28

Bilinear InterpolaKon (x,y) (x+1,y) (2x,2y) (2x+1,2y) (2x,2y+1) (2x+1,2y+1) (x,y+!) (x+1,y+1) O[2x,2y]=I[x,y] O[2x+1,2y]=(I[x,y]+I[x+1,y])/2 O[2x,2y+1]=(I[x,y]+I[x+1,y])/2 O[2x+1,2y+1]=(I[x,y]+I[x+1,y]+I[x,y+1]+I[x+1,y+1])/4 Yao Wang, 2003 29

Motion field target frame Predicted anchor frame (29.86dB) anchor frame Example: Half-pel EBMA

Problems with EBMA MoKon field is chaokc Each block s mokon vector is computed independently Many possible matches, especially in smooth regions DFD is not uniformly small within block Poor mokon model: Block may contain mulkple mokons Block does not undergo translakon IlluminaKon changes DFD is not uniformly small across block boundaries Poor mokon model: Adjacent pixels can have very different mokons Slow ARReibman 2011 31

Minimizing problems with EBMA MoKon field is chaokc Use hierarchical search Impose smoothness constraints (including mesh- based model) DFD is not uniformly small within block Improve mokon model (Deformable and mesh- based models; region- based eskmakon; compensate for variable illuminakon) DFD is not uniformly small across block boundaries Mesh- based mokon models; compute pixel- based mokon Slow Use fast algorithms and hierarchical search ARReibman 2011 32

Fast Algorithms for BMA Two key ideas to reduce the computakon in EBMA Reduce # of search candidates: Only search for those that are likely to produce small errors. Predict possible remaining candidates, based on previous search result Reduce the computakon for each candidate by simplifying the DFD error measure Subsample and don t compute DFD on all possible pixels Many many fast algorithms Three- step 2D- log Conjugate direckon Yao Wang, 2003 33

Summary (so far this class) Constraints for 2D motion Optical flow equation Derived from constant intensity and small motion assumption Ambiguity in motion estimation Estimation criterion: DFD (constant intensity) OF (constant intensity+small motion) Search method: Exhaustive search, gradient-descent, multi-resolution Pixel-based motion estimation Most accurate representation, but also most costly to estimate Block-based motion estimation Good trade-off between accuracy and speed EBMA and its fast but suboptimal variants are widely used in video coding for motion-compensated temporal prediction. Yao Wang, 2003 34

Some reference material Wang, Ostermann, Zhang, Video Processing and Communications, Prentice Hall, 2002 Chap 5: Sec. 5.1, 5.5 Chap 6: Sec. 6.1-6.4, Sec. 6.4.5,6.4.6 not required, Apx. A, B. J. Konrad, Motion detection and estimation, Chapter 3 in A. C. Bovik (ed.), The Essen(al Guide to Video Processing, Elsevier 2009 A. M. Tekalp, Digital video processing, Prentice Hall, 1995 Chapter 5: 5.1,5.2 Chapter 6: 6.1, 6.3, 6.4 35

Computer assignments Due Tuesday 2/3/15 at 1:30pm in class Note: you can download sample video frames from the course webpage. When applying your motion estimation algorithm, you should choose two frames that have sufficient motion in between so that it is easy to observe effect of motion estimation inaccuracy. If necessary, choose two frames that are several frames apart. For example, foreman: frame 100 and frame 103. 36

Computer assignment (1) Write a C or Matlab program for implemenkng an exhauskve search block matching algorithm with integer- pel accuracy. Program inputs: video sequence, block size, search range. Program outputs: eskmated mokon field. Assignment deliverables: Program source code Plot of eskmated mokon field for two images from video sequence. (In Matlab, use funckon quiver ) The predicted image, the predickon error image The PSNR of the predicted frame relakve to its original ExploraKons Impact of different search range Impact of different predicted- block size (16*16 as starkng point)

Computer assignment (2) Write a C or Matlab program for implemenkng an exhauskve search block matching algorithm with half- pel accuracy. Program inputs: video sequence, block size, search range. Program outputs: eskmated mokon field. Assignment deliverables: Program source code Plot of eskmated mokon field for two images from video sequence. (In Matlab, use funckon quiver ) The predicted image, the predickon error image The PSNR of the predicted frame relakve to its original Compare PSNR of the predicted image with results from integer- pel accuracy Compare execukon Kme relakve to results from integer- pel accuracy ExploraKons Impact of different search range Impact of different predicted- block size