Daala: Tools and Techniques IETF 93 (Prague)

Similar documents
Proposed WG Development of a Video Codecs Market

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

Thor High Efficiency, Moderate Complexity Video Codec using only RF IPR

TECHNICAL OVERVIEW OF VP8, AN OPEN SOURCE VIDEO CODEC FOR THE WEB

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

H.264/MPEG-4 AVC Video Compression Tutorial

Evaluating Wavelet Tranforms for Video Conferencing Applications. Second quarter report (Oct Dec, 2008)

JPEG Image Compression by Using DCT

Accelerating Wavelet-Based Video Coding on Graphics Hardware

Overview: Video Coding Standards

THE EMERGING JVT/H.26L VIDEO CODING STANDARD

Quality Estimation for Scalable Video Codec. Presented by Ann Ukhanova (DTU Fotonik, Denmark) Kashaf Mazhar (KTH, Sweden)

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt)

Multihypothesis Prediction using Decoder Side Motion Vector Derivation in Inter Frame Video Coding

Comparison of the Coding Efficiency of Video Coding Standards Including High Efficiency Video Coding (HEVC)

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201

CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.

Transform-domain Wyner-Ziv Codec for Video

X264: A HIGH PERFORMANCE H.264/AVC ENCODER. Loren Merritt and Rahul Vanam*

Development and Evaluation of Point Cloud Compression for the Point Cloud Library

Figure 1: Relation between codec, data containers and compression algorithms.

THE PRIMARY goal of most digital video coding standards

Survey of Dirac: A Wavelet Based Video Codec for Multiparty Video Conferencing and Broadcasting

Video-Conferencing System

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

Video coding with H.264/AVC:

How To Code With Cbcc (Cbcc) In Video Coding

H.264/MPEG-4 Advanced Video Coding Alexander Hermans

An Introduction to Ultra HDTV and HEVC

Fast Arithmetic Coding (FastAC) Implementations

Section V.3: Dot Product

L9: Cepstral analysis

Performance Analysis and Comparison of JM 15.1 and Intel IPP H.264 Encoder and Decoder

Alberto Corrales-García, Rafael Rodríguez-Sánchez, José Luis Martínez, Gerardo Fernández-Escribano, José M. Claver and José Luis Sánchez

Michael W. Marcellin and Ala Bilgin

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions


H 261. Video Compression 1: H 261 Multimedia Systems (Module 4 Lesson 2) H 261 Coding Basics. Sources: Summary:

Introduction to image coding

INTERNATIONAL TELECOMMUNICATION UNION TERMINAL EQUIPMENT AND PROTOCOLS FOR TELEMATIC SERVICES

Impedance 50 (75 connectors via adapters)

Image Compression through DCT and Huffman Coding Technique

Rate-Constrained Coder Control and Comparison of Video Coding Standards

Difference between a vector and a scalar quantity. N or 90 o. S or 270 o

Efficient Stream-Reassembling for Video Conferencing Applications using Tiles in HEVC

*EP B1* EP B1 (19) (11) EP B1 (12) EUROPEAN PATENT SPECIFICATION

GPU Compute accelerated HEVC decoder on ARM Mali TM -T600 GPUs

Lecture L3 - Vectors, Matrices and Coordinate Transformations

Fast Hybrid Simulation for Accurate Decoded Video Quality Assessment on MPSoC Platforms with Resource Constraints

The string of digits in the binary number system represents the quantity

Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm

Impedance Matching of Filters with the MSA Sam Wetterlin 2/11/11

Motion Estimation. Macroblock Partitions. Sub-pixel Motion Estimation. Sub-pixel Motion Estimation

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

MEDICAL IMAGE COMPRESSION USING HYBRID CODER WITH FUZZY EDGE DETECTION

Linear Codes. Chapter Basics

TRANSPARENT ENCRYPTION FOR HEVC USING BIT-STREAM-BASED SELECTIVE COEFFICIENT SIGN ENCRYPTION. Heinz Hofbauer Andreas Uhl Andreas Unterweger

Introduction to Digital Audio

v w is orthogonal to both v and w. the three vectors v, w and v w form a right-handed set of vectors.

Today. Binary addition Representing negative numbers. Andrew H. Fagg: Embedded Real- Time Systems: Binary Arithmetic

Data Storage 3.1. Foundations of Computer Science Cengage Learning

PCM Encoding and Decoding:

Standards compliant watermarking for access management

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

A Look at Emerging Standards in Video Security Systems. Chris Adesanya Panasonic Network Systems Company

MPEG-4 Natural Video Coding - An overview

Vectors. Objectives. Assessment. Assessment. Equations. Physics terms 5/15/14. State the definition and give examples of vector and scalar variables.

One advantage of this algebraic approach is that we can write down

South Carolina College- and Career-Ready (SCCCR) Pre-Calculus

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

RECOMMENDATION ITU-R BO.786 *

Appendix C GSM System and Modulation Description

PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS

Section 9.1 Vectors in Two Dimensions

FAQs. Getting started with the industry s most advanced compression technology. when it counts

Revision of Lecture Eighteen

DOLBY SR-D DIGITAL. by JOHN F ALLEN

The Dot and Cross Products

Shear :: Blocks (Video and Image Processing Blockset )

Video Coding Technologies and Standards: Now and Beyond

2. Spin Chemistry and the Vector Model

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Figure 1.1 Vector A and Vector F

CHAPTER 2 LITERATURE REVIEW

APPLIED MATHEMATICS ADVANCED LEVEL

DCT-JPEG Image Coding Based on GPU

Video Coding with Cubic Spline Interpolation and Adaptive Motion Model Selection

Java Modules for Time Series Analysis

(Refer Slide Time: 2:10)

Peter Eisert, Thomas Wiegand and Bernd Girod. University of Erlangen-Nuremberg. Cauerstrasse 7, Erlangen, Germany

HD Radio FM Transmission System Specifications Rev. F August 24, 2011

Orthogonal Projections and Orthonormal Bases

The mathematics of RAID-6

Unified Lecture # 4 Vectors

Coding and Patterns of Data Streams

The Opus Codec To be presented at the 135th AES Convention 2013 October New York, USA

The H.264/MPEG-4 Advanced Video Coding (AVC) Standard

FacultyofComputingandInformationTechnology DepartmentofRoboticsandDigitalTechnology TechnicalReport93-11

Transcription:

Daala: Tools and Techniques IETF 93 (Prague)

Daala Inter Encoder Reference Frames Input Frame OBMC Forward Transform Prediction Frame Forward Transform Input Coefficients Prediction Coefficients PVQ Bitstream Entropy Coding Decoded Image Dequant + Inverse Transform DCs and PVQ data

Daala Intra Encoder Input Frame Forward Transform H+V Intra Prediction Chroma from Luma Input Coefficients Prediction Coefficients Bitstream Haar DC, PVQ Entropy Coding Decoded Image Dequant + Inverse Transform DCs and PVQ data

Transform Lapped transform N-point pre-filter removes correlation between blocks N-point DCT within blocks Decoder applies inverse (non-adaptive post-filter) Prefilter Postfilter DCT IDCT P P -1 DCT IDCT P P -1 DCT IDCT P P -1 DCT IDCT

Transforms draft-egge-netvc-tdlt-00 Sizes: 4x4, 8x8, 16x16, 32x32 (64x64 in progress) Reversible ilt(flt(x)) == x for all x Biorthogonal (not orthogonal) Not all basis functions have the same magnitude Slight correlation between coefficients

Overlapped Block Motion Compensation draft-terriberry-netvc-obmc Goal: MC prediction with no blocking artifacts Variable block size (unrelated to transform size) Unlike Dirac, larger blocks use larger blend window Currently restricted so adjacent blocks differ by at most one size to maintain continuity Possible to remove this restriction Subpel: separable 6-tap filters Windowed sinc 7-bit coefficients, no truncation/rounding between horizontal/vertical stages

OBMC Prediction OBMC produces a prediction image for the whole frame PVQ requires a prediction in the frequency domain Just apply forward transforms to the prediction image Currently no explicit intra mode PVQ can choose to discard the prediction (noref) Our encoder still spends bits trying to improve it during motion search

Displaced Frame Difference Motion Compensation Copy blocks from an already encoded frame (offset by a motion vector) Subtract from the current frame Code the residual = Input Reference frame Residual 8

Displaced Frame Difference Motion Compensation Copy blocks from an already encoded frame (offset by a motion vector) Subtract from the current frame Code the residual = Input Reference frame Residual 9

Perceptual Vector Quantization Separate gain (contrast) from shape (spectrum) Vector = Magnitude Unit Vector (point on sphere) Potential advantages Better contrast preservation Better representation of coefficients Free activity masking Can throw away more information in regions of high contrast (relative error is smaller) The gain is what we need to know to do this! 10

PVQ with a Prediction Subtracting and coding a residual loses energy preservation The gain no longer represents the contrast But we still want to use predictors They do a really good job of reducing what we need to code Hard to use prediction on the shape (on the surface of a hyper-sphere) Solution: transform the space to make it easier 11

2-D Projection Example Input Input 12

2-D Projection Example Input + Prediction Prediction Input 13

2-D Projection Example Input + Prediction Compute Householder Reflection Prediction Input 14

2-D Projection Example Input + Prediction Compute Householder Reflection Apply Reflection Prediction Input 15

2-D Projection Example Input + Prediction Compute Householder Reflection Apply Reflection Compute & code angle Prediction Input θ 16

2-D Projection Example Input + Prediction Compute Householder Reflection Apply Reflection Compute & code angle Code other dimensions Prediction Input θ 17

What does this accomplish? Creates another intuitive parameter, θ How much like the predictor are we? θ = 0 use predictor exactly Remaining N-1 dimensions are coded with VQ We know their magnitude is gain*sin(θ) Instead of subtraction (translation), we re scaling and reflecting 18

Activity Masking Noise is more visible in low contrast areas Sensitivity g α 19

Activity Masking Better resolution in low-contrast (gain) areas Compand gain with exponent β β=1 β=1.5 20

PVQ Bands DC coded separately with scalar quantization Intra uses Haar DC to get better resolution over large areas AC coefficients grouped into bands Gain, theta, etc., signaled separately for each band Layout ad-hoc for now Scan order in each band optimized for decreasing variance 21

Band Structure 22

To Predict or Not to Predict θ > π/2 Prediction not helping Could code large θ s, but doesn t seem that useful Need to handle zero predictors anyway Current approach: code a noref flag Jointly coded with small gain and theta values 23

H+V Intra Prediction (luma) Copy first row/column of neighbor to corresponding bands Only if size of corresponding blocks match Use noref to decide whether to use that as a predictor For the first band (first 15 AC coeffs), have to choose (only one noref flag) Copy from neighbor with highest energy No explicit intra mode signaled 24

Chroma from Luma Use luma coeffs. as PVQ predictor for chroma For 4:2:0 4x4 chroma blocks, TF-merge 4x4 luma blocks and take the low quadrant No longer building a model from neighbors PVQ gains signal scaling noref flag can disable prediction Additional flip flag can reverse the whole predictor (coded on first non-noref band) No longer predicting chroma DC from luma 25

Quantization Per-coefficient quantizers Interpolated up/down from 8x8 matrix Compensation for LT basis magnitudes in separate step Built-in activity masking Goal: better resolution in flat areas Low contrast low energy (gain) Compand gain, choose resolution for θ and K based on quantized gain 26

Entropy Coding draft-terriberry-netvc-codingtools Non-binary arithmetic coding Theory: many overheads are per-symbol Reducing the number of symbols improves throughput/cycle Much simpler than encoding multiple symbols in parallel Decoder search in non-binary alphabet can still be parallelized (with SIMD or in hardware) 27

Entropy Coding Multiply-free partition function Modified from Stuiver & Moffat 1998 design R c + min(c, R total) Requires total R < 2*total (shift up if not) f 0 =8 f 1 =2 f 2 =1f 3 =1 total = 12 (scale by 2 to get 24 32 < 48) 0 24 28 30 32 R = 32 over-estimated under-estimated 28

Entropy Coding: Properties 15-bit probabilities Alphabet sizes up to 16 Want to keep as close to this as possible for maximum efficiency Several different adaptation strategies Traditional frequency counts Laplace encoder parameterized by expected value Generic Coder combines frequency counts at several scales with exponential tails 29

Entropy Coder: Raw bits Appended to the end of the frame Coded from back to front Much simpler than CABAC bypass mode Same technique used in Opus Encoder writes to separate buffer, merged during Dirac-style carry propagation step 30

TODO PVQ needs a fixed-point implementation No B-frames at all (in progress) Need 64x64 blocks for OBMC Need 64x64 transforms Lots of potential improvements in motion search Lots of ways to exploit PVQ we haven t tried Generic coder should be replaced by more targeted probability modeling Deringing filter (have prototype, too slow) Etc. 31