MPEG & H.26L OVERVIEW Nuno Vasconcelos (with thanks to Truong Nguyen)
Video Compression Codec Characteristics Temporal & Spatial Compression Codec Settings Compression Standards MPEG-7
Codec Characteristics Lossy v. Lossless Can t use lossy compression for data or programs Spatial v. Temporal Compression Intraframe Discrete Cosine Transform (DCT) Interframe Keyframe + Difference frame Symmetric v. Asymetric Usually decoding needs to be faster, except for capture Software v. Hardware Real-time capture needs hardware MPEG-2 2 usually needs hardware Web video should never need special hardware
Codec Characteristics (contd.) Hardware Requirements Fast Disk Access Powerful Processor for storage/retrieval of compressed file for real-time compression/decompr decompression Change makes Compression Difficult Artifacts Fast motion Dramatic lighting changes Low light level introduces noise Blockiness - low DCT values only Blurriness - loss of high frequency DCT coefficients
Low Bit Rate Video coding Why?: Increasing demand for video conferencing and telephony applications, limited bandwidth in PSTN and wireless networks Video coding algorithms: Waveform based coding: MC+DCT/wavelets, 3D subband, etc. Object- and model-based coding : shape coding, wireframes, etc. Video coding standards: ITU-T T H.261(1990), H.262 (1994), H.263 (1995), H.263+ (1998) ISO/IEC MPEG1 (1992), MPEG2 (1994), MPEG4 (1999) H.263 version 2 (H.263+): Higher coding efficiency, more flexibility, scalability support, error resilience support 5
The H.263 Standard Video in DCT Q VLC MUX Bit Stream IQ IDCT Inter/Intra 0 PRED ME 6
Video Compression Standards Communications Information/Entertainment Real time Encode& Decode H.261 MPEG-1 Real time decode Delay not critical Low delay Low bit-rate H.263 MPEG-2 MPEG-4
Video Stream Data Hierarchy
Types of Pictures (1) I ( Intra ) Picture P ( Predicted ) Picture B ( Bidirectional ) Picture
Types of Pictures (2) Forward prediction 1 2 3 4 5 6 7 8 9 I B B B P B B B I Bidirectional prediction Transmission Order : 1 5 2 3 4 9 6 7 8
Spatial Compression Process Flow
Chrominance Subsampling
Perceptual Sensitivity
Discrete Cosine Transform
Quantization
Scan Types
Block Matching Algorithm
Example Motion Vector Field
Backward Prediction
GOP ( (Group Group Of Pictures)- Order of Arrival I frames intra-frame spatially compressed only P frames predict frames predicted from I frames or other P frames B frames bidirectional frames interpolated between I and P frames (must be buffered)
GOP Quality Tradeoff
MPEG Streams
Data partitioning Scalability Only lower order DCT coefficients are transmitted SNR Standard quality picture + low-noise helper signal Spatial Standard size + additional High Definition (HD) layer Temporal e.g. MPEG-2 2 where B pictures are in separate layer
Level & Profile MP@ML 15Mb/sec
Problem video compression scenarios for MPEG-2 Quick changes in luminosity e.g leaves, water, flashbulbs Circular motion, because motion prediction assumes objects move in a straight line Alternating wavy lines, a variation of circular motion Sharp, high-contrast edges, as for fonts or graphics Multiple motions, where a single images splits into two or more, confuses motion prediction
Codec Settings Quality Don t use less that 50 Law of Diminishing Returns for higher values Frames/Sec Use submultiple of source frame rate if possible Keyframe every N frames Depends on the codec, Sorenson default is 1 every 10 secs Transitions and cuts require keyframes
Codec Settings (contd.) Automatic Key Frames Limit data rate Can specify a difference threshold Limits the size for variable rate codecs Data Rate Tracking Combination of fixed and variable rates 0% => fixed only 100% => data rate depends entirely on content Temporal Scalability (very simple) 2:1 drop every second frame 3:2:1 first drop every third frame then every second
Compression Standards Motion JPEG (MJPEG) Sequence of JPEGs Often used for video capture MJPEG A 3Mbytes/sec 7:1 compression H.261 Videoconferencing over ISDN - 64 to 1920 kbits/sec Part of H.32X series of standards MPEG-1 1 (H.262) Structure: block macroblock slice picture GOP sequence Uses prediction or motion estimation - I, P & B pictures No interlacing
Compression Standards (contd.) MPEG-2 MPEG-4 Wide range of bit rates, resolutions and frame sizes Interlacing Scalability receiver can decode a subset of the full bitstream Designed for low bitrate multimedia applications Video Object Planes (VOPs( VOPs) similar to a sprite or a Photoshop layer Segmentation of picture into irregular shapes Texture Coding Discrete Wavelet Transform (DWT) used instead of DCT
Compression Standards (contd.) H.263 Incorporates MPEG 1 & 2 technology into videoconferencing MPEG-7 Multimedia content description interface indexing & query Features extracted from keyframes and stored as metadata
MPEG 7 Multimedia Content Description Interface MPEG-1 1 & MPEG2 are for compression MPEG-4 4 for content object such as sprites MPEG-7 How to identify and manage audio-visual content Can be used independently of other MPEG standards Similar to XML Specfies a standard set of descriptors which can be used to describe various types of multimedia information Offers different level of granularity for feature descriptions Low Visual: shape, size, texture, color, motion descriptors Audio: key, tempo, timbre/spectral composition High Roy Keane scores winning goal in Ireland v. Cameroon match Features may be extracted manually or automatically
Additional multimedia descriptive information Format The coding scheme used (e.g. JPEG, MPEG-2,, MP3), or the overall data size Conditions for accessing Classification categories This could include intellectual property rights information, and price This could include parental rating, and content classification into i a number of pre-defined Links to other relevant material In n the t case of recorded non-fiction content, it is very important to know the occasion of the recording.. The information may help the user in speeding up the search
MPEG-7 Architecture
Possible Application Areas Architecture, real estate, and interior design (e.g., searching for ideas) Broadcast media selection (e.g., radio channel, TV channel) Cultural services (history museums, art galleries, etc.) Digital libraries (e.g., image catalogue, musical dictionary, bio-medical imaging catalogues, film, video and radio archives) E-Commerce (e.g., personalised advertising, on-line catalogues, directories of e-shops) e Education (e.g., repositories of multimedia courses, multimedia search for support material) Home Entertainment (e.g., systems for the management of personal multimedia collections, including manipulation of content, e.g. home video editing, searching a game, karaoke) Investigation services (e.g., human characteristics recognition, forensics) Journalism (e.g. searching speeches of a certain politician using g his name, his voice or his face) Multimedia directory services (e.g. yellow pages, Tourist information, Geographical information systems) Multimedia editing (e.g., personalised electronic news service, media authoring) Remote sensing (e.g., cartography, ecology, natural resources management) Shopping (e.g., searching for clothes that you like) Social (e.g. dating services) Surveillance (e.g., traffic control, surface transportation, non-destructive testing in hostile environments) Taken From: http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg 7/mpeg-7.htm7.htm