CM0340/CMT502 Solutions. Do not turn this page over until instructed to do so by the Senior Invigilator.

Similar documents
CM0340 SOLNS. Do not turn this page over until instructed to do so by the Senior Invigilator.

Figure 1: Relation between codec, data containers and compression algorithms.


Video-Conferencing System

Introduction to image coding

JPEG Image Compression by Using DCT

encoding compression encryption

Image Compression through DCT and Huffman Coding Technique

Comparison of different image compression formats. ECE 533 Project Report Paula Aguilera

MPEG Unified Speech and Audio Coding Enabling Efficient Coding of both Speech and Music

For Articulation Purpose Only

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

H 261. Video Compression 1: H 261 Multimedia Systems (Module 4 Lesson 2) H 261 Coding Basics. Sources: Summary:

Study and Implementation of Video Compression Standards (H.264/AVC and Dirac)

Computer Networks and Internets, 5e Chapter 6 Information Sources and Signals. Introduction

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

CHAPTER 2 LITERATURE REVIEW

Digital Audio Compression: Why, What, and How

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Video Encryption Exploiting Non-Standard 3D Data Arrangements. Stefan A. Kramatsch, Herbert Stögner, and Andreas Uhl

1. Introduction to image processing

STUDY OF MUTUAL INFORMATION IN PERCEPTUAL CODING WITH APPLICATION FOR LOW BIT-RATE COMPRESSION

Video compression: Performance of available codec software

Understanding HD: Frame Rates, Color & Compression

Broadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.

Audio Coding, Psycho- Accoustic model and MP3

Study and Implementation of Video Compression standards (H.264/AVC, Dirac)

MMGD0203 Multimedia Design MMGD0203 MULTIMEDIA DESIGN. Chapter 3 Graphics and Animations

Understanding Compression Technologies for HD and Megapixel Surveillance

UNIVERSITY OF LONDON GOLDSMITHS COLLEGE. B. Sc. Examination Sample CREATIVE COMPUTING. IS52020A (CC227) Creative Computing 2.

4 Digital Video Signal According to ITU-BT.R.601 (CCIR 601) 43

Conceptual Framework Strategies for Image Compression: A Review

Outline. Quantizing Intensities. Achromatic Light. Optical Illusion. Quantizing Intensities. CS 430/585 Computer Graphics I

Overview. Raster Graphics and Color. Overview. Display Hardware. Liquid Crystal Display (LCD) Cathode Ray Tube (CRT)

Quality Estimation for Scalable Video Codec. Presented by Ann Ukhanova (DTU Fotonik, Denmark) Kashaf Mazhar (KTH, Sweden)

Compression techniques

Voice---is analog in character and moves in the form of waves. 3-important wave-characteristics:

Trigonometric functions and sound

Bandwidth Adaptation for MPEG-4 Video Streaming over the Internet

A HIGH PERFORMANCE SOFTWARE IMPLEMENTATION OF MPEG AUDIO ENCODER. Figure 1. Basic structure of an encoder.

H.264/MPEG-4 AVC Video Compression Tutorial

The Essence of Image and Video Compression 1E8: Introduction to Engineering Introduction to Image and Video Processing

AUDIO CODING: BASICS AND STATE OF THE ART

Arithmetic Coding: Introduction

Sachin Dhawan Deptt. of ECE, UIET, Kurukshetra University, Kurukshetra, Haryana, India

Video codecs in multimedia communication

An Optimised Software Solution for an ARM Powered TM MP3 Decoder. By Barney Wragg and Paul Carpenter

Structures for Data Compression Responsible persons: Claudia Dolci, Dante Salvini, Michael Schrattner, Robert Weibel

Introduction to Medical Image Compression Using Wavelet Transform

The Design and Implementation of Multimedia Software

Today s topics. Digital Computers. More on binary. Binary Digits (Bits)

Introduction and Comparison of Common Videoconferencing Audio Protocols I. Digital Audio Principles

How to Send Video Images Through Internet

Introduzione alle Biblioteche Digitali Audio/Video

Understanding Megapixel Camera Technology for Network Video Surveillance Systems. Glenn Adair

IP Video Rendering Basics

Prepared by: Paul Lee ON Semiconductor

Module 13 : Measurements on Fiber Optic Systems

A NEW LOSSLESS METHOD OF IMAGE COMPRESSION AND DECOMPRESSION USING HUFFMAN CODING TECHNIQUES

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging

Digital Audio and Video Data

!"#$"%&' What is Multimedia?

Video Coding Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

IMPACT OF COMPRESSION ON THE VIDEO QUALITY

MPEG-1 / MPEG-2 BC Audio. Prof. Dr.-Ing. K. Brandenburg, bdg@idmt.fraunhofer.de Dr.-Ing. G. Schuller, shl@idmt.fraunhofer.de

Implementation of ASIC For High Resolution Image Compression In Jpeg Format

Classes of multimedia Applications

Reading.. IMAGE COMPRESSION- I IMAGE COMPRESSION. Image compression. Data Redundancy. Lossy vs Lossless Compression. Chapter 8.

Computer Vision. Color image processing. 25 August 2014

balesio Native Format Optimization Technology (NFO)

Lecture 1-10: Spectrograms

Calibration Best Practices

Video Authentication for H.264/AVC using Digital Signature Standard and Secure Hash Algorithm

White paper. An explanation of video compression techniques.

A Comparison of the ATRAC and MPEG-1 Layer 3 Audio Compression Algorithms Christopher Hoult, 18/11/2002 University of Southampton

Video compression. Contents. Some helpful concepts.

Understanding Network Video Security Systems

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

Information, Entropy, and Coding

Audio Coding Algorithm for One-Segment Broadcasting

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. XXXIV-5/W10

JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions

*EP B1* EP B1 (19) (11) EP B1 (12) EUROPEAN PATENT SPECIFICATION

Secured Lossless Medical Image Compression Based On Adaptive Binary Optimization

WATERMARKING FOR IMAGE AUTHENTICATION

Digital Audio Compression

Figure1. Acoustic feedback in packet based video conferencing system

Technical Paper DISPLAY PROFILING SOLUTIONS

REIHE INFORMATIK 7/98 Efficient Video Transport over Lossy Networks Christoph Kuhmünch and Gerald Kühne Universität Mannheim Praktische Informatik IV

Doppler. Doppler. Doppler shift. Doppler Frequency. Doppler shift. Doppler shift. Chapter 19

This document describes how video signals are created and the conversion between different standards. Four different video signals are discussed:

In the two following sections we separately consider hardware and software requirements. Sometimes, they will be offered for sale as a package.

UNIVERSITY OF CALICUT

Solomon Systech Image Processor for Car Entertainment Application

C Implementation & comparison of companding & silence audio compression techniques

A Guide to MPEG Fundamentals and Protocol Analysis (Including DVB and ATSC)

Statistical Modeling of Huffman Tables Coding

Computer Vision and Video Electronics

A Tutorial On Network Marketing And Video Transoding

Transcription:

CARDIFF UNIVERSITY EXAMINATION PAPER Academic Year: 2012/2013 Examination Period: Examination Paper Number: Examination Paper Title: Duration: Spring CM0340/CMT502 Solutions Multimedia 2 hours Do not turn this page over until instructed to do so by the Senior Invigilator. Structure of Examination Paper: There are 14 pages. There are 4 questions in total. There are no appendices. The maximum mark for the examination paper is 81 and the mark obtainable for a question or part of a question is shown in brackets alongside the question. Students to be provided with: The following items of stationery are to be provided: ONE answer book. Instructions to Students: Answer 3 questions. The use of calculators is permitted in this examination. The use of translation dictionaries between English or Welsh and a foreign language bearing an appropriate departmental stamp is permitted in this examination. 1 PLEASE TURN OVER

Q1. (a) How does the human eye sense colour? What characteristics of the human visual system can be exploited for the compression of colour images and video? The eye is basically sensitive to colour and intensity Retina of the eye has neurons on which light is focus. Each neuron is either a rod or a cone. [1] Rods are not sensitive to colour - sense intensity (monochrome). [1] Cones come in 3 types: The first responds most to light of long wavelengths, red/yellowish colours. The second type responds most to light of mediumwavelength, peaking at a green colour, The third type responds most to shortwavelength light, of a bluish colour. [1] Each responds differently Non linearly and not equally for RGB differently to various frequencies of light. [1] Compression in image video uses the fact that intensity (monochrome) can be modelled in high resolution and colour modelled in lower resolution and non-linearly w.r.t colour sensitivity. [1] 5 Marks - Bookwork (b) Different colour models are often used in different applications. What is the CMYK colour model? Give an application in which this colour model is mostly used and explain the reason. The CMYK colour model use Cyan, Magenta, Yellow and Black as primaries (components). [1] The CMYK colour model is mostly used in printing because the colour pigments on the paper absorb certain colours thus a subtractive model is suitable; black is used to produce darker black than simply mixing CMY. [2] 3 Marks Bookwork Given a colour represented in RGB colour space as R = 0.2, G = 0.6, B = 0.3, what is its representation in the CMYK colour model? First convert to CMY as C M Ȳ = 1 1 1 R G B = 0.8 0.4 0.7 Then K = min( C, M, Ȳ ) = 0.4, C = C K = 0.4, M = M K = 0, Y = Ȳ K = 0.3. [2] 2 Marks Unseen problem 2

(c) What is a colour look-up table and how is it used to represent colour? Colour Look-Up Tables (LUTs) Store only the index of the colour LUT for each pixel. [1] Look up the table to find the colour (RGB) for the index [1] [3] 5 Marks - Bookwork Give an advantage and a disadvantage of this representation with respect to true colour (24-bit) colour. Advantage : Use up significantly less memory than full 24-bit colour. [1] Disadvantage : Restricted number of colours available. [1] 2 Marks - Bookwork How do you convert from 24-bit colour to an 8-bit colour look up table representation? LUT needs to be built when converting 24-bit colour images to 8-bit: grouping similar colours (each group assigned a colour entry) [1] 1 Mark - Bookwork 3 PLEASE TURN OVER

(d) What is chroma subsampling? Why is chroma subsampling meaningful? What is the benefit of doing chroma subsampling? Chroma subsampling is a method that stores colour information at lower resolution than intensity information. [1] Chroma subsampling is meaningful because human visual system is less sensitive to variations in colour than brightness. [1] Chroma subsampling can reduce the bandwidth for colour detail in almost no perceivable visual difference. [1] 3 Marks Bookwork For the following array of colour values, give chroma subsampling results with 4:2:2, 4:1:1 and 4:2:0 schemes. Note: Listing the formulae to obtain the entries without calculating the final numbers is acceptable. 90 100 96 42 80 18 82 78 44 62 52 38 28 23 48 22 Chroma subsampling result for 4:2:2 scheme: 90 96 80 82 44 52 28 48 Chroma subsampling result for 4:1:1 scheme: [2] 90 80 44 28 Chroma subsampling result for 4:2:0 scheme: [2] (90 + 100+ 80 + 18)/4=72 (96 + 42 + 82 + 78)/4=75 (44 + 62 + 28 + 23)/4=39 (52 + 38 + 48 + 22)/4=40 6 Marks Unseen problem [2] Question 1 Total Marks 27 4

Q2. (a) GIF and JPEG are two commonly used image representations. Do they usually use lossless or lossy compression? State the major compression algorithm (if lossless) or the lossy steps of the algorithm (if lossy) for each representation. Lossless or lossy: GIF : Lossless. [1] JPEG : Lossy. [1] Key algorithms: GIF : Key algorithm is LZW (lossless) [1] JPEG : Lossy steps involve quantisation and chroma subsampling [1] 4 Marks Bookwork (b) Briefly describe the four basic types of data redundancy that data compression algorithms can apply to audio, image and video signals. 4 Types of Compression: Temporal in 1D data, 1D signals (Audio), 3D temporal frames in Video. [2] Spatial correlation between neighbouring pixels or data items. [2] Spectral correlation between colour or luminescence components. This uses the frequency domain to exploit relationships between frequency of change in data. [2] Psycho-visual, psycho-acoustic exploit perceptual properties of the human visual system or aural system to compress data. [2] 8 Marks Bookwork 5 PLEASE TURN OVER

(c) Given the following string as input, /TAN/HAN/HAN/AN/, with the initial dictionary below, encode the sequence with LZW algorithm, showing the intermediate steps. Index Entry 1 / 2 H 3 A 4 N 5 T RECAP: (Not explicitly required for solution) The LZW Compression Algorithm: w = NIL; while ( read a character k ) { if wk exists in the dictionary w = wk; else { add wk to the dictionary; output the code for w; w = k; } } The steps to encode above string are given as follows: wk is: /, EXISTS w = wk / wk is: /T, NEW add to table, w is k: T, Code is: Output is: 1 (/) New Table Entry, 6 : /T wk is: TA, NEW add to table, w is k: A, Code is: Output is: 5 (T) New Table Entry, 7 : TA wk is: AN, NEW add to table, w is k: N, Code is: Output is: 3 (A) New Table Entry, 8 : AN wk is: N/, NEW add to table, w is k: /, Code is: Output is: 4 (N) New Table Entry, 9 : N/ wk is: /H, NEW add to table, w is k: H, Code is: Output is: 1 (/) New Table Entry, 10 : /H wk is: HA, NEW add to table, w is k: A, Code is: Output is: 2 (H) New Table Entry, 11 : HA wk is: AN, EXISTS w = wk: AN wk is: AN/, NEW add to table, w is k: /, Code is: Output is: 8 (AN) New Table Entry, 12 : AN/ wk is: /H, EXISTS w = wk: /H wk is: /HA, NEW add to table, w is k: A, Code is: Output is: 10 (/H) New Table Entry, 13 : /HA wk is: AN, EXISTS w = wk: AN 6

wk is: AN/, EXISTS w = wk: AN/ wk is: AN/A, NEW add to table, w is k: A, Code is: Output is: 12 (AN/) New Table Entry, 14 : AN/A wk is: AN, EXISTS w = wk: AN wk is: AN/, EXISTS w = wk: AN/ Output final token which is 12 To Summarise, the output Table (New Elements)i: 6 : /T 7 : TA 8 : AN 9 : N/ 10 : /H 11 : HA 12 : AN/ 13 : /HA 14 : AN/A So the output will be 1 5 3 4 1 2 8 10 12 12 10 Marks Unseen problem applying algorithms covered in lectures. 3 marks for keeping w, 2 marks for appropriate allocation of index, 3 marks for symbol table and 3 marks for output (d) Briefly describe the LZW decoding process, and illustrate your answer with the above string sequence. RECAP: (Not explicitly required for solution) The LZW Decompression Algorithm : read a character k; output k; w = k; while ( read a character k ) /* k could be a character or a code. */ { entry = dictionary entry for k; output entry; add w + entry[0] to dictionary; w = entry; } Decoding: Have sequence: 1 5 3 4 1 2 8 10 12 12 And Code Book: Index Entry 1 / 2 H 3 A 4 N 5 T 7 PLEASE TURN OVER

So we get: Input: (w=k) 1 : Output (k Table entry): / Input k: 5 : Output (k Table entry): T New Table Entry, 6 : /T Input k: 3: Output (k Table entry): A New Table Entry, 7 : TA Input k: 4 Output (k Table entry): N New Table Entry, 8 : AN Input k: 1 : Output (k Table entry): / New Table Entry, 9 : N/ Input k: 2 : Output (k Table entry): H New Table Entry, 10 : /H Input k: 8 : Output (k Table entry): AN New Table Entry, 11 : HA Input k: 10: Output (k Table entry): /H New Table Entry, 12 : AN/ Input k: 12 : Output (k Table entry): AN/ New Table Entry, 13 : /HA Input k: 12 : Output (k Table entry): AN/ New Table Entry, 14 : AN/A Decoded Stream is (as expected): /TAN/HAN/HAN/AN/ Note Output Table (New Elements) is as before: 6 : /T 7 : TA 8 : AN 9 : N/ 10 : /H 11 : HA 12 : AN/ 13 : /HA 14 : AN/A 5 Marks Unseen problem Question 2 Total Marks 27 8

Q3. (a) Briefly outline, with the aid of suitable diagrams, the JPEG/MPEG I-Frame compression pipeline and list the constituent compression algorithms employed at each stage in the pipeline. The Major Steps in JPEG/MPEG Coding involve: JPEG: MPEG: Colour Space Transform and subsampling DCT (Discrete Cosine Transformation) Quantization Zigzag Scan Discrete Pulse Code Modulation (DPCM) on DC component (in JPEG), Run length encoding (RLE) on AC Components (JPEG), all of zig zag (MPEG). Entropy Coding Huffman or Arithmetic [2] [7] 9 Marks Bookwork 9 PLEASE TURN OVER

What are the key differences between the JPEG and MPEG I-Frame compression pipelines? Four main differences for JPEG uses YIQ whilst MPEG use YUV (YCrCb) colour space [1] MPEG used larger block size DCT windows 16 even 32 as opposed to JPEG s 8 [1] Different quantisation MPEG usually uses a constant quantisation value. [1] Only Discrete Pulse Code Modulation (DPCM) on DC component in JPEG on zig zag scan. AC (JEPG) and complete zig zag scan get RLE. [1] 4 Marks Applied Bookwork: Some lateral thinking to compare JPEG and MPEG not directly compared in course notes at least (b) Motion JPEG (or M-JPEG) is a video format that uses JPEG picture compression for each frame of the video. Why is M-JPEG not widely used as a video compression standard? Compressing in just each frame does not yield a high enough compression ratio that is required for general video needs. Can exploit temporal aspect of video to get better compression. [2] 2 Marks Bookwork Briefly state what additional approaches are used by MPEG video compression algorithms to improve on M-JPEG. Adopt some form of temporal compression. Use P-frames and B-frames to to differencing between frames and also motion estimation. [2] 2 Marks Bookwork (c) What processes above give rise to the lossy nature of JPEG/MPEG video compression? Lossy steps: Colour space subsampling in IQ or UV components. [2] Quantisation reduces bits needed for DCT components. [2] 4 Marks Bookwork 10

(d) Given the following portion from a block (assumed to be 4x4 pixels to simplify the problem) from an image after the Discrete Cosine Transform stage of the compression pipeline has been applied: 118 42 54 150 42 32 30 34 100 60 43 98 44 39 40 31 i. What is the result of the quantisation step of the MPEG video compression method assuming that a constant quantisation value of 32 is used? Trick needed to be remembered from notes is that we divide the matrix by the quantisation table or in this case a constant. So in this case divide all values by 32 and round down (Integer division). 3 1 1 4 1 1 0 1 3 1 1 3 1 1 1 0 [3] ii. What is the output of the following zig-zag step being applied to the resulting quantised block? Trick needed to be remembered from notes is that Zig-zag reads of values from DCT in an increasing low frequency order (better that row by row). Create a vector rather than a matrix. So we get a vector from matrix above: 3 1 1 3 1 1 4 1 1 1 1 1 1 3 1 0 [3] 6 Marks: Unseen Problem Question 3 Total Marks 27 11 PLEASE TURN OVER

Q4. (a) In MPEG audio compression, what is i. frequency masking? When an audio signal consists of multiple frequencies the sensitivity of the ear changes with the relative amplitude of the signals. If the frequencies are close and the amplitude of one is less than the other close frequency then the second frequency may not be heard. [2] 2 Marks: Bookwork ii. temporal masking? After the ear hears a loud sound, consisting of multiple frequencies, it takes a further short while before it can hear a quieter sound close in frequency.[2] 2 Marks: Bookwork Briefly describe the cause of each kind of masking in the human auditory system? Frequency Masking: Stereocilia in inner ear get excited as fluid pressure waves flow over them. [1] Stereocilia of different length and tightness on Basilar membrane so resonate in sympathy to different frequencies of fluid waves (banks of stereocilia at each frequency band).. [1] Stereocilia already excited by a frequency cannot be further excited by a lower amplitude near frequency wave. [1] 3 Marks: Bookwork Temporal Masking: (Like frequency masking) Stereocilia in inner ear get excited as fluid pressure waves flow over them and respond to different frequencies. [1] Stereocilia already excited by a certain frequency will take a while to return to rest state, as inner ear is a closed fluid chamber and pressure waves will eventually dampen down. [1] Similar to frequency masking Stereocilia in a dampening state may not respond to a a lower amplitude near frequency wave. [1] 3 Marks: Bookwork 6 Marks: subtotal 10 Marks: Q4(a) Total 12

(b) Briefly describe, using a suitable diagram if necessary, the MPEG-1 audio compression algorithm, outlining how frequency masking and temporal masking are encoded. MPEG audio compression basically works by: Dividing the audio signal up into a set of frequency subbands (Filtering) [1] [2] Use filter banks to achieve this. [1] Sub-bands approximate critical bands. [1] 27 Each band quantised according to the audibility of quantisation noise. [1] Frequency masking and temporal masking are encoded by: Frequency Masking MPEG Audio encodes this by quantising each filter bank with adaptive values from neighbouring bands energy, defined by a look up table. [2] Temporal Masking Not so easy to model as frequency masking. MP3 achieves this with a 50% overlap between successive transform windows gives window sizes of 36 or 12 and applies basic frequency masking as above. [2] 10 Marks: Bookwork 13 PLEASE TURN OVER

(c) In MPEG-4 Audio an alternative synthesis-based approach may be adopted to achieve compression. Briefly discuss how the following may be compressed with MPEG-4 Audio: Musical Audio Signals. Spoken Word Audio. What are advantages and disadvantages of such approaches? Musical Audio Signal Use MIDI type Structured Audio facilities in MPEG- 4. Compose music from Scratch using S/W tools or use pitch-to-midi or some transcription tools. [1] Spoken Word Audio. Use Text-to-Speech (TTS) facilities in MPEG-4. Again could transcribe audio or use some text-to-speech analysis tools. [1] Advantages: Lose control of the true nature of sounds so audio won t sound like given speaker or the source music. [1] Disadvantages: Very low bitrate streams/compression [1] 4 Marks: Applied Bookwork, Text-to-Speech UNSEEN (d) Assume that after analysis, the critical band filters of MPEG-1 Audio have output the levels of 3 consecutive critical bands as: Band 1 2 3 Level (db) 20 90 55 Assuming that signal-to-mask ratios for bands 1, 2 and 3 are for signals above 80 db in band 2 a masking of 30 db in band 1 and 40 db in band 3: Show how temporal masking is implemented in MPEG audio compression. What is the saving in bits to transmit the masked value in each masked band? Relies on simple thresholding above or below given values (look-up table) In band 1 20 db < 30 db so ignore it, don t send any bits, saving is clearly 4 bits.. [1] In band 3 55 db > 40 db so ignore it, so send difference value above masking value: 15 db (suitable coded). 4 bit instead of 6 bits: Saving of 2 bits (= 12 db). [2] 3 Marks: Unseen problem Question 4 Total Marks 27 14X END OF EXAMINATION