Comparative study of Intra Frame Coding efficiency in HEVC and VP9. Dr.K.R.Rao

Comparative study of Intra Frame Coding efficiency in HEVC and VP9 EE5359 Multimedia Processing Interim Report Under the guidance of Dr.K.R.Rao University of Texas at Arlington Dept. of Electrical Engineering Shwetha Chandrakant Kodpadi 1001051972 Shwetha.chandrakantkodpadi@mavs.uta.edu Spring 2014 1

List of Acronyms and Abbreviations ADST - Asymmetric Discrete Sine Transform AVC Advanced Video Coding BD-BR- Bjøntegaard-Delta Bit-Rate Measurements BD-PSNR - Bjøntegaard-Delta Peak signal to noise ratio CU- Coding unit CTU- Coding tree unit DBF- Deblocking Filter DFT Discrete Fourier Transform DCT Discrete Cosine Transform DST Discrete Sine Transform DPB - Decoded Picture Buffer DC Direct Current HD- High definition HEVC-High Efficiency Video Coding ITU-T - International Telecommunication Union (Telecommunication Standardization Sector) JPEG - Joint photographic experts group JCT-VC- Joint collaborative team on video coding MSE-Mean square error MPEG-Moving picture experts group NGOV- Next Geneneration Open Video PU- Prediction unit PSNR-Peak signal to noise ratio PU Prediction Unit RD Rate Distortion SAO - Sample Adaptive Offset SSIM- Structural similarity index TM- True Motion TU-Transform units VCEG Video Coding Experts Group 2

1. Objective The objective of this project is to study, implement and compare video coding standards HEVC and VP9 [1][3]. The analysis will be carried out on the intra frame coding efficiency by using performance metrics such as computational time, PSNR, BD-BR [14], SSIM [10] and video quality will be evaluated for high resolution videos. The HM Test Model 13.0[12] and VPX encoder from The WebM Project [13] for HEVC and VP9 respectively will be used for this purpose. 2. General compression dataflow Both HEVC and VP9 video compression standards are hybrid block-based codecs relying on spatial transformations [9]. General compression dataflow of hybrid block-based encoders is illustrated in Figure 1. The input video frame is initially partitioned into blocks of the same size called macroblocks. The compression and decoding process works within each macroblock. A macroblock is sub partitioned into smaller blocks to perform prediction. There are two basic types of prediction: intra and inter. Intra-prediction works within a current video frame and is based upon the compressed and decoded data available for the block being predicted. Inter-prediction is used for motion compensation: a similar region on previously coded frames close to the current block is used for prediction. The aim of the prediction process is to reduce data redundancy and therefore, not store excessive information in coded bitstream. Figure 1: Hybrid block-based codec dataflow [9] Once the prediction is done, it is subtracted from the original data to get residuals that should be compressed. Residuals are subject to forward Discrete-Fourier Transform (DFT). DFT translates spatial residual information into frequency domain. Quantization is applied to the transformed matrix to lose insufficient information. The insufficient threshold is predetermined by encoder configuration. The remaining data and the steps applied are subject to entropy coding, which makes it possible to get compressed bit-stream. For inter-prediction and intra-prediction purposes the compressed data should be restored in the encoder. Dequantization and inverse DFT are performed to restore residuals. Then the restored residuals and the predicted values are summed up to get restored pixel values, 3

identical to those achieved in the decoder. These restored values are used for intraprediction within current video frame. An additional frame post-processing stage is optionally applied to eliminate image blocking introduced by DFT and quantization. The final restored and post-processed video frame is stored in the Decoded Picture Buffer (DPB) for interprediction of further frames. VP9 and HEVC both utilize the described general compression dataflow, but differ in details [9]. 3. High Efficiency Video Coding 3.1 Introduction High Efficiency Video Coding (HEVC) is the latest Video Coding format [4]. It challenges the state-of-the-art H.264/AVC [20] Video Coding standard which is in current use in the industry by being able to reduce the bit rate by 50% and retaining the same video quality. It came into existence in the early 2012 although Joint Collaborative Team on Video Coding (JCT-VC) was formed in January 2001 to carry out developments on HEVC, and ever since then a huge range of development has been going on. On 13 April 2013 [5], HEVC standard also called H.265 was approved by ITU-T. Joint Collaborative Team on Video Coding (JCT- VC), is a group of video coding experts from ITU-T Study Group (VCEG) and ISO/IEC JTC 1/SC 29/WG 11 (MPEG). 3.2 HEVC Encoder and Decoder The HEVC standard is designed to achieve multiple goals, including coding efficiency, ease of transport system integration and data loss resilience, as well as implementability using parallel processing architectures [4]. Figures 2 and 3 represent block diagrams of encoder and decoder of HEVC respectively. 4

Figure 2: Encoder block diagram for HEVC [4] Figure 3: Decoder block diagram for HEVC [17] 5

3.3 HEVC Coding Tools 3.3.1 Macroblock concept and Prediction block sizes The concept of macroblock in HEVC [9] is represented by the Coding Tree Unit (CTU). CTU size can be 16x16, 32x32 or 64x64, while AVC macroblock size is 16x16. Larger CTU size aims to improve the efficiency of block partitioning on high resolution video sequence. Larger blocks provoke the introduction of quad-tree partitioning (Figure 4) of a CTU into smaller coding units (CUs). A coding unit is a bottom-level quad-tree syntax element of CTU splitting. The CU contains a prediction unit (PU) and a transform unit (TU). a) b) Figure 4: CTU splitting example with solid lines for CU split: a) with PU splitting depicted as dotted lines; b) with TU splitting depicted as dotted lines [9] The TU is a syntax element responsible for storing transform data. Allowed TU sizes are 32x32, 16x16, 8x8 and 4x4. The PU is a syntax element to store prediction data like the intra-prediction angle or inter-prediction motion vector. The CU can contain up to four prediction units. CU splitting on PUs can be 2Nx2N, 2NxN, Nx2N, NxN, 2NxnU, 2NxnD, nlx2n and nrx2n (Figure 5) where 2N is a size of a CU being split. In the intra-prediction mode only 2Nx2N PU splitting is allowed. An NxN PU split is also possible for a bottom level CU that cannot be further split into sub CUs. 3.3.2 Prediction Modes Figure 5: PU splitting [9] 3.3.2.1 Intra Prediction Modes There are a total of 35 intra-prediction modes in HEVC: planar (mode 0), DC (mode 1) and 33 angular modes (modes 2-34 in Figure 6). DC intra-prediction is the simplest mode in 6

HEVC. All PU pixels are set equal to the mean value of all available neighboring pixels. Planar intra-prediction is the most computationally expensive. It is a two- dimensional linear interpolation. Angular intra-prediction modes 2-34 are linear interpolations of pixel values in the corresponding directions. Vertical intra-prediction (modes 18-34) is an updown interpolation of neighboring pixel values. Also, intra prediction can be done at different block sizes, ranging from 4 X 4 to 64 X 64 (whatever size the PU has) (Figure 7). Figure 6: Modes and directional orientations for intra picture prediction for HEVC [1] Figure 7: Luma intra prediction modes for different PU sizes in HEVC [8] 3.3.2.1 Inter Prediction Each PU is predicted from image data in one or two reference pictures (before or after the current picture in display order), using motion compensated prediction. 3.3.2 Transform and Quantization Any residual data remaining after prediction is transformed using a block transform based on the integer Discrete Cosine Transform (DCT) [22]. Only for 4x4 intra luma, a transform based on Discrete Sine Transform (DST) is used. One or more block transforms of size 32x32, 16x16, 8x8 and 4x4 are applied to residual data in each CU. Then the transformed data is quantized. 7

3.3.3 Entropy Coding Figure 8: CTU showing range of transform (TU) sizes [18] Context adaptive binary arithmetic coding (CABAC) is used for entropy coding. This is similar to the CABAC scheme in H.264/MPEG-4 AVC [20], but has undergone several changes to improve its throughput speed (especially for parallel-processing architectures) and its compression performance, and to reduce its context memory requirements. 3.3.4 Post Processing One or two filtering stages can be optionally applied (within the inter-picture prediction loop) before writing the reconstructed picture into the decoded picture buffer. A deblocking filter (DBF) is used that is similar to the one in AVC; however the DBF design has been simplified with regard to its decision-making and filtering processes and also has been made more friendly to parallel processing. The second stage, called the sample adaptive offset (SAO) filter, is a non-linear amplitude mapping. The goal of SAO is to improve the reconstruction of the signal amplitude by adding an offset based on a look-up table mapping that is controlled by the encoder. Two types of SAO operation can be selected for each CTB the band offset and edge offset modes, where depending on additional criteria (amplitude or local directional amplitude constellation) an offset value is added to the reconstructed sample amplitude. 4. VP9 4.1 Introduction VP9 is an open and royalty free video compression standard being developed by Google [2][3]. VP9 had earlier development names of Next Generation Open Video (NGOV) and VP- Next. VP9 is a successor to VP8. Development of VP9 started in Q3 2011. One of the goals of VP9 is to reduce the bit rate by 50% compared to VP8 while having the same video quality [7]. Also VP9 aims to improve it to the point where it would have better compression efficiency than High Efficiency Video Coding. VP9 expands techniques used in H.264/AVC and VP8 and is very likely to replace AVC at least in the YouTube video service [9]. 8

4.2 VP9 Encoder and Decoder A large part of the advances made by VP9 over its predecessors is natural progression from current generation video codecs to the next. Figures 9 and 10 represent block diagrams of encoder and decoder of VP9 respectively. DCT Scan Ordering Uniform Quantization Entropy Encoding Input + + - Inverse Quantization Scan reordering Inverse DCT + + + Motion Compensation Previous frame buffer Prediction Loop filter Motion Estimation Golden frame buffer Figure 9: Encoder block diagram for VP9 [19] Encoded in Entropy Decoding Inverse Quantization + Scan reordering + IDCT + + + Decoded out Motion Compensation Prediction Loop filter Previous frame buffer Golden frame buffer Figure 10: Decoder block diagram for VP9 [19] 9

4.3 VP9 Coding Tools 4.3.1 Prediction Block Sizes A large part of the coding efficiency improvements achieved in VP9 can be attributed to incorporation of larger prediction block sizes [9] [3]. VP9 introduces super-blocks (SB) of size up to 64x64 and allows breakdown using recursive decomposition all the way down to 4x4. Unlike HEVC, any sub-block can be split on prediction blocks in intra mode. Furthermore rectangular intra-prediction blocks are possible which are demonstrated in Figure 11. Each sub-block may be further split into prediction blocks and transform blocks which are represented by Figure 12.a and Figure 12.b respectively. Intra-prediction in VP9 is still performed on square regions thus rectangular prediction blocks represent two square prediction blocks with the same prediction mode. Giving an analogy to HEVC, prediction splitting 2Nx2N, NxN, 2NxN or Nx2N is available (Figure 12.a) where 2Nx2N is the size of the block being split. It is worth mentioning that 4x4 prediction blocks are determined within corresponding 8x8 block as a group, unlike other prediction sizes when prediction data is stored per each prediction block. Like in HEVC, a sub-block can be split into transform blocks in a quad-tree structure down to the smallest 4x4 block. The allowed sizes are 32x32, 32x16, 16x16, 8x16, 8x8 and 4x4 (Figure 12.b). Figure 11: Example partitioning of a 64x64 Super-block 10

a) b) Figure 12: Superblock splitting example with solid lines for block split: a) with prediction splitting depicted as dotted lines; b) with transform splitting depicted as dotted lines [9] 4.3.2 Prediction Modes 4.3.2.1 Intra-prediction Modes VP9 supports a set of 10 Intra prediction modes [9] for block sizes ranging from 4x4 up to 32x32: DC_PRED (DC prediction), TM_PRED (True-motion prediction), H_PRED (Horizontal prediction), V_PRED (Vertical prediction), and 6 oblique directional prediction modes: D27, D153, D135, D117, D63, D45 corresponding approximately to angles 27, 153, 135, 117, 63, and 45 degrees (counter-clockwise measured against the horizontal axis). The horizontal, vertical and oblique directional prediction modes involve copying (or estimating) pixel values from surrounding blocks into the current block along the angle specified by the prediction mode. Figure 8 shows angular Intra-prediction modes in VP9. Figure 13: VP9 angular intra-prediction modes [9] 11

4.3.2.2 Inter Prediction Modes VP9 supports a set of 4 inter prediction modes for block sizes ranging from 4x4 up to 64x64 pixels: NEARESTMV, NEARMV, ZEROMV, and NEWMV [3]. 4.3.3 Transform and quantization The residuals after subtraction of predicted pixel values are subjected to transformation and quantization [9]. Transform blocks can be 32x32, 16x16, 8x8 or 4x4 pixels. Like most other coding standards, these transforms are an integer approximation of the DCT. For intra coded blocks either or both the vertical and horizontal transform pass can be DST (discrete sine transform) instead. This is with respect to the specific characteristics of the residual signal of intra blocks. In addition, VP9 introduces support for a new transform type, the Asymmetric Discrete Sine Transform (ADST), which can be used in combination with specific intra-prediction modes. Intra-prediction modes that predict from a left edge can use the 1-D ADST in the horizontal direction, combined with a 1-D DCT in the vertical direction. Similarly, the residual signal resulting from intra-prediction modes that predict from the top edge can employ a vertical 1-D ADST transform combined with a horizontal 1- D DCT transform. Intra-prediction modes that predict from both edges such as the True Motion mode and some diagonal intra-prediction modes use the 1-D ADST in both horizontal and vertical directions. 4.3.4 Entropy coding VP9 uses 8-bit arithmetic coding engine from VP8 known as bool-coder [9]. Unlike AVC or HEVC, the probabilities of VP9 bool-coder do not change adaptively within a frame. VP9 makes use of forward context updates through the use of flags in the frame header that signal modifications of the coding contexts at the start of each frame. These probabilities are stored in what is known as a frame context. The decoder maintains four of these contexts, and each frame specifies which one to use in bitstream. 4.3.5 Post-processing There is only one possible post-processing stage in VP9: deblock filter [9]. It aims to reduce blocking artifacts on superblocks filtering vertical edges first and horizontal edges second. VP9 has 16-, 8-, 4- and 2-pixels wide filters with half filter size on each side of a boundary. VP9 also incorporates a flatness detector in the loop filter that detects at regions and varies the filter strength and size accordingly. 12

5. Performance comparison metrics 5.1 MSE and PSNR MSE and PSNR [17] for an NxM pixel image are defined in equations 1 and 2 where O is the original image and R is the reconstructed image. M and N are the width and height of an image and L is the maximum pixel value in the NxM pixel image. [ ( ) ( )] 5.2 Structural Similarity Index The structural similarity (SSIM) [10] index is a method for measuring the similarity between two images. SSIM emphasizes that the human visual system is highly adapted to extract structural information from visual scenes. Therefore, structural similarity measurement should provide a good approximation to perceptual image quality. SSIM is designed to improve on methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proved to be inconsistent with human eye perception. SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. where x and y correspond to two different signals that need to be compared for similarity, i.e. two different blocks in two separate images. 5.3 Bjøntegaard-Delta Bit-Rate Measurements As rate-distortion (R-D) performance assessment [14], Bjøntegaard-Delta bit-rate (BD-BR) measurement method is used for calculating average bit-rate differences between R-D curves for the same objective quality (e.g., for the same PSNRYUV values), where negative BD-BR values indicate actual bit-rate savings. 13

As part of this project BD-BR performance metric will be used to determine bit-rate savings. 6. Implementation For comparison purpose, open-source implementations of the reviewed codecs will be used. HEVC compression efficiency will be measured with the HM Test Model [12]. Evaluation of VP9 compression performance will be carried out with the VPX encoder from The WebM Project [13]. Since HEVC has more Intra Prediction modes and few other features than VP9, both the codecs are configured to establish a fair comparison. Encoding time is used to compare the implementation complexity. 7. Test Sequences The implementation will be carried out on the.yuv video sequences which are listed in Table 1. They have different resolutions and frame-rates, covering the most use cases possible. Table 1: Test sequences [9] Figures 14, 15, 16 and 17 are frames of the test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. Figure 14: RaceHorses (416x240) 14

Figure 15: BasketballDrill (832x480) Figure 16: Kimono (1920x1080) Figure 17: PeopleOnStreet (2560x1600) 15

8. Implementation results To compare intra compression efficiency between HEVC and VP9 All Intra Main configuration is used in HEVC. In VP9, key frame parameter is adjusted to make the VPX encoder behave in All Intra (AI) mode. PSNR and bitrate values are recorded for different quantization parameters (22, 27, 32, and 37) for each test sequence. Tables 2, 3, 4 and 5 demonstrate the implementation results test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. RaceHorses_416x240_30.yuv (AI mode) HEVC VP9 QP PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) 22 42.5720 5057.16 63.726 41.143 6507.95 52.426 27 38.6146 3152.1000 57.084 40.191 4683.97 50185 32 34.9992 1814.8920 48.975 36.230 2380.11 42.470 37 31.9718 979.5840 41.869 35.913 2262.14 40.828 Table 2: Implementation results for RaceHorses_416x240_30.yuv sequence BasketballDrill_832x480_50.yuv (AI mode) HEVC VP9 QP PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) 22 42.3716 20407.88 278.849 42.3070 22101.34 197.060 27 39.0915 11014.04 220.022 39.553 13454.87 229.792 32 36.2986 5847.02 196.536 38.224 10621.66 178.128 37 33.9144 3200.72 162.733 35.831 163.153 7410.842 Table 3: Implementation results for BasketballDrill_832x480_50.yuv sequence 16

Kimono1_1920x1080_24.yuv (AI mode) HEVC VP9 QP PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) 22 43.3716 18738.4512 1054.518 43.164 20764.76 940.820 27 42.0109 10746.1920 886.952 42.053 13404.4746 749.586 32 39.7045 6408.2112 830.751 40.955 9977.10 634.479 37 38.0951 3776.95 785.519 39.868 7508.94 659.983 Table 4: Implementation results for Kimono1_1920x1080_24.yuv sequence PeopleOnStreet_2560x1600_30_crop.yuv (AI mode) HEVC QP PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) VP9 PSNR(dB) Bitrate(Kbit/s) Encoding Time(s) 22 43.8084 104202.5640 2109.036 43.219 101898.87 1714.655 27 40.6998 60435.6720 1773.904 40.387 62989.20 1577.737 32 37.8947 34338.7800 1661.502 38.794 47220.45 1481.304 37 35.4149 19983.8040 1571.038 36.611 35706.77 1347.487 Table 5: Implementation results for PeopleOnStreet_2560x1600_30_crop.yuv sequence Figures 8, 9, 10 and 11 illustrate the Bitrate-PSNR plot for test sequences RaceHorses, BasketballDrill, Kimono1 and PeopleOnStreet, respectively. 17

Bitrate-PSNR plot for RaceHorses_416x240_30.yuv PSNR(dB) 44 42 40 38 36 34 32 HEVC VP9 30 0 1000 2000 3000 4000 5000 6000 7000 Bitrate(kbps) Figure 18: R-D plot for RaceHorses_416x240_30.yuv PSNR(dB) 44 42 40 38 36 34 32 Bitrate-PSNR Plot for BasketballDrill_832x480_50.yuv HEVC VP9 30 0 5000 10000 15000 20000 25000 Bitrate(kbps) Figure 19: R-D plot for BasketballDrill_832x480_50.yuv 18

Bitrate-PSNR plot Kimono1_1920x1080_24.yuv 44 43 42 PSNR(dB) 41 40 39 HEVC VP9 38 37 0 5000 10000 15000 20000 25000 Bitrate(kbps) Figure 20: R-D plot for Kimono1_1920x1080_24.yuv Bitrate-PSNR plot for PeopleOnStreet_2560x1600_30_crop.yuv 46 44 42 PSNR(dB) 40 38 HEVC vp9 36 34 0 20000 40000 60000 80000 100000 120000 Bitrate(kbps) Figure 21: R-D plot for PeopleOnStreet_2560x1600_30_crop.yuv Figures 22 and 23 illustrate the encoding time taken by HEVC and VP9, and BD-BR for HEVC and VP9, respectively. 19

BD-Bitrate % Encoding time of HEVC and VP9 2000 1800 1600 1778.87 1530.29 1400 1200 Time(secs) 1000 800 600 400 200 84.66 74.36 214.53192.03 889.43 746.21 HEVC VP9 0 1 2 3 4 1 RaceHorses_416x240_30 2 BasketballDrill_832x480_50 3 Kimono1_1920x1080_24 4 PeopleOnStreet_2560x1600_30_crop Figure 22: Average encoding time of HEVC and VP9 0-2 -4-6 -8-10 -12-14 -16-18 -20-10.38 RaceHorses_416x240 _30 BD-BR for HEVC and VP9-14.07 BasketballDrill_832x4 80_50-17.17 Kimono1_1920x1080 _24 Figure 23: Bitrate savings of HEVC over VP9-12.6 PeopleOnStreet_2560 x1600_30_crop BD-Bitrate -10.38-14.07-17.17-12.6 20

9. Conclusions HEVC provides better compression rates than VP9, but VP9 is patent-free and can be used without licensing expenses. For Intra frame coding, HEVC gives 13% more bitrate savings than VP9. And the encoding time taken by VP9 is marginally less than HEVC. 10. References [1] G.J. Sullivan et al, Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. circuits and systems for video technology, vol. 22, no.12, pp. 1649 1668, Dec 2012. [2] D. Grois et al, Performance Comparison of H.265/ MPEG-HEVC, VP9, and H.264/MPEG- AVC Encoders, IEEE PCS 2013, pp 394-397, San José, CA, USA, Dec 8-11, 2013 [3] D. Mukherjee et al, The latest open-source video codec VP9 An overview and preliminary results, Google Inc., United States [4] G.J. Sullivan et al, "Standardized Extensions of High Efficiency Video Coding (HEVC)", IEEE Journal of Selected Topics in Signal Processing, vol.7, no.6, pp.1001-1016, Dec. 2013 [5]Article on HEVC - http://en.wikipedia.org/wiki/high_efficiency_video_coding [6] Q. Cai et al, Lossy and lossless intra coding performance evaluation: HEVC, H.264/AVC, JPEG 2000 and JPEG LS, Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific, vol.9, no.12, pp.1-9, Dec 2012. [7] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngovproject-update.pdf [8]M.T. Pourazad et al, HEVC:The new gold standard for video compression, IEEE consumer electronics magazine,vol.1, no.7, pp.36-46, July 2012. [9] M.P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences, Vol. 7, no. 137, pp.6803 6824, Hikari Ltd, 2013 [10] Z. Wang et al, Image quality assessment: From error visibility to structural similarity, IEEE Trans. on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004. [11] H. Jain, Comparative performance analysis of HEVC and H.264 Intra frame coding and JPEG2000, EE5359, UTA, spring 2013. http://www-ee.uta.edu/dip/courses/ee5359/index.html. [12] HM Reference Software- https://hevc.hhi.fraunhofer.de/hm-doc/ [13] Chromium open-source browser project, VP9 source code, Online: http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=tree;f=vp9;hb=aaf61dfbcab414bfa cc3171501be17d191ff8506 [14] G. Bjøntegaard, Calculation of average PSNR differences between RD-curves, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001. [15] S. Jeong et al., High efficiency video coding for entertainment quality. ETRI J vol. 33, pp.145 154, 2011. [16] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050- http://ip.hhi.de/imagecom_g1/assets/pdfs/jvt-g050.pdf [17] White paper on PSNR-NI - http://www.ni.com/white-paper/13306/en/ [18] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html 21

[19] J. Padia, Complexity reduction for VP6 to H.264 transcoder using motion vector reuse, M.S. Thesis, EE Dept., UTA, Arlington, TX, 2010. Available on: http://wwwee.uta.edu/dip/courses/ee5359/index.html [20] T. Wiegand et al, Overview of the H.264/AVC Video Coding Standard, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, pp. 560-576, Jul. 2003. [21] G. Bjøntegaard, Improvements of the BD-PSNR model ITU-T SG16 Q.6, Doc. VCEG- AI11, Berlin, Germany, July 16-18, 2008 [22] N. Ahmed, T. Natarajan, K.R. Rao, Discrete Cosine Transform, IEEE Transactions on Computers, Vol. C-23, pp. 90-93, Jan. 1974. [23] I.E.G. Richardson, The H.264 advanced video compression standard, 2nd Edition, Hoboken, NJ, Wiley, 2010. [24] I.E.G. Richardson, Video Codec Design: Developing Image and Video Compression Systems, Wiley, 2002. [25] K.R. Rao, D.N. Kim and J.J. Hwang, Video Coding Standards: AVS China, H.264/MPEG-4 Part 10, HEVC, VP6, DIRAC and VC-1, Springer, 2014. [26] B. Bross et al, High Efficiency Video Coding (HEVC) Text Specification Draft 10, Document JCTVC-L1003, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT- VC), Mar. 2013 available on http://phenix.itsudparis.eu/jct/doc_end_user/current_document.php?id=7243 22