H.263 Video Encoder Introduction to topic
Topic of the work A simplified H.263 video encoder on DE2 FPGA Education and Development board The exercise work consists of several phases and sub-tasks Receiving and understanding the system requirements Writing a system specification Software implementation of the encoder on desktop PC Functional verification on desktop PC Creating the SoC platform on FPGA Writing communication driver for NIOSII processor Porting the SW implementation onto the SoC platform Verification and performance profiling for pure SW implementation HW/SW partitioning and hardware acceleration Verification and performance profiling for accelerated implementation Documentation
H.263 The basics of H.263 video encoding are explained during following exercises Students are encouraged to get familiar with video encoding algorithms in general before they start the project H.263 has a lot in common with algorithms like JPEG and MPEG-2 A very simplified version of H.263 video encoder (resembling motion JPEG) is used. Only INTRA coding (i.e. prediction of subsequent frames is not applied) The utilized key algorithms are DCT (Discrete Cosine Transform), Quantization, RLE (Run-Length Encoding), and VLC (Variable Length Code.
Software Kactus2 System development Altera Quartus II v12.1 FPGA synthesis QSYS for building Avalon/Nios II based systems Integrated Iogic analyzer for HW debugging Nios II EDS Software development environment for Nios II processor Nios2-terminal Terminal software for Nios II standard stream inteface via jtag uart Mentor Graphics ModelSim Simulating own VHDL blocks/designs VLC video player
Hardware Desktop PC w/ Windows OS Platform for the first encoder implementation Utilized to verify the encoded video bitstream Altera DE2 Development and Education Board Platform for the created Nios II based SoC
H.263 Video encoder Introduction to algorithms
Requirements for Video Transmission Communication delay (latency) More important in video conferencing applications than in file-based streaming applications Should be as low as possible (< 250 ms, even 150 ms) Should be kept as constant as possible Avoiding burst of frames followed by a still image Buffering Frame rate Affects to perceived smoothness of motion Under 10 fps video stream is perceived as fast slide show Image resolution Directly proportional to data size of a raw image Depends on the application
Introduction to H.263 Standard May 1996, ITU-T recommendation v1 Block-based ( Macroblock size is 16 pixels by 16 lines ) Motion estimation for temporal redundancy reduction Same objects are likely to be present in adjacent frames Half pixel accurate motion vectors DCT for spatial redundancy reduction 8 x 8 blocks Adjacent pixel values have only a little difference Quantization (lossy) Control of compression ratio RLE and Huffman as entropy coding algorithms Lossless compression
Block Diagram of H.263 Encoder + pre-processing + DCT Q Entropy coding 1/2 pixel accurate (interpolation) - Mot. Comp v(u,v) Mot. Est. Prediction error computation In Intra mode, MBs are coded directly Q -1 IDCT motion vector v(u,v) Previous reconstructed pictures (same image as the decoder observes) 7 0 4 00001 19300000 20000000 00000000 00000000 00000000 00000000 00000000 1 0 1 1 0 0 1 0 bits out (Huffman, VLC) No need to send zeros in 8x8 block to the decoder
Discrete Cosine Transform (DCT) Assumption: Adjacent pixels differ only a little from each other Thus, data in the frequency domain is easier to compress Spatial domain compression Pixels are grouped into blocks and the blocks are then transformed into frequency domain Essential information is then in more compact form Important DCT-coefficients in upper-left corner, that is, in low frequencies Compression is achieved by discarding the less important information of the transformed block Quantization of coefficients DCT itself is a lossless transform Limited accuracy with coefficients, however, leads to some loss of information
Entropy Encoding Next, the quantized coefficients are compressed in a lossless manner using entropy encoding Run-length coding o Lower amplitude coefficient likely to be zero o Arrange successive quantized non-zero coefficients into combinations of (LAST, RUN, LEVEL) Last = Whether this is the final non-zero coefficient in the block RUN = Number of preceding zeros LEVEL = sign and magnitude of the non-zero coefficient o Coefficients are processed in zig-zag order Due to the fact that running zeros are most likely located at higher frequencies Huffman coding (variable length coding) o After RLE coefficients are encoded based on the statistical characteristics Shorter codewords for symbols which occur with high probability
H.263 Project work A simplified version of H.263 video is created only INTRA coding (i.e. no motion estimation/compenstation) Key algorithms: DCT, quantization, RLE and VLC) Supported image resolution is QCIF (176 x 144) Encoder: pre-processing DCT Q Entropy coding 011001011 Decoder: 011001011 Entropy decoding Q -1 IDCT Reconstructed pictures
Design flow Requirements Specification Performance analysis Documentation SW Implementation Performance analysis Verification HW/SW partitioning Performance analysis Final Implementation