A Realtime 1080P30 H.264 Encoder System on a Zynq Device

Similar documents
Zynq SATA Storage Extension (Zynq SSE) - NAS. Technical Brief from Missing Link Electronics:

Easy H.264 video streaming with Freescale's i.mx27 and Linux

Getting Started with the Xilinx Zynq All Programmable SoC Mini-ITX Development Kit

System Performance Analysis of an All Programmable SoC

Open Flow Controller and Switch Datasheet

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial

Nutaq. PicoDigitizer 125-Series 16 or 32 Channels, 125 MSPS, FPGA-Based DAQ Solution PRODUCT SHEET. nutaq.com MONTREAL QUEBEC

MicroBlaze Debug Module (MDM) v3.2

7a. System-on-chip design and prototyping platforms

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor

Video Conference System

SABRE Lite Development Kit

Eli Levi Eli Levi holds B.Sc.EE from the Technion.Working as field application engineer for Systematics, Specializing in HDL design with MATLAB and

LogiCORE IP AXI Performance Monitor v2.00.a

1000Mbps Ethernet Performance Test Report

Eureka Technology. Understanding SD, SDIO and MMC Interface. by Eureka Technology Inc. May 26th, Copyright (C) All Rights Reserved

NORTHEASTERN UNIVERSITY Graduate School of Engineering. Thesis Title: CRASH: Cognitive Radio Accelerated with Software and Hardware

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Accelerate Cloud Computing with the Xilinx Zynq SoC

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

FPGA Manager PCIe, USB 3.0 and Ethernet

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

AGIPD Interface Electronic Prototyping

Chapter 13. PIC Family Microcontroller

VPX Implementation Serves Shipboard Search and Track Needs

BDTI Solution Certification TM : Benchmarking H.264 Video Decoder Hardware/Software Solutions

Switch Fabric Implementation Using Shared Memory

A Scalable Large Format Display Based on Zero Client Processor

ALL-AIO-2321P ZERO CLIENT

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

Reconfigurable System-on-Chip Design

Gigabit Ethernet Design

WiSER: Dynamic Spectrum Access Platform and Infrastructure

Getting the most TCP/IP from your Embedded Processor

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Nios II-Based Intellectual Property Camera Design

Getting Started with RemoteFX in Windows Embedded Compact 7

SPI I2C LIN Ethernet. u Today: Wired embedded networks. u Next lecture: CAN bus u Then: wireless embedded network

Figure 1.Block diagram of inventory management system using Proximity sensors.

Architectures and Platforms

Avoiding pitfalls in PROFINET RT and IRT Node Implementation

A DIY Hardware Packet Sniffer

IOVU-571N ARM-based Panel PC

Ways to Use USB in Embedded Systems

Attention. restricted to Avnet s X-Fest program and Avnet employees. Any use

Hi3520D H.264 Codec Processor. Brief Data Sheet. Issue 02. Date

Embedded Systems on ARM Cortex-M3 (4weeks/45hrs)

AND8336. Design Examples of On Board Dual Supply Voltage Logic Translators. Prepared by: Jim Lepkowski ON Semiconductor.

10-/100-Mbps Ethernet Media Access Controller (MAC) Core

1000-Channel IP System Architecture for DSS

What is LOG Storm and what is it useful for?

White paper. Latency in live network video surveillance

Embedded Display Module EDM6070

USB - FPGA MODULE (PRELIMINARY)

HARDWARE MANUAL. BrightSign HD120, HD220, HD1020. BrightSign, LLC Lark Ave., Suite 200 Los Gatos, CA

A Design of Video Acquisition and Transmission Based on ARM. Ziqiang Hao a, Hongzuo Li b

Design of Remote Security System Using Embedded Linux Based Video Streaming

OpenSPARC T1 Processor

Computer Systems Structure Input/Output

Types Of Operating Systems

Qsys and IP Core Integration

An Embedded Based Web Server Using ARM 9 with SMS Alert System

Computer Organization & Architecture Lecture #19

Getting Started with Embedded System Development using MicroBlaze processor & Spartan-3A FPGAs. MicroBlaze

H.264 Based Video Conferencing Solution

AXI Performance Monitor v5.0

VTR-1000 Evaluation and Product Development Platform. User Guide SOC Technologies Inc.

System on Chip Platform Based on OpenCores for Telecommunication Applications

ARM Cortex -A8 SBC with MIPI CSI Camera and Spartan -6 FPGA SBC1654

System-on-a-Chip with Security Modules for Network Home Electric Appliances

Virtualized Execution and Management of Hardware Tasks on a Hybrid ARM-FPGA Platform

Model-based system-on-chip design on Altera and Xilinx platforms

HDMI 2.0 Implementation on Kintex UltraScale FPGA GTH Transceivers

Networking Remote-Controlled Moving Image Monitoring System

Pre-tested System-on-Chip Design. Accelerates PLD Development

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

AXIS Video Capture Driver. AXIS Video Capture Driver. User s Manual

MBP_MSTR: Modbus Plus Master 12

What is a System on a Chip?

ADM5120 HOME GATEWAY CONTROLLER. Product Notes

ALL-ZC-2140P-DVI PCoIP Zero Client Overview

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Implementation of Wireless Gateway for Smart Home

PCI Express Overview. And, by the way, they need to do it in less time.

10/100/1000Mbps Ethernet MAC with Protocol Acceleration MAC-NET Core with Avalon Interface

ARM Microprocessor and ARM-Based Microcontrollers

4. H.323 Components. VOIP, Version 1.6e T.O.P. BusinessInteractive GmbH Page 1 of 19

Latency on a Switched Ethernet Network

USB 3.0 Connectivity using the Cypress EZ-USB FX3 Controller

XMC Modules. XMC-6260-CC 10-Gigabit Ethernet Interface Module with Dual XAUI Ports. Description. Key Features & Benefits

Design of a remote image monitoring system based on GPRS

ARM Processors for Computer-On-Modules. Christian Eder Marketing Manager congatec AG

Video Monitoring and Log System

Getting Started Guide. Version 7.0

40G MACsec Encryption in an FPGA

The Dusk of FireWire - The Dawn of USB 3.0

SDLC Controller. Documentation. Design File Formats. Verification

Using FPGAs to Design Gigabit Serial Backplanes. April 17, 2002

SBC6245 Single Board Computer

WAVES. MultiRack SETUP GUIDE V9.80

Transcription:

A Realtime 1080P30 H.264 Encoder System on a Zynq Device Introduction The Zynq all programmable System On a Chip is a recently introduced device from Xilinx which incorporates two ARM A9 CPU cores, I/O peripherals, memory controller, and programmable logic. This paper describes the implementation of a 1080P30 realtime H.264 encoder system on the device. System Overview Figure 1 shows the block diagram of the system. HDMI Raster Video Input hdmi_raster_write_axi macroblock_read_axi Zynq Processinng System h264_enc_wrapper_axi Gigabit ethernet PHY UDP Packets Containing H.264 encoded data DDR3 Memory Flash Memory Figure 1 : System block diagram Ocean Logic Pty Ltd August 2014 Page 1

An HDMI 1080P60 raster video source provides the input video for the encoder. Every second frame of raster video is written to a frame buffer in the DDR3 memory to achieve the 2:1 frame rate conversion. This step is performed as 1080P60 is a standard HDMI video resolution and frame rate while 1080P30 is not. 16x16 macroblocks are then read from the frame buffer and fed into the H.264 encoder. The encoded output is encapsulated into UDP Ethernet packets and streamed out through the Ethernet interface for decoding by a remote device running the VLC multimedia software. The Zedboard with its ZC7020 device was selected as the platform for this exercise as it is a cost effective evaluation platform with all the required features for the system. The HDMI input interface is provided by the AES-FMC-IMAGEON-G FMC card from Avnet. System Implementation The system was implemented using the Vivado design software from Xilinx. This software provides an integrated environment to build a system using IP blocks. Apart from the standard IP blocks in Vivado, the system also used IP blocks created by Ocean Logic for the video input raster to macroblock conversion, and H.264 encoding. The system uses the High-Performance as well as the General Purpose AXI buss. The High- Performance buss configured for 64-bit data is used for connecting the video input logic and also the H.264 encoder to the DDR3 memory controller via the High-Performance slave ports of the Processing System. The 32-bit General Purpose AXI buss is used by the embedded processor to configure the IP blocks as well as by the DMA engine to collect the encoded H.264 data. The Zedboard provides 512Mbytes of 32-bit DDR3 memory clocked at 533MHz which is controlled by the hard-ip DDR3 controller on the Zynq device. This memory is shared by the embedded processor (program and data storage), the H.264 encoder (6.3Mbytes reference frame memory), and the video input raster to macroblock conversion logic (3.13Mbytes frame buffer). While the amount of memory used by the H.264 encoder is only a small proportion of the available memory, the bandwidth required is significant. The H.264 encoder requires 460 64- bit read/write accesses every 1024 encoder clock cycles. The encoder is in this case is clocked at 126MHz to achieve the encoding throughput for 1080P30 video so the 1024 encoder clocks equates to 8.13us. While there is theoretically 4.26Gbytes/s (4327 64-bit accesses in 8.13us) of available bandwidth, it is important to ensure that the embedded processor does not lock out the H.264 encoder s access to the DDR3 memory. In addition, the raster to macroblock conversion logic requires 384 64-bit accesses every 8.13us. There are also DDR3 memory page activations and deactivation cycles which will erode the available bandwidth. Ocean Logic Pty Ltd August 2014 Page 2

With the current system there are no issues with available DDR3 memory bandwidth. If the available bandwidth in a system were to be exceeded, a separate soft IP memory controller could be used to support the H.264 encoder and the raster to macroblock logic. Table 1 shows the resource utilization of the system. Resource Used Available Slice LUTs 35362 53200 Slice Registers 22754 106400 Memory 138 140 DSP 11 220 IO 31 202 Clocking 7 32 Slice Logic Utilization 11195 13300 Table 1 : ZC7020 resource utilization As mentioned above, the encoder clock frequency is 126MHz. This is not achievable in the ZC7020-1 device on the Zedboard for worst case conditions. A ZC7020-2 device will be required to achieve the encoder clock frequency for worst case conditions. System Software The embedded processor on the Zynq configures the H.264 encoder and also services the DMA engine interrupts for transferring the encoded data packets from the H.264 encoder to the Ethernet MAC. The embedded processor does not implement any of the H.264 encoding algorithm. The system software flow diagram is shown in Figure 2. After configuring the peripherals on the Zynq device and the HDMI interface, the status of the HDMI is read to see of a 1080P60 source is present. The operation is aborted if it is not present. The flash memory on the Zedboard is then checked to see if the CRC is valid for the set of encoding parameters. If it is not, the serial terminal connected to the USB UART can be used to enter a set of encoding parameters. Once entered, the parameters can be saved to the flash memory with a valid CRC. If SWITCH7 on the Zedboard is set to 1, the encoding parameters can also be input using the same method. With a valid set of encoding parameters, the H.264 encoder is configured and then encoding operation started. When the buffer full interrupt from the encoder is received, the data is transferred from the buffer via DMA to the Ethernet MAC on the Zynq processing system. Ocean Logic Pty Ltd August 2014 Page 3

Configure Peripherals Read HDMI input status 1080P60? Abort Flash memory CRC OK? Switch7 ON? Configure and start encoder Get encoding parameter s Encoded data buffer full? Transmit packet Figure 2 : Software flow diagram Ocean Logic Pty Ltd August 2014 Page 4

This type of software is referred to by Xilinx as a Bare Metal application as there is no underlying operation system. It was developed in the Xilinx SDK environment using C, with drivers and examples from Xilinx and also the AES-FMC-IMAGEON-G FMC card reference design. Summary The Zynq device incorporates a lot of built-in functionally coupled with the programmable fabric which makes it a very capable platform for implementing a large variety of embedded systems. The system shown here only uses a fraction of what the device is capable of. The Vivado design software provides an environment to quickly develop a system such as this. Ocean Logic Pty Ltd PO BOX 768 - Manly NSW 1655 - Australia E-Mail: contact@ocean-logic.com URL : http://www.ocean-logic.com/ Ocean Logic Pty Ltd August 2014 Page 5