KeyStone Training. Multicore Navigator Overview. Overview Agenda



Similar documents
KeyStone Architecture Security Accelerator (SA) User Guide

KeyStone Multicore. Ecosystem

Scaling Networking Applications to Multiple Cores

A Generic Network Interface Architecture for a Networked Processor Array (NePA)

Serial Communications

Chapter 1 Computer System Overview

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING Question Bank Subject Name: EC Microprocessor & Microcontroller Year/Sem : II/IV

Texas Instruments TMS320C6678 (Shannon) DSP Training

WiLink 8 Solutions. Coexistence Solution Highlights. Oct 2013

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Software Defined Networking (SDN) - Open Flow

Open Flow Controller and Switch Datasheet

Development Kit (MCSDK) Training

Understanding OpenFlow

KeyStone Architecture Gigabit Ethernet (GbE) Switch Subsystem. User Guide

AFDX networks. Computers and Real-Time Group, University of Cantabria

Vi DANTE TM Card User & Setup Guide

Achieving Low-Latency Security

Transport Layer Protocols

Network administrators must be aware that delay exists, and then design their network to bring end-to-end delay within acceptable limits.

Multichannel Voice over Internet Protocol Applications on the CARMEL DSP

Chapter 11 I/O Management and Disk Scheduling

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON. Ernie Gilman IBM. August 10, 2011: 1:30 PM-2:30 PM.

QoS issues in Voice over IP

C-GEP 100 Monitoring application user manual

System Considerations

Chapter 02: Computer Organization. Lesson 04: Functional units and components in a computer organization Part 3 Bus Structures

Design and Verification of Nine port Network Router

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

The Load Balancing System Design of Service Based on IXP2400 Yi Shijun 1, a, Jing Xiaoping 1,b

Implementation of a Wimedia UWB Media Access Controller

AN ANALYSIS OF DELAY OF SMALL IP PACKETS IN CELLULAR DATA NETWORKS

USER GUIDE EDBG. Description

Analyzing Full-Duplex Networks

Network Instruments white paper

Hello, and welcome to this presentation of the STM32 SDMMC controller module. It covers the main features of the controller which is used to connect

AMD Opteron Quad-Core

Performance of Software Switching

SoC IP Interfaces and Infrastructure A Hybrid Approach

A Scalable Large Format Display Based on Zero Client Processor

PEX 8748, PCI Express Gen 3 Switch, 48 Lanes, 12 Ports

PMC-XM-DIFF & EADIN/MODBUS Virtex Design

Implementing Multi-channel DMA with the GN412x IP

Operating Systems. Module Descriptor

Contents. Connection Guide. What is Dante? Connections Network Set Up System Examples Copyright 2015 ROLAND CORPORATION

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

Understanding Latency in IP Telephony

Going Linux on Massive Multicore

Programming Audio Applications in the i.mx21 MC9328MX21

OSBRiDGE 5XLi. Configuration Manual. Firmware 3.10R

12K Support Training 2001, Cisc 200 o S 1, Cisc 2, Cisc y tems, Inc. tems, In A l rights reserv s rese ed.

Optimizing Converged Cisco Networks (ONT)

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

Intel Ethernet Switch Load Balancing System Design Using Advanced Features in Intel Ethernet Switch Family

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Getting the most TCP/IP from your Embedded Processor

AN4646 Application note

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

MOBILE ARCHITECTURE BEST PRACTICES: BEST PRACTICES FOR MOBILE APPLICATION DESIGN AND DEVELOPMENT. by John Sprunger

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

Overview of Asynchronous Transfer Mode (ATM) and MPC860SAR. For More Information On This Product, Go to:

OpenSPARC T1 Processor

Easy H.264 video streaming with Freescale's i.mx27 and Linux

Pharos Control User Guide

Hardware Assisted Virtualization

Operating Systems Design 16. Networking: Sockets

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

EBERSPÄCHER ELECTRONICS automotive bus systems. solutions for network analysis

Technical Bulletin. Enabling Arista Advanced Monitoring. Overview

Setup and Programming of the. Master Module R72-11Z-SLSASG-028

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

MP3 Player CSEE 4840 SPRING 2010 PROJECT DESIGN.

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

MCA Standards For Closely Distributed Multicore

Buffer Management 5. Buffer Management

Top 10 Tips for z/os Network Performance Monitoring with OMEGAMON Ernie Gilman

Computer Organization & Architecture Lecture #19

Enabling Cloud Architecture for Globally Distributed Applications

ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ, UK.

Computer Systems Structure Input/Output

DAC Digital To Analog Converter

White Paper Abstract Disclaimer

A DIY Hardware Packet Sniffer

LogiCORE IP AXI Performance Monitor v2.00.a

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Design and Implementation of an On-Chip timing based Permutation Network for Multiprocessor system on Chip

Course 12 Synchronous transmission multiplexing systems used in digital telephone networks

Design and Implementation of the Heterogeneous Multikernel Operating System

Dante: Know It, Use It, Troubleshoot It.

10/100/1000Mbps Ethernet MAC with Protocol Acceleration MAC-NET Core with Avalon Interface

System Design Issues in Embedded Processing

An Oracle Technical White Paper November Oracle Solaris 11 Network Virtualization and Network Resource Management

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

FOXBORO. I/A Series SOFTWARE Product Specifications. I/A Series Intelligent SCADA SCADA Platform PSS 21S-2M1 B3 OVERVIEW

theguard! ApplicationManager System Windows Data Collector

Avoiding pitfalls in PROFINET RT and IRT Node Implementation

Note! The problem set consists of two parts: Part I: The problem specifications pages Part II: The answer pages

Chapter 13. PIC Family Microcontroller

Transcription:

KeyStone Training Multicore Navigator Overview What is Navigator? Overview Agenda Definition Architecture Queue Manager Sub System (QMSS) Packet DMA (PKTDMA) Descriptors and Queuing What can Navigator do? Data movement InterProcessor Communication Job management

What is Navigator? What is Navigator? Definition Architecture Queue Manager Sub System (QMSS) Packet DMA (PKTDMA) Descriptors and Queuing What can Navigator do? Data movement InterProcessor Communication Job management Definition Multicore Navigator is a collection of hardware components that facilitate data movement and multi core control. The major components within the Navigator domain are: A hardware Queue Manager (QM). Specialized packet DMAs, called PKTDMA. Data structures to describe packets, called descriptors. A consistent API to manipulate descriptors and hardware. Navigator is the primary data movement engine in Nyquist, Turbo Nyquist and Shannon devices. Designed to be a fire and forget system load the data and the system handles the rest, without CPU intervention.

Navigator Architecture Navigator Architecture: PKTDMA

Navigator Architecture: QMSS Navigator Architecture: Host

Queue Manager Subsystem (QMSS) Features: 8192 total hardware queues, some dedicated to qpend signals. HW signals route to Tx DMA channels and chip level CPINTC. 20 Memory regions for descriptor storage (LL2, MSMC, DDR) 2 Linking RAMs for queue linking/management Up to 16K descriptors can be handled by internal Link RAM. Second Link RAM can be placed in L2 or DDR. Up to 512K descriptors supported in total. Major hardware components: Queue Manager PKTDMA (Infrastructure DMA) 2 PDSPs (Packed Data Structure Processors) for: Descriptor Accumulation / Queue Monitoring Load Balancing and Traffic Shaping Interrupt Distributor (INTD) module Queue Mapping This table shows the mapping of queue number to functionality. Queues associated with queue pend signals should not be used for general use, such as free descriptor queues (FDQs). Others can be used for any purpose. Queue Range Count Hardware Purpose 0 to 511 512 Type pdsp/firmware Low Priority Accumulation queues 512 to 639 128 queue pend AIF2 Tx queues 640 to 651 12 queue pend PA Tx queues (PA PKTDMA uses the first 9 only) 652 to 671 20 queue pend CPintC0/intC1 auto-notification queues 672 to 687 16 queue pend SRIO Tx queues 688 to 695 8 queue pend FFTC_A and FFTC_B Tx queues (688..691 for FFTC_A) 696 to 703 8 General purpose 704 to 735 32 pdsp/firmware High Priority Accumulation queues 736 to 799 64 Starvation counter queues 800 to 831 32 queue pend QMSS Tx queues 832 to 863 32 Queues for traffic shaping (supported by specific firmware) 864 to 895 32 queue pend vusr queues for external chip connections 896 to 8191 7296 General Purpose

Packet DMA Topology FFTC PKTDMA Multicore Navigator Queue Manager 0 1 2 3 4 5. 8192 SRIO PKTDMA AIF PKTDMA PKTDMA Network Coprocessor PKTDMA Multiple Packet DMA instances in KeyStone devices: PA and SRIO instances for all KeyStone devices. AIF2 and FFTC (A and B) instances are only in KeyStone devices for wireless applications. Packet DMA (PKTDMA) Features Independent Rx and Tx cores: Tx Core: Tx channel triggering via hardware qpend signals from QM. Tx core control is programmed via descriptors. 4 level priority (round robin) Tx Scheduler Additional Tx Scheduler Interface for AIF2 (wireless applications only) Rx Core: Rx channel triggering via Rx Streaming I/F. Rx core control is programmed via an Rx Flow (more later) 2x128 bit symmetrical Streaming I/F for Tx output and Rx input These are wired together for loopback in QMSS PKTDMA. Connects to peripheral s matching streaming I/F (Tx >Rx, Rx >Tx) Packet based, so neither Rx or Tx cores care about payload format.

Two descriptor types are used within Navigator: Host type provide flexibility, but are more difficult to use Contains a header with a pointer to the payload. Can be linked together (packet length is the sum of payload () sizes). Monolithic type are less flexible, but easier to use Descriptor contains the header and payload. Cannot be linked together. All payload s are equally sized (per region). Descriptor Types Descriptor Queuing This diagram shows several descriptors queued together. Things to note: Only the Host Packet is queued in a linked Host descriptor. A Host Packet is always used at SOP, followed by zero or more Host Buffer types. Multiple descriptor types may be queued together, though not commonly done in practice.

What is Navigator? What is Navigator? Definition Architecture Queue Manager Sub System (QMSS) Packet DMA (PKTDMA) Descriptors and Queuing What Can Navigator Do? Data Movement InterProcessor Communication Job Management Navigator Functionality Three major areas: Data Movement Peripheral input and output Infrastructure, or core to core transfers Chaining of transfers (output of PKTDMA A triggers PKTDMA B) Inter Processor Communication (IPC) Task/Core synchronization Task/Core notification Job Management Resource Sharing Load Balancing

Data Movement: Normal Peripheral input and output: Queue Manager Peripheral push Drive data through IP block using QM and PKTDMA Simple transmit is shown tx free queue tx queue pop Tx DMA read stream i/f control Rx DMA Switch Infrastructure or core tocore transfers: Transfer payload from L2 to L2, or DDR Memory read write Data Movement: Chaining Chaining of IP transfers (output of PKTDMA 1 triggers PKTDMA 2). write read write read write Chaining is accomplished by peripheral 1 pushing to a queue that is a Tx queue for peripheral 2. read

IPC Using QM without a PKTDMA Synchronization Queues are used by tasks or cores as a sync resource. Multiple receivers (slaves) can sync on the same trigger. Can also be done using the QMSS Interrupt Distributor, or the CpIntC queue pend queues. pop Queue Manager push Notification Zero copy messaging using shared memory Producers and consumers may be on different cores or the same core. Notification can be interrupt or polling. producer Switch push free queue tx/rx queue Memory pop consumer Switch Job Management Two main variations: Resource Sharing Multiple job queues are scheduled for a single shared resource. A resource can be a peripheral (e.g. FFTC) or a DSP task/core. Centralized Scheduler Load Balancing Single job queue is fanned out to several resources. The goal is to send a job to the least busy core. Distributed (multi) schedulers are another variation.

For More Information For more information, refer to the to Multicore Navigator User Guide http://www.ti.com/lit/sprugr9 For questions regarding topics covered in this training, visit the support forums at the TI E2E Community website.