Advanced Microcontrollers Grzegorz Budzyń Lecture. 11: Processors



Similar documents
System Design Issues in Embedded Processing

STM32 F-2 series High-performance Cortex-M3 MCUs

All Programmable Logic. Hans-Joachim Gelke Institute of Embedded Systems. Zürcher Fachhochschule

Digital Signal Controller Based Automatic Transfer Switch

Chapter 13. PIC Family Microcontroller

Building Blocks for PRU Development

Digital signal processor fundamentals and system design

7a. System-on-chip design and prototyping platforms

System Considerations

Architekturen und Einsatz von FPGAs mit integrierten Prozessor Kernen. Hans-Joachim Gelke Institute of Embedded Systems Professur für Mikroelektronik

1 TMS320C6678 Features and Description

Advanced Microcontrollers Grzegorz Budzyń Lecture. 3: Electrical parameters of microcontrollers 8051 family

KeyStone Architecture Security Accelerator (SA) User Guide

ARM Cortex -A8 SBC with MIPI CSI Camera and Spartan -6 FPGA SBC1654

BEAGLEBONE BLACK ARCHITECTURE MADELEINE DAIGNEAU MICHELLE ADVENA

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

A New, High-Performance, Low-Power, Floating-Point Embedded Processor for Scientific Computing and DSP Applications

Reconfigurable System-on-Chip Design

ADVANCED PROCESSOR ARCHITECTURES AND MEMORY ORGANISATION Lesson-12: ARM

MPSoC Designs: Driving Memory and Storage Management IP to Critical Importance

Single Phase Two-Channel Interleaved PFC Operating in CrM

Embedded Systems on ARM Cortex-M3 (4weeks/45hrs)

Single Phase Two-Channel Interleaved PFC Operating in CrM Using the MC56F82xxx Family of Digital Signal Controllers

High-Performance, Highly Secure Networking for Industrial and IoT Applications

SABRE Lite Development Kit

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor

DS1104 R&D Controller Board

Microcontrollers in Practice

SBC8600B Single Board Computer

ZigBee Technology Overview

SOC architecture and design

Lab Experiment 1: The LPC 2148 Education Board

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Lecture N -1- PHYS Microcontrollers

Implementing a Digital Answering Machine with a High-Speed 8-Bit Microcontroller

Network connectivity controllers

Cut Network Security Cost in Half Using the Intel EP80579 Integrated Processor for entry-to mid-level VPN

QorIQ T4 Family of Processors. Our highest performance processor family. freescale.com

A Survey on ARM Cortex A Processors. Wei Wang Tanima Dey

Which ARM Cortex Core Is Right for Your Application: A, R or M?

Welcome to the tutorial for the MPLAB Starter Kit for dspic DSCs

The new 32-bit MSP432 MCU platform from Texas

Atmel Norway XMEGA Introduction

Hello, and welcome to this presentation of the STM32L4 reset and clock controller.

Texas Instruments TMS320C6678 (Shannon) DSP Training

SBC8100 Single Board Computer

Architectures and Platforms

Am186ER/Am188ER AMD Continues 16-bit Innovation

USB 3.0 Connectivity using the Cypress EZ-USB FX3 Controller

Renesas Inverter Agenda

Outline. Introduction. Multiprocessor Systems on Chip. A MPSoC Example: Nexperia DVP. A New Paradigm: Network on Chip

A DESIGN OF DSPIC BASED SIGNAL MONITORING AND PROCESSING SYSTEM

IBM CELL CELL INTRODUCTION. Project made by: Origgi Alessandro matr Teruzzi Roberto matr IBM CELL. Politecnico di Milano Como Campus

Chapter 1 Lesson 3 Hardware Elements in the Embedded Systems Chapter-1L03: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

Fondamenti su strumenti di sviluppo per microcontrollori PIC

Embedded Linux RADAR device

Video/Cameras, High Bandwidth Data Handling on imx6 Cortex-A9 Single Board Computer

IP Phone Solutions TNETV1050/1055

DAC Digital To Analog Converter

Lesson 7: SYSTEM-ON. SoC) AND USE OF VLSI CIRCUIT DESIGN TECHNOLOGY. Chapter-1L07: "Embedded Systems - ", Raj Kamal, Publs.: McGraw-Hill Education

ontroller LSI with Built-in High- Performance Graphic Functions for Automotive Applications

Introduction to the Latest Tensilica Baseband Solutions

System-on-a-Chip with Security Modules for Network Home Electric Appliances

Motor Control using NXP s LPC2900

Serial port interface for microcontroller embedded into integrated power meter

What is a System on a Chip?

A Master-Slave DSP Board for Digital Control

How To Make An Ip Phone Work For A Business

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Simplifying Embedded Hardware and Software Development with Targeted Reference Designs

KeyStone Multicore. Ecosystem

ARM Microprocessor and ARM-Based Microcontrollers

Product Brief. R7A-200 Processor Card. Rev 1.0

Chapter 2 Logic Gates and Introduction to Computer Architecture

Chapter 1 Computer System Overview

POCKET SCOPE 2. The idea 2. Design criteria 3

LSI SAS inside 60% of servers. 21 million LSI SAS & MegaRAID solutions shipped over last 3 years. 9 out of 10 top server vendors use MegaRAID

The Central Processing Unit:

Open Architecture Design for GPS Applications Yves Théroux, BAE Systems Canada

PIC32 Microcontroller Families

ALL-AIO-2321P ZERO CLIENT

Freescale Semiconductor, Inc. Product Brief Integrated Portable System Processor DragonBall ΤΜ

Discovering Computers Living in a Digital World

CISC, RISC, and DSP Microprocessors

PCI/PC Bus Operation ACR8020. ACR8020 Exclusives. ACR8020 (1- to 8-Axes) Ordering. 11 Parker Hannifin Corporation. Catalog 8180/USA Motion Controllers

Qsys and IP Core Integration

Soft processors for microcontroller programming education

1 Introduction. Freescale Semiconductor Application Note. AN3224 Rev. 0, 3/2006

AppliedMicro Trusted Management Module

Accelerate Cloud Computing with the Xilinx Zynq SoC

USB OTG and Embedded Host Microchip Technology Incorporated. All Rights Reserved. Slide 1

Pre-tested System-on-Chip Design. Accelerates PLD Development

CHAPTER 7: The CPU and Memory

Freescale Semiconductor, I

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Figure 1.Block diagram of inventory management system using Proximity sensors.

Embedded System Hardware - Processing (Part II)

2.0 Command and Data Handling Subsystem

Transcription:

Advanced Microcontrollers Grzegorz Budzyń Lecture 11: Digital Signal Controllers & Digital Signal Processors

Plan Digital Signal Controllers Introduction Digital Signal Controllers vs Microcontrollers Digital Signal Controllers vs Digital Signal Processors DSC by Texas Instruments DSC by Freescale Digital Signal Processors

Digital Signal Controllers

Introduction Digital Signal Controller(DSC) is a combination of a Microcontroller and a Digital Signal Processor(DSP)

Introduction - Like microcontrollers DSC have: - Fast interrupt response - Controlorientedperipherals(PWM, watchdog, etc.) - Usually programmed in C++ language(although assembler programming possible)

Introduction - Like digital signal processors DSC have: - Single cyclemultiply-and-accumulatemac instructions - Barrelshifters - LargeAccumulators

Introduction - Main applications of DSCs: - Motor control - Power conversion - Sensor processing applications

Introduction - MainDSC vedors: - TexasInstruments - NXP - Microchip - Infineon - Renesas

DSC by Microchip

dspic applicationareas

Source: [1]

dspic-architecture dspic familyof16 bit RISC controllerswith DSP features Two subfamilies: dspic30f smaller, slower dspic33f highest performance The only digital signal controller on the market available in QFN-28 cases at prices down to 3$!!!

dspic-architecture Main features: Modified Harvard architecture, Optimized for C-compilers Two 40-bit accumulators with rounding Memoryoptionsas inpic24 Many single-cycle MAC operations Cases18 to 110 pins

DSC by Texas Instruments

DSC -TI portfolio Four main subfamilies: 24x 16-bit Series 28x Fixedpoint Series 28x Piccolo Series 28x Delfino Floating-point Series

DSC -TI portfolio Source: [2]

TMS320F2802x: Piccolo Series fixed point microcontrollers 40-60MHz performance up to 64KB of on-chip flash small 38-pin package options feature rich peripherals: 150-ps high resolution enhanced pulse width modulators (epwms) 4.6 MSPS 12-bit ADC high precision on-chip oscillators, analog comparators high speed 12-bit ADC support for I2C, SPI, and SCI.

Piccolo Series - TMS320F2803x: - fixed point 32-bit microcontrollers - 60 MHz speed - up to 128KB flash memory - 64 or 80-pin packages - peripherals and features of the 2802x devices plus: - control law accelerator (CLA) for high efficiency control loops - QEP module - CAN and LIN interfaces

TMS320F2833x: DelfinoSeries Integratedfloating point unit simplifies development and speeds control applications up by an average of 50% F2833x devices run at up to 150 MHz (300 MFLOPS) with two package offerings that are pinfor-pin compatible within all F2833x and F2823x controllers Featuresup to 512KB of on-chip flash and a DMA for high speed memory access.

DelfinoSeries TMS320C2834x: delivers up to 600 MFLOPS of floating-point performance up to 516KB of single-access RAM PWMswith 65-ps Direct Memory Access and a low-latency core make the C2834x an excellent solution for performance-hungry real-time control applications.

28x Fixed-PointSeries TMS320F2823x: F2823x generation of controllers is a fixed point version of the F2833x devices Pin-to-pin compatible with the F2833x series, all of the peripherals and features remain the same except for the floating point unit.

28x Fixed-PointSeries TMS320F280x: device offers 60-100Mhz performance were the first generation to feature: the on-chip 12.5 MSPS 12-bit ADC multiple high resolution PWM peripherals QEP (quadrature encoder pulse) F280xx devices have up to 256KB of flash memory.

28x Fixed-PointSeries TMS320F281x: F281x device generation features: 150Mhz core flexible Event Managers that provide access to timers, compare/pwm units, captures, and quadratureencoder units

Source: [3] C2000 Architecture

Main features: C2000 corefeatures Efficient C engine with hardware that allows a C compiler to generate compact code, resulting in industry-leading code density Single cycle read-modify-write instructions, single cycle 32-bit multiply. Fast interrupt service time (down to 9 cycles) with automatic zero-cycle context save. 96 dedicated interrupt vectors that require no software decision making

Main features: C2000 corefeatures 32-bit floating-point unit on Delfinocontrollers On select Piccolo devices, an independent Control Law Accelerator (CLA) processes floating-point control loops to free the CPU for other purposes. Three 32-bit general purpose CPU timers brings accuracy and flexibility to any applications. Code Security Module prevents reverse engineering and protects valuable intellectual property

DSC by Freescale

Freescalemicrocontrollers portfolio DSC by Freescale Source: [4]

FreescaleDSC portfolio DSC by Freescale Source: [4]

56F8000 blockdiagram DSC by Freescale Source: [5]

56F8000 details DSC by Freescale

56F8000 application DSC by Freescale Source: [5]

56F8000 application DSC by Freescale Source: [5]

56F8000 application DSC by Freescale Source: [5]

56F8000 application DSC by Freescale Source: [5]

56F8XXX comparison Source: [5]

C2000 corefeatures Source: [5]

Main features: MC56F8357 On-chip memory includes high-speed volatile and nonvolatile components: 512 KB of Program Flash 4 KB of Program RAM (836X Devices) 32 KB of Data RAM 32 KB of Data Flash (836X Devices) 32 KB of Boot Flash Access up to 4MB of off-chip program and 32MB of data memory Up to 60 MIPS at 60 MHz execution frequency

Main features: MC56F8357 Four 12-bit, Analog-to-Digital Converters Temperature Sensor Up to two FlexCAN(CAN Version 2.0 B-compliant) Two Serial Communication Interfaces (SCIs) Up to two Serial Peripheral Interfaces (SPIs) Two dedicated external interrupt pins Software-programmable Phase-Lock Loop

MC56F8357

Main features 1/2: MC56F8357 -Core Efficient 16-bit 56800E family controller engine with dual Harvard architecture Single-cycle 16 16-bit parallel Multiplier- Accumulator (MAC) Four 36-bit accumulators, including extension bits Arithmetic and logic multi-bit shifter Parallel instruction set with unique DSP addressing modes Hardware DO and REP loops

Main features 2/2: MC56F8357 -Core Three internal address buses and one external address bus Four internal data buses and one external data bus Instruction set supports both DSP and controller functions Controller-style addressing modes and instructions for compact code Efficient C compiler and local variable support Software subroutine and interrupt stack with depth limited only by memory

MC56F8357 Memory

MC56F8357 Corearchitecture

MC56F8357 Corearchitecture

MC56F8357 Corepipeline

MC56F8357 Corepipeline

Digital Signal Processors

DSP Introduction - Digital Signal Processing: application of mathematical operations to digitally represented signals - Signals represented digitally assequences of samples - Digital signals obtained from physical signals via tranducers(e.g., microphones) and analog to-digital converters (ADC)

DSP Introduction - Digital signals converted back to physical signals via digital-to-analog converters (DAC) - Digital Signal Processor (DSP):electronic system that processes digital signals

DSP Introduction - Most DSP tasks require: - Repetitive numeric computations - Attention to numeric fidelity - High memory bandwidth, mostly via array accesses - Real-time processing

DSP Introduction - DSPs must perform these tasks efficiently while minimizing: - Cost - Power - Memory use - Development time

CommonDSP applications - Applications Instrumentation and measurement: - Communications - Audio and video processing - Graphics, image enhancement, 3- D rendering - Navigation, radar, GPS - Control - robotics, machine vision, guidance

- Algorithms CommonDSP algorithms - Frequency domain filtering - FIR and IIR - Frequency- time transformations - FFT - Correlation

FIR algorithm

CommonDSP architecture

Requirements<-> Realisations

Fast data access - Needof transferring data to / from memory or DSP peripherals - Need of retrieving instructions from memory - Three main implementations: - high-bandwidth memory architectures specialized addressing modes direct memory access

High-bandwidth memory architectures

High-bandwidth memory architectures - Only Harvard (b) and Super-Harvard (c) used in DSPs Super-Harvard modification-adding to the DSP core a small bank of fast memory, called instruction cache Dataarealsoallowedto be stored in the program memory The last-executed program instructions are relocated at run time in the instruction cache

High-bandwidth memory architectures Alsodata-cachefor fastaccessto data is sometimes present

High-bandwidth memory architectures

High-bandwidth memory architectures - Cache drawbacks: Problemscausedby thelackoffullpredictabilityfor cache hits A missingcachehit happenswhenthedata orthe instructionsneededby thedsp arenot storedin cachememory, hencetheyhaveto be fetchedfroma slower memory with an execution speed penalty A situationcausinga missingcachehit is, for instance, the flow change due to branch instructions.

Specializedaddressing modes Address generator blocks controls the address generation for: specialized addressing modes such as indexing addressing, circular buffers, and bit-reversal addressing

Specializedaddressing modes Circularbuffers userfor examplein the implementation of digital filters

Specializedaddressing modes Bit-reversaladdressing necessaryfor FFT (butterfly)

Directmemory access The DMA controller is a second processor working in parallel with the DSP core Itisdedicated totransferring information between two memory areas or between peripherals and memory The DMA controller frees the DSP core for other processing tasks

Directmemory access

Directmemory access

Fast computation MAC centered The MAC operation is used by many digital processing algorithms The basic DSP arithmetic processing blocks are: a) many registers b) one or moremultipliers c) one or more Arithmetic Logic Units (ALUs) d) one or more shifters

Fast computation MAC centered

Instructionpipelining Instruction pipelining consists of: dividing the execution of instructions into different stages executing the different instructions inparallel stages. The net result is an increased throughput of the instruction execution.

Parallelarchitectures Parallel-enhanced DSP architectures started to appear on the market in the mid 1990s and were based: on instruction-level parallelism(vliw), data-level parallelism(simd), a combination of both

Parallelarchitectures-VLIW

Parallelarchitectures-VLIW In VLIW many instructions areissued at the same time and are executed in parallel by multiple execution units Characteristics of VLIW architectures include simple and regular instruction sets Instruction scheduling is done at compile-time and not at run-time writing assembly code for VLIW architecture is very complex and the optimization is often better left to the compiler

Parallelarchitectures-SIMD

Parallelarchitectures-SIMD SIMD architectures are based on data-level parallelism Onlyone instruction is issued at atime The same operation specified by the instruction is performed on multiple data sets

Numericalfidelity It isessential that the numerical fidelity be maximized Theerrors due to the finite number of bits used in the number representation and in the arithmetic operations should be minimized Improvingnumericalfidelitycanbe doneby changingthe numeric representation orby dedicated hardware features

Numericalfidelity DSP canbe categorizedinto: Fixed point (up to 64-bit, fractional arithmetic) Floating point (32- or 64-bit)

Fast executioncontrol Itisimportantthattheprogram inthedsp is executed in a deterministic way Interrupts have to be serviced with minimal latency An important DSPfeature is the implementation by hardware of looping constructs, referred to as zero-overhead hardware loop - e.g. RPT #2 NOP

Digital Signal Processor example TI Multicore DSP+ARM KeyStone II System-on-Chip(SoC)

TI 66AK2H12 Upto 5.6 GHzofARM and9.6 GHzofDSP processing coupled with: security, packetprocessing, Ethernet Therawcomputationalperformance is38.4 GMACS/core and 19.2 Gflops/core(@ 1.2 GHz operating frequency)

TI 66AK2H12 Eight TMS320C66x DSP Core Subsystems Each With Upto 1.2 GHzC66x Fixed/Floating-PointDSP Cores 38.4 GMacs/Corefor FixedPoint @ 1.2 GHz 19.2 GFlops/Corefor FloatingPoint @ 1.2 GHz Memory 32K ByteL1P Per Core 32K ByteL1D Per Core 1024K ByteLocalL2 Per Core

TI 66AK2H12 ARM Cortex -A15 MPCore Processors Containing Four ARM Cortex-A15 Cores Up to 1.4-GHz Cortex-A15 Processor Core Speed 4MB L2 Cache Memory Shared by All ARMCores Full Implementation of ARMv7-A ArchitectureInstruction Set 32KB L1 Instruction Cache and Data Cache percortex-a15 Processor Core AMBA 4.0 AXI Coherency Extension (ACE) MasterPort, Connected to MSMC(MulticoreShared Memory Controller) for Low Latency Access to Shared MSMC SRAM

TI 66AK2H12 Network Coprocessor Packet Accelerator Enables Support for Transport Plane IPsec, GTP-U, SCTP, PDCP L2 User Plane PDCP (RoHC, Air Ciphering) 1 GbpsWire Speed Throughput at 1.5 MPackets Per Second Security Accelerator Engine Enables Support for IPSec, SRTP, 3GPP and WiMAX Air Interface, and SSL/TLS Security ECB, CBC, CTR, F8, A5/3, CCM, GCM, HMAC,CMAC, GMAC, AES, DES, 3DES, Kasumi, SNOW 3G, SHA-1, SHA-2 (256-bit Hash), MD5 Up To 6.4 GbpsIPSecand 3 GbpsAir Ciphering Ethernet Subsystem Five SGMII Port Switch

Peripherals Four Lanes of SRIO 2.1 TI 66AK2H12 5 GbpsOperation Per Lane Supports Direct I/O, Message Passing Two Lanes PCIeGen2 Supports Up To 5 GBaud TwoHyperLink Supports Connections to Other KeyStoneArchitecture Devices Supports Up To 50 GBaud Five Enhanced Direct Memory Access (EDMA) Modules

Peripherals TI 66AK2H12 Two 72-Bit DDR3 Interfaces with Speeds Up To1600 MHz USB 3.0 Two UART Interfaces Three I2C Interfaces 32 GPIO Pins Three SPI Interfaces Semaphore Module Twenty 64-Bit Timers Five On-Chip PLLs

Keystonearchitecture Highperformance structure for integrating RISC and DSP coreswith application-specific coprocessors and I/O Four main hardware elements: MulticoreNavigator, TeraNet, Multicore Shared Memory Controller HyperLink

Keystonearchitecture Multicore Navigator: A packet-based manager that controls 16k queues When tasks are allocated to the queues, Multicore Navigator provides hardware-accelerated dispatch that directs tasks to the appropriateavailable hardware TeraNet: central resource to move packetswith2 Tbps capacity!

Keystonearchitecture Multicore Shared Memory Controller: enables processing cores to accessshared memory directly without drawing from the TeraNet s capacity no blocking of packet moevement by memory access HyperLink: provides a 50-GBaud chip-level interconnect Working with Multicore Navigator, HyperLink dispatches tasks to tandem devices transparently and executes tasks as if they arerunning on local resources

Thank you for your attention

Interestingparameters

Thank you for your attention

References [1] dspic family documentation; www.microchip.com [2] www.ti.com [3] C2000 family documentation; www.ti.com [4] www.freescale.com [5] 56F8000 family documentation; www.freescale.com [6] http://www.coe.pku.edu.cn/tpic/2010913102418831.pdf [7] http://www.dspguide.com/ch28.pdf [8] http://www.cs.berkeley.edu/~pattrsn/252s98/lec08-dsp.pdf [9] http://www.ti.com/lit/ds/symlink/66ak2h12.pdf