CASTNESS 2007 Workshop MPSoC Virtual Platforms Rainer Leupers Software for Systems on Silicon (SSS) RWTH Aachen University Institute for Integrated Signal Processing Systems
Why focus on virtual platforms? MPSoC HW has become reality TI: OMAP, DaVinci STM: Nomadik IBM: Cell Intel: IXP, CoreDuo Philips: Nexperia Atmel: Diopsis ARM: MPCore... and many others Still many open issues in MPSoC design methodology HW/SW partitioning Processor IP/NoC optimization MPSoC programming MPSoC prototyping [Intel] [Nokia] 2
MPSoC design flow Application: Task 1 Task 2 Task 3 Task 4 Task 5 HW HW Proc Proc Proc Proc Network-on-Chip MPSoC Network-on-Chip Specification HW HW MPSoC virtual virtual platform (SW (SW prototype) HW HW Proc Proc Proc Proc HW HW Network-on-Chip Network-on-Chip MPSoC HW HW platform prototype 3
What is a virtual platform? A SW model of a HW SoC platform Enables... HW platform architecture exploration and optimization SW development, debugging, and optimization Concurrent HW/SW design ( HW/SW codesign ) Requirements High simulation speed Speed/accuracy trade-off Flexibility Usability for non-hw-experts 4
Some virtual platform history MPSoC is processor (i.e. software) dominated Instruction-set simulator (ISS) plays key role for simulation speed Early ISS s (e.g. for DSPs) used interpretive technology, therefore very slow Introduction of compiled simulation achieved speedup of 10-100x Application Instruction Decode Execute ory Run-Time Application Simulation Compiler Instruction Behavior Compiled Simulation Execute Program ory Compile-Time Run-Time 5
Some virtual platform history Instruction Instruction Behavior Application Program ory Decode Compiled Simulation Cache Execute Run-Time Added flexibility via cache-based JIT compiled simulation Dynamic binary translation results in another 10x speedup Further developments Automatic ISS retargeting from processor models Multi-core debugging facilities via synchronized ISS s Transaction-level modeling (TLM) for fast bus/noc simulation Adoption of SystemC as de facto standard for ESL modeling 6
Virtual platforms today Different successful commercial offerings CoWare, Vast, Virtio, Virtutech,... Fast and accurate enough, e.g. to boot Linux and decode H.264 in (almost) real-time make virtual GSM phone calls In use by major semiconductor and system houses [Vast] 7
Virtual SHAPES platform (VSP) RISC DSP Tile () RISC Elementary Tile () DSP Elementary Tile () RISC VLIW BUS MEM RISC MEM BUS VLIW BUS MEM SHAPES HW architecture DNP DNP DNP Multi-Chip System NOC Multi-Tile Chip 8
VSP (derived from Atmel Diopsis 940) ICE RISC Instr Cache ARM926EJS ARM926EJS MMU PSP PSP Data Cache RDM IF BIU I D I D DXM DXM Interface(AHB EBI) ARM JTAG SRAM ROM magicv DSP TM JTAG magicv TM DPM 2-port 16-port 256x40 Data Regs 10-float ops/cycle DSP AHB Master LISA LISA 4-addr/ Model Model cycle Multiple DSP Addr Gen DSP AHB Slave DDM 6-access/ cycle Multi-layer AMBA AMBA Bus Bus MATRIX Bus Library Library Master DNP AHB Master X + X - Y + DNP AHB Master Y - Slave DNP Z + DNP AHB Slave Modeled Modeled by by INFN INFN Z - C + PDMA Bridge NoC (NI) APB SystemC SystemC Functional Functional Model Model 9
Virtual SHAPES platform (VSP) Modeling accuracy levels Cycle accurate (CA): partially existing, out of SHAPES scope Instruction accurate (IA): current focus Fast performance estimation: future work IA modeling based on CoWare VP/LISATek technology Platform modeling with Virtual Platform Designer (VPD) Embedded SW development with Virtual Platform Architect (VPA) Package & ship to user VPD (HW view) VPA (SW view) 10
Future work: simulation on multi-core hosts Growing SW and HW complexity: simulation speed will remain an important issue for virtual platforms Moore s law treats embedded MPSoC and general purpose computers equally ASIC ASIC CPU CPU ASIP ASIP ASIP ASIP CPU CPU ASIP ASIP ory ory ory ory ory ory MPSoC virtual platform Simulation host OS-based load balancing of multi-core host could be optimized for application domain, i.e. MPSoC simulation 11
Future work: ultra-fast MPSoC simulation Simulation of large-scale multi- MPSoC systems (e.g. SHAPES) will require novel concepts Need to move to higher abstraction levels CA IA??? native code execution Earlier work: Virtual Processing Unit (VPU) for high-level architecture and OS exploration [Kogel/Kempf@ISS] timing annotated native execution Goal: hybrid simulation, i.e. seamless handover between VPU/native and IA simulation Fast forwarding concept Speed/accuracy trade-off task A task A get init busy1 request put response get busy2 request put Δt A1 Δt response Δt A2 VPU VPU toggle ISS task B task B get init busy1 request put response get busy2 request put Δt B1 Δt B2 12
Future work: checkpoint/restart technology Problem with virtual platforms: Reaching a point of interest in the simulation may take considerable time E.g. OS booting before entering application code Makes debugging tedious Solution: Checkpointing facility, i.e. dump entire simulation state on hard disk Restarting facility, i.e. resume simulation from previous checkpoint Technically very challenging for SystemC simulation Dependence on host OS details, SystemC simulation kernel, external debuggers, etc. Linux booted Periodic checkpoints Program crash 13
Summary Virtual platforms... can replace HW prototypes for architecture exploration and esw development enable truly concurrent embedded HW/SW design Advances in simulation technology have made virtual platforms applicable to today s MPSoC complexity levels Virtual SHAPES Platform under development Future topics Higher simulation speed (e.g. multi-core hosts, higher abstraction levels) Improved usability (e.g. hybrid simulation, checkpointing) HW design HW design T2 T6 T1 T3 T4 T5 14 application VP creation VP creation SW design SW design integration
Thank you! See also: - CASTNESS presentation by Torsten Kempf - www.iss.rwth-aachen.de Institute for Integrated Signal Processing Systems