Multi-/Many-core Modeling at Freescale



Similar documents
High Performance or Cycle Accuracy?

Development With ARM DS-5. Mervyn Liu FAE Aug. 2015

ANDROID DEVELOPER TOOLS TRAINING GTC Sébastien Dominé, NVIDIA

Hybrid Platform Application in Software Debug

Development of Type-2 Hypervisor for MIPS64 Based Systems

BY STEVE BROWN, CADENCE DESIGN SYSTEMS AND MICHEL GENARD, VIRTUTECH

TEGRA X1 DEVELOPER TOOLS SEBASTIEN DOMINE, SR. DIRECTOR SW ENGINEERING

MPSoC Virtual Platforms

Compilers and Tools for Software Stack Optimisation

Networking Services Trusted at every level and every phase

Virtual Machines.

Enterprise-Class Virtualization with Open Source Technologies

PyMTL and Pydgin Tutorial. Python Frameworks for Highly Productive Computer Architecture Research

Full and Para Virtualization

x86 ISA Modifications to support Virtual Machines

Outline. Outline. Why virtualization? Why not virtualize? Today s data center. Cloud computing. Virtual resource pool

Digitale Signalverarbeitung mit FPGA (DSF) Soft Core Prozessor NIOS II Stand Mai Jens Onno Krah

Network connectivity controllers

Benchmarking Virtual Switches in OPNFV draft-vsperf-bmwg-vswitch-opnfv-00. Maryam Tahhan Al Morton

IBM high-density µserver demonstration platform leveraging PPC, Linux and hot-water cooling Ronald P. Luijten Data Motion architect

Visualizing gem5 via ARM DS-5 Streamline. Dam Sunwoo ARM R&D December 2012

Open Network Install Environment (ONIE) LinuxCon North America 2015

Wind River ICE 2. Table of Contents. Key Features

Real-time Debugging using GDB Tracepoints and other Eclipse features

System Software Integration: An Expansive View. Overview

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

Virtualization for Cloud Computing

KVM: A Hypervisor for All Seasons. Avi Kivity avi@qumranet.com

STLinux Software development environment

FPGA Prototyping Primer

Customer Experience. Silicon. Support & Professional Eng. Services. Freescale Provided SW & Solutions

Chapter 2 System Structures

Virtualization. Clothing the Wolf in Wool. Wednesday, April 17, 13

RISC-V Software Ecosystem. Andrew Waterman UC Berkeley

Embedded Development Tools

Servervirualisierung mit Citrix XenServer

Some Future Challenges of Binary Translation. Kemal Ebcioglu IBM T.J. Watson Research Center

Chapter 3 Operating-System Structures

Going Linux on Massive Multicore

Virtualization. Types of Interfaces

Eddy Integrated Development Environment, LemonIDE for Embedded Software System Development

The Freescale Embedded Hypervisor

GPU Profiling with AMD CodeXL

WIND RIVER HYPERVISOR

A Hardware and Software Monitor for High-Level System-on-Chip Verification

2972 Linux Options and Best Practices for Scaleup Virtualization

Virtualization. Dr. Yingwu Zhu

Von der Hardware zur Software in FPGAs mit Embedded Prozessoren. Alexander Hahn Senior Field Application Engineer Lattice Semiconductor

Android Development: a System Perspective. Javier Orensanz

Hardware Based Virtualization Technologies. Elsie Wahlig Platform Software Architect

Informal methods A personal search for practical alternatives to moral improvement through suffering in systems research

Advanced Server Virtualization: Vmware and Microsoft Platforms in the Virtual Data Center

Regional SEE-GRID-SCI Training for Site Administrators Institute of Physics Belgrade March 5-6, 2009

Capacity planning for IBM Power Systems using LPAR2RRD.

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

How To Run A Power5 On A Powerbook On A Mini Computer (Power5) On A Microsoft Powerbook (Power4) On An Ipa (Power3) On Your Computer Or Ipa On A Minium (Power2

Performance Counter. Non-Uniform Memory Access Seminar Karsten Tausche

DS-5 ARM. Using the Debugger. Version 5.7. Copyright 2010, 2011 ARM. All rights reserved. ARM DUI 0446G (ID092311)

FRONT FLYLEAF PAGE. This page has been intentionally left blank

Complete Integrated Development Platform Copyright Atmel Corporation

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

OpenPOWER Software Stack with Big Data Example March 2014

Cloud Computing. Up until now

OPTIMIZE DMA CONFIGURATION IN ENCRYPTION USE CASE. Guillène Ribière, CEO, System Architect

Virtualization. Jukka K. Nurminen

VxWorks Guest OS Programmer's Guide for Hypervisor 1.1, 6.8. VxWorks GUEST OS PROGRAMMER'S GUIDE FOR HYPERVISOR

UG103.8 APPLICATION DEVELOPMENT FUNDAMENTALS: TOOLS

Virtual Servers. Virtual machines. Virtualization. Design of IBM s VM. Virtual machine systems can give everyone the OS (and hardware) that they want.

Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation

BHyVe. BSD Hypervisor. Neel Natu Peter Grehan

SOFTWARE DEVELOPMENT STANDARD FOR SPACECRAFT

Xeon+FPGA Platform for the Data Center

Precise and Accurate Processor Simulation

Hardware Virtualization for Pre-Silicon Software Development in Automotive Electronics

Sierraware Overview. Simply Secure

A Unified View of Virtual Machines

Developing a dynamic, real-time IT infrastructure with Red Hat integrated virtualization

INTRODUCTION ADVANTAGES OF RUNNING ORACLE 11G ON WINDOWS. Edward Whalen, Performance Tuning Corporation

Beyond Virtualization: A Novel Software Architecture for Multi-Core SoCs. Jim Ready September 18, 2012

How Linux kernel enables MidoNet s overlay networks for virtualized environments. LinuxTag Berlin, May 2014

Chapter 14 Virtual Machines

Electronic system-level development: Finding the right mix of solutions for the right mix of engineers.

Uses for Virtual Machines. Virtual Machines. There are several uses for virtual machines:

This presentation provides an overview of the architecture of the IBM Workload Deployer product.

THE CHANGING FACE OF SDN. Guido Appenzeller 2014

The Art of Virtualization with Free Software

Virtual Private Systems for FreeBSD

Security Overview of the Integrity Virtual Machines Architecture

Attention. restricted to Avnet s X-Fest program and Avnet employees. Any use

The Advanced JTAG Bridge. Nathan Yawn 05/12/09

MODULE 3 VIRTUALIZED DATA CENTER COMPUTE

DS-5 ARM. Using the Debugger. Version Copyright ARM. All rights reserved. ARM DUI 0446M (ID120712)

Overview. Open source toolchains. Buildroot features. Development process

Altera SoC Embedded Design Suite User Guide

Configuring Virtual Blades

CodeWarrior for Power Architecture Errata

Embedded Software development Process and Tools:

In-System Programmer USER MANUAL RN-ISP-UM RN-WIFLYCR-UM

evm Virtualization Platform for Windows

Transcription:

Multi-/Many-core Modeling at Freescale David Murrell, Jim Holt, Michele Reese

Perspective: Virtual Platform ROI Schedules: SW and HW available on day 1 becoming an industry expectation FSL P4080 Linux kernel boots on 1 st silicon, within 2 days of receiving DS boards MOT P4080 Software development enabled 1 year before silicon XBOX360 and PS3 OS kernels boot on 1 st silicon within 2 weeks of receiving first eval boards (IBM s Mambo software VP) Growing demand from customers Evidenced by recent trends in industry strategic acquisitions Quality: VP is the proving ground for complex HW/SW interactions Both HW and SW are improved while still in formative stages (it s not just that SW has a running start) functionality, usability and performance Gain insight into pathological effects, and intercept them before solutions become costly Typically 20-30 studies conducted per critical path IP block Verified bit and cycle-level accuracy is essential 2

Virtual Platform Consumers Ecosystem development Debuggers, Tools/Partners Verification & Validation Pre-Si test bringup Reference Architectural Design/ Exploration μ-arch analysis Marketing/ FAEs Virtual Platforms NPI Next Generation Definition Customer SW bringup Demos Performance assurance Bake-offs Tradeoff analysis Primary Use Case Proof/bench development Performance Analysis PRL assessment Functional Performance Hybrid Pre-Si BSP/SDK bringup μcode development Virtual DS Software Enablement 3

Concept thru NPI Execution ALE ALF Planning TO Production Func α Functional β Functional Cycle CornerStone Exploration α Performance β Performance Plat Trace Linux Cornerstone Platform Bare Metal Platform Linux System Platform 4

Technology Improvement Strategy Scale UP Continue to optimize single-thread performance Continue to leverage JIT (DBT) Technologies OK for partial core cluster configurations Scale OUT Migrate to distributed simulation platforms: multi-cores and multi-machine clusters Explore transition to COREMU (or similar) Assess the potential of the MIT Graphite platform Possibilities to facilitate native execution for dedicated sim farms FSL s VortiQa U Inexpensive (low $100 s per compute node) 32-bit e500v2 (with SPE) configuration (P2020U) 64-bit e5500 configuration (P5020U) Combine with Graphite for multi-u simulation cluster Port PIN or DynamoRIO to support Power ISA

What we ve done with graphite so far We ve been using it in the context of the Angstrom project Ported graphite to Redhat Linux Learned overall system architecture & how to add syscalls Lesson learned: Syscalls - when you bring new application code into graphite it will sometimes include unsupported syscalls If you see runtime message from graphite: Unhandled syscall number ### then you can typically identify the offending syscall by going here and looking up the reported number: http://asm.sourceforge.net/syscall.html Currently working on a cycle & power accurate Angstrom tile model using Freescale e200 cores goal is to integrate with graphite for Angstrom related research

Porting to Redhat Linux In common/makefile.common change KERNEL to ETCH Remove -Werror flag from all Makefiles (there are several warnings about classes with virtual functions that do not have virtual destructors) There are a couple of instance of "invalid use of sizeof operator" in pin/handle_syscalls.cc, these are easy to fix In common/misc/moving_average.h call of overloaded 'pow' is ambiguous. * line 120 changed UInt32 curr_window_size to Int curr_window_sizexxx

BACKUP

Customer Expectations: Functional Virtual Platform High speed functional execution (programmer s view of the system): Model the behavior to arrive at the correct result 10 s of MIPS per core at a minimum Evaluation board replacement: Functional fidelity: firmware, OS, drivers, and applications run without modification Linux console Enhanced debug environment: Source-level: CodeWarrior, GDB, etc. Low-level: core/device registers, TLB s, disassembly, memory Rich breakpoint/watchpoint support Event callbacks (triggers) Full system checkpoint/restore Python command shell Linux/Windows-32/64 hosts 9

Customer Expectations: Performance Virtual Platform System-level cycle accurate models Model the behavior and the number of cycles consumed by the operation Cores, caches, memory controllers, interconnects, accelerators, data path, devices Executes un-modified binaries and traces Fast-forward to points of interest via checkpoints and/or gear-shifting Verified cycle-timing fidelity Accurate to within 10% of hardware logic Models micro-architectural structures and policies Comprehensive set of system performance metric data Rich data visualization Source-code traceability (multicore and hypervisor aware) Custom plots System activity heat maps Live display and data replay Flexible instrumentation and control points Event breakpoints Event callbacks (triggers) Application control of the simulator via speciallyencoded instructions 10

Performance Analysis and Design Exploration Flow Perf Model Traces Perf Model Pre-ALF Measurements Experimental Design Alternatives Cornerstone, Bare Metal and Linux Platforms PRL Linux Linux EGFE Analysis Plan Select Workloads Func Model Port, Test, and Debug Emulator Post-ALF Measurements Analyze Data Perf Model Investigate Anomalies Design Improvements Bare Metal and Linux Platforms Eval Boards Post-Si Measurements Summary Analysis Report S&A S&A: SoC Systems, S&A + PcP: Core+MSS 11

Verification Flow RTL Simulator Execute Test Cases Specs Verification Plan Develop Test Cases Func Model Debug Test Cases Test Suites Emulator Execute Test Cases Compare Function Verify Bit Accuracy Timing Calibration Bare Metal Platform Eval Boards Execute Test Cases Model Fixes Bare Metal Platform HW Verification Flow Verif T IP Model Verification Addendum S&A 12

Software Development Flow Linux SW PRL Develop Func Model Debug and Test Function Debug and Test Timing Race Conditions & Optimize Function Debug and Test Timing Race Conditions & Optimize FM Microcode Bare Metal Platform Linux Platform Hypervisor Drivers USDPAA Linux VortiQa IP SW IP: Drivers & Microcode 13

Embedded Reference Flow Petra Petra Stampede Func Model Test and Debug Func Model Reference Petra Platform Bare Metal Platform Eval Board 1 2 3 4 a1 a2 a3 a4 0 Vcc1 Si GND 0 b1 b2 b3 b4 5 6 7 8 T 14