System Software and TinyAUTOSAR Florian Kluge University of Augsburg, Germany parmerasa Dissemination Event, Barcelona, 2014-09-23
Overview parmerasa System Architecture Library RTE Implementations TinyIMA pbios4cm TinyAUTOSAR Parallelisation techniques for AUTOSAR Parallelised Interrupt Handler Synchronisation Buffers Performance Evaluation of Remote Service Calls 2
RTE Requirements Scheduling Com. & Sync. Automotive Avionic Construction Pre-emptive (Earliest Deadline First) Resources, Events, buffered/unbuffered Messages Fixed cyclic + Preemptive Messages; Events, Buffers, Blackboards, Semaphores Round-robin Events, Semaphores, Spin-Locks I/O Low latency Predictability Low latency Protection OS-Application, (Task) Partition Task Blocking sync./com. techniques need interaction with system scheduler different scheduling algorithms 3
System Architecture Application Layer Application Specific Domain Specific Interface User Mode Non-Critical RTE Services Domain Specific Protection Boundary Critical RTE Services Mode Scheduling Protection Communication & Synchronisation I/O Domain Specific Context Management Memory Management Synchronisation Mechanisms Services Interrupt Handling Common Library Simulated Hardware 4
Library Abstraction of underlying hardware Provides RTE-independent services WCET-analysable primitives Synchronisation functions for parallel applications Common base for RTE implementations OS consists of Library RTE-specific isolation-critical services Non-critical RTE services: no influence on other partitions 5
Avionic System Architecture Partition I Partition II APEX APEX User Mode ARINC 653 O/S (Non-Critical Services) Protection Boundary ARINC 653 O/S (Non-Critical Services) ARINC 653 O/S (Critical Services) Mode Library Simulated Hardware 6
TinyIMA Implementation Minimal subset of ARINC 653 API required by Honeywell application to guarantee time isolation Partition and process management Communication and synchronisation primitives Buffers, Queueing Ports, Semaphores ARINC XML system definition replaced with C functions Extend API to support cluster-based many-cores Partitions Clusters CONNECT_SRCDST_QUEUING_PORTS Processes Cores BIND_PROCESS_TO_CORE No parallelisation at TinyIMA level has been provided 7
Construction Machinery System Architecture Parallelized BAUER Application Middleware pbios4cm - Interrupt handling - Thread divided to cores Library Simulated Hardware 8
pbios4cm - Parallel BIOS for Construction Machinery Based on BIOS API of original ECU vendor Single-core TriCore system (ESX-3XL from Sensortechnik Wiedemann) Basic functions for thread scheduling, I/O, timing Modifications: Thread-safe re-design and implementation (sources not available) Simplified thread management: One thread per core Supported I/O functionalities CAN Generic I/O for PWM, digital and analogue input Interrupt and DMA system 9
Automotive System Architecture Application Layer User Mode AUTOSAR Runtime Environment (RTE) Protection Boundary Critical RTE Services (Tiny AUTOSAR) OS System Services Abstraction Layer I/O Hardware Abstraction Microcontroller Abstraction Layer Complex Drivers Mode Library Simulated Hardware 10
TinyAUTOSAR Properties: Evaluation platform for parallelised automotive applications executed in parallel Provide minimum subset of AUTOSAR software architecture Comprise basic functionalities required on any core Provide basic communication and synchronisation features for multi-/many-core processors Implementations: Multicore Implementation according to AUTOSAR 4.x Implementation with optimisations for parallelised applications 11
Optimisations for Parallelised Applications Support for execution of parallelised tasks and runnables Alternative scheduling: time synchronous execution model Synchronisation buffers Extended to support parallelised interrupt handler Communication Comparison of implementation approaches of cross core service calls 12
Time synchronous execution model Time-triggered execution of tasks/runnables Exchange of data only at time synchronisation points no need for explicit synchronisation! For regular tasks: defined in PharOS Sequential Task R2 R3 R5 R2 R3 R2 R5 R2 R3 R2 Parallelised Task R2 R2 R2 R2 R2 R3 R5 R3 R5 R3 0 1 2 3 4 5 6 8 13
Synchronisation Buffer Aim: each runnable execution has time-consistent view of input data buffer data, access relative to TSP Latest possible TSP TSP starting point of R3 TSP R1 R1 R1 2 R1 3 R1 4 R1 R1 R1 R1 R3 0 R3 1 R3 1 R3 2 0 1 2 3 4 5 6 Buffering duration of data from R1 2 7 8 14
Synchronous Execution of a Parallelised Interrupt Handler Exploit time synchronous execution model also for ISRs! TSPs ADV(3) ADV(4) Time Core 1 (TT) A1 A2 A3 A1 IRQ i at t IRQ(i) 0 4 8 ADV abs (t) ADV(2) System Real- Time Clock H Core 2 (ET) IH PH PH Core 3 (ET) ADV abs (t) spawn PH 0 4 ADV(2) PH PH Real-Time Clock H IRQ(i) t = t IRQ(i) + t WCTT = 4 CRP TSP Ax IH PH Agent x (TT domain) Initial Handler (ET domain) Processing Handler (ET domain) CRP ADV abs (t) Clock Reference Point time base for future ADV instructions on corresponding cores Advance Absolute Instruction advances to an absolute point of system real-time clock t IRQ(i) t WCTT Interrupt event i at time t Worst Case Traversal Time 15
Results Actual start time of Processing Handler: 150-200 cycles after elapse of counter Deviation between Processing handlers: <50 cycles ADV(3) ADV(4) Time A1 A2 A3 A1 0 4 8 ADV abs (t) ADV(2) IH PH PH spawn PH 0 4 ADV abs (t) PH ADV(2) PH t = t IRQ(i) + t WCTT = 4 CRP TSP 16
Remote Service Calls General Structure of Service call: Service Call Check passed yes Action Return no Some service calls can affect other cores e.g. ActivateTask Questions: Where should the single parts be executed? How is communication organised? 17
Cross-Core Service Calls: Approaches Message based OS data access across core Lock based direct OS data access across core OS data OS data OS data OS data Core Core Core Core Everything on Target Core Check & Action on Source core, Only Rescheduling on Target Core 18
Experimental Results Advantage for lock based approach Less remote core blocking times Faster execution Message based OS data access across core Lock based direct OS data access across core OS data OS data OS data OS data Core Core Core Core 19
Summary RTE bases for parmerasa applications Avionic: TinyIMA Construction Machinery: pbios4cm Automotive: TinyAUTOSAR Adaption for application parallelisation Automotive RTE: TinyAUTOSAR Minimum evaluation platform Investigations using TinyAUTOSAR: Time-synchronous execution Synchronisation buffers Synchronous execution of parallelised interrupt handler Performance of remote service calls: messages vs. locks 20