SDP Workshop Nashville TN 13 Dec 2001 Architectures for Distributed Real-time Systems Michael W. Masters NSWCDD
Building Systems for the Real World What is the Problem? Capability sustainment Affordable life cycle Low upgrade cycle time Technology refreshable despite obsolescence Multi-dimensional Trade Space! Real-time Complex Long life Evolving Distributed Mission critical Cost conscious High performance
High Performance Distributed Computing DARPA Goal: Transition Computing Technology to Military HiPer-D D Premise: New Computer Program & System Architecture Required to Fully Exploit COTS Technology Navy Goal: Provide Increased Capacity & Scalability HiPer-D Quorum Navy Team Architecture Industry Navy Real-time Systems DARPA Technologies Advanced computers Operating systems Advanced networks Low latency protocols Quality-of-service middleware management Architecture Concepts Distributed processing Open systems Portability Scalability Fault tolerance Shared resource mgt. Self-instrumented Navy Benefits Load-invariant tactical performance Information access Mission flexibility Continuous availability Rapid upgrades Low ownership cost Michael W. Masters RTAS 2001 - Architectures for Distributed Real-time Systems Slide 3
SYSTEM ARCHITECTURE VIEWS FUNCTIONAL VIEW PHYSICAL VIEW COMPANION SYSTEMS OPERATOR CONTROL READINESS satcom network link to other testbeds data base MPP SENSORS SYNTHESIS Events DECISIONS To / From All Events EFFECTORS JTIDS DSP2 algorithm accelerator VHF taccom Embedded MPP fire cntl data base DSP1 SEMI-AUTONOMOUS EFFECTORS Surv radar fire cntl SAR TECHNOLOGY VIEW visualization App App App App R es mission control Common Services Distribution Frameworks Middleware Adaptation Middleware o u rc e M an SOFTWARE VIEW load ordered sharing multicast fault tolerance shadow Real-Time Operating System Computing Equipment Cable Plant Cabinets Switches Drivers Processors a g e m e n t server Replicated servers primary client Process group Replicated clients Michael W. Masters RTAS 2001 - Architectures for Distributed Real-time Systems Slide 4
QoS REFERENCE ARCHITECTURE Computer Client Application Security Mgt. User Requirements B A Replication Services Application QoS Broker QoS Specs. Enterprise Mgt. Publish Subscribe Distributed Objects Group Ordered Appl. Ctrl Agent Control Allocation B Security Agent Name Service Monitor Failure Monitor & QoS Broker Appl. QoS Mgt. & Neg. Mgt. & Neg. O/S Adaptation Layer Operating System Mid-level Protocols Auto- Config. Process Failure Process Startup A Time Service Low-level I/O QoS Security Services Utilization QoS Broker Server Application Physical Media Computer / Hardware Hardware Computer Michael W. Masters RTAS 2001 - Architectures for Distributed Real-time Systems Slide 5
GUIDANCE DOCUMENT Computer Program Design Component partitioning Portability Location transparency Client-server Data distribution State data coherency Computational flow Fault tolerance Scalability Real-time performance Process, thread & memory mgt. Data flow management Track data distribution Legacy capture Computing Technology Base Cabling and cabinets Information transfer Computing resources Peripherals Middleware management Instrumentation Failure management Information assurance Time services Programming/language support facilities Requirements and design tools, methodologies and processes Michael W. Masters RTAS 2001 - Architectures for Distributed Real-time Systems Slide 6
CHALLENGES FOR THE FUTURE Fault Tolerance Faster fault detection and isolation ( << 1 sec ), e.g. via hardware support for fault detection and reconfiguration Integrated failure management across technology base Middleware Faster, scalable performance during join, leave & recovery events Integrated products with full range of middleware functionality Middleware for higher performance domains Management Optimal, stable system-wide dynamic allocation algorithms Run-time schedulability and stability analysis for mixed real-time systems (hard, soft, event) Incorporation of network QoS and routing management Security Intrusion detection, authentication, mgt. of security domains, etc. Integration with other technologies, e.g. Management System Support for system end-to-end performance requirements Certification methods for dynamically allocated systems Michael W. Masters RTAS 2001 - Architectures for Distributed Real-time Systems Slide 7