Design of High Availability Systems & Software



Similar documents
Safety Critical & High Availability Systems

Creating A Highly Available Database Solution

Storage and High Availability with Windows Server 10971B; 4 Days, Instructor-led

Red Hat Enterprise linux 5 Continuous Availability

10971B: Storage and High Availability with Windows Server

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

Course 10971:Storage and High Availability with Windows Server

MS Design, Optimize and Maintain Database for Microsoft SQL Server 2008

Vess A2000 Series HA Surveillance with Milestone XProtect VMS Version 1.0

5054A: Designing a High Availability Messaging Solution Using Microsoft Exchange Server 2007

Embedded Systems Lecture 9: Reliability & Fault Tolerance. Björn Franke University of Edinburgh

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models

Implementing High-Availability (HA) Solutions for Siebel ebusiness Applications

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

High Availability Design Patterns

M6430a Planning and Administering Windows Server 2008 Servers

System Availability and Data Protection of Infortrend s ESVA Storage Solution

High Availability and Disaster Recovery Solutions for Perforce

Synology High Availability (SHA)

Contingency Planning and Disaster Recovery

Lecture 36: Chapter 6

Planning and Administering Windows Server 2008 Servers

How To Write A Server On A Flash Memory On A Perforce Server

Using RAID6 for Advanced Data Protection

Planning and Administering Windows Server 2008 Servers

Server Virtualization with Windows Server Hyper-V and System Center

How Routine Data Center Operations Put Your HA/DR Plans at Risk

NETWORK ATTACHED STORAGE DIFFERENT FROM TRADITIONAL FILE SERVERS & IMPLEMENTATION OF WINDOWS BASED NAS

Daly Computers Webinar for MEEC: P4000 SAN Solutions

MaximumOnTM. Bringing High Availability to a New Level. Introducing the Comm100 Live Chat Patent Pending MaximumOn TM Technology

How to choose the right RAID for your Dedicated Server

Server Virtualization with Windows Server Hyper-V and System Center

Providing Open Architecture High Availability Solutions

CS420: Operating Systems

Westek Technology Snapshot and HA iscsi Replication Suite

COMP 7970 Storage Systems

High Availability Using Raima Database Manager Server

Availability Digest. SAP on VMware High Availability Analysis. A Mathematical Approach. December 2012

Hitachi Essential NAS Platform, NAS Gateway with High Cost Performance

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions

INDIA September 2011 virtual techdays

Storage node capacity in RAID0 is equal to the sum total capacity of all disks in the storage node.

Course Syllabus. Planning and Administering Windows Server 2008 Servers. Key Data. Audience. At Course Completion. Prerequisites. Recommended Courses

HIGH AVAILABILITY LINUX ARCHITECTURE FOR MISSION CRITICAL WORKLOADS

Course 2788A: Designing High Availability Database Solutions Using Microsoft SQL Server 2005

NEC Express Partner Program. Deliver true innovation. Enjoy the rewards.

High Availability Design Patterns

Planning and Administering Windows Server 2008 Servers

IT White Paper. N + 1 Become Too Many + 1?

nappliance Network Virtualization Gateways

Implementing and Managing Windows Server 2008 Clustering

Simplified HA/DR Using Storage Solutions

Definition of RAID Levels

FAULT-TOLERANT COMPUTING

Fault Tolerance & Reliability CDA Chapter 3 RAID & Sample Commercial FT Systems

VMware vsphere on NetApp. Course: 5 Day Hands-On Lab & Lecture Course. Duration: Price: $ 4, Description:

HRG Assessment: Stratus everrun Enterprise

Synology High Availability (SHA)

Administering a Microsoft SQL Server 2000 Database

Disaster Recovery Disaster Recovery Planning for Business Continuity Session Name :

How To Create A Multi Disk Raid

Administering a Microsoft SQL Server 2000 Database

Pervasive PSQL Meets Critical Business Requirements

SQL SOLUTION BRIEF. NexGen N5 for Microsoft SQL Server: Performance, Control and Consolidation

High-Availablility Infrastructure Architecture Web Hosting Transition

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Achieving High Availability

Managing and Maintaining Windows Server 2008 Servers

VBLOCK SOLUTION FOR SAP APPLICATION HIGH AVAILABILITY

RAID HARDWARE. On board SATA RAID controller. RAID drive caddy (hot swappable) SATA RAID controller card. Anne Watson 1

Backup and Redundancy

Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS)

Iron Networks Network Virtualization Gateways

Price/performance Modern Memory Hierarchy

Keys to Successfully Architecting your DSI9000 Virtual Tape Library. By Chris Johnson Dynamic Solutions International

Enhancing Exchange Server 2010 Availability with Neverfail Best Practices for Simplifying and Automating Continuity

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

VBLOCK SOLUTION FOR SAP: HIGH AVAILABILITY FOR THE PRIVATE CLOUD

Deployment Topologies

20409B: Server Virtualization with Windows Server Hyper-V and System Center

NAS 251 Introduction to RAID

Transcription:

HighAv - Version: 2 21 June 2016 Design of High Availability Systems & Software

Design of High Availability Systems & Software HighAv - Version: 2 2 days Course Description: This course examines the high-level design of embedded systems and software that are to provide their services at near-continuous availability. High availability systems must tolerate both expected and unexpected faults. Their design is based on redundant hardware and software combined in ways that will achieve five-nines (99.999%) or greater availability, equivalent to less than 1 second of downtime per day. Basic hardware N-plexing and voting issues are discussed, followed by an indepth study of a number of backward error recovery fault tolerance techniques including static N-version programming, Checkpoint-Rollback, Process Pairs, and Recovery Blocks. The class continues with several forward error recovery techniques. Technical issues such as failover management, data replication, and software design defects, are addressed in depth. Many real-world examples are presented. This course is far from a general course about system or software design theory, but rather it is highly focused on the design of embedded systems and software that must make their services available at all times, with less than 5 minutes per year of downtime. Intended audience: This course is intended for practicing real-time and embedded systems software system architects, project managers and technical consultants who have responsibility for designing, structuring and

implementing the software for real-time and embedded computer systems that are required to continue providing service despite the occurrence of internal and external faults. Prerequisites: Many (but not all) high-availability systems are also safety-critical systems -- with can threaten human safety or even human life in situations where the system fails and remains unavailable for significant periods of time. For those highavailability systems that also have safety-critical requirements, we recommend that the course "Design of S <SafetyCrit.html>afety-Critical <SafetyCrit.html> Systems and Software <SafetyCrit.html>" should be taken at the same time as this course. The two courses have little overlap in content, and offer complimentary approaches and perspectives. It is possible to combine these two courses into a unified threeor four-day course for presentation at customer sites. Objectives: The primary goal of this course is to give participants the skills necessary to design software for real-time and embedded computer systems that must relentlessly provide service despite the occurrence of nternal and external faults. This is a very practical, results-oriented course that will provide knowledge and skills that can be applied immediately. Topics: Definitions and Background High Availability Fault -> Error -> Failure Single Points of Failure Fault Tree Analysis Exercise: Probabilistic Fault Tree Analysis Underlying Principles

Fault Avoidance vs. Tolerance Redundancy Failure Curves Replication vs. Functional Redundancy vs. Analytic Redundancy Dynamic vs. Static Redundancy Extended Example: Space Shuttle Software Fundamental System-Level Design Patterns Static Hardware Fault Tolerance N-Plex Design Exercise: MTBF, MTTF Calculations in Triple Modular Redundancy Dynamic System Fault Tolerance Redundant Pairs Clusters Cluster Failover Strategy Choices Examples: Redundant Cluster Design Concepts for Backward Error Recovery Design Diversity Dynamic System Redundancy Backward Error Recovery Transactions Checkpointing System and Software Design Patterns for High Availability Checkpoint-Rollback Process Pairs Recovery Blocks Limitations of Backward Error Recovery Patterns Forward Error Recovery Design Patterns Technical Issues in High Availability Design

RAID: Redundant Arrays of Inexpensive Disks Exercise: Hamming Codes Failover Management Data Replication Dealing with Software Design Faults Extended Example: Airbus A330/340 Fly-by-Wire C Language in Critical Systems Software Robustness: MISRA-C, LINT, Static Code Analyzers Exercise: C-Language Shenanigans