Software Health Management An Introduction. Gabor Karsai Vanderbilt University/ISIS
|
|
|
- Bertram Hall
- 10 years ago
- Views:
Transcription
1 Software Health Management An Introduction Gabor Karsai Vanderbilt University/ISIS Tutorial at PHM 2009
2 Outline Definitions Backgrounds Approaches Summary
3 Definitions Software Health Management: A branch of System Health Management that applies health management techniques to the controlling software of a system. SHM goes beyond classical fault tolerance Software Fault Tolerance Fault detected Functionality restored Software Health Management Anomaly detected Fault source isolated Fault mitigated Fault prognosticated When the fault happens, SFT reacts Software and system health is managed
4 Software Health Management Goals: To prevent a (software) fault from becoming a (system) failure Manage health of the software Sense, analyze, and act upon health indicators Provide (relevant) information to operator, maintainer, designer Assumption: Software health is a measurable, non-binary property
5 Software Health Management Characteristics Performed at run-time, on the running system Includes all phases of health management: Detection: detect anomalous behavior Isolation: isolate source of fault (component, failure mode) Mitigation: take action to reduce/eliminate impact of fault Prognostics: predict impending faults and failures Can be highly mode- and mission/goal-dependent
6 Backgrounds: Basics of Software Fault Tolerance Definition: Software Fault Tolerance: Methods and techniques to implement software that can tolerate faults in itself, in the platform it is running on, in the hardware system it is connected to, in the environment
7 Backgrounds: Basics of Software Fault Tolerance Why? Serves as a foundation for SHM See Fault-Tolerance vs. System Health Management What? Follows the (HW) Fault Tolerance principles in SW Literature: Wilfredo Torres-Pomales: Software Fault Tolerance: A Tutorial, NASA/TM , Langley Research Center, 2000 CREDIT Software Fault Tolerance, Edited by Michael R. Lyu, Published by John Wiley & Sons Ltd. Google: Software Fault Tolerance
8 Basics of Software Fault Tolerance Single version Definition: FT for a software component (module, application, service, ) one version of the component (code) is used Architectural issues Foundation for SFT: the architecture Component-oriented architecture Modularization horizontal partitioning Layering vertical partitioning Common thread: prevent propagation of failures (H + V)
9 Basics of Software Fault Tolerance Single version Detection Requires: Self protection: component protects itself from outside effects Self checking: component detects its own faults and prevents their propagation Concepts / Techniques: Replication checks: components replicated and results compared Timing checks: deadlines, response times, Reversal checks: inverse function: output input Coding checks: use redundancy in representations, e.g. CRC Reasonableness checks: value/range/rate/sequence of data Structural checks : verify data structure integrity
10 Basics of Software Fault Tolerance Single version Exceptions and their management Language-based mechanisms C++, Java, Ada, not in C! Hierarchical nesting (per control flow) Incorrect requirements/design can lead to major problems (Ariane 5) Categories: Interface exceptions: self-protecting component raises it Local exceptions: generated and contained w/in component Failure exceptions: local management failed, global actions is needed
11 Basics of Software Fault Tolerance Single version Checkpoints and restarts Detect and restart Categories: Static : reset to an initial state Dynamic : checkpoint state, restore previous one upon failure Problems: non-invertible actions Process pairs Identical versions Separate processors State checkpointed On fault, backup takes over Component Checkpoint Checkpointed state Primary State copy Backup Restart Selector Error detector Error detector
12 Basics of Software Fault Tolerance Multi version Definition: FT for software system multiple versions of component/s (code) are used Multiple versions: Issues Same spec Diversity: in design, implementation, language, compiler, processor, etc. + independent teams Specification errors (e.g. omissions) could be a common source of faults Experimental result: faults are not really independently distributed over the input space underlying similarities in design/implementation/etc. and faults?
13 Basics of Software Fault Tolerance Multi version Recovery blocks Create checkpoint before start If version fails, try another one (use checkpointed state) Alternatives can provide graceful degradation Checkpoint state State copy Primary Alt 1 Alt 2 Alt n Selector Acceptance test N-version programming Independent alternatives Generic voter selects Alt 1 Alt 2 Voter Alt n
14 Basics of Software Fault Tolerance Multi version N self-checking Alt 1 Acceptance Test Each alternative is selfchecking Selection logic selects best Alt 2 Acceptance Test Selection Logic Alt n Consensus-based If the selection algorithm fails to find a correct output then an output is chosen that has passed the acceptance test Alt 1 Alt 2 Alt n Acceptance Test Selection Algorithm Error Switch Failure Acceptance Test Switch
15 Basics of Software Fault Tolerance Multi version Output selection issues Acceptance tests are hard to build Voters may have to work with inexact comparisons Two-step process: Filtering via acceptance tests Arbitration step to choose output Generalized voters: Majority, median, plurality, weighted averaging, Choice must be based on system level issues Reliability, safety, availability, etc.
16 Software Fault Tolerance vs. Software Health Management Complexity of systems necessitates an additional layer above SFT that manages the Software Health Why? Software is a crucial ingredient in aerospace systems Software as a method for implementing functionality Software as the universal system integrator Software could exhibit faults that lead to system failures Software complexity has progressed to the point that zero-defect systems (containing both hardware and software) are very difficult to build Systems Health Management is an emerging field that addresses precisely this problem: How to manage systems health in case of faults?
17 Software Health Management and System Health Management What is System Health Management? The on-line view: Detection of anomalies in system or component behavior Identification and isolation of the fault source/s Prognostication of impending faults that could lead to system failures Mitigation of current or impending fault effects while preserving mission objective/s Reports Observations Isolation Corrections Detection Mitigation Prognostics
18 Design for Software Health Management Component-oriented software architecture Systems are built by composing components via welldefined interfaces and composition principles There is a (highly robust and reliable) component framework that mitigates all component interactions Component framework is built to higher integrity/quality standards than application software (e.g. RTOS vs. app) Beyond classical architecture-based SFT: No single fault assumption multiple faults are possible Cascading fault effects are also possible Software Health Management is a system-level function it must be integrated with System Health Management
19 Example: Component Model Parameter Subscribe (Event) Provided (Interface) Component Publish (Event) Required (Interface) State Resource Trigger A component is a unit (containing potentially many objects). The component is parameterized, has state, it consumes resources, publishes and subscribes to events, provides interfaces and requires interfaces from other components. Publish/Subscribe: Event-driven, asynchronous communication Required/Provided: Synchronous communication using call/return semantics. Triggering can be periodic or sporadic.
20 Example: Component Interactions Sampler Component GPS Component Display Component P S S Components can interact via asynchronous/event-triggered and synchronous/call-driven connections. Example: The Sampler component is triggered periodically and it publishes an event upon each activation. The GPS component subscribes to this event and is triggered sporadically to obtain GPS data from the receiver, and when ready it publishes its own output event. The Display component is triggered sporadically via this event and it uses a required interface to retrieve the position data from the GPS component.
21 Design for Software Health Management Component-level health management Very localized limited capability, yet needed for higher levels Monitor component detect anomalies What to monitor Input and output: pre- and post-conditions on incoming and outgoing synchronous calls and asynchronous events State: invariants over the component state Timing: component operation execution time Execution (response) time Frequency of invocation Resource usage: component resource consumption patterns Memory, resource lock/unlock, etc. How to monitor Momentary values Rates History/trends
22 Component Monitoring Monitor arriving events Monitor published events Component Monitor incoming calls Monitor outgoing calls Observe state Monitor resource usage Monitor control flow/ triggering
23 Component-level Health Management A Component Level Health Manager reacts to detected events and takes mitigation actions. It also reports events to higher-level manager/s. Events: detected by monitoring Actions: Basic mitigation: reset, init, shutdown, destroy, checkpoint/restore Intercept related: allow/block call Specialized mitigation: inject event, call method, deallocate memory, release resource, Event or time-triggered activation Reporting Report events/actions to other managers Component Monitor Actions Events Component Framework Manager s behavioral model: - Finite-state machine - Triggers: monitored events, time - Actions: mitigation activities Manager Events Manager is local to component container (for efficiency) but must be protected from the faults of functional components.
24 Component-level Health Management Manager behavior: Track component state changes via detected events and progression of time Take mitigation actions as needed Design issues: Co-location with component Fault containment Efficiency Local detection may implicate another component Mitigation action may include blocking the call, overriding data Complexity of mitigation actions Verification of mitigation logic Safety conditions Performance issues Manager encapsulates all HM Logic Idle start timeout Exec finish /init /reset Component Monitor inva_violation Actions Events Component Framework WCET InvA Events Manager
25 Design for Software Health Management System-level health management Multiple components can fail, independently Fault effects cascade through components Anomalies (with cascading effects) and faults propagating through components and assemblies must be correlated and managed Diagnosis: Isolate the fault source component Mitigation: Take (component-)local or global action to mitigate effect of fault/s
26 Design for Software Health Management A system-level fault model: Timed Failure Propagation Graph Failure Mode FM1 FM2 Unmonitored Discrepancy (OR) 2,3 a,b 1,4 a,c 1,6 a,b,c 1,3 b,c D1 SD1 D2 Monitored Discrepancy (AND) 1,3 b,c 1,4 a,c 1,3 b,c t=3 t=6 2,5 a Current Time = 10 Op. Modes = {a,b,c}, Current Mode = b Alarm sequence: {(3,D2), (6,D3), (8,D4)} Propagation interval t=8 D4 1,6 a,b,c D3 F: set of failure modes D: set of discrepancies Discrepancy attributes: Type: {AND, OR} Condition: {Monitored, unmonitored} M: Set of operating modes E: set of edges Edge attributes: Propagation interval: [t min,t max ] Activation modes Abdelwahed, S., G. Karsai, and G. Biswas, "A Consistency-based Robust Diagnosis Approach for Temporal Causal Systems", 16th International Workshop on Principles of Diagnosis (DX '05), Monterey, CA, June, 2005.
27 Design for Software Health Management System-level health management Model: Faults (failure modes) and discrepancies (observed anomalies) can be located in different components Fault propagation occurs along component communication links / call chains Diagnosis: Correlate observations across multiple components, deduce fault source Features: modal, robust, ranked results, multiple faults
28 Design for Software Health Management System-level health management: Multi-component diagnosis Component Fault Model Component Fault Model D FM D FM FM D FM D D FM D D Diagnoser Manager Managed Component Component CHM Managed Component Component CHM Component Platform
29 Design for Software Health Management System-level health management: Multi-component, hierarchical mitigation Local: reflex reactions Regional: mitigation in an area Global: system-level mitigation Dubey, A., S. Nordstrom, T. Keskinpala, S. Neema, T. Bapty, and G. Karsai, "Towards a verifiable real-time, autonomic, fault mitigation framework for large scale real-time systems", ISSE, vol. 3, pp , 2007.
30 Summary Software Health Management: A branch of System Health Management that applies HM techniques to the controlling software of a larger system. Software Fault Tolerance provides useful techniques for SHM, but SHM reaches beyond SFT as it has a comprehensive approach to anomaly detection, diagnosis, mitigation and prognostics. Initial progress in the area of component-level and system-level software health management shows promise, but it is subject of ongoing research.
Proactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware
Proactive, Resource-Aware, Tunable Real-time Fault-tolerant Middleware Priya Narasimhan T. Dumitraş, A. Paulos, S. Pertet, C. Reverte, J. Slember, D. Srivastava Carnegie Mellon University Problem Description
The Service Availability Forum Specification for High Availability Middleware
The Availability Forum Specification for High Availability Middleware Timo Jokiaho, Fred Herrmann, Dave Penkler, Manfred Reitenspiess, Louise Moser Availability Forum [email protected], [email protected],
CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS
CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS What is an operating? A collection of software modules to assist programmers in enhancing efficiency, flexibility, and robustness An Extended Machine from the users
The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.
The EMSX Platform A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks A White Paper November 2002 Abstract: The EMSX Platform is a set of components that together provide
Mixed-Criticality Systems Based on Time- Triggered Ethernet with Multiple Ring Topologies. University of Siegen Mohammed Abuteir, Roman Obermaisser
Mixed-Criticality s Based on Time- Triggered Ethernet with Multiple Ring Topologies University of Siegen Mohammed Abuteir, Roman Obermaisser Mixed-Criticality s Need for mixed-criticality systems due to
Methods and Tools For Embedded Distributed System Scheduling and Schedulability Analysis
Methods and Tools For Embedded Distributed System Scheduling and Schedulability Analysis Steve Vestal Honeywell Labs [email protected] 18 October 2005 Outline Background Binding and Routing Scheduling
Embedded Systems Lecture 9: Reliability & Fault Tolerance. Björn Franke University of Edinburgh
Embedded Systems Lecture 9: Reliability & Fault Tolerance Björn Franke University of Edinburgh Overview Definitions System Reliability Fault Tolerance Sources and Detection of Errors Stage Error Sources
Real-Time Component Software. slide credits: H. Kopetz, P. Puschner
Real-Time Component Software slide credits: H. Kopetz, P. Puschner Overview OS services Task Structure Task Interaction Input/Output Error Detection 2 Operating System and Middleware Applica3on So5ware
FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING
FAULT TOLERANCE FOR MULTIPROCESSOR SYSTEMS VIA TIME REDUNDANT TASK SCHEDULING Hussain Al-Asaad and Alireza Sarvi Department of Electrical & Computer Engineering University of California Davis, CA, U.S.A.
The Temporal Firewall--A Standardized Interface in the Time-Triggered Architecture
1 The Temporal Firewall--A Standardized Interface in the Time-Triggered Architecture H. Kopetz TU Vienna, Austria July 2000 Outline 2 Introduction Temporal Accuracy of RT Information The Time-Triggered
Replication on Virtual Machines
Replication on Virtual Machines Siggi Cherem CS 717 November 23rd, 2004 Outline 1 Introduction The Java Virtual Machine 2 Napper, Alvisi, Vin - DSN 2003 Introduction JVM as state machine Addressing non-determinism
HRG Assessment: Stratus everrun Enterprise
HRG Assessment: Stratus everrun Enterprise Today IT executive decision makers and their technology recommenders are faced with escalating demands for more effective technology based solutions while at
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
MultiPARTES. Virtualization on Heterogeneous Multicore Platforms. 2012/7/18 Slides by TU Wien, UPV, fentiss, UPM
MultiPARTES Virtualization on Heterogeneous Multicore Platforms 2012/7/18 Slides by TU Wien, UPV, fentiss, UPM Contents Analysis of scheduling approaches Virtualization of devices Dealing with heterogeneous
Trends in Embedded Software Engineering
Trends in Embedded Software Engineering Prof. Dr. Wolfgang Pree Department of Computer Science Universität Salzburg cs.uni-salzburg.at MoDECS.cc PREEtec.com Contents Why focus on embedded software? Better
EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications
ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani
Modular Communication Infrastructure Design with Quality of Service
Modular Communication Infrastructure Design with Quality of Service Pawel Wojciechowski and Péter Urbán Distributed Systems Laboratory School of Computer and Communication Sciences Swiss Federal Institute
Safety Issues in Automotive Software
Safety Issues in Automotive Software Paolo Panaroni, Giovanni Sartori INTECS S.p.A. SAFEWARE 1 INTECS & Safety A very large number of safety software development, V&V activities and research project on
CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL
CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL This chapter is to introduce the client-server model and its role in the development of distributed network systems. The chapter
Verifying Semantic of System Composition for an Aspect-Oriented Approach
2012 International Conference on System Engineering and Modeling (ICSEM 2012) IPCSIT vol. 34 (2012) (2012) IACSIT Press, Singapore Verifying Semantic of System Composition for an Aspect-Oriented Approach
High Availability Storage
High Availability Storage High Availability Extensions Goldwyn Rodrigues High Availability Storage Engineer SUSE High Availability Extensions Highly available services for mission critical systems Integrated
Designing Real-Time and Embedded Systems with the COMET/UML method
By Hassan Gomaa, Department of Information and Software Engineering, George Mason University. Designing Real-Time and Embedded Systems with the COMET/UML method Most object-oriented analysis and design
How To Understand The Concept Of A Distributed System
Distributed Operating Systems Introduction Ewa Niewiadomska-Szynkiewicz and Adam Kozakiewicz [email protected], [email protected] Institute of Control and Computation Engineering Warsaw University of
Multiagent Reputation Management to Achieve Robust Software Using Redundancy
Multiagent Reputation Management to Achieve Robust Software Using Redundancy Rajesh Turlapati and Michael N. Huhns Center for Information Technology, University of South Carolina Columbia, SC 29208 {turlapat,huhns}@engr.sc.edu
Real Time Programming: Concepts
Real Time Programming: Concepts Radek Pelánek Plan at first we will study basic concepts related to real time programming then we will have a look at specific programming languages and study how they realize
Fault Tolerance & Reliability CDA 5140. Chapter 3 RAID & Sample Commercial FT Systems
Fault Tolerance & Reliability CDA 5140 Chapter 3 RAID & Sample Commercial FT Systems - basic concept in these, as with codes, is redundancy to allow system to continue operation even if some components
The MILS Component Integration Approach To Secure Information Sharing
The MILS Component Integration Approach To Secure Information Sharing Carolyn Boettcher, Raytheon, El Segundo CA Rance DeLong, LynuxWorks, San Jose CA John Rushby, SRI International, Menlo Park CA Wilmar
Diagnosis and Fault-Tolerant Control
Mogens Blanke Michel Kinnaert Jan Lunze Marcel Staroswiecki Diagnosis and Fault-Tolerant Control With contributions by Jochen Schroder With 228 Figures Springer 1. Introduction to diagnosis and fault-tolerant
Aperiodic Task Scheduling
Aperiodic Task Scheduling Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 Germany Springer, 2010 2014 年 11 月 19 日 These slides use Microsoft clip arts. Microsoft copyright
10.04.2008. Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details
Thomas Fahrig Senior Developer Hypervisor Team Hypervisor Architecture Terminology Goals Basics Details Scheduling Interval External Interrupt Handling Reserves, Weights and Caps Context Switch Waiting
Using UML Part Two Behavioral Modeling Diagrams
UML Tutorials Using UML Part Two Behavioral Modeling Diagrams by Sparx Systems All material Sparx Systems 2007 Sparx Systems 2007 Page 1 Trademarks Object Management Group, OMG, Unified Modeling Language,
Unified Batch & Stream Processing Platform
Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built
FactoryTalk View Site Edition V5.0 (CPR9) Server Redundancy Guidelines
FactoryTalk View Site Edition V5.0 (CPR9) Server Redundancy Guidelines This page left intentionally blank. FTView SE 5.0 (CPR9) Server Redundancy Guidelines.doc 8/19/2008 Page 2 of 27 Table of Contents
EVALUATION. WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration COPY. Developer
WA1844 WebSphere Process Server 7.0 Programming Using WebSphere Integration Developer Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com Chapter 6 - Introduction
Architecture Design & Sequence Diagram. Week 7
Architecture Design & Sequence Diagram Week 7 Announcement Reminder Midterm I: 1:00 1:50 pm Wednesday 23 rd March Ch. 1, 2, 3 and 26.5 Hour 1, 6, 7 and 19 (pp.331 335) Multiple choice Agenda (Lecture)
BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic
BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop
The ConTract Model. Helmut Wächter, Andreas Reuter. November 9, 1999
The ConTract Model Helmut Wächter, Andreas Reuter November 9, 1999 Overview In Ahmed K. Elmagarmid: Database Transaction Models for Advanced Applications First in Andreas Reuter: ConTracts: A Means for
How To Secure Cloud Computing
Resilient Cloud Services By Hemayamini Kurra, Glynis Dsouza, Youssif Al Nasshif, Salim Hariri University of Arizona First Franco-American Workshop on Cybersecurity 18 th October, 2013 Presentation Outline
Software Defined Security Mechanisms for Critical Infrastructure Management
Software Defined Security Mechanisms for Critical Infrastructure Management SESSION: CRITICAL INFRASTRUCTURE PROTECTION Dr. Anastasios Zafeiropoulos, Senior R&D Architect, Contact: [email protected]
System Aware Cyber Security
System Aware Cyber Security Application of Dynamic System Models and State Estimation Technology to the Cyber Security of Physical Systems Barry M. Horowitz, Kate Pierce University of Virginia April, 2012
Principles and characteristics of distributed systems and environments
Principles and characteristics of distributed systems and environments Definition of a distributed system Distributed system is a collection of independent computers that appears to its users as a single
Providing Load Balancing and Fault Tolerance in the OSGi Service Platform
Providing Load Balancing and Fault Tolerance in the OSGi Service Platform ARNE KOSCHEL, THOLE SCHNEIDER, JOHANNES WESTHUIS, JÜRGEN WESTERKAMP Faculty IV, Department of Computer Science University of Applied
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design
PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM
STUDY AND SIMULATION OF A DISTRIBUTED REAL-TIME FAULT-TOLERANCE WEB MONITORING SYSTEM Albert M. K. Cheng, Shaohong Fang Department of Computer Science University of Houston Houston, TX, 77204, USA http://www.cs.uh.edu
Embedded Systems. 6. Real-Time Operating Systems
Embedded Systems 6. Real-Time Operating Systems Lothar Thiele 6-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic
A Data Centric Approach for Modular Assurance. Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems 23 March 2011
A Data Centric Approach for Modular Assurance The Real-Time Middleware Experts Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems 23 March 2011 Gabriela F. Ciocarlie Heidi Schubert
Quotes from Object-Oriented Software Construction
Quotes from Object-Oriented Software Construction Bertrand Meyer Prentice-Hall, 1988 Preface, p. xiv We study the object-oriented approach as a set of principles, methods and tools which can be instrumental
How To Test Automatically
Automated Model-Based Testing of Embedded Real-Time Systems Jan Peleska [email protected] University of Bremen Bieleschweig Workshop 7 2006-05-05 Outline Technologie-Zentrum Informatik Objectives Basic concepts
Expansion of a Framework for a Data- Intensive Wide-Area Application to the Java Language
Expansion of a Framework for a Data- Intensive Wide-Area Application to the Java Sergey Koren Table of Contents 1 INTRODUCTION... 3 2 PREVIOUS RESEARCH... 3 3 ALGORITHM...4 3.1 APPROACH... 4 3.2 C++ INFRASTRUCTURE
Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing
www.ijcsi.org 227 Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing Dhuha Basheer Abdullah 1, Zeena Abdulgafar Thanoon 2, 1 Computer Science Department, Mosul University,
Contents. System Development Models and Methods. Design Abstraction and Views. Synthesis. Control/Data-Flow Models. System Synthesis Models
System Development Models and Methods Dipl.-Inf. Mirko Caspar Version: 10.02.L.r-1.0-100929 Contents HW/SW Codesign Process Design Abstraction and Views Synthesis Control/Data-Flow Models System Synthesis
Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems
Verification of Triple Modular Redundancy (TMR) Insertion for Reliable and Trusted Systems Melanie Berg 1, Kenneth LaBel 2 1.AS&D in support of NASA/GSFC [email protected] 2. NASA/GSFC [email protected]
Chapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
DeviceNet Communication Manual
DeviceNet Communication Manual Soft-Starter Series: SSW-07/SSW-08 Language: English Document: 10000046963 / 00 03/2008 Summary ABOUT THIS MANUAL... 5 ABBREVIATIONS AND DEFINITIONS... 5 NUMERICAL REPRESENTATION...
Cloud Resilient Architecture (CRA) -Design and Analysis. Hamid Alipour Salim Hariri Youssif-Al-Nashif
Cloud Resilient Architecture (CRA) -Design and Analysis Glynis Dsouza Hamid Alipour Salim Hariri Youssif-Al-Nashif NSF Center for Autonomic Computing University of Arizona Mohamed Eltoweissy Pacific National
MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS
MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS Tao Yu Department of Computer Science, University of California at Irvine, USA Email: [email protected] Jun-Jang Jeng IBM T.J. Watson
Stretching A Wolfpack Cluster Of Servers For Disaster Tolerance. Dick Wilkins Program Manager Hewlett-Packard Co. Redmond, WA dick_wilkins@hp.
Stretching A Wolfpack Cluster Of Servers For Disaster Tolerance Dick Wilkins Program Manager Hewlett-Packard Co. Redmond, WA [email protected] Motivation WWW access has made many businesses 24 by 7 operations.
From Control Loops to Software
CNRS-VERIMAG Grenoble, France October 2006 Executive Summary Embedded systems realization of control systems by computers Computers are the major medium for realizing controllers There is a gap between
Chapter 1 - Web Server Management and Cluster Topology
Objectives At the end of this chapter, participants will be able to understand: Web server management options provided by Network Deployment Clustered Application Servers Cluster creation and management
Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce
Elastic Application Platform for Market Data Real-Time Analytics Can you deliver real-time pricing, on high-speed market data, for real-time critical for E-Commerce decisions? Market Data Analytics applications
SCALABILITY AND AVAILABILITY
SCALABILITY AND AVAILABILITY Real Systems must be Scalable fast enough to handle the expected load and grow easily when the load grows Available available enough of the time Scalable Scale-up increase
Distributed Databases
Distributed Databases Chapter 1: Introduction Johann Gamper Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems
Safety Requirements Specification Guideline
Safety Requirements Specification Comments on this report are gratefully received by Johan Hedberg at SP Swedish National Testing and Research Institute mailto:[email protected] -1- Summary Safety Requirement
Availability Digest. Stratus Avance Brings Availability to the Edge February 2009
the Availability Digest Stratus Avance Brings Availability to the Edge February 2009 Business continuity has not yet been extended to the Edge. What is the Edge? It is everything outside of the corporate
FIPA agent based network distributed control system
FIPA agent based network distributed control system V.Gyurjyan, D. Abbott, G. Heyes, E. Jastrzembski, C. Timmer, E. Wolin TJNAF, Newport News, VA 23606, USA A control system with the capabilities to combine
Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays
Remote Copy Technology of ETERNUS6000 and ETERNUS3000 Disk Arrays V Tsutomu Akasaka (Manuscript received July 5, 2005) This paper gives an overview of a storage-system remote copy function and the implementation
secure intelligence collection and assessment system Your business technologists. Powering progress
secure intelligence collection and assessment system Your business technologists. Powering progress The decisive advantage for intelligence services The rising mass of data items from multiple sources
Cloud Computing. Up until now
Cloud Computing Lecture 11 Virtualization 2011-2012 Up until now Introduction. Definition of Cloud Computing Grid Computing Content Distribution Networks Map Reduce Cycle-Sharing 1 Process Virtual Machines
Oracle WebLogic Server 11g Administration
Oracle WebLogic Server 11g Administration This course is designed to provide instruction and hands-on practice in installing and configuring Oracle WebLogic Server 11g. These tasks include starting and
Software Architecture Case Study. Air Traffic Control - Designing for High Availability
Software Architecture Case Study Air Traffic Control - Designing for High Availability Air Traffic Control (ATC) Air Traffic Control (ATC) Readings Chapter 6 The problem is to control a very large number
Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
EonStor DS remote replication feature guide
EonStor DS remote replication feature guide White paper Version: 1.0 Updated: Abstract: Remote replication on select EonStor DS storage systems offers strong defense against major disruption to IT continuity,
AndroLIFT: A Tool for Android Application Life Cycles
AndroLIFT: A Tool for Android Application Life Cycles Dominik Franke, Tobias Royé, and Stefan Kowalewski Embedded Software Laboratory Ahornstraße 55, 52074 Aachen, Germany { franke, roye, kowalewski}@embedded.rwth-aachen.de
Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
Disaster Recovery, Business Continuity, and Hosting Solutions for IFS Applications IFS CUSTOMER SUMMIT 2011, CHICAGO
Disaster Recovery, Business Continuity, and Hosting Solutions for IFS Applications IFS CUSTOMER SUMMIT 2011, CHICAGO Richard Francart DIRECTOR INFORMATION TECHNOLOGY SYSTEMS & SERVICES [email protected]
Software Testing Strategies and Techniques
Software Testing Strategies and Techniques Sheetal Thakare 1, Savita Chavan 2, Prof. P. M. Chawan 3 1,2 MTech, Computer Engineering VJTI, Mumbai 3 Associate Professor, Computer Technology Department, VJTI,
Budapest University of Technology and Economics Department of Measurement and Information Systems. Business Process Modeling
Budapest University of Technology and Economics Department of Measurement and Information Systems Business Process Modeling Process, business process Workflow: sequence of given steps executed in order
Managing a Fibre Channel Storage Area Network
Managing a Fibre Channel Storage Area Network Storage Network Management Working Group for Fibre Channel (SNMWG-FC) November 20, 1998 Editor: Steven Wilson Abstract This white paper describes the typical
Applying 4+1 View Architecture with UML 2. White Paper
Applying 4+1 View Architecture with UML 2 White Paper Copyright 2007 FCGSS, all rights reserved. www.fcgss.com Introduction Unified Modeling Language (UML) has been available since 1997, and UML 2 was
A Framework for Highly Available Services Based on Group Communication
A Framework for Highly Available Services Based on Group Communication Alan Fekete [email protected] http://www.cs.usyd.edu.au/ fekete Department of Computer Science F09 University of Sydney 2006,
Value Paper Author: Edgar C. Ramirez. Diverse redundancy used in SIS technology to achieve higher safety integrity
Value Paper Author: Edgar C. Ramirez Diverse redundancy used in SIS technology to achieve higher safety integrity Diverse redundancy used in SIS technology to achieve higher safety integrity Abstract SIS
Review from last time. CS 537 Lecture 3 OS Structure. OS structure. What you should learn from this lecture
Review from last time CS 537 Lecture 3 OS Structure What HW structures are used by the OS? What is a system call? Michael Swift Remzi Arpaci-Dussea, Michael Swift 1 Remzi Arpaci-Dussea, Michael Swift 2
Architectures for Distributed Real-time Systems
SDP Workshop Nashville TN 13 Dec 2001 Architectures for Distributed Real-time Systems Michael W. Masters NSWCDD Building Systems for the Real World What is the Problem? Capability sustainment Affordable
Introduction to Embedded Systems. Software Update Problem
Introduction to Embedded Systems CS/ECE 6780/5780 Al Davis logistics minor Today s topics: more software development issues 1 CS 5780 Software Update Problem Lab machines work let us know if they don t
Cisco Active Network Abstraction Gateway High Availability Solution
. Cisco Active Network Abstraction Gateway High Availability Solution White Paper This white paper describes the Cisco Active Network Abstraction (ANA) Gateway High Availability solution developed and
Managing and Maintaining a Windows Server 2003 Network Environment
Managing and maintaining a Windows Server 2003 Network Environment. AIM This course provides students with knowledge and skills needed to Manage and Maintain a Windows Server 2003 Network Environment.
Suppor&ng the Design of Safety Cri&cal Systems Using AADL
Suppor&ng the Design of Safety Cri&cal Systems Using AADL T. Correa, L. B. Becker, J.- M. Farines, J.- P. Bodeveix, M. Filali, F. Vernadat IRIT LAAS UFSC Agenda Introduc&on Proposed Approach Verifica&on
