BUILDING HIGH-AVAILABILITY SERVICES IN JAVA

Similar documents

LinuxWorld Conference & Expo Server Farms and XML Web Services

A High Availability Clusters Model Combined with Load Balancing and Shared Storage Technologies for Web Servers

EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers

Module 14: Scalability and High Availability

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

High-Availability, Fault Tolerance, and Resource Oriented Computing

HRG Assessment: Stratus everrun Enterprise

High Availability with Elixir

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

Creating A Highly Available Database Solution

Persistent, Reliable JMS Messaging Integrated Into Voyager s Distributed Application Platform

ORACLE COHERENCE 12CR2

CHAPTER 2 BACKGROUND AND OBJECTIVE OF PRESENT WORK

WITH BIGMEMORY WEBMETHODS. Introduction

Fax Server Cluster Configuration

Cloud Based Application Architectures using Smart Computing

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

Ecomm Enterprise High Availability Solution. Ecomm Enterprise High Availability Solution (EEHAS) Page 1 of 7

<Insert Picture Here> WebLogic High Availability Infrastructure WebLogic Server 11gR1 Labs

HA / DR Jargon Buster High Availability / Disaster Recovery

High-Availablility Infrastructure Architecture Web Hosting Transition

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

WSO2 Message Broker. Scalable persistent Messaging System

How To Improve Your Communication With An Informatica Ultra Messaging Streaming Edition

Vess A2000 Series HA Surveillance with Milestone XProtect VMS Version 1.0

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

The functionality and advantages of a high-availability file server system

Enterprise Integration

XTM Web 2.0 Enterprise Architecture Hardware Implementation Guidelines. A.Zydroń 18 April Page 1 of 12

Best Practices for Implementing High Availability for SAS 9.4

EMC DATA PROTECTION FOR SAP HANA

Blackboard Managed Hosting SM Disaster Recovery Planning Document

High Availability Cluster for RC18015xs+

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

The Service Availability Forum Specification for High Availability Middleware

KillTest. 半年免费更新服务

Tier Architectures. Kathleen Durant CS 3200

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators

Middleware Platforms for Application Development: A Product Comparison

Skelta BPM and High Availability

Distributed File Systems

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

Red Hat Enterprise linux 5 Continuous Availability

Cloud Server. Parallels. Key Features and Benefits. White Paper.

The Importance of Software License Server Monitoring White Paper

Enterprise Planning Large Scale ARGUS Enterprise /29/2015 ARGUS Software An Altus Group Company

Using Multipathing Technology to Achieve a High Availability Solution

Code:1Z Titre: Oracle WebLogic. Version: Demo. Server 12c Essentials.

Building a Reliable Messaging Infrastructure with Apache ActiveMQ

ACHIEVING 100% UPTIME WITH A CLOUD-BASED CONTACT CENTER

Integration of PRIMECLUSTER and Mission- Critical IA Server PRIMEQUEST

MaximumOnTM. Bringing High Availability to a New Level. Introducing the Comm100 Live Chat Patent Pending MaximumOn TM Technology

COST-BENEFIT ANALYSIS: HIGH AVAILABILITY IN THE CLOUD AVI FREEDMAN, TECHNICAL ADVISOR. a white paper by

Contents. SnapComms Data Protection Recommendations

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

ORACLE DATABASE HIGH AVAILABILITY STRATEGY, ARCHITECTURE AND SOLUTIONS

Jive and High-Availability

Pervasive PSQL Meets Critical Business Requirements

High Availability for Citrix XenApp

Introduction to Virtualization. Paul A. Strassmann George Mason University October 29, 2008, 7:20 to 10:00 PM

VDI can reduce costs, simplify systems and provide a less frustrating experience for users.

Maximum Availability Architecture. Oracle Best Practices For High Availability. Backup and Recovery Scenarios for Oracle WebLogic Server: 10.

Oracle Database Solutions on VMware High Availability. Business Continuance of SAP Solutions on Vmware vsphere

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

HARVARD RESEARCH GROUP, Inc.

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions

Veritas Cluster Server from Symantec

Online Transaction Processing in SQL Server 2008

ORACLE DATABASE 10G ENTERPRISE EDITION

Disaster Recovery for Oracle Database

Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source

Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008

Service Mediation. The Role of an Enterprise Service Bus in an SOA

Meeting Management Solution. Technology and Security Overview N. Dale Mabry Hwy Suite 115 Tampa, FL Ext 702

How To Use The Dcml Framework

An Oracle White Paper October BI Publisher 11g Scheduling & Apache ActiveMQ as JMS Provider

Epimorphics Linked Data Publishing Platform

Oracle Databases on VMware High Availability

A High Availability Clusters Model Combined with Load Balancing and Shared Storage Technologies for Web Servers

Total Business Continuity with Cyberoam High Availability

The Promise of Virtualization for Availability, High Availability, and Disaster Recovery - Myth or Reality?

<Insert Picture Here> Oracle In-Memory Database Cache Overview

VALLIAMMAI ENGNIEERING COLLEGE SRM Nagar, Kattankulathur

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Software Performance, Scalability, and Availability Specifications V 3.0

VoIP Logic: Disaster Recovery and Resiliency

EMC VPLEX FAMILY. Transparent information mobility within, across, and between data centers ESSENTIALS A STORAGE PLATFORM FOR THE PRIVATE CLOUD

COMPARISON OF VMware VSHPERE HA/FT vs stratus

Transcription:

BUILDING HIGH-AVAILABILITY SERVICES IN JAVA MATTHIAS BRÄGER CERN GS-ASE Matthias.Braeger@cern.ch

AGENDA Measuring service availability Java Messaging Shared memory solutions Deployment Examples Summary 2

WHAT IS HIGH AVAILABILITY? 3

AVAILABILITY Failures happen! How do you build reliable systems regardless? How do you provide continuous, uninterrupted service? 4

THE USS YORKTOWN BUG 5

HUGE NEEDS FOR HA SYSTEMS 6

MEASURING SERVICE AVAILABILITY 7

CALCULATING AVAILABILITY Availability is usually expressed in percentage of uptime in a given year Uptime and availability are not synonymous! Example: A system can be up, but not available, as in the case of a network outage. The impact of unavailability varies with its time of occurrence 8

SCHEDULED AND UNSCHEDULED DOWNTIME (1/2) Scheduled downtime: Result of some logical, management-initiated event Examples: Patches to the system software that require reboot System configuration changes that require reboot 9

SCHEDULED AND UNSCHEDULED DOWNTIME (2/2) Unscheduled downtime: Usually arise from some physical event Examples: Hardware failure (power outages, failed CPU or RAM components, etc.) Software failure (application, middleware and operating system failures) Environmental anomaly (over-temperature related shutdown, logically or physically severed network connections, catastrophic security breaches) 10

CLASS OF NINES Availability % Downtime per year Downtime per month Downtime per week 90% ("one nine") 36.5 days 72 hours 16.8 hours 99% ("two nines") 3.65 days 7.20 hours 1.68 hours 99.5% 1.83 days 3.60 hours 50.4 minutes 99.9% ("three nines") 8.76 hours 43.8 minutes 10.1 minutes 99.95% 4.38 hours 21.56 minutes 5.04 minutes 99.99% ("four nines") 52.56 minutes 4.32 minutes 1.01 minutes 99.999% ("five nines") 5.26 minutes 25.9 seconds 6.05 seconds 99.9999% ("six nines") 31.5 seconds 2.59 seconds 0.605 seconds 99.99999% ("seven nines") 3.15 seconds 0.259 seconds 0.0605 seconds 11

AVAILABILITY ENVIRONMENT CLASSIFICATION (AEC) HRG* Class Indication Availability Description AEC-0 Conventional Service can be interrupted, data integrity is not essential AEC-1 AEC-2 Highly Reliable High Availability 99% Service can be interrupted, data integrity must be assured 99.9% Service is only allowed to be interrupted within scheduled time windows or minimal at main runtime AEC-3 Fault Resilient 99.99% Service must be assured without any downtime within well defined time windows or at main runtime AEC-4 Fault Tolerant 99.999% Service must be guaranteed without interruption, 24/7 service must be assured AEC-5 Disaster Tolerant 99.9999% Service must be available under all circumstances * Introduced by the Havard Research Group (HRG) 12

REASONS FOR UNAVAILABILITY OF ENTERPRISE IT SYSTEMS Lack of best practice: 1. change control 2. monitoring of the relevant components 3. requirements and procurement 4. operations 5. avoidance of network failures 6. avoidance of internal application failures 7. avoidance of external services that fail 8. physical environment 9. network redundancy 10. technical solution of backup 11. process solution of backup 12. physical location 13. infrastructure redundancy 14. storage architecture redundancy (From a survey among academic availability experts in 2010) 13

REACHING HIGH-AVAILABILITY High availability implies no human intervention to restore operation in complex systems. Example: Availability limit of 99.999% allows about one second of down time per day. The need for human intervention for maintenance actions in a large system will exceed this limit. 14

REACHING HIGH-AVAILABILITY Avoid Single-Point-of-Failure risks Redundancy of system critical components Passive redundancy, e.g. boat with two separate engines Active redundancy, e.g. Internet routing Fault-tolerance and robustness of the overall system Exhaustive testing before going in operation! Quickly reachable experts Good error messages and quick communication system Enough hardware spare-parts 15

SERVICE LEVEL AGREEMENTS (SLA) SLA are used to define the availability of a given service. Many systems have to be available 24/7 but some need high-availability only within certain time windows Example: Trading system of a stock market do not to be available on weekends or bank holidays. 16

JAVA MESSAGING 17

WHAT IS MESSAGING? Method of communication between software components or applications Messaging enables distributed communication that is loosely coupled Anonymous communication Sender and the receiver do not have to be available at the same time 18

WHAT IS THE JMS API? The Java Message Service is a Java API that allows applications to create, send, receive, and read messages Loosely coupling Asynchronous Reliable 19

WHEN CAN YOU USE JMS? The provider wants the components not to depend on information about other components interfaces The provider wants the application to run whether or not all components are up and running simultaneously. The application business model allows a component to send information to another and to continue to operate without receiving an immediate response. 20

JMS TECHNICAL TERMS Brokers: A JMS broker provides clients with connectivity, and message storage/delivery functions. Messages: A messages is an object that contains the required heading fields, optional properties, and data payload being transferred between JMS clients. Destinations: Destinations are maintained by the message broker. They can be either queues or topics. 21

MESSAGING MODELS (1/2) Point-to-Point Messaging Each message has only one consumer A sender and a receiver of a message have no timing dependencies The receiver acknowledges the successful processing of a message JMS allows messages to expire 22

MESSAGING MODELS (2/2) Publish/Subscribe Messaging Supports publishing messages to a particular message topic Neither the publisher nor the subscriber knows about each other Each message can have multiple consumers A client that subscribes to a topic can consume only messages published after the client has created a subscription 23

METHODS FOR DECREASING COUPLING Communication objects need to be serialized before sending and deserialized after sending How do I avoid unneeded clients restarts, when the communication object changes? Problem: Older versions of an application would throw exceptions when asked to deserialize new versions of the old object type. Newer versions of an application would throw exceptions when deserializing older versions of a type with missing data. Solution: Java serialized objects: Always define the serialversionuid or use XML or JSON for messaging! Version tolerant and better to handle 24

FREE JMS DISTRIBUTIONS Apache ActiveMQ OpenSource, well documented Provides API for different languages (Java, C++, Python, ) Apache Apollo ActiveMQ's next generation of messaging OpenJMS OpenMQ by Oracle StormMQ, cloud solution 25

SHARED MEMORY (DISTRIBUTED CACHING) 26

DEFINITION In computing, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient way of passing data Shared memory can be used to realize load balanced, redundant systems 27

IN-MEMORY DATABASES (IMDB) IMDB is a database management system that primarily relies on main memory Faster than disk-optimized databases Use cases: Applications where response time is critical Independence from the reference database 28

WHY SHARED MEMORY? Share data/state among many servers (e.g. web session sharing) Cache data (distributed cache) Cluster applications Provide secure communication among servers Distribute workload onto many servers Take advantage of parallel processing Provide fail-safe data management 29

SHARED MEMORY PRODUCTS FOR JAVA Hazlecast Peer-to-peer solution Based on java.util.{queue, Set, List, Map} Community edition available Terracotta Scalable array of in-memory cache servers Based on Java caching standard EHCache Allows caching over JVM memory limits Free version with limited functionalities available Memcached Free & open source, designed for dynamic web applications Simple solution for read-only use cases, but not designed for parallel read-write access Memcached server is atomic and not aware of other servers è no automatic failover Other Proprietary Solutions Oracle Coherence, JCache compliant SAP Hana 30

MEMCACHED EXAMPLE 31

TERRACOTTA ARCHITECTURE 32

EXAMPLES 33

SCENARIO 1: SIMPLE MONITORING Client Client Client Scenario 1: moderate data size high throughput short maintenance stops availability not critical low budget JMS broker SERVER same or different brokers JMS broker DAQ process DAQ process 34

SCENARIO 2: HIGH AVAILABILITY MONITORING Client Client Client Scenario 2: moderate data size average throughput min service interrupts high availability low budget JMS broker Terracotta standby JMS broker SERVER 1 Terracotta SERVER 2 JMS broker Clustered JMS brokers JMS broker DAQ process DAQ process DAQ process DAQ process 35

SCENARIO 3: BIG DATA MONITORING Client Client Client Client Scenario 3: large data set high throughput min service interrupts high availability JMS broker 1 Terracotta server array JMS broker n SERVER 1 SERVER m JMS broker 1 JMS broker k DAQ process DAQ process DAQ process DAQ process 36

SCENARIO 4: DISTRIBUTED STATELESS SYSTEM Client 1 Client k Scenario 4: Stateless (mirrored) daemons anonymous, asynchronous communication JMS broker 1 JMS broker n Daemon 1 Daemon m 37

SUMMARY WHAT DID WE LEARN? 38

SUMMARY Service availability Needs to be well defined within Service Level Agreement Measuring non-trivial and has to taken into account SLA High availability implies no human intervention to restore operation in complex systems JMS Provides anonymous, reliable messaging Suitable middleware for high-availability services Shared memory Simultaneously accessed by multiple programs (cluster) Can be used to realize In-Memory databases Allows realization of parallel processing 39

QUESTIONS? THANK YOU FOR YOUR ATTENTION! Matthias.Braeger@cern.ch 40