Eugene Ciurana geecon@ciurana.eu - pr3d4t0r ##java, irc.freenode.net High-Availability, Fault Tolerance, and Resource Oriented Computing This presentation is available from: http://ciurana.eu/geecon-2010
About Eugene... 15+ years building mission-critical, highavailability systems 14+ years of Java work Open source evangelist Official adoption of open source/linux at Walmart worldwide State of the art main line of business at the largest companies in the world - not a web guy!
What You ll Learn... Decoupled, event-driven, resource-oriented systems are more flexible Avoid tight, point-to-point integration Enhance JVM-based apps with better domain-specific languages How to move away from monolithic app servers and architectures How to implement event-driven systems based by leveraging existing infrastructure and SOA investment Treat computational resources as addressable entities Balance open source vs. commercial products
Very Important! Please Ask Questions! (don t be shy)
What is Scalability? Scalability is the property of a system to: handle bigger amounts of work; or to be easily expanded in response to increased demand network, processing, database, file resources Types of scalability Horizontal (out): add more nodes with identical functionality as existing ones and redistribute the load Vertical (up): expand by adding more cores, main memory, storage, or network interfaces
Horizontal Scalability Load Balancer Scales out Load Balancer Clustering!
Vertical Scalability Virtual 2 Virtual 3 Virtual 2 Virtual 1 Scales up Virtual 1 Virtual 0 Dual Core Single Processor 16 MB RAM Virtual 0 Dual Core Dual Processor 32 MB RAM
What is Availability? How well a system provides useful resources over a set period of time High availability guarantees an absolute degree of functional continuity within a time window Expressed as a relationship between uptime and unplanned downtime A = 100 - (100*D/U); D, U expressed in minutes Beware: uptime!= available
The Nines Game Availability % Downtime (minutes) Downtime/year Vendor jargon 90 52560.00 36.5 days one nine 99 5256.00 3.7 days two nines 99.9 526.60 8.8 hours three nines 99.99 52.56 53 minutes four nines 99.999 5.26 5.3 minutes five nines 99.9999 0.53 32 seconds six nines
Service Level Agreements SLAs are negotiated terms that outline the obligations of the two parties delivering and using a system System type - not all systems require the same SLA Levels of availability Minimum Target Uptime Network Power Maintenance windows Serviceability Performance and metrics Billing SLAs help determine if you scale up or out
Load Balancers They work by spreading requests among two or more resources Implemented in hardware or in software Multiple machines Multiple processes Multiple threads Resources appear as a single device to consumers Can be stateless (web services), or stateful (applications that require session management) Algorithms determine the distribution 1/n == all systems equally likely to service Special requests (e.g. music store) some servers get hit more than others
Load Balancers Consumer Load Balancer 74.0.125.28 Rn R = request n = sequence number R1 R3 R2 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69
Persistent Load Balancers Consumer Consumer Consumer Sticky Load Balancer 74.0.125.28 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69
Load Balancing and Databases Consumer Load Balancer 74.0.125.28 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69 Session Data
Caching Strategies Stateful load balancing requires data sharing Caching distributes popular, shared read-only data Think of them as a giant hash map If the data isn t in the cache, fetch it from database Write policies: write-through: write to the cache AND database write-behind: cache is marked dirty and updated only if a dirty datum is requested no-write allocation: only read requests are cached; assumes data never changes
Caching Usage Pattern Application caching Little or no programmer participation (e.g. Terracotta) Explicit API calls (memcached, Coherence, etc.) Web caching - stores full documents, or fragments ( particles ) on the server or client and are invisible to the client Web accelerators - distribute the load (e.g. CDN like S3, Akamai, etc.) Proxy caches - distribute requests to same resources and may provide filtering/query (e.g. Squid, Apache, ISA servers)
Caching Usage Pattern Begin query Query? update Fetch datum from cache datum is None no Update datum in database yes Query datum from database Add datum to cache Invalidate cache Add or update datum to cache Use datum in app End
Distributed Caching Consumer Load Balancer 74.0.125.28 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69 Load Balanced Configuration or Datagram Cache 0 Cache 1 Cache 2 Cache 3 Database
Clustering Cluster - two or more systems that appear to users as a single system A cluster (horizontally scalable) system is more costeffective than a monolithic single system (vertically scalable) with the same performance characteristics Systems are connected in the cluster over high-speed LANs like Gb Ethernet, FDDI, Infiniband, Myrinet, etc.
A/A Clustering A/A == Active/Active Distribute the load evenly among multiple nodes All nodes offer the same capabilities All nodes are active at the same time Consumer Load Balancer 74.0.125.28 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69
High-Availability A/P Cluster A/P == Active/Passive Provides uninterrupted service through redundant nodes Eliminates single-point-of-failure Two nodes minimum, and heartbeat detection Automatic traffic switch for fail-over Consumer Router 74.0.125.28 Active 192.168.202.55 heartbeat Failover 192.168.202.69 State Data Cache Database replication or clustered database Failover Database
Grid Consumer Master Load Balancer Load Balancer Process loads as independent jobs s don t require data sharing Storage, network may be shared by all nodes Intermediate results have no bearing on other jobs progress Each node is independent Map/Reduce (Hadoop)
Computational Cluster Used for operations that require raw computational power Not good for transactional operations (web, database) Tightly coupled nodes, homogeneous, close proximity Meant to replace supercomputers Consumer Master
Redundancy and Fault Tolerance Redundancy - the expectation that any system component failure is independent of failure in other components Fault tolerance - the system continues to operate in the event of component failure May have decreased throughput Fault tolerance results from SLAs
Fault Tolerance SLA Requirements No single point of failure - redundant components ensure continuous operation Allow repairs without disruption of service Fault isolation - problem detection must pinpoint the specific faulty component Fault propagation containment - problems in one component must not cascade to others Reversion mode - the system can be set back to a known state on command
A/A Cluster Fault Tolerance Consumer Load Balancer 74.0.125.28 Replacement 192.168.202.53 192.168.202.55 192.168.202.66 192.168.202.67 192.168.202.69 Uninterruptible, scalable service (stateless, web services) Failure transparency - though maybe degraded service Ideal for event-based web services (SOAP, REST, JMS, etc.) No dependencies between nodes
A/P Cluster Fault Tolerance Consumer Router 74.0.125.28 192.168.202.55 heartbeat Failover 192.168.202.69 State Data Cache Database Failover Database High availability through redundancy and failure detection Higher cost - used for stateful systems May require active sys- or netadmin participation More moving parts - more things to coordinate
Putting It All Together
ROC Architecture ROC = Resource-Oriented Computing Everything is a resource (computational, data, other) Service Provider (UPS, FedEx) Service Object Web browser Internet Remedy Dedicated API business logic Web app JMS, SOAP, etc. GUI App Transformer Transformer Mule ESB Transformer SOAP JDBC HTTP, XML CRM Product Catalogue Product Support Product Pages Support Product Pages Support Pages TCP pass-through Single Sign-On LDAP, SOAP Mainframe / RACF Active Directory Legacy Auth
SOA and Computational Network
Real-Life Example - LeapFrog USB End-User System (Mac, Windows) LeapFrog Connect Web Browser Third-party Partner Site S3 Content Repository Internet www.leapfrog.com connected products LearningPath Firewall Mule ESB backbone HTTP, SOAP (CXF), REST, etc. routing, filtering, and dispatching; ActiveMQ JMS broker; dedicated LeapFrog services Mule ESB tailbone Connected products SOAP, REST web services Mule ESB funnybone Device log upload, processing, servlet container Content Management System REST, JCR Crowd SSO Customer Data Game play Data Servlets App Logic Device Logs Content Authoring User Credentials
Real-Life Example - LeapFrog Internet Load Balancer Application Server Tomcat 6 Services Proxy Application Server Tomcat 6 Load Balancer - Backbone Backbone - message filtering, routing, dispatching, queuing, events Mule ESB 1.6.2 Mule ESB 1.6.2 Mule ESB 1.6.2 Mule ESB 1.6.2 Load Balancer - Tailbone Load Balancer - Funnybone Load Balancer - Message Broker Mule ESB SOAP, REST Mule ESB SOAP, REST Mule ESB servlet, MTOM Mule ESB servlet, MTOM ActiveMQ ActiveMQ Database NFS share NFS share
Mule SOA Applied Clustering * Two or more Mule instances can provide services, for scalability if there is high demand * Load balanced configuration has built-in fail-over * External apps see a single point of entry: the service endpoint name * Load balancer or proxy sends the request to any available Mule server * Increased demand - add another Mule server without interrupting the existing ones * Decreased demand - remove Mule servers without interrupting other servers * This is an active/active configuration - any server can handle a request at any time * Assumes that the service application components are stateless External Applications http://server.mycompany.com/service_call Load Balancer http://mule_server_1/service_call http://mule_server_2/service_call Mule ESB as Application Container 1 Mule ESB as Application Container 2 Service 1 Service 2 Service 3 Service 1 Service 2 Service 3
Mule SOA - ESB App Failover * A/A configuration uses the load balancer to dispatch service calls * The load balancer takes a failing service out of rotation automatically * Failure reason no. 1: network connectivity * Failure reason no. 2: Mule container * Failure reason no. 3: Service application bug External Applications http://server.mycompany.com/service_call Load Balancer http://mule_server_1/service_call http://mule_server_2/service_call Mule ESB as Application Container 1 Mule ESB as Application Container 2 Service 1 Service 2 Service 3 Service 1 Service 2 Service 3
Uninterrupted Application Updates * Allow stopping and deploying new application functionality without stopping services * Allow upgrades to a country's configuration without affecting other countries or stopping services Load Balancer Mule ESB as Application version 1.4 Mule ESB as Application version 1.4 Load Balancer Mule ESB as Application version 2.0 Mule ESB as Application version 1.4 time Load Balancer Mule ESB as Application version 2.0 Mule ESB as Application version 1.4 Load Balancer Mule ESB as Application version 2.0 Mule ESB as Application version 2.0
Database Replication Primary Cluster 0 1 ESB as app services provider Partition 0 Partition 1 DB 0 DB 1 DB 0b DB 1b
Application Deployment Load Balancer Load Balancer Mule 1 Mule 2 Mule 3 Mule 4 Mule 5 Failover JMS Queuing Active JMS Queuing Active
Application Deployment This architecture has a lower cost of operation and simplifies power consumption and administration. Application 1 Application 2 Web Service 1 Web Service 2 JBoss Mule ESB Container MQ Java 6 Java 6 Java 6 Linux Linux Linux Virtual Machine Virtual Machine Virtual Machine Multi-Core Intel or AMD Processors Simplify the architecture by having a common platform for all systems. This platform can be replicated across multiple data centers. * Virtual Machine: VMware or Xen hosted on Windows; consider Amazon EC2 as a viable, low-cost alternative * Linux: Ubuntu Server * PowerBuilder applications (end-user) migrate to JBoss + Wicket or a similar configuration * All web services are hosted by Mule ESB * The Mule ESB and JBoss servers are separate from one another * MQ clusters have a similar architecture; JBoss messaging and Websphere MQ * Java 6 as a minimum
Application Deployment App and service requests may come from the open Internet Each data center will have a cluster of two or more physical systems. Internet Each system will virtually host two or more applications/ environments deployed as described in the previous diagram. The system is designed for horizontal scalability (more traffic, more virtual or physical servers. The system has inherent fail-over built in. App Balancer Use physical load balancers; can be Linux systems or dedicated F5 balancers - separate from cluseter Services Balancer MQ Master Application Active Web Services Active Distributed Cache MQ Slave Application Active Web Services Active Distributed Cache Virtual Host (Intel, AMD) Virtual Host (Intel, AMD) Disk Disk SAN
Application Deployment Data Center Europe Data Center Japan App Cluster App Cluster Internet App Cluster App Cluster Expert Claims Mgmt Data Center US App Cluster App Cluster Each data center has an application cluster The app clusters have identical configurations; only the app itself may vary by locale Designated data center also functions as the global services processing hub; all applications talk to this cluster (e.g. Claims Management) regardless of where the app calling them is from. The global services clusters are separate physically and logically from the application clusters which may include locale-specific web services and data stores. Claims Mgmt Informix Legacy System Legacy System Legacy System
Application Deployment Primary Cluster Secondary Cluster 0 1 0 1 ESB as app services provider q u e u e ESB as app services provider Partition 0 Partition 1 Partition 0 Partition 1 DB 0 DB 1 DB 0 DB 1 DB 0b DB 1b DB 0b DB 1b Enterprise Service Bus (routing, queuing, transformation, transactions, dispatching)
Eugene Ciurana geecon@ciurana.eu - pr3d4t0r ##java, irc.freenode.net http://ciurana.eu/scalablesystems Q&A Comments? Anything else? This presentation is available from: http://ciurana.eu/geecon-2010 Twitter: ciurana