StreamBase High Availability

Similar documents
TIBCO StreamBase High Availability Deploy Mission-Critical TIBCO StreamBase Applications in a Fault Tolerant Configuration

TIBCO StreamBase High Availability Deploy Mission-Critical TIBCO StreamBase Applications in a Fault Tolerant Configuration

Performance & Scalability Characterization. By Richard Tibbetts Co-Founder and Chief Architect, StreamBase Systems, Inc.

StreamBase White Paper StreamBase versus Native Threading. By Richard Tibbetts Co-Founder and Chief Technology Officer, StreamBase Systems, Inc.

Case Study: BlueCrest Capital Management Ltd.

Pervasive PSQL Meets Critical Business Requirements

Symantec and VMware: Virtualizing Business Critical Applications with Confidence WHITE PAPER

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

Non-Native Options for High Availability

Confidently Virtualize Business-critical Applications in Microsoft Hyper-V with Symantec ApplicationHA

Oracle Databases on VMware High Availability

Confidently Virtualize Business-Critical Applications in Microsoft

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Veritas Cluster Server from Symantec

SOLUTION BRIEF. TIBCO StreamBase for Algorithmic Trading

Database Resilience at ISPs. High-Availability. White Paper

Oracle Database Solutions on VMware High Availability. Business Continuance of SAP Solutions on Vmware vsphere

High Availability and Disaster Recovery Solutions for Perforce

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

TABLE OF CONTENTS THE SHAREPOINT MVP GUIDE TO ACHIEVING HIGH AVAILABILITY FOR SHAREPOINT DATA. Introduction. Examining Third-Party Replication Models

FioranoMQ 9. High Availability Guide

Veritas Cluster Server by Symantec

Course Syllabus. Maintaining a Microsoft SQL Server 2005 Database. At Course Completion

High Availability And Disaster Recovery

Recover a CounterPoint Database

The Aspect Unified IP Five 9s Environment

Veritas InfoScale Availability

MS Design, Optimize and Maintain Database for Microsoft SQL Server 2008

DeltaV Virtualization High Availability and Disaster Recovery

Consolidation of Disaster Recovery Replica Servers

Windows Geo-Clustering: SQL Server

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

SQL SERVER ADVANCED PROTECTION AND FAST RECOVERY WITH EQUALLOGIC AUTO-SNAPSHOT MANAGER

Comparing MySQL and Postgres 9.0 Replication

Windows Server Failover Clustering April 2010

John D. Bonam Disaster Recovery Architecture Session # 2841

Creating A Highly Available Database Solution

Availability Guide for Deploying SQL Server on VMware vsphere. August 2009

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

Module 14: Scalability and High Availability

Fax Server Cluster Configuration

Scaling Healthcare Applications to Meet Rising Challenges of Healthcare IT

The functionality and advantages of a high-availability file server system

Westek Technology Snapshot and HA iscsi Replication Suite

Five Secrets to SQL Server Availability

Cisco Active Network Abstraction Gateway High Availability Solution

Protect SAP HANA Based on SUSE Linux Enterprise Server with SEP sesam

Maximum Availability Architecture. Oracle Best Practices For High Availability. Backup and Recovery Scenarios for Oracle WebLogic Server: 10.

SanDisk ION Accelerator High Availability

WHAT IS GLOBAL OPERATION? EBOOK

VERITAS Business Solutions. for DB2

High Availability Database Solutions. for PostgreSQL & Postgres Plus

Whitepaper - Disaster Recovery with StepWise

TIBCO ActiveSpaces Use Cases How in-memory computing supercharges your infrastructure

Automatic Service Migration in WebLogic Server An Oracle White Paper July 2008

Informatica MDM High Availability Solution

Executive Brief Infor Cloverleaf High Availability. Downtime is not an option

Explain how to prepare the hardware and other resources necessary to install SQL Server. Install SQL Server. Manage and configure SQL Server.

Contingency Planning and Disaster Recovery

Designing a Cloud Storage System

Symantec Cluster Server powered by Veritas

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

Executive Summary WHAT IS DRIVING THE PUSH FOR HIGH AVAILABILITY?

Building Reliable, Scalable AR System Solutions. High-Availability. White Paper

SQL Server AlwaysOn Deep Dive for SharePoint Administrators

Maintaining a Microsoft SQL Server 2008 Database

Comparing TCO for Mission Critical Linux and NonStop

DATABASES AND ERP SELECTION: ORACLE VS SQL SERVER

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Double-Take Replication in the VMware Environment: Building DR solutions using Double-Take and VMware Infrastructure and VMware Server

Testing Big data is one of the biggest

Three Ways Enterprises are Protecting SQL Server in the Cloud

ORACLE DATABASE 10G ENTERPRISE EDITION

Microsoft SQL Server Always On Technologies

Online Firm Improves Performance, Customer Service with Mission-Critical Storage Solution

Oracle Exam 1z0-599 Oracle WebLogic Server 12c Essentials Version: 6.4 [ Total Questions: 91 ]

Course Syllabus. At Course Completion

Rackspace Cloud Databases and Container-based Virtualization

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Tushar Joshi Turtle Networks Ltd

StreamBase White Paper Real-Time Profit & Loss

Administering and Managing Log Shipping

Designing a Data Solution with Microsoft SQL Server 2014

Reducing the Cost and Complexity of Business Continuity and Disaster Recovery for

Assuring High Availability in Healthcare Interfacing Considerations and Approach

EMC DATA PROTECTION FOR SAP HANA

Designing Database Solutions for Microsoft SQL Server 2012 MOC 20465

Eliminating End User and Application Downtime:

VARONIS CASE STUDY. Arnold Worldwide

Implementing efficient system i data integration within your SOA. The Right Time for Real-Time

Active-Active and High Availability

Application Continuity with BMC Control-M Workload Automation: Disaster Recovery and High Availability Primer

Neverfail Solutions for VMware: Continuous Availability for Mission-Critical Applications throughout the Virtual Lifecycle

Syslog Analyzer ABOUT US. Member of the TeleManagement Forum

High Availability and Clustering

SQL SERVER ADVANCED PROTECTION AND FAST RECOVERY WITH DELL EQUALLOGIC AUTO SNAPSHOT MANAGER

be architected pool of servers reliability and

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Transcription:

StreamBase High Availability Deploy Mission-Critical StreamBase Applications in a Fault Tolerant Configuration By Richard Tibbetts Chief Technology Officer, StreamBase Systems

StreamBase High Availability The StreamBase Event Processing Platform provides the ability for customers to deploy mission-critical StreamBase applications in a fault tolerant configuration. High Availability is a broad term, and is used here to mean that the system will continue to operate correctly in conditions such as: Hardware failure of a server running StreamBase Unexpected software failure in the StreamBase Event Processing Platform Administrative error on a server, such as accidental shutdown Failure of network links Removing a server from production for maintenance Correct operation means that the system will continue to correctly process new events and that downstream systems will be insulated from a gap in processing. Overview The principal technique for fault tolerance is to deploy a pair of servers, both executing the same application, one running as the primary server and one as the backup. (see Figure A) Data from upstream sources is delivered to both servers. The application running on the primary server is responsible for actual decisions, and for producing the results that are delivered to downstream systems. The application running on the backup executes all calculations and maintains state, but under normal operation does not deliver results to the downstream system. If the application is nondeterministic, such as making trades based on market timings, then application state can optionally be synchronized from the primary to the backup server. State synchronization assures that the backup server state correctly reflects the processing on the primary and the results already delivered to downstream systems. Figure A StreamBase High Availability In the event of failure on the primary, the backup server will take over as primary. The backup server uses a 2

system of heartbeats to monitor the health of the primary. When the primary stops responding to heartbeats, the backup server determines that a failure has occurred. At this point, the backup server stops listening to state updates from the primary, and begins delivering messages to the downstream system, optionally filling in messages that the downstream system missed. To return to a highly available state, a new backup server can be introduced and synchronized with the nowprimary server. Once synchronization is complete, the system is returned to safety. Components Several components of the StreamBase Event Processing Platform work together to create fault tolerant deployments. (see Figure B) To characterize StreamBase High Availability capabilities in more detail, we describe some of the components below. Primary StreamBase Server Application Container Input Input Input Application Logic Output Input Input HA Container HA Events Heartbeat System Container stats control Heartbeats Secondary StreamBase Server Monitoring HA Container HA Events Heartbeat System Container stats control Application Container Input Input Input Application Logic Output Input Input Figure B Components of StreamBase High Availability Containers Each running StreamBase Server can run multiple StreamBase applications, each one of which runs in its own container. A container provides the system services needed by its application, and affords management 3

functionality, such as starting and stopping applications. The ability to manage containers independently is a key part of StreamBase High Availability. Leadership Leadership is a flag that will be set on exactly one of the nodes in the pair. Downstream clients and backup nodes will respect the leader s output messages and state updates over those of other nodes. The leadership flag is maintained by the logic in the HA container. HA Container Within a server participating in a highly available server pair, a special container called the HA container will exist. This container executes an HA application, which is responsible for the event processing logic that implements the high availability strategy. This application is specified in StreamBase Event Flow and provided as part of the StreamBase platform. When failure of the primary is detected by the backup, the HA container in the backup decides on the new leader and triggers HA Events which tell applications and clients that a leadership change has occurred. The HA container also publishes events that can be read by clients that want visibility into this recovery process. Heartbeats Heartbeats are passed between the HA containers of the two nodes in the pair. These heartbeats come at a configurable interval (generally 10 times per second) and are used to identify a failure in the primary server. When a configurable number of heartbeats are missed (generally 10) the backup server will initiate failover. Third-Party Components StreamBase High Availability can be deployed in a pure StreamBase environment, or it can be integrated with additional third-party components. Highly Available RDBMS Many StreamBase applications interact with a traditional relational database management system (RDBMS). These systems often provide high availability options, such as Oracle RAC. In a highly available deployment, StreamBase will integrate with the failover strategy of these systems through standard or proprietary APIs, to assure that data is not lost and processing continues. Storage Area Network (SAN) Many sites make heavy use of SAN in their high availability or disaster recovery strategies. StreamBase can persist table data to a SAN, and use that data when a backup server takes over. StreamBase can also log to a SAN using an efficient binary representation for recovery or auditing. Multicast Messaging Messaging systems such as RMDS and Wombat can use multicast to deliver the same messages to multiple systems simultaneously. A StreamBase primary and backup server can both listen to these messages as a way of assuring that the backup server sees all the messages. 4

Reliable Messaging Third-party reliable messaging products, such as Tibco EMS or JMS can be used in place of StreamBase network connections through the use of adapters. StreamBase integrates with the failover strategy of these message queuing products, and will acknowledge the messages at the right stage of event processing. Hardware Virtualization StreamBase can easily be deployed on virtualized hardware. In multi-tenant situations sufficient processing power must be available in order to achieve real-time goals, but with StreamBase this is easily accomplished. Disaster Recovery Disaster recovery is Primary Site separate from fault tolerance, because the time scales are different and the faults are larger. In a disaster recovery scenario, an entire operational site is Market Data Order Flow Primary Secondary OMS disrupted by a natural disaster or infrastructure failure. Because the upstream and downstream systems feeding StreamBase are likely to be Shared Storage (SAN) RDBMS disrupted similarly, and this SAN Replication RDBMS Replication event is very infrequent, immediate and automatic failover is not generally desirable. Disaster Recovery Site Instead, administrative procedures and operator intervention come into play. (See Figure C) The StreamBase server is generally restarted as part of a larger procedure. Persistent storage, such as a SAN, is used to recover critical application state, such as executed orders. For some applications, the disaster recovery procedure is asymmetric, where the Market Data Order Flow Primary Shared Storage (SAN) Figure C Components of StreamBase Disaster Recovery Secondary RDBMS OMS 5

application running at the recovery site closes out open positions, and then ceases operation. But for many applications, the disaster recovery site achieves full operability and will continue as the primary site for days or weeks. In either case, StreamBase easily fits into existing business continuity planning. Additional Information Additional information is available in the following StreamBase documentation topics: Primer on StreamBase Clustering and HA This webpage is the primary starting point for understanding clustering and HA in StreamBase. This page provides an overview of the collection of features that can be combined in various ways to support clustered and highly available servers in pursuance of different design goals. Visit: http://streambase.com/developers/docs/latest/admin/ha-overview.html Designing Highly Available Applications This webpage describes the application design patterns that HA applications have in common. Visit: http://streambase.com/developers/docs/latest/admin/ha-designing.html 6

About Richard Tibbetts Richard Tibbetts is co-founder and Chief Technology Officer at StreamBase Systems. Throughout his career, Tibbetts has concentrated on the design and development of database technologies, while also offering specialization in program analysis and distributed computing. Tibbetts created the Linear Road Benchmark, the database management industry s guide for comparing the performance of real-time streaming databases with alternative systems, like relational databases. In addition, he has been heavily involved with a number of academic database projects, including Aurora at Brown University, and Medusa at the Massachusetts Institute of Technology (MIT). In 2003, the research from these two projects later led to the founding and commercialization of StreamBase s Stream Processing Engine. As a co-founder and Chief Technology Officer at StreamBase, Tibbetts is directly involved in architecting the foundation for the next-generation textual language based on Structured Query Language (SQL). The objective of this work is to apply the benefits of SQL for stored data to real-time, transitory data. Tibbetts is also charged with global responsibilities in furthering new StreamBase capabilities through platform and technical innovation. Tibbetts earned both his bachelor of science in computer science and engineering and his master s of engineering degree at MIT. He completed his graduate research under the direction of Dr. Mike Stonebraker. About StreamBase StreamBase Systems, Inc, a leader in high-performance Complex Event Processing (CEP), provides software for rapidly building systems that analyze and act on real-time streaming data for instantaneous decision-making. StreamBase s Event Processing Platform combines a rapid application development environment, an ultra low-latency high-throughput event server, and the broadest connectivity to real-time and historical data and leading EMS/ OMS software platforms. Six of the top ten Wall Street investment banks and three of the top five hedge funds use StreamBase to power mission-critical applications to increase revenue, lower costs, and reduce risk. It is also used by government agencies for highly specialized intelligence work. The company is headquartered in Lexington, Massachusetts with European offices in London. For more information, visit www.streambase.com. Corporate Headquarters 181 Spring Street Lexington, MA 02421 +1 (866) 787-6227 New York Office 845 Third Avenue, 6th Floor New York, NY 10022 +1 (866) 787-6227 Virginia Office 11921 Freedom Drive, Suite 550 Reston, VA 20190 +1 (443) 472-0767 European Headquarters 60 Cannon Street London EC4N 6NP +44 (0) 20 7002 1095 www.streambase.com 2009 StreamBase Systems, Inc. All rights reserved. StreamBase, StreamBase Studio, and When Now Means Right Now are trademarks of StreamBase Systems, Inc. All other trademarks are the property of their respective owners. 7