Clustering/HA. Introduction, Helium Capabilities, and Potential Lithium Enhancements. www.opendaylight.org



Similar documents
Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Implementing and Managing Windows Server 2008 Clustering

Understanding The Brocade SDN Controller Architecture

Towards Smart and Intelligent SDN Controller

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Technical Overview Simple, Scalable, Object Storage Software

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

In Memory Accelerator for MongoDB

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

Module 14: Scalability and High Availability

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

Building Reliable, Scalable AR System Solutions. High-Availability. White Paper

MS Design, Optimize and Maintain Database for Microsoft SQL Server 2008

Distributed Systems. Tutorial 12 Cassandra

Red Hat Cluster Suite

Windows Server Failover Clustering April 2010

CDAT Overview. Remote Managing and Monitoring of SESM Applications. Remote Managing CHAPTER

Appendix A Core Concepts in SQL Server High Availability and Replication

Deploying BDR. Simon Riggs CTO, 2ndQuadrant & Major Developer, PostgreSQL. February 2015

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

You need to recommend a monitoring solution to ensure that an administrator can review the availability information of Service1. What should you do?

Above the Clouds: Introducing Akka

VMware vsphere: [V5.5] Admin Training

Wisdom from Crowds of Machines

Deployment Topologies

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Oracle WebLogic Server 11g Administration

High Availability Design Patterns

Scaling Out With Apache Spark. DTL Meeting Slides based on


XtreemFS Extreme cloud file system?! Udo Seidel

Citrix XenDesktop Backups with Xen & Now by SEP

SQL Server 2012/2014 AlwaysOn Availability Group

Oracle Database 11g: New Features for Administrators DBA Release 2

Design and Evolution of the Apache Hadoop File System(HDFS)

Application Centric Infrastructure Object-Oriented Data Model: Gain Advanced Network Control and Programmability

Configuring the BIG-IP and Check Point VPN-1 /FireWall-1

ONOS Open Network Operating System

Objectif. Participant. Prérequis. Pédagogie. Oracle Database 11g - New Features for Administrators Release 2. 5 Jours [35 Heures]

HDFS Users Guide. Table of contents

CitusDB Architecture for Real-Time Big Data

CONSUL AS A MONITORING SERVICE

Oracle 11g New Features - OCP Upgrade Exam

OpenAM. 1 open source 1 community experience distilled. Single Sign-On (SSO) tool for securing your web. applications in a fast and easy way

Windows Server. Introduction to Windows Server 2008 and Windows Server 2008 R2

Citrix XenServer Backups with Xen & Now by SEP

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

In-Memory BigData. Summer 2012, Technology Overview

High Availability Solutions for the MariaDB and MySQL Database

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance.

TECHNOLOGY WHITE PAPER Jun 2012

Building your Server for High Availability and Disaster Recovery. Witt Mathot Danny Krouk

YouTube Vitess. Cloud-Native MySQL. Oracle OpenWorld Conference October 26, Anthony Yeh, Software Engineer, YouTube.

PROTECTING DATA IN TRANSIT WITH ENCRYPTION IN M-FILES

MS SQL Server 2014 New Features and Database Administration

The Design and Implementation of the Zetta Storage Service. October 27, 2009

Mesos: A Platform for Fine- Grained Resource Sharing in Data Centers (II)

Design for Failure High Availability Architectures using AWS

A Performance Analysis of Distributed Indexing using Terrier

Deploying complex applications to Google Cloud. Olia Kerzhner

CHAPTER 1 - JAVA EE OVERVIEW FOR ADMINISTRATORS

Planning and Administering Windows Server 2008 Servers

Configure AlwaysOn Failover Cluster Instances (SQL Server) using InfoSphere Data Replication Change Data Capture (CDC) on Windows Server 2012

Mail-SeCure Load Balancing

VMware vsphere: Install, Configure, Manage [V5.0]

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

Clustering ExtremeZ-IP 4.1

Savanna Hadoop on. OpenStack. Savanna Technical Lead

NoSQL in der Cloud Why? Andreas Hartmann

nwstor Storage Security Solution 1. Executive Summary 2. Need for Data Security 3. Solution: nwstor isav Storage Security Appliances 4.

The Service Availability Forum Specification for High Availability Middleware

ArcGIS for Server Deployment Scenarios An ArcGIS Server s architecture tour

Bigdata High Availability (HA) Architecture

The CF Brooklyn Service Broker and Plugin

SCALABILITY AND AVAILABILITY

Time series IoT data ingestion into Cassandra using Kaa

ZooKeeper. Table of contents

How To Understand The History Of The Network And Network (Networking) In A Network (Network) (Netnet) (Network And Network) (Dns) (Wired) (Lannet) And (Network Network)

MongoDB Developer and Administrator Certification Course Agenda

Robert Honeyman Honeyman IT Consulting.

Disaster Recovery White Paper

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

High Performance Cluster Support for NLB on Window

VAULT MODERN SECRETS MANAGEMENT

Distributed File Systems

Megastore: Providing Scalable, Highly Available Storage for Interactive Services

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

HDFS Architecture Guide

XLink ClusterReplica SQL 3.0 For Windows 2000/2003/XP

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Chapter 1 - Web Server Management and Cluster Topology

Cluster Computing. ! Fault tolerance. ! Stateless. ! Throughput. ! Stateful. ! Response time. Architectures. Stateless vs. Stateful.

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V

Cloud Computing Disaster Recovery (DR)

Implementing Intercluster Lookup Service

Transcription:

Clustering/HA Introduction, Helium Capabilities, and Potential Lithium Enhancements

Overview Introduction to Clustering Goals Terminology Functionality Helium Clustering Capabilities Potential Lithium Enhancements APP HA Use Cases Design With Clustering in Mind. 2

Why Cluster? 1) Fault Tolerance Replicate Data to Allow Recovery in the Event of a Failure 2) Scale Distribute Work Across Set of Available Controllers (Ideally) Keep Data and Execution of Operations on that Data on Same Controller 3

What is Needed For Clustering? Form and Maintain a Cluster of Controllers Divide and Replicate Data Among Members Fast & Safe Datastore API Remote Execution Cluster Status and Change Notifications Defined Network Partition Behavior Security 4

Discussion Outline 1) Cluster Configuration 2) Datastore Clustering & Sharding 3) Autonomous Data Replication 4) Distributed Execution Akka - Actor System (Framework) - Use Akka-Provided Services and Custom Actors Focus: Clustering Concepts Over Code. 5

1) Cluster Configuration Create/Delete Cluster Automatic Member Discovery and Management Track Member Reachability and Availability Cluster-Event Notifications Transport Config & Security Ports Authenticate Members Encrypt Data Exchange Define Cluster Leader For Simplified Team Interaction Other Configuration (Failure Detection Interval, etc.) Akka-Clustering - Define Members + Discovery - Tracks Membership (Failure Detector) - Transport Config (Netty) 6

1) Cluster Configuration Helium Static Team Configuration on Startup (akka.conf) Akka-Specific Cluster-Member Event Notifications Local APP Deployment Odd # Node Clusters Lithium (Potential Changes) Programmatic Team Configuration Dynamic Team Formation Generalize Cluster Events Team-Wide APP Deployment (Apache Karaf Clustering?) Even # Node Clusters Team Leader (IP, Role, etc.) Security Not Enabled Member Authentication + Encryption 7

LevelDB 2) Datastore Clustering & Sharding Datastore: Define Shard Define Sharding Strategy Transparent Replacement for In-Memory Datastore (Distributed) Transactions Data Change Notifications Persistence (Optional) Scalability Config APP 1 Controller X APP 1 Akka-Persistence Operational - Journaling / Snapshotting - Configurable Plugins (Flat File, LevelDB) 8

2) Datastore Clustering & Sharding Helium Static Shard Configuration (module.conf, module-shards.conf) Top-Level Module Sharding Strategy Shard Persistence (enabled/disabled) Single-Shard Transactions Lithium (Potential Changes) Programmatic Shard Configuration Finer-Grain Sharding (APP Defined) Finer-Grain Persistence Control (Node, Subtree, other?) Multi-Shard Transactions 9

LevelDB 3) Autonomous Data Replication Automatic Replication of Shards Re-replication on Failures Dynamic Configuration: Replication Level Shard Placement Consistency (R/W) Defined Network Partition Behavior (Quorum vs. Non-Quorum) Config APP 1 Controller Y (Replica) Controller X (Shard Primary) APP 1 Remote RPC Registry Operational Controller Z (Replica) RAFT Consensus - Actor-Based Implementation on Akka 10

RAFT Consensus in 60sec Goal: All Members of Cluster Agree on a Variable s Value. (e.g. Content of Operational Datastore Node(s)) 1) Leader Election 2) Data Replication RAFT Elects and Maintains a Leader (Shard) Leader Handles R/W Request and Replicates Data to Other Members ALL Reads/Writes Go Through Shard Leader (Strong Consistency) On Write a Quorum of Nodes Must be Written Before Write Completes Minority (Non-Quorum) Partition Could Continue Operating Reads Not Consistent With Majority Writes Lost When Partition Healed 11

3) Autonomous Data Replication Helium Static Shard Placement / Replication Levels (module-shards.conf) Don t Maintain Replication Level on Failure Lithium (Potential Changes) Programmatic Shard Placement and Replication Levels Re-replicate to Maintain Configuration (config & team-size permitting) Special Replication Levels (Full, Quorum) Strong Consistency (R/W Through Leader Only) Any Shard Can Be Elected Leader Suspend on Non-Quorum Network Partition Adjustable R/W Consistency (Per Query?, Shard?) Influence Shard Leader Election (Cluster Load Balancing) Support (R, W, R/W) on Non-Quorum & Merge on Network Heal? 12

4) Distributed Execution Location Transparency For RPCs Automatic Routing Remote RPC Registry Updates Execute Operation on Data Owner Akka-Remoting - Location Independence - Remote RPC Registry (Custom Actors) 13

4) Distributed Execution Helium Static Remote RPC Registry Update Interval (Gossip) Lithium (Potential Changes) Programmable Update Interval 14

APP HA Use Cases High-Availability Modes of Operation: 1) Active-Active 2) Active-Passive (aka Active-Standby) 15

Discussion / Q&A 16

Resources ODL Clustering DOCs: General Design Detailed Design Feature Summary RAFT Consensus Akka Contacts: Moiz Raja (moraja@cisco.com) Raghurama Bhat (ragbhat@cisco.com) Harman Singh (harmasin@cisco.com) Mark Mozolewski (mbm@hp.com) 17