MulGsite Clustering and Search Aﬃnity

Similar documents

Architec;ng Splunk for High Availability and Disaster Recovery

Architec;ng Splunk for High Availability and Disaster Recovery

Copyright 2015 Splunk Inc. Go Big or Go Home. Sean Delaney Specialist SE Mustafa Ahamed Director, Product Management

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Telemetry: The Customer Experience

In Depth with Deployment Server Sanford Owings

Technical Overview Simple, Scalable, Object Storage Software

High Availability and Clustering

Keeping Splunk in Check: Tools to BeGer Manage Your Investment

HPC on AWS. Hiroshi Kobayashi, Dev./Lab. IT System HGST Japan, Ltd. Jun 3, 2015

Best Practices for Deploying and Managing Linux with Red Hat Network

SAP HANA Operation Expert Summit BUILD - High Availability & Disaster Recovery

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

INTRODUCTION ADVANTAGES OF RUNNING ORACLE 11G ON WINDOWS. Edward Whalen, Performance Tuning Corporation

A very short Intro to Hadoop

Accelera'ng Your Solu'on Development with Splunk Reference Apps

C a r l G o e t h a l s T e r r e m a r k E u r o p e. C a r l. g o e t h a l t e r r e m a r k. c o m

Memory-to-memory session replication

Cloud Based Application Architectures using Smart Computing

Storage Virtualization from clusters to grid

CHOOSING THE RIGHT STORAGE PLATFORM FOR SPLUNK ENTERPRISE

Configuring Windows Server Clusters

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

In Memory Accelerator for MongoDB

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

EMC SOLUTION FOR SPLUNK

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

MAKING YOUR VIRTUAL INFRASTUCTURE NON-STOP Making availability efficient with Veritas products

Layer 4-7 Server Load Balancing. Security, High-Availability and Scalability of Web and Application Servers

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Pacific Life Insurance Company

Hyper-V backup implementation guide

Learn Oracle WebLogic Server 12c Administration For Middleware Administrators

EMC arhiviranje. Lilijana Pelko Primož Golob. Sarajevo, Copyright 2008 EMC Corporation. All rights reserved.

Windows Server 2003 Migration Guide: Nutanix Webscale Converged Infrastructure Eases Migration

The Design and Implementation of the Zetta Storage Service. October 27, 2009

Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer

High Availability & Disaster Recovery. Sivagopal Modadugula/SAP HANA Product Management Session # 0506 May 09, 2014

NoSQL and Hadoop Technologies On Oracle Cloud

Highly Available Service Environments Introduction

Workflow ProducCvity in Splunk Enterprise

Scaling DBMail with MySQL

be architected pool of servers reliability and

Gladinet Cloud Enterprise

Oracle Hyperion Financial Management Virtualization Whitepaper

Introduction. Setup of Exchange in a VM. VMware Infrastructure

Step-By-Step Guidelines to Formulate a High Availability Strategy for Your Business Objects Landscape Eric Vallo EV Technologies

Mirror File System for Cloud Computing

AppSense Environment Manager. Enterprise Design Guide

Unitt Zero Data Loss Service (ZDLS) The ultimate weapon against data loss

Sustain.Ability Honeywell Users Group EMEA. Virtualization Solutions: Improving Efficiency, Availability and Performance

High Availability and Disaster Recovery Solutions for Perforce

Symantec NetBackup Appliances

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

Arif Goelmhd Goelammohamed Solutions Hyperconverged Infrastructure: The How-To and Why Now?

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Fax Server Cluster Configuration

Complete Storage and Data Protection Architecture for VMware vsphere

Deployment Topologies

Building a Flash Fabric

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

WebSphere Commerce V7 Feature Pack 2

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

How To Make A Network Overlay More Efficient

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

Understanding Disk Storage in Tivoli Storage Manager

Course 2788A: Designing High Availability Database Solutions Using Microsoft SQL Server 2005

SOFTWARE-DEFINED STORAGE IN ACTION. What s new in SANsymphony-V v10

Oracle Database Backups and Disaster Autodesk

VMware vcloud Automation Center 6.1

High Availability Solutions for the MariaDB and MySQL Database

NoSQL Data Base Basics

RAID Basics Training Guide

Scality RING High performance Storage So7ware for pla:orms, StaaS and Cloud ApplicaAons

Deploying Splunk on Amazon Web Services

Hypervisor-based Replication

Postgres Plus xdb Replication Server with Multi-Master User s Guide

SAN Conceptual and Design Basics

HP Data Protector software Zero Downtime Backup and Instant Recovery. Data sheet

Information Archiving

Production ready hadoop. By Deepak Rao Na,onal Head Datawarehousing Bajaj Finserv

NetApp SnapManager for Microsoft Exchange

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Rajesh Gupta Best Practices for SAP BusinessObjects Backup & Recovery Including High Availability and Disaster Recovery Session #2747

So What s the Big Deal?

SQL Server 2012/2014 AlwaysOn Availability Group

VoIP Logic: Disaster Recovery and Resiliency

State of Cloud Storage Providers Industry Benchmark Report:

Veritas CommandCentral Disaster Recovery Advisor Release Notes 5.1

How Using V3 Appliances Virtual Desktop Total Cost of Ownership (TCO) is Reduced: A Superior Desktop Experience For Less Money

Transcription:

Copyright 2014 Splunk Inc. MulGsite Clustering and Search Aﬃnity Mustafa Ahamed Director, Product Management Da Xu Senior SoDware Engineer

Disclaimer During the course of this presentagon, we may make forward- looking statements regarding future events or the expected performance of the company. We caugon you that such statements reflect our current expectagons and esgmates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presentagon are being made as of the Gme and date of its live presentagon. If reviewed ader its live presentagon, this presentagon may not contain current or accurate informagon. We do not assume any obligagon to update any forward- looking statements we may make. In addigon, any informagon about our roadmap outlines our general product direcgon and is subject to change at any Gme without nogce. It is for informagonal purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligagon either to develop the features or funcgonality described or to include any such feature or funcgonality in a future release. 2

Agenda! What is Clustering?! Business Benefits of MulGsite Clustering! MulGsite ConfiguraGon! Search Affinity! Tips and Tricks MigraGon! Q&A 3

Why We Need Clustering? Work- Arounds 1. Index and Forward Addl. Licensing Costs 2. Simultaneous forward Data Sync Issue Forwarders Forwarders 4

Four Ideal Data Availability Product Requirements Data Recovery Auto failure detecgon and recovery Data Fidelity Correctness of data at all Gme Data Redundancy Auto backup of data Ease of Use + Lower TCO Management Console Single graphical interface to manage cluster 5

Major Components of Clustering Cluster Master Cluster Peer(s) One master per cluster meta data only Stores and indexes the actual data Replicates data to other peers Search Head(s) Coordinates the searches 6

Clustering The Search Process Flow 2 1 3 1 Search Head gets the peer list from Cluster Master 2 Search Head sends the search queries to peers 3 Redundant copies of raw data are available 7

Enterprise Readiness in Splunk High Availability Indexer Tier Index ReplicaGon Commodity hardware based Recommended for single site Flexible replicagon policies Available since Splunk 5.0 Disaster Recovery MulG Site Clustering Can withstand engre site failure Support for acgve- acgve configuragon Search affinity MISSION CRITICAL ENTERPRISE 8

MulGsite Clustering

MulGsite Clustering! Released in Splunk 6.1! Previous Splunk clusters gave us data redundancy RF- 1 indexers can fail without data loss! MulGsite allows for an extra layer of parggoning Indexers are grouped up in sites Now, an engre site of indexers (or mulgple sites) can fail without data loss This parggoning allows for bejer real- world redundancy ê e.g. failures in a rack/office- locagon (one site) will not result in data loss from redundant sites Los Angeles (site1) San Jose (site2) 10

MulGsite configuragons are in splunk.conf MulGsite ConfiguraGon Cluster Master splunk.conf [general] site = site1 [clustering] mode = master mulnsite = 1 available_sites = site1,site2,site3 site_replicanon_factor = origin:2,total:3 site_search_factor = origin:1,total:2 Cluster Indexer splunk.conf [general] site = site1 [clustering] mode = slave 11

MulGsite ConfiguraGon ConfiguraNon ExplanaNon/Rules mulnsite Turns mulgsite on or off [0/1] site available_sites Which site this Splunk instance belongs to Master/peers/searchheads all require a site if mulgsite is enabled Valid sites are site1 site63 List of all sites that will be part of this cluster Splunk instances with a site not listed here will not be able to join the cluster site_replicanon_factor site_search_factor MulGsite replicagon policy - specifies how many copies of a bucket per site [required] origin refers to # of replicated copies for the original site sitex refers to # of replicated copies for a specific site [required] total refers to total # of replicated copies for each bucket MulGsite replicagon policy for searchable copies, similar to site_replicagon_factor None of the values can be larger than their corresponding site_replicagon_factor values 12

site_replicagon_factor and site_search_factor ConfiguraNon origin:2, total:3 origin:2, total:4 origin:2, site1:2, total:4 origin:2, site1:1, total:4 origin:2:, site1:2, site2:2, total:3 ExplanaNon/Rules Default value. Origin site has 2 copies, 3 copies cluster- wide. Splunk will put the extra copy in a site that doesn t have a copy Similar to above, but Splunk will try to put a single copy into any site that doesn t have one Both site1 and origin will require a minimum of 2 copies If origin==sitex, then we require a minimum of the max of the 2 values (in our case, sgll 2) If origin==site1, Splunk will put 2 copies of the bucket in site1. Invalid the individual sites add up to more than total! 13

MulGsite ConfiguraGon! Sample configuragon these are two idengcal configuragons Site1 Indexers Site2 Indexers available_sites = site1,site2 site_replicanon_factor = origin:1,total:2 available_sites = site1,site2 site_replicanon_factor = origin:1,site1:1,site2:1,total:2 14

Search Affinity! Before mulgsite clustering, each bucket had a single primary searchable copy that would respond to searches! With mulgsite, each bucket now has a primary per site An individual copy can be primary for mulgple sites (search affinity) If a bucket with a searchable copy exists on a site, Splunk will make that bucket the primary for that site! Searchheads also have a site (search affinity) Searches will get as much events from indexers that share the same site 15

Search Affinity Site1 search head Site1 Indexers 1 1 1 2 Site2 search head Site2 Indexers 1 2 1 2 1 2! When a searchable copy becomes available on a site, Splunk will move the primary for that site to its local copy! Buckets on a site will return events to a searchhead with the same site! If a peer goes down, the master will move the primaries that peer had to another copy! If the engre site goes down, the other site(s) will become primaries 16

Where Do the Buckets Go? When a new hot bucket is created, Splunk will choose replicagon peers as follows: 1. For specific site counts, randomly choose peers from that site to be targets origin:n and sitex:n (site2:2) Splunk will find 2 random peers in site2 to be the replicagon partner 2. For the remaining unspecified counts (the ledover of total subtracted from the specific counts) We target at least 1 copy into sites that have no copies yet (origin:2, total:4), and (available_sites=site1,site2,site3,site4). There are two unspecified counts here, and 3 sites that have no copies yet so Splunk will randomly target a copy into two of those sites If every unspecified site has at least 1 copy, Splunk will then choose sites with the lowest number of copies (which leads to an even distribugon, number of peers permixng) (origin:2, total:6) and (available_sites=site1,site2,site3,site4). There must be 2 copies in the origin site Splunk will then distribute the remaining 4 buckets over all sites. 17

Some Things to Note! The Cluster Master coordinates all primary changes If it fails, primaries will no longer change and thus we may lose site affinity if a site goes down SoluGon is to bring up another cluster master (can be in a separate site)! Buckets created before mulgsite is turned on follow slightly different rules: 1. They follow the old replicagon_factor and search_factor rules instead of mulgsite rules 2. These buckets will also not replicate across sites. Splunk will try to keep these old buckets on its origin site, and perform replicagons between peers of that site 18

MigraGon

MigraGon! 6.0 ( non- clustering ) to 6.1 ( mulgsite) MulGsite policies will be applied to new data Pre- mulgsite buckets will have single copy of the data ungl they age out! 6.0 ( clustering ) to 6.1 ( mulgsite) MulGsite policies will be applied to new data Pre- mulgsite buckets will follow the legacy rep_factor / search_factor policies ungl they age out 20

Scaling & UI Enhancements

MulGsite Clustering Scaling! Indexers Test 1000 nodes cluster Splunk 5.0 Splunk 6.0 Splunk 6.1 Splunk 6.2! Indexes Test 100+ indexes 200,000 buckets Largest cluster tested in- house 10 Nodes 150 Nodes 450 Nodes 1000 Nodes! Sites Test 200 nodes cluster ê 63 Sites, 3 nodes in each site 22

Clustering UI New Bucket Status Page 23

Clustering UI Fixup Tasks In- progress fixup acgviges Pending fixup tasks 24

Clustering UI Manage Excess Buckets 25

Summary

Key Benefits of MulGsite Clustering 1. Faster Recovery from Disastrous Events 2. Intelligent Search RouGngs through Search Affinity 3. ConGnuous Data Availability 27

Q & A

THANK YOU