Architec;ng Splunk for High Availability and Disaster Recovery

Similar documents

Architec;ng Splunk for High Availability and Disaster Recovery

Incident Response Using Splunk for State and Local Governments

Gain Insight into Your Cloud Usage with the Splunk App for AWS

Splunk for Networking and SDN

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Splunk Apps for Monitoring Microso< Based Infrastructure

Deployment Best PracHces for Splunk Apps Monitoring MicrosoK- based Infrastructure

MulGsite Clustering and Search Aﬃnity

The Shift Cloud Computing Brings to Disaster Recovery

EMC DOCUMENTUM xplore 1.1 DISASTER RECOVERY USING EMC NETWORKER

More Comprehensive Digital Intelligence - CorrelaFng Client and Server- side Data

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

Preface Introduction... 1 High Availability... 2 Users... 4 Other Resources... 5 Conventions... 5

Workflow ProducCvity in Splunk Enterprise

Windows Inputs and MicrosoC Apps Strategy

EISOO AnyBackup 5.1. Detailed Features

Splunk Enterprise in the Cloud Vision and Roadmap

CA ARCserve and CA XOsoft r12.5 Best Practices for protecting Microsoft Exchange

How to Leverage Splunk s Security Intelligence PlaKorm for Security OperaNons Environments

Westek Technology Snapshot and HA iscsi Replication Suite

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Storage and Disaster Recovery

Copyright 2015 Splunk Inc. Go Big or Go Home. Sean Delaney Specialist SE Mustafa Ahamed Director, Product Management

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

HADOOP MOCK TEST HADOOP MOCK TEST I

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

From the Datacenter to the Dean s office

Configure SQL database mirroring

The Microsoft Large Mailbox Vision

Creating a Cloud Backup Service. Deon George

This article Includes:

Intellicus Enterprise Reporting and BI Platform

Veeam Backup and Replication Architecture and Deployment. Nelson Simao Systems Engineer

Business Process Desktop: Acronis backup & Recovery 11.5 Deployment Guide

The Art of High Availability

Backup, Restore, High Availability, and Disaster Recovery for Microsoft SharePoint Technologies

MTA Course: Windows Operating System Fundamentals Topic: Understand backup and recovery methods File name: 10753_WindowsOS_SA_6.

Disaster Recovery Remote off-site Storage for single server environment

ABSTRACT. February, 2014 EMC WHITE PAPER

SOFTWARE-DEFINED STORAGE IN ACTION

High Availability and Disaster Recovery Solutions for Perforce

Dell PowerVault DL2200 & BE 2010 Power Suite. Owen Que. Channel Systems Consultant Dell

SharePoint Capacity Planning Balancing Organiza,onal Requirements with Performance and Cost

Integrating Data Protection Manager with StorTrends itx

WHITE PAPER: DATA PROTECTION. Veritas NetBackup for Microsoft Exchange Server Solution Guide. Bill Roth January 2008

USER GUIDE CLOUDME FOR WD SENTINEL

Expert. Briefing. \\\\ Best Practices for Managing Storage with Hyper-V

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Deploy App Orchestration 2.6 for High Availability and Disaster Recovery

Backup and Restore with 3 rd Party Applications

Technical Notes. EMC NetWorker Performing Backup and Recovery of SharePoint Server by using NetWorker Module for Microsoft SQL VDI Solution

NetApp OnCommand Plug-in for VMware Backup and Recovery Administration Guide. For Use with Host Package 1.0

Availability and Disaster Recovery: Basic Principles

Exchange Data Protection: To the DAG and Beyond. Whitepaper by Brien Posey

Acronis Backup & Recovery Backing Up Microsoft Exchange Server Data

EMC VIPR SRM: VAPP BACKUP AND RESTORE USING EMC NETWORKER

Protecting the Microsoft Data Center with NetBackup 7.6

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Building Storage Service in a Private Cloud

External Data Connector (EMC Networker)

Using Live Sync to Support Disaster Recovery

Stretching A Wolfpack Cluster Of Servers For Disaster Tolerance. Dick Wilkins Program Manager Hewlett-Packard Co. Redmond, WA dick_wilkins@hp.

How to setup NovaBACKUP DataCenter to backup data to Amazon S3 using Amazon s AWS Storage Gateway

Gladinet Cloud Backup V3.0 User Guide

Acronis Backup & Recovery 11.5

CTERA Cloud Onramp for IBM Tivoli Storage Manager

Backup & Disaster Recovery Options

Enhancements of ETERNUS DX / SF

Building your Server for High Availability and Disaster Recovery. Witt Mathot Danny Krouk

Advanced High Availability Architecture. White Paper

Appendix A Core Concepts in SQL Server High Availability and Replication

USER GUIDE CLOUDME FOR WD SENTINEL

Using Drobo for Onsite & Offsite backup with Carbon Copy Cloner

CommVault Simpana Archive 8.0 Integration Guide

Disaster Recovery (DR) Planning with the Cloud Desktop

Universal Backup Device with

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

VMware vcloud Air - Disaster Recovery User's Guide

INSIDE. Preventing Data Loss. > Disaster Recovery Types and Categories. > Disaster Recovery Site Types. > Disaster Recovery Procedure Lists

Creating A Highly Available Database Solution

High Availability with Windows Server 2012 Release Candidate

Actifio Big Data Director. Virtual Data Pipeline for Unstructured Data

High Availability Essentials

VMware vsphere Data Protection Evaluation Guide REVISED APRIL 2015

Acronis Backup & Recovery for Mac. Acronis Backup & Recovery & Acronis ExtremeZ-IP REFERENCE ARCHITECTURE

Deploying Splunk on Amazon Web Services

Cloud Attached Storage

Acronis Backup & Recovery 11

Configuring High Availability for VMware vcenter in RMS Distributed Setup

Intellicus Cluster and Load Balancing- Linux. Version: 7.3

Blackboard Managed Hosting SM Disaster Recovery Planning Document

Mobile App User's Guide

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

activecho Driving Secure Enterprise File Sharing and Syncing

FileDrawer An Enterprise File Sharing and Synchronization (EFSS) solution.

Transcription:

Copyright 2013 Splunk Inc. Architec;ng Splunk for High Availability and Disaster Recovery Dritan Bi;ncka Professional Services #splunkconf

Legal No;ces During the course of this presenta;on, we may make forward- looking statements regarding future events or the expected performance of the company. We cau;on you that such statements reflect our current expecta;ons and es;mates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in this presenta;on are being made as of the ;me and date of its live presenta;on. If reviewed auer its live presenta;on, this presenta;on may not contain current or accurate informa;on. We do not assume any obliga;on to update any forward- looking statements we may make. In addi;on, any informa;on about our roadmap outlines our general product direc;on and is subject to change at any ;me without no;ce. It is for informa;onal purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obliga;on either to develop the features or func;onality described or to include any such feature or func;onality in a future release. Splunk, Splunk>, Splunk Storm, Listen to Your Data, SPL and The Engine for Machine Data are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respeccve owners. 2013 Splunk Inc. All rights reserved. 2

About Me! Member of Professional Services! Several mul;- TB deployments! 50+ projects/customers! Third.conf 3

Agenda Problems! Disaster recovery Recover in the event of a disaster! High availability Data collection Searching Indexing! Top takeaways Maintain an acceptable level of continuous service 4

Disaster Recovery (DR)

What is Disaster Recovery? Set of processes necessary to ensure recovery of service after a disaster 6!

Disaster Recovery Steps 1 Backup necessary data Local backup vs. remote Backup to a medium at least as resilient as source 2 Restore Ensure this works Backup is worthless without restore 7!

Backup 1 a Configurations $SPLUNK_HOME/etc/* b Indexes Buckets: Hot*, warm, cold, frozen 8!

Backup Configurations Splunk instance $SPLUNK_HOME/etc/* 9

Backup: Bucket Lifecycle Events [Hot bucket is full] [Too many warms] [Out of space or bucket is old] $ Home path $ Cold path [Cheaper storage] [Explicit user action] $ Thawed path $ Frozen path or deleted 10!

Backup Indexes Bucket type State Can backup? Read + write No* Read only Yes Read only Yes 11

What if your backup fails? 12

13

Restore Configurations Splunk instance New Splunk instance $SPLUNK_HOME/etc/* $SPLUNK_HOME/etc/* 14

Restore Data Splunk instance New Splunk instance $Indexes_Location ($SPLUNK_HOME/var/lib/splunk) $Indexes_Location ($SPLUNK_HOME/var/lib/splunk) 15

Putting Restore Together a New Splunk instance 2 b Configurations C Indexes 16!

Things to Think About:! Recovery )me vs. complexity and cost Ex. Backup to tape vs. warm standby! Tolerable loss vs. complexity and cost Ex. Backup warm/cold buckets vs. mirror hot bucket writes 17!

High Availability (HA)

What is High Availability? A design methodology whereby a system is continuously operational, bounded by a set of predetermined tolerances Note: high availability!= complete availability 19!

Splunk High Availability 1 Data collection 2 Searching 3 Indexing 20!

Data Collection A Indexers... B Forwarder Forwarder... Forwarder 21!

Data Collection A Indexers... B outputs.conf:!! [tcpout]! defaultgroup = mygroup!! [tcpout:mygroup]! server = A:9997, B:9997! autolb = true!... Forwarder Forwarder Forwarder 22!

Searching 2 a b Multiple, Independent Search Heads Splunk Native Search Head Pooling 23!

Searching Typical search hierarchy Indexer A Indexer B...! Indexer N 24!

Searching: Independent SH Typical search hierarchy Indexer A Indexer B...! Indexer N 25!

Searching: Independent SH Search head A Search head B Sync configs Sync job artifacts Independent search heads Indexer A Indexer B...! Indexer N 26!

Search Head Pooling 27!

Searching: Pooling Use NFS to share - Configurations - Job artifacts Search head A Search head B NFS Indexer A Indexer B...! Indexer N 28!

Searching: Pooling Use NFS to share - Configurations - Job artifacts Search head A Search head B NFS Indexer A Indexer B...! Indexer N 29!

Steps to SHP 1 Configure the shared location (NFS/SMB) 2 3 Enable pooling on each (stopped) search head./splunk pooling enable <path_to_share> Copy $SPLUNK_HOME/etc/apps and $SPLUNK_HOME/etc/users to $path_to_share/etc/ 4 Start all search heads 30!

Indexing a Reliable storage (SAN) 3 b Mirrored indexers C Index replication 31!

Indexing: Reliable Storage Indexer A Use the underlying SAN infrastructure to provide HA SAN 32!

Indexing: Reliable Storage Indexer A When indexer A fails, mount SAN to another indexer instance SAN Indexer B 33!

Indexing: Mirrored Indexers Search head Primary indexers X.0 index (locally) and forward to their respective secondaries X.1 Indexer A.0 Indexer B.0 Indexer A.1 Indexer B.1 34!

Indexing: Mirrored Indexers Search head Point search head to mirrored indexer B.1 Indexer A.0 Indexer B.0 Indexer A.1 Indexer B.1 35!

Index Replication

Indexing: Index Replication! Introduced with Splunk 5! Cluster = a group of search peers (indexers) that replicate each others' data! Mul;ple copies of all data exist throughout the cluster! The process that achieves this is known as index replica;on! Bucket based replica;on i.e. the unit of replication is a bucket 37!

Index Replication Benefits! Data availability. An indexer is always available to handle incoming data, and the indexed data is always available for searching! Disaster tolerance. System can tolerate downed indexers without losing data or losing access to data 38!

Index Replication Trade Offs! Extra storage! To a lesser degree, increased processing load More disaster tolerance (i.e., more copies of data) requires higher storage requirements Governed the replication factor (RF) attribute 39!

3 node Cluster Replication factor of 3 40

Ok guys, pair up in threes! Yogi Berra 41

Cluster Components! Master node Orchestrates replication process Informs the SH where to find data Helps manage peer configurations! Peer nodes Receive and index data (normal peer behavior) Replicate data to/from other peers Peer nodes number RF 42!

Cluster Components! At least one search head Must use one to manage searches across the cluster Can t run searches directly on an indexer, as you can with nonclustered indexers! Auto load- balanced forwarders Capable of indexer acknowledgement ê Ensuring data gets reliably indexed Note that other input methods can be used as well, but no indexer acknowledgement is available 43!

Cluster Concepts: RF! Replica;on factor :: RF defaults to 3 Number of copies of data in the cluster Cluster can tolerate RF-1 node failures! Cluster can tolerate 2 node failures 44!

Cluster Concepts: SF! Searchable factor :: SF defaults to 2 Number of searchable copies the cluster maintains SF RF Requires more storage and more processing than replicated copies Replicated copy vs. searchable copy SF = 1 vs. SF = 2 45!

Clustered Indexing! Origina;ng peer node streams copies of data to other clustered peers They store those copies in their own buckets Some peers may also index the copy! Master determines which peer nodes get replicated data Currently not configurable! Master manages all peer- to- peer interac;ons! Master keeps track of which peers have searchable data Ensures that there are always SF copies of searchable data available When a peer goes down, the master coordinates remedial activities 46!

Clustered Searching! Search head coordinates all searches in the cluster! SH relies on master to tell it who its peers are! Only one replicated bucket is searchable (primary) Searches occur over primary buckets, only! Primary buckets (searchable) may change over ;me Peers know their status and therefore know where to search 47!

Putting HA All Together NFS! Search head pooling HA Index replication or SAN based mirrorring..!..! Forwarders autolb HA 48!

Top Takeways! DR Process of backing-up and restoring service in case of disaster Configuration files Copy of $SPLUNK_HOME/etc/ folder Indexed data Backup and restore buckets ê Hot, warm, cold, frozen ê Can t backup hot but can safely backup warm and cold! HA Continuously operational system bounded by a set of tolerances Data collection ê Autolb from forwarders to multiple indexers ê Use indexer acknowledgement to protect in flight data Searching ê Search head pooling (SHP) Indexing ê Use index replication ê Highly reliable storage, mirrored set of indexers 49!

Questions?

Next Steps 1 2 Download the.conf2013 Mobile App If not iphone, ipad or Android, use the Web App Take the survey & WIN A PASS FOR.CONF2014 Or one of these bags! 3 Go to the sessions listed on the next slide 51

More Info Other Sessions You May Like:! Architec)ng and Sizing Your Splunk Deployments Brera 2&3, Level 3 Today, 3-4pm! Planning and Execu)on for Successful Deployments Brera 2&3, Level 3 October 3rd, 10:15-11:15am 52