Architec:ng and Sizing your Splunk Deployment

Similar documents
Architec;ng Splunk for High Availability and Disaster Recovery

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Architec;ng Splunk for High Availability and Disaster Recovery

Deploying Splunk on Amazon Web Services

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Copyright 2015 Splunk Inc. Go Big or Go Home. Sean Delaney Specialist SE Mustafa Ahamed Director, Product Management

Splunk for Networking and SDN

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Splunk implementa-on. Our experiences throughout the 3 year journey

Hardware/Software Guidelines

BENCHMARKING V ISUALIZATION TOOL

Best PracBces: Deploying Splunk on Physical, Virtual, and Cloud Infrastructure

Tableau Server Scalability Explained

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Accelera'ng Your Solu'on Development with Splunk Reference Apps

Splunk Enterprise in the Cloud Vision and Roadmap

Quantum StorNext. Product Brief: Distributed LAN Client

SQL Server Virtualization 101. David Klee, Group Principal and Practice Lead. SQL PASS Virtualization VC,

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

CompTIA Cloud+ 9318; 5 Days, Instructor-led

PARALLELS CLOUD STORAGE

CompTIA Cloud+ Course Content. Length: 5 Days. Who Should Attend:

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Cloud Computing through Virtualization and HPC technologies

Incident Response Using Splunk for State and Local Governments

Scaling Database Performance in Azure

Keeping Splunk in Check: Tools to BeGer Manage Your Investment

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Pervasive PSQL Vx Server Licensing

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Technology Insight Series

Symantec Endpoint Protection 11.0 Architecture, Sizing, and Performance Recommendations

HP Proliant BL460c G7

Tuning Tableau Server for High Performance

WHITE PAPER. Software Defined Storage Hydrates the Cloud

Texas Digital Government Summit. Data Analysis Structured vs. Unstructured Data. Presented By: Dave Larson

Passwords are for Chumps

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

SharePoint Capacity Planning Balancing Organiza,onal Requirements with Performance and Cost

Data Center Evolu.on and the Cloud. Paul A. Strassmann George Mason University November 5, 2008, 7:20 to 10:00 PM

Zadara Storage Cloud A

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

Best Practices for Virtualised SharePoint

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

How To Create A Desktop Computer From A Computer Or Mouse And Keyboard (For Business)

Milestone Solution Partner IT Infrastructure MTP Certification Report Scality RING Software-Defined Storage

Scalability Tuning vcenter Operations Manager for View 1.0

Oracle Primavera P6 Enterprise Project Portfolio Management Performance and Sizing Guide. An Oracle White Paper October 2010

Server & Application Monitor

Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers. Steve Smith Combined Knowledge MVP SharePoint Server

STORAGE CENTER WITH NAS STORAGE CENTER DATASHEET

Selling Compellent NAS: File & Block Level in the Same System Chad Thibodeau

Virtualizing Exchange

The last 18 months. AutoScale. IaaS. BizTalk Services Hyper-V Disaster Recovery Support. Multi-Factor Auth. Hyper-V Recovery.

MED 0115 Optimizing Citrix Presentation Server with VMware ESX Server

Stingray Traffic Manager Sizing Guide

my forecasted needs. The constraint of asymmetrical processing was offset two ways. The first was by configuring the SAN and all hosts to utilize

Increasing Storage Performance

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

VDI Without Compromise with SimpliVity OmniStack and VMware Horizon View

Scaling Analysis Services in the Cloud

Planning Domain Controller Capacity

TheraDoc v4.6.1 Hardware and Software Requirements

Protect Data... in the Cloud

Virtual SAN Design and Deployment Guide

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

FusionHub Virtual Appliance

Initial Hardware Estimation Guidelines. AgilePoint BPMS v5.0 SP1

Virtualizing SQL Server 2008 Using EMC VNX Series and Microsoft Windows Server 2008 R2 Hyper-V. Reference Architecture

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Recommendations for Performance Benchmarking

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Architecting ColdFusion For Scalability And High Availability. Ryan Stewart Platform Evangelist

Characterize Performance in Horizon 6

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

How To Compare The Economics Of A Database To A Microsoft Database

XenDesktop 7 Database Sizing

Maximum performance, minimal risk for data warehousing

Business Con*nuity with Docker

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

Traditional v/s CONVRGD

VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE

Deploying Microsoft Exchange Server 2010 on the Hitachi Adaptable Modular Storage 2500

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Toolbox 4.3. System Requirements

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Transcription:

Copyright 2014 Splunk Inc. Architec:ng and Sizing your Splunk Deployment Simeon Yep Sr. Manager BD Tech Serv, Splunk Karandeep Bains Sr. Sales Engineer, Splunk

Disclaimer During the course of this presenta:on, we may make forward- looking statements regarding future events or the expected performance of the company. We cau:on you that such statements reflect our current expecta:ons and es:mates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presenta:on are being made as of the :me and date of its live presenta:on. If reviewed aser its live presenta:on, this presenta:on may not contain current or accurate informa:on. We do not assume any obliga:on to update any forward- looking statements we may make. In addi:on, any informa:on about our roadmap outlines our general product direc:on and is subject to change at any :me without no:ce. It is for informa:onal purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no obliga:on either to develop the features or func:onality described or to include any such feature or func:onality in a future release. 2

Objec:ve Show you how to build a robust and scalable Splunk deployment 3

Introduc:on

About Simeon! 6+ years @ Splunk! Experience: Suppor:ng, administering, and architec:ng large scale deployments OEM, technical sales Strategic Accounts, technical sales! Based in HQ (San Francisco office)! Currently: Business Development, Technical Synergies 5

About Deep! 7+ years @ Splunk! Experience: Suppor:ng, administering, and architec:ng large scale deployments OEM, technical sales Strategic Accounts, technical sales! Based in HQ (San Francisco office)! Currently Business Development, OEM 6

Agenda! Sizing fundamentals! Architec:ng fundamentals! Deployment topologies 7

Sizing Fundamentals

! Understand the sizing factors! Data volume! Search volume Sizing Fundamentals

! How much data (raw sizes)? Daily volume Peak volume Retained volume (archive size) Future volume?! How much searching? Use cases How many people? How osen? Apps! Jobs Summariza:on, aler:ng, repor:ng Sizing Factors

Data Volumes! Es:mate input volume Verify raw log sizes Leverage _internal metrics to get actual input volumes! Confirm es:mates with actual data Create a baseline with real or simulated data Find compression rates (range from 30%- 120%, typically 50%) Determine reten:on needs Clustering needs! Document use cases Use case determines search needs Plan for expansion as adop:on grows (search and volume)

! Via filesystem Data Sizing Exercise! Use the Splunk log files: metrics.log or license_usage.log! Recommended: Introspec:on data and dashboards in 6.2

Screen Shot 2014-09- 16 at 3.05.42 PM

Search Volumes! Gather use case informa:on How much ad- hoc searching? How much background searching?! Ad- hoc searching Evaluate the data being searched Evaluate the :me dura:on (real- :me vs historic) Real- :me searches are typically less overhead! Background searching Aler:ng and monitoring General reports Summary indexing

Search Volume Exercise! Use the Splunk log files: audit.log! Recommended: Introspec:on data and dashboards in 6.2

! Data capacity Daily and peak! User capacity Concurrent and total! Search capacity Concurrent and total Sizing Fundamentals *Document the use cases!!

Architecture Fundamentals

Architecture Fundamentals! Splunk server roles: distributed/clustered deployments! Reference server! Rules of thumb! Hardware factors

Splunk Distributed Roles LM DS Search Head (Search Head Captain) License Master Deployment Server CM Indexer Cluster Master Forwarders

Recommended Configura:ons Standalone Indexer (Distributed) Search Head (Distributed) Indexer (Clustered) Search Head (Clustered) Cluster Master (Clustered) Forwarder * * * * Searching * Indexing * Deployment Server * * License Master * * Cluster Master Search Head Captain common * uncommon 20

What's a "Search Head Reference" Server?! Sizing based on commodity x86 servers 64bit! 4 x quad- core CPUs at 2.0 GHz! 12 GB of RAM (16 GB is common)! 64- bit OS! 2x10k RPM local SAS drives in RAID 1! Varia:ons cause corresponding changes in performance/ requirements

What's an "Indexer Reference" Server?! Sizing based on commodity x86 servers 64bit! 2 x six- core CPUs at 2.0 GHz! 12 GB of RAM (16 GB is common)! 64- bit OS! Local or avached storage (1200+ IOPs)! Varia:ons cause corresponding changes in performance/ requirements

Rules of Thumb! These all have excep:ons and qualifica:ons! 1 Reference indexer per 250 GB/day! 1 Reference search head per 20-40 jobs! 1 Deployment server per 3,000 polls/min! Replica:on later.

How Many Indexers?! Rule of thumb says: 1 per 250 GB/day! Leaves room for: Daily peaks Light searching and repor:ng for about 5 concurrent users! Need more indexers for: Heavy repor:ng More users Slower disks, slower CPUs, fewer CPUs

How Many Search Heads?! Rule of thumb says: 1 per 20 40 concurrent jobs! Limit is concurrent queries! Search Query may u:lize up to 1 CPU core! Only add first search head if 3 indexers! Don t add search heads - add indexers. Indexers do most work! But you need more if: Running a lot of scheduled jobs on the search head

How Many Deployment Servers?! Rule of thumb says: 1 per 3000 polls/minute! Just use one deployment server, and adjust the polling period! Small deployments can share the same Splunkd! Low requirement for disk performance (good candidate for virtualiza:on)! Windows OS 1 per 500 polls/minute! Or use something other than deployment server puppet, SCCM

More is Bever?! CPUs Search process u:lizes up to 1 CPU core Indexers s:ll need to do the heavy lising (search exists on indexer AND search head) Limited benefit for indexing (up to 4 CPU cores for indexing)! Memory Good for search heads and indexers (16+ GB)! Disks Faster is bever (15k rpm) or SSD More disks in RAID 1+0 = Faster SSDs can provide benefit for rare term searches and many concurrent jobs

Performance and Sizing Tips System Change Search Speed Indexing Speed Faster disks ++ ++ Add an indexer ++ ++ Add a search head + Report accelera:on/summaries ++

Performance and Sizing Tips System change Search Speed Indexing Speed Op:mize searches +++ Op:mize field extrac:on + Op:mize input parsing + Faster CPU + +

Capacity Architecture! Sizing recipe Capacity Rules of thumb determines number of servers! Building blocks for architecture

Architecture Factors! What are my sizing requirements?! Where is the data?! Where are the users?! What is the security policy?! What are the reten:on and compliance policies?! What is the availability requirement?! What about the cloud?

Architecture Factors! What are my sizing requirements? Data capacity Search capacity User capacity! Obtained from the sizing process

Architecture Factors! Where is the data? Local or remote to the indexing machine If remote use forwarders when possible Index in local data center (zone) or index centrally Persist network data to disk as a best prac:ce Use intermediate forwarders to distribute data! Where are the users? User experience affected by search head loca:on ê Time zone tuning ê Distributed search over LAN vs WAN

! What is the security policy? Apply user security policies ê Auth method ê Roles ê Filters Apply physical security policies ê Index loca:on Architecture Factors

Architecture Factors! Reten:on, compliance, governance Where is the data allowed to be? Where is the data not allowed to go? Where must the data go?! Availability Local failover, fault- tolerance, clustering Geographic disaster recovery/fault- tolerance Index replica:on!

! Cloud Considera:ons Authen:ca:on restric:ons Data transfer costs Security SSL Tunnel Zones Architecture Factors

Architecture Topologies! What are my sizing requirements?! Where is the data?! Where are the users?! What is the security policy?! What are the reten:on and compliance policies?! What is the availability requirement?! What about the cloud?

Topologies

Architecture Factors Topology! Topology Examples Centralized Decentralized Hybrid Index replica:on Search head clustering

Centralized Topology Search Head Clustering Indexers Forwarders Intermediate Forwarder Forwarders Syslog Devices

Decentralized Topology DC1 DC2 Search Head Clustering DC3 DC4 HQ

DC1 Hybrid Topology HQ DC2 DC3 DC4 4

Index Replica:on (aka Clustering)! What is it? Indexes are replicated to 1 or more indexers (tunable) Splunk cluster master controlled! Basics Master node (manages indexing and searching loca:on) Distributed deployment NOT = Index and Forward! HA vs DR HA - Data is made available on 1 or more indexers in one loca:on DR - All data exists in mul:ple loca:ons

Clustering! Replica:on factor Determine the number of copies of data to maintains! Search factor Determine the number of searchable copies of the data! Data reten:on equa:on General rule of thumb: ê 15% for each RF; 35% for each SF Example: ê 100 GB of raw = 50 GB on disk. ê RF 2; SH 2; ((.15 * 2 RF * 100GB ) + (.35 * 2 SH * 100GB )) = 100 GB

Index Replica:on Reminders! Logically, mul:ple copies of the data Increase in I/O, CPU, and disk requirement Need more Indexers! Increase in search factor vs replica:on factor (rawdata + tsidx) vs. (only rawdata)! Mul:- site replica:on WAN Load Search head affinity

Search Head Clustering (aka NOT SHP)! What is it? Uses RaS protocol Splunk head captain controlled! Basics Ability to group search heads into a cluster in order to provide highly available search services NOT NFS based Replica:on using local storage! How does it work? Group search heads into a cluster A captain gets elected dynamically User created reports/dashboards automa:cally replicated to other search heads

Scaling and Expansion! Add to your indexer pool for more performance or capacity Mixed pla orm and hardware is not recommended! Use search head clustering for more UI capacity Does not requires NFS! Create new indexes for new data types Follows best prac:ces

Final Thoughts! Sizing is more than data volume it s also search load! Centralized architecture is the baseline! Varia:ons on architecture are driven by Sizing Data loca:on User loca:on Reten:on/Access/Governance Availability requirements

More Informa:on! Contact: syep@splunk.com deep@splunk.com! Documenta:on: hvp://docs.splunk.com! Answers: hvp://answers.splunk.com! Other presenta:ons Mul:site Indexer Clustering with Search Affinity 10/9/2014 @ 9:00 am Splunk Search Accelera:on Technologies 10/9/2014 @ 10:30 am

THANK YOU