Copyright 2014 Splunk Inc. Deploying Splunk on Amazon Web Services Simeon Yep Senior Manager, Business Development Technical Services Roy Arsan Senior SoHware Engineer
Disclaimer During the course of this presentalon, we may make forward- looking statements regarding future events or the expected performance of the company. We caulon you that such statements reflect our current expectalons and eslmates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presentalon are being made as of the Lme and date of its live presentalon. If reviewed aher its live presentalon, this presentalon may not contain current or accurate informalon. We do not assume any obligalon to update any forward- looking statements we may make. In addilon, any informalon about our roadmap outlines our general product direclon and is subject to change at any Lme without nolce. It is for informalonal purposes only, and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligalon either to develop the features or funclonality described or to include any such feature or funclonality in a future release. 2
Amazon Web Services vs. Everyone Else 3
ObjecLve: Integrate your Splunk Enterprise deployment with Amazon Web Services (AWS) 4
Bios Simeon Yep! 6+ years @ Splunk! Roles in: Support ConsulLng Technical Sales! Currently focused on Partner Ecosystem (including AWS) Roy Arsan! 2+ years @ Splunk! Roles in: Product Engineering Cloud Architecture 5
Agenda! Infrastructure: AWS ElasLc Compute Cloud (EC2)! Deployment Examples! How to Deploy: AWS CloudFormaLon! Apps + Other 6
AWS EC2 Infrastructure
What is this Amazon stuff?! Amazon ElasLc Compute Cloud (EC2) is a web service that provides resizable compute capacity in the cloud! Pay only for capacity that you actually use! Splunk is easily deployed in Amazon 8
Splunk and Hardware! Splunk consumes high I/O due to indexing and searching! Load!= GB/day! Search drives a large porlon of the load Rare vs. Sparse vs. ReporLng Real- Lme vs. Historic! Reference servers can index up to 500 GB/day with no search load! Virtualized systems incur some overhead, but work well if tuned correctly 9
Typical User Scenario 1. Sign- up for an AWS account (use AWS IAM IdenLty and Access Management) 2. Launch an instance (via user chosen tool such as GUI, CLI, or external) 3. Use key credenlals to access the instance 4. Install SoHware/Splunk 10
Instances ü Availability Zones exist within Regions (8 Regions + Gov) ü Amazon Machine Image (AMI) Amazon Linux based Best Performance Cost EffecLve (extra $$ for Windows)
Instances! Instance type Pricing: Spot vs. On- demand vs. Reserved Family: Storage vs. Compute vs. GPU vs. Memory vs. General Purpose GeneraLon: Current vs. Previous! Instance size Workload size: compute units, memory, storage Micro, Small, Medium, Large, Extra Large (XL) ê MulLple XL sizes: xlarge, 2xlarge, 4xlarge, 8xlarge 4XL general purpose provides similar performance to a reference server ê 50-150 GB/day indexing and searching 12
Instance Storage! Instances have ephemeral storage (Current Gen has SSDs) General Purpose instances have GBs to TBs Storage OpLmized instances have up to 48 TB! Data is lost when the instance dies! EBS ElasLc Block Storage Persistent block level storage volumes for use with EC2 instances Cost associated 1 TB costs $50/month, 5 TB costs $250/month Data is not lost when instance dies can be remounted with new instance! S3 Simple Storage Service Online cloud storage service (files, data, snapshots, etc ) Need this for backup purposes Can also be used as a data feed for Splunk 13
Storage Summary! For single instances or non- replicated distributed deployments: Use EBS volumes in RAID 1+0 for indexes, RAID 0 for OS/soHware SoHware RAID will consume cpu Use snapshots to backup the instance (S3) IOPS oplmized can provide some benefits XFS preferred (customer feedback)! Warming Doesn t have to do with Datacenter temperature Improves first write performance hit NoLceable improvements in performance when performed on ephemeral storage EBS volumes created from snapshots also benefit from warming 14
Instance SelecLon! How can I make my deployment resilient? OpLon 1: RAID 1+0 at the storage layer + EBS (was the preferred setup) OpLon 2: Index ReplicaLon OpLon 3: Data Cloning (Index and Forward, HA license required)! Instance seleclon should factor in resiliency, use- case, and cost! Index ReplicaLon FTW (?) Factoring in most common retenlon needs, you may need large EBS volumes and/or double the instances to be resilient (maybe HA license as well) ReplicaLon requires more instances, but does not require EBS IR is driven per instance cost 15
! 1 TB/day deployment example EBS backed storage for availability No replicalon Instance SelecLon 16
Instance SelecLon! 1 TB/day deployment cost comparison! Overall Cost is equivalent when EBS retenlon is 211 days (vs. 960)! Index ReplicaLon offers immediate search capability with SF/RF 17
Instance SelecLon Distributed Deployments Using Index ReplicaLon (IR)! Local ephemeral storage (SSDs) may perform beuer than EBS! Search/ReplicaLon Factor determines availability of data for searching! IR adds load and requires more servers and storage Using EBS volumes, no IR! Typically fewer instances to manage vs. IR! Search Availability is driven by the capability to remount a volume to a new instance (automalcally or manually)! Cost can be largely driven by retenlon and daily volume 18
Best PracLces! Custom AMI crealon Create your own AMI using Linux based or Splunk provided Leverage current configuralon tooling with AMI (don t have to use deployment server, but can be very helpful)! AuthenLcaLon and AuthorizaLon Policies will dictate what you can or cannot use LDAP/AD will require an SSL tunnel Other oplons: scripted input or proxying (SSO) SAML (Okta)! Security SSL everywhere + private network Install your own cerlficates 19
! Search Head Pools Best PracLces! Deploy to the same Availability Zone ReplicaLon and searches across Regions and AZ can be a challenge! Monitor from outside of the Region/AZ Offers addilonal resiliency! Use a Virtual Private Cloud (VPC) 20
Best PracLces (Rewrite)! ConfiguraLon and SoHware Management Use the tools you are most familiar with Chef and Puppet content publicly available! Deployment server usage EffecLve for controlling Splunk configuralon (only)! Use Cloud FormaLon Allows for easy and quick deployment Great starlng point for large deployments (See Appendix A) 21
General Guidelines Follow Best PracLces for ArchitecLng and Sizing: Load=Searching+Indexing Indexers (50-150 GB/day)! m3.2xlarge 8vcpu, 30 GB RAM! i2.4xlarge 16vcpu, 122 GB RAM! hs1.8xlarge 16vcpu, 117 GB RAM *These are all starlng points! Splunk can index and search more OR less depending on overall load Search Heads (8+ users)! c3.2xlarge 8vcpu, 15 GB RAM! c3.4xlarge 32vcpu, 60 GB RAM Cluster Master or Deployment Server! m3.xlarge 4vcpu, 15 GB RAM! c3.2xlarge 8vcpu, 15 GB RAM License Master! m3.large 2vcpu, 7.5 GB RAM! m3.xlarge 4vcpu, 15 GB RAM 22
Architecture & Deployment Examples
Architecture Examples! Centralized! Decentralized! Hybrid! Centralized with Index ReplicaLon 24
Search Pooling Centralized Topology Indexers Forwarders Intermediate Forwarder Forwarders Syslog Devices 25
Decentralized Topology Search Pooling 26
Hybrid Topology 27
Index Replication with Search Pooling Cluster Master Search Pool Forwarders Peer Nodes 28
! Deployment A! Deployment B Deployment Examples 29
Deployment A! Use Case: Searching, ReporLng and AnalyLcs! Capable of 1-100+ GB/day indexing! m3.2xlarge instance High value for cpu (8 vcpu, 30 GB RAM) Previously were using c1.xlarge (8 vcpu, 7 GB RAM)! RAID 1+0 across 4 EBS volumes! 16 concurrent users 30
Deployment B! Use Case: ApplicaLon Management, Security Forensics! Capable of 500 GB/day indexing! Distributed deployment with Index ReplicaLon (2 SF, 3 RF)! 3 hs1.8xl instances with 49 TB ephemeral storage (indexers)! c1.xlarge instance (search head)! Leveraging AWS API for instance management 31
Deployment B Search Head(s) Cluster Master License Master Indexer Indexer Indexer 32 32
Example Architectures Use case and requirements influence final setup, but there is no right or wrong way Using EBS Backed Storage! 20 GB/day m3.2xlarge (single instance)! 100 GB/day m3.2xlarge (single instance)! 300 GB/day m3.2xlarge (3) c3.4xlarge! 500 GB/day m3.2xlarge as indexer (5) c3.4xlarge as search head (1) Using Index ReplicaJon! 100 GB/day m3.2xlarge as indexer (2) c3.2xlarge as search head (1) c3.xlarge as CM/LM! 500 GB/day hs1.8xlarge as indexer (3) c3.8xlarge as search head (1) m3.xlarge as CM/LM 33
How To Provision Deployments
Cloud Provisioning Tools A Primer Server Provisioning Deployment Provisioning AWS OpsWorks AWS CloudFormation Scalr Terraform! Flexible recipe- based configuralon Configure machine based on role! Fast template- based provisioning Provision & connect resources 35
Splunk AWS CloudFormaLon What used to take days to get all configured properly, now I can do in few minutes with Splunk [AWS] CloudFormaLon Abdallah Mohammed, Data Architect, Intuit 36
Splunk AWS CloudFormaLon! Open- source self- service tool (no cost associated)! Fast, automated, consistent Splunk deployments on AWS! Available on GitHub: Templates + Tutorial hups://github.com/splunk/splunk- aws- cloudformalon! Splunk Blog: Deploy your own Splunk cluster on AWS in minutes! hup://blogs.splunk.com/2014/05/20/deploy- your- own- splunk- cluster- on- aws- in- minutes/
Splunk AWS CloudFormaLon What can Splunk AWS CloudFormaLon do for you?! Accelerates deployment Lme down to minutes! Incorporates Splunk best pracjces for operalons and administralon! Abstracts away details of configuring distributed Splunk! Extensible and customizable templates to fit custom needs 38
Sample Architecture Search Head(s) Cluster Master License Master Indexer Indexer Indexer 39 39
Deploy Splunk Cluster in < 30 minutes 40
Simple User- Friendly Push- Buuon Form 41
Demo Time 42
QuesLons? 43
Contact Simeon Yep syep@splunk.com Business Development Roy Arsan rarsan@splunk.com Engineering 44
References! Splunk App for AWS: hup://apps.splunk.com/app/1274/! Hunk App for AWS ELB: hup://apps.splunk.com/app/1731/! Technical Brief: hup://www.splunk.com/web_assets/pdfs/secure/ Splunk_and_Amazon_Web_Services_Tech_Brief.pdf 45
References! Blogs: hup://blogs.splunk.com/2012/03/07/splunk- and- aws- sizing- revisited/ hup://blogs.splunk.com/2013/06/06/splunkit- v2-0- 2- results- ec2- storage- comparisons/ hup://blogs.splunk.com/2013/07/31/whats- going- on- with- aws- and- splunk/ hup://blogs.splunk.com/2014/05/20/deploy- your- own- splunk- cluster- on- aws- in- minutes/! AMIs Splunk: hups://aws.amazon.com/marketplace/pp/b00gizituo?sr=0-4 Hunk: hups://aws.amazon.com/marketplace/pp/b00gizk2qi?sr=0-2 46
THANK YOU