The Jiffy Lube Quick Tune- up for your Splunk Environment



Similar documents
Keeping Splunk in Check: Tools to BeGer Manage Your Investment

Windows Inputs and MicrosoC Apps Strategy

Splunk/Ironstream and z/os IT Ops

How and When to Use Dynamic Lookups

Making the Most of the New Splunk Scheduler

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Directions for VMware Ready Testing for Application Software

Best Practices for Building Splunk Apps and Technology Add-ons

Accelera'ng Your Solu'on Development with Splunk Reference Apps

Splunk Enterprise in the Cloud Vision and Roadmap

Copyright 2015 Splunk Inc. Go Big or Go Home. Sean Delaney Specialist SE Mustafa Ahamed Director, Product Management

In Depth with Deployment Server Sanford Owings

Telemetry: The Customer Experience

Building a Splunk-based Lumber Mill. Turning a bunch of logs into useful products

One of the database administrators

An Engagement Model for Master Data Consumers

This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.

Deploying the Splunk App for Microso> Exchange

CA Unified Infrastructure Management

InfoScale Storage & Media Server Workloads

VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT

Informatica Master Data Management Multi Domain Hub API: Performance and Scalability Diagnostics Checklist

SnapLogic Salesforce Snap Reference

Eloquence Training What s new in Eloquence B.08.00

Tableau Server 7.0 scalability

Configuring System Message Logging

Grid CompuAng AnalyAcs with Splunk Finnbar Cunningham

<Insert Picture Here> Adventures in Middleware Database Abuse

Practical Performance Understanding the Performance of Your Application

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Analyzing IBM i Performance Metrics

Monitoring HP OO 10. Overview. Available Tools. HP OO Community Guides

Adobe Marketing Cloud Data Workbench Monitoring Profile

SQL diagnostic manager Management Pack for Microsoft System Center. Overview

ITPS AG. Aplication overview. DIGITAL RESEARCH & DEVELOPMENT SQL Informational Management System. SQL Informational Management System 1

Boost your VDI Confidence with Monitoring and Load Testing

Tableau Server Scalability Explained

SharePoint 2010 Performance and Capacity Planning Best Practices

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Deployment Best PracHces for Splunk Apps Monitoring MicrosoK- based Infrastructure

Enterprise Manager Performance Tips

Maintaining Non-Stop Services with Multi Layer Monitoring

Product Version 1.0 Document Version 1.0-B

Module 10: Maintaining Active Directory

theguard! ApplicationManager System Windows Data Collector

Healthstone Monitoring System

StreamServe Persuasion SP5 Oracle Database

Toad for Oracle 8.6 SQL Tuning

Application Compatibility Best Practices for Remote Desktop Services

vrealize Operations Manager Customization and Administration Guide

Installation and Configuration Guide for Windows and Linux

Cognos8 Deployment Best Practices for Performance/Scalability. Barnaby Cole Practice Lead, Technical Services

Oracle Audit Vault Oracle FLEXCUBE Universal Banking Release [April] [2014]

PATROL From a Database Administrator s Perspective

Understanding Server Configuration Parameters and Their Effect on Server Statistics

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

Web Application s Performance Testing

Paper Robert Bonham, Gregory A. Smith, SAS Institute Inc., Cary NC

CA Nimsoft Monitor. Probe Guide for Active Directory Server. ad_server v1.4 series

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

Deploying Microsoft Operations Manager with the BIG-IP system and icontrol

Hadoop & Spark Using Amazon EMR

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Who needs humans to run computers? Role of Big Data and Analytics in running Tomorrow s Computers illustrated with Today s Examples

Analyzing Call Signaling

HelpSystems Web Server User Guide

Audit Logging. Overall Goals

How To Test For Performance

Tuning Your GlassFish Performance Tips. Deep Singh Enterprise Java Performance Team Sun Microsystems, Inc.

Best Practices for Monitoring Databases on VMware. Dean Richards Senior DBA, Confio Software

BITDEFENDER SECURITY FOR AMAZON WEB SERVICES

BlackBerry Enterprise Server for Microsoft Exchange Version: 5.0 Service Pack: 2. Feature and Technical Overview

FortiAnalyzer VM (VMware) Install Guide

Overview and History of Operating Systems

User Guide. NAS Compression Setup

Informatica Corporation Proactive Monitoring for PowerCenter Operations Version 3.0 Release Notes May 2014

vcenter Operations Management Pack for SAP HANA Installation and Configuration Guide

Improved metrics collection and correlation for the CERN cloud storage test framework

The Top 20 VMware Performance Metrics You Should Care About

Enhanced Diagnostics Improve Performance, Configurability, and Usability

CA Nimsoft Monitor Snap

insync Installation Guide

WebSphere Business Monitor

Managing your Red Hat Enterprise Linux guests with RHN Satellite

Installation and Configuration Guide for Windows and Linux

Configuring Event Log Monitoring With Sentry-go Quick & Plus! monitors

System Administration of Windchill 10.2

Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015

vsphere Replication for Disaster Recovery to Cloud

From the Datacenter to the Dean s office

Oracle Database In-Memory The Next Big Thing

How To Monitor A Server With Zabbix

Monitoring System Status

Mirjam van Olst. Best Practices & Considerations for Designing Your SharePoint Logical Architecture

Transcription:

Copyright 2014 Splunk Inc. The Jiffy Lube Quick Tune- up for your Splunk Environment Sean Delaney Client Architect, Splunk

Disclaimer During the course of this presentaion, we may make forward looking statements regarding future events or the expected performance of the company. We cauion you that such statements reflect our current expectaions and esimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in the this presentaion are being made as of the Ime and date of its live presentaion. If reviewed aqer its live presentaion, this presentaion may not contain current or accurate informaion. We do not assume any obligaion to update any forward looking statements we may make. In addiion, any informaion about our roadmap outlines our general product direcion and is subject to change at any Ime without noice. It is for informaional purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obligaion either to develop the features or funcionality described or to include any such feature or funcionality in a future release. 2

IntroducIon

! Splunk Client Architect Splunker for 3+ years Using Splunk for 4+ years Large Scale Deployment Specialist About Me! Previously Splunk Professional Services 10+ years ProducIon Services for a large Internet Security Company 4

Agenda! Time Issues! Indexer Health! Search Performance! New Kids on the block 5

What s a Splunk Tune- Up?! IdenIfying common performance issues in a distributed environment: Ime issues lagging events monitoring indexer queues slow and resource intensive searches! Provide steps to improve performance via configuraion changes 6

Splunk Architecture 7

Splunk on Splunk hcp://apps.splunk.com/app/748/ 8

Time

! Splunk indexes all events by Ime Time is Important! Splunk Searches are bound by a Ime range! When you run a Search Splunk retrieves these events based on their indexed Ime 10

Common Time Issues! ExtracIng/assigning correct Imestamp for the event MulIple Imestamps in an event Time but no date Custom Imestamp formats! Line Breaking Line breaking occurring in the middle of muli- line events Orphaned log lines with no Imestamps! Time Zones! Servers not synchronized to an authoritaive Ime source NTP 11

Assigning Time! Timestamps are determined and set at index Ime! Splunk will first acempt to extract the Imestamp out of the raw event! Timestamp is stored in the _Ime field in UTC Ime format 10.10.241.162 - - [29/Mar/2014:17:43:09-0700] "GET /includes/js/combined-javascript.js HTTP/ 1.1" 200 5067 0.003! 10.10.241.162 - - [29/Mar/2014:17:43:09-0700] "GET /favicon.ico HTTP/1.1" 200 2494 0.001! 10.10.241.162 - - [29/Mar/2014:17:43:17-0700] "POST /login.jsp HTTP/1.1" 302-0.028! 12

The side- effects! Incorrectly Ime stamped events can: Exclude key events from the result set Exclude events from Real Time Window Miss correlaion of events Missed alerts Throw off Summary Indexing results Produce incorrect reports Increase the window of index buckets Cause heart burn at audit Ime 13

Finding Time Stamping Issues! Where are my events? Possibly in the future.! Run a Real- Time All Time Search 14

TZ issues! Splunk will use a Ime zone offset reported in the _raw Ime field: 29/Jan/2014:17:43:09-0700! 29/Jan/2014:17:43:09 PST!! Otherwise, if your events come from systems in different Time Zones you may need to account for this as the event will be set with the TZ of the indexer! Set TZ offsets in props.conf [host::chi*] TZ = US/Central! Default Forwarder Time Zone Timezone supplied by the forwarder as its system Imezone (both forwarder and indexer must be Splunk 6 or later) 15

Other Time Errors! Check splunkd.log for TimeParserVerbose warnings: > index=_internal source=*splunkd.log WARN DateParserVerbose! 08-10-2014 05:59:14.488 +0000 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event. Context="source::/ log/applog.log host::appserver appserver_log remoteport::58689" Text="columnCount = 6;...!! These errors are normally due to incorrect TIME_FORMAT or LINE_BREAKER errors 16

Use the log file viewer in SoS 17

TIME_FORMAT in props.conf! [source::transaction.log] SHOULD_LINEMERGE = true BREAK_ONLY_BEFORE_DATE = false BREAK_ONLY_BEFORE = ^Processing TIME_PREFIX = \sat\s TIME_FORMAT = %Y-%m-%d %H:%M:%S MAX_TIMESTAMP_LOOKAHEAD = 70!! Verify the strpime() format with the date command % date "+%Y-%m-%d %H:%M:%S 2014-08-19 11:06:24 %! 18

Where to apply props.conf 19

Verify events are matching TIME_FORMAT! 08-07- 2014 14:33:09.566-0700! Use regex in your search to isolate events which don t have a matching TIME_FORMAT index=x sourcetype=y regex _raw!= ^\d{2}- \d{2}- \d{4}\s\d{2}:\d{2}:\d{2}. \d{3}\s- \d{4}! Similar approach using punct index=x sourcetype=y punct!= - - _::._- 20

Ge~ng data in on Ime! Events are slow ge~ng into Splunk! Look at indexing Ime lag source=<your delayed source> eval delay=_indexime- _Ime fields delay 21

SoS Real- Time Indexing Lag S.o.S - Splunk on Splunk > Distributed Indexing Performance 22

Causes for reported indexing lag! Network throughput issues! Universal/LightWeightForwarder Throcling (limits.conf) [thruput] maxkbps = 256! Blocked indexer queues! Buffered wriing to log files! Batch log files or historical log file loads! Incorrect Timestamp extracion 23

Indexer Health

Indexer Server Health! Monitor Indexer s Server Resources Memory Usage Processes System Load I/O throughput! System Commands top free/swap l iostat! *nix/windows App 25

SoS Splunk CPU/Memory Usage 26

Indexer Queues! Indexer queues provide insight to an indexer s internal state! Blocked queues indicate a processing bocleneck in the indexing process and will delay the indexing of events! Default max queue size = 1000, when this limit is reached this queue will be blocked! It s normal for queues to be temporarily blocked, indicates that your indexer is doing work! Persistent blocked queues indicate a performance issue, and will block upstream queues as they fill 27

Important Indexer Queues parsingqueue - UTF Encoding - Line Breaking (single line) - Header Processing aggqueue - Line Merging (identify single, multiline event) - Timestamp extraction typingqueue - Regex Replacement (field extraction, routing) - Punct Extraction indexqueue - Index and Forwarding - Block Signature - Writing to Disk 28

Looking for Boclenecks! View the current queue size by searching the metrics log: index=_internal source=*metrics.log group=queue Imechart median(current_size) by name! Find blocked queue events index=_internal source=*metrics.log group=queue blocked 29

SoS Indexer Queues S.o.S - Splunk on Splunk > Indexing Performance 30

SoS Indexer Queues S.o.S - Splunk on Splunk > Distributed Indexing Performance 31

Tuning Configs! Performance can be improved by modifying the configuraion file se~ng related to each queue s operaion 1. parsingqueue (props.conf)! LINE_BREAKER!! TRUNCATE! 2. aggqueue (props.conf)! SHOULD_LINEMERGE=false TIME_PREFIX=<something> TIME_FORMAT=<something> MAX_TIMESTAMP_LOOKAHEAD=X! 32

3. typingqueue props.conf ê TRANSFORMS- xxx ê SEDCMD transforms.conf ê SOURCE_KEY ê DEST_KEY ê REGEX ê FORMAT Tuning Configs 33

4. indexerqueue Index and Forwarding Block Signing Tuning Configs indexes.conf se~ngs ê maxhotbuckets = 10 ê maxdatasize = auto_high_volume! In most cases a blocked indexerqueue is due to more data being wricen to disk than can be serviced.! If the disk performance cannot be improved, consider adding addiional indexers to offload this indexing workload. 34

Search Performance

Search Performance! Search performance: idenifying slow searches, re- factoring searches to take advantage of map- reduce! Bundle replicaion white/blacklists! Where summary indexing/search acceleraion helps! Spreading out scheduled searches using cron syntax to avoid search hot spots at midnight, top of the hour! Using Ime snaps effecively and reducing the risk of missing delayed events 36

Search Performance ReporIng Splunk Server AcIvity Reports SoS Search AcIvity Reports 37

SoS Search Performance 38

SoS - Scheduler AcIvity 39

SoS - Skipped Scheduled Searches 40

Concurrent Searches! SoS > Search AcIvity > Search concurrency! System AcIvity Report > Search AcIvity Overview 41

Concurrent Searches! The system- wide number of searches is computed as base_max_searches + max_searches_per_cpu x number_of_cpus! Splunk 5 (limits.conf):! [search]!!base_max_searches = 6!!max_searches_per_cpu = 1!! number_of_cpus = number of cpus reported by OS with hyperthreading enabled this is 2 x physical cores! Beware increasing the number of concurrent searches may increase the CPU load and the Ime taken for any search to complete! 42

Scheduled Search Limit! The raio of jobs that the scheduler can use (versus the manual/dashboard ones).! By default, 25% of your quota is for scheduled searches (limits.conf):! [scheduler]!!max_searches_perc = 25!! If you have a large number of Scheduled Searches increasing this percentage may help with skipped searches.! The searches are NOT reserved for the scheduler - this is just a limit - ad- hoc user searches take priority over the scheduler. 43

Summary Indexing! Use Ime modifier offsets to snap- to- Ime! For last hour - 1h@h to @h! Offset the scheduled Ime from the top of the hour to 5 mins aqer the hour 5 * * * *! This offset method can be applied to other scheduled searches Start Time Finish Time Cron Schedule sidata -1h@h @h 5 * * * * Scheduled saved search to run every hous at 5 mins past the hour, with the time range of the last whole hour. 1:00 2:00 3:00 4:00 The results from the summary search will populate the Summary Index for the last whole hour. 44

New kids on the block

_introspecion index! In 6.1 added the _introspecion index! Collects Splunk performance data Per- process resource usage data: splunkd, splunkweb, splunk- opimize Search process data Host resource usage: CPU uilizaion, paging Disk object informaion: server- info, indexes, volumes, search- arifacts, fishbucket! hcp://docs.splunk.com/documentaion/splunk/latest/ TroubleshooIng/Abouchepla orminstrumentaionframework 46

Distributed Management Console! New in 6.2 visualizes data collected in the _introspecion index 47

Wrapping Up

In Review! It s important to ensure that events are indexed with the correct Ime, otherwise you risk missing data in your result sets! Tracking down Ime zone issues can be difficult, all your systems running in UTC is Utopia! Correct se~ng of TIME_FORMAT can improve indexer throughput! Monitor Splunk s indexer queues to stay ahead of indexer load/throughput issues! Tuning se~ngs in props and transforms can increase your indexer throughput! Monitor search acivity, fix expensive searches, modified scheduled search Imes to avoid search load hot spots 49

! hcp://answers.splunk.com Answers 50

TroubleshooIng Manual hcp://docs.splunk.com/documentaion/splunk/latest/ 51

Resources! Get educated: hcp://www.splunk.com/view/educaion/sp- CAAAAH9! Download Splunk applicaions: hcp://apps.splunk.com! Hire Splunk Professional Services: hcp://www.splunk.com/view/professional- services/sp- CAAABH9! Watch some videos: hcp://www.splunk.com/videos 52

Splunk on Splunk hcp://apps.splunk.com/app/748/ 53

QuesIons?

THANK YOU