Copyright 2013 Splunk Inc. How and When to Use Dynamic Lookups Nimish Doshi Principal Systems Engineer, Splunk #splunkconf
Legal NoIces During the course of this presentaion, we may make forward- looking statements regarding future events or the expected performance of the company. We cauion you that such statements reflect our current expectaions and esimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward- looking statements, please review our filings with the SEC. The forward- looking statements made in this presentaion are being made as of the Ime and date of its live presentaion. If reviewed aser its live presentaion, this presentaion may not contain current or accurate informaion. We do not assume any obligaion to update any forward- looking statements we may make. In addiion, any informaion about our roadmap outlines our general product direcion and is subject to change at any Ime without noice. It is for informaional purposes only and shall not, be incorporated into any contract or other commitment. Splunk undertakes no obligaion either to develop the features or funcionality described or to include any such feature or funcionality in a future release. Splunk, Splunk>, Splunk Storm, Listen to Your Data, SPL and The Engine for Machine Data are trademarks and registered trademarks of Splunk Inc. in the United States and other countries. All other brand names, product names, or trademarks belong to their respeccve owners. 2013 Splunk Inc. All rights reserved. 2
About Me: Nimish Doshi! Principal Systems Engineer at Splunk Cover the USA East Coast! Have been at Splunk since 2008! Splunk Blogger! AcIve App Template and Add- on developer at apps.splunk.com 3
Agenda! Lookups in General! StaIc Lookups! Dynamic Lookups Retrieve fields from a web site Retrieve fields from a database Retrieve fields from a persistent cache 4
Lookups in General
Enrichment Enrich your events with fields from external sources. 6
Capturing All This Data Occurs at INDEX Time Logfiles Configs Messages Traps Alerts Metrics Scripts Changes Tickets Windows Registry Event logs File system sysinternals Linux/Unix ConfiguraIons syslog File system ps, iostat, top VirtualizaIon Hypervisor Guest OS Guest Apps ApplicaIons Web logs Log4J, JMS,JMX.NET events Code, scripts Databases ConfiguraIons Audit/query logs Tables Schemas Networking ConfiguraIons syslog SNMP neglow 7
Splunk Architecture, BIG DATA Plagorm Splunk CLI Interface Splunk Web Interface Other Interfaces, SDKs Splunk > Engine REST API Lookups Scheduling/AlerIng ReporIng Knowledge Distributed Search Deployment Server Search Index Data RouIng, Cloning and Load Balancing Distributed Search Users & Access Controls Monitor Files Detect File Changes Listen to Network Ports Run Scripts (WMI, Registry, OPSEC LEA, DBI, JMS, VMWare API, other APIs) 8
UlImate Knowledge Base Lookups enrich your ability to act upon your data * hnp://sangrea.net/free- cartoons/comp_real- life- search- engine.jpg, Royalty Free Cartoons 9
Before Lookups 10
ASer Lookups Top Status DescripIon 11
Integrate External Data Extend search with lookups to external data sources LDAP, AD Watch Lists CMDB CRM/ERP Correlate IP addresses with locaions, accounts with regions 12
InteresIng Things to Lookup ü User s Mailing Address (AD) ü External Host Address ü Error Code DescripIons ü Database Query ü Product Names ü Web Service Call for Status ü Stock Symbol (from CUSIP) ü Geo LocaIon 13
Other Reasons for Lookup! Bypass staic developer or vendor that does not enrich logs! ImaginaIve correlaions i.e. Web site URL with like or dislike count stored in external source! Make your data more interesing Bener to see textual descripions than arcane codes 14
StaIc Lookups
StaIc vs. Dynamic Lookup StaIc External Data comes from a CSV file Dynamic External Data comes from output of an external script. Output resembles a CSV file 16
StaIc Lookups Review! Pick what input fields will be used to get output fields! Create or locate a CSV file that has all fields in proper order! Tell Splunk via the Manager about your CSV file and your lookup You can also define lookups manually via props.conf and transforms.conf If you use automaic lookups, they will run every Ime the source, sourcetype, or associated host stanza is used in a search Non- automaic lookups run only when the lookup command is invoked in the search 17
Example StaIc Lookup Conf files! props.conf! [access_combined] lookup_http = http_status status OUTPUT status_description, status_type transforms.conf [http_status] filename = http_status.csv 18
Example AutomaIc StaIc Lookup 19
Permissions local.meta! [lookups/http_status.csv] access = read : [ * ], write : [ * ] export = system [transforms/http_status] access = read : [ * ], write : [ * ] export = system 20
Lookup Topics Not Covered in This Session! Field extracions are performed before lookups! Lookups run on the indexer To ensure lookups do not run on remote peers, use local=true in lookup command! You can also use outputlookup to populate a CSV file! You can also use Ime based lookups to find fields that match your event s Imestamp with an interval in the lookup CSV 21
Dynamic Lookups
Dynamic Lookups! Write the script to simulate access to external source! Test script with one set of inputs! Create the Splunk Version of the lookup script! Register the Script with Splunk via Manager or Conf files! Test the script explicitly before using automaic lookups 23
Lookups vs. Custom Command! Use dynamic lookups when returning fields given input fields Standard for users who already know how to use lookup! Use a custom command when doing more than just lookup Not all use cases involve just returning fields ê Decrypt event data ê Translate event data from one format to another with new fields (e.g. FIX Orders) 24
Write/Test External Field Gathering Script External Data in Cloud Send Input Fields Return Output Fields Your Python Script Scripts 25
Example Script to Test External Lookup # Given a host, find the ip def mylookup(host):! try:!! ipaddrlist = socket.gethostbyname_ex(host)! return ipaddrlist! except:! return [] 26
Write/Test External Field Gathering Script External Data in Cloud Send Input Fields Return Output Fields Your Python Script Scripts 27
Test External Field Gathering Script with Splunk External Data in Cloud Your Python Script Output Fields Scripts 28
Script for Splunk Simulates Reading Input CSV hostname, ip! a.b.c.com! Zorrosty.com! seemanny.com! 29
Output of Script Returns Logically Complete CSV hostname, ip! a.b.c.com, 1.2.3.4! Zorrosty.com, 192.168.1.10! seemanny.com, 10.10.2.10! 30
transforms.conf for Dynamic Lookup [NameofLookup]! external_cmd = <name>.py field1 fieldn! external_type = python! fields_list = field1,, fieldn! 31
Example Dynamic Lookup Conf files! transforms.conf!! # Note this is an explicit lookup! [whoislookup]! external_cmd = whois_lookup.py ip whois! external_type = python! fields_list = ip, whois! 32
Dynamic Look Up Python Flow def lookup(input):! Perform external lookup based on input. Return result main()! Check standard input for CSV headers.! Write headers to standard output.! For each line in standard input (input fields):!!gather input fields into a dictionary (key-value structure)!!ret = lookup(input fields)!!if ret:!!!send to standard output input values and return values from lookup! 33
Whois Lookup def main():! if len(sys.argv)!= 3:! print "Usage: python whois_lookup.py [ip field] [whois field]"! sys.exit(0)! ipf = sys.argv[1]! whoisf = sys.argv[2]! r = csv.reader(sys.stdin)! w = None! header = []! first = True! 34
Whois Lookup (cont.) to Read CSV Header # First read the CSV header and output the fields names. Continue! for line in r:! if first:! header = line! if whoisf not in header or ipf not in header:! print "IP and whois fields must exist in CSV data"! sys.exit(0)! csv.writer(sys.stdout).writerow(header)! w = csv.dictwriter(sys.stdout, header)! first = False! continue! 35
Whois Lookup (cont.) to Populate Input Fields # Read the result and populate the values for the input fields (ip address in our case)! result = {}! i = 0! while i < len(header):! if i < len(line):! else:! i += 1! result[header[i]] = line[i]! result[header[i]] = ''! 36
Whois Lookup (cont.) to Populate Output Fields # Perform the whois lookup if necessary! if len(result[ipf]) and len(result[whoisf]):! w.writerow(result)! # Else call external website to get whois field from the ip address as the key! elif len(result[ipf]):! result[whoisf] = lookup(result[ipf])! if len(result[whoisf]):! w.writerow(result)! 37
Whois Lookup FuncIon LOCATION_URL=http://some.url.com?query=! # Given an ip, return the whois response! def lookup(ip):! try:! whois_ret = urllib.urlopen(location_url + ip)! lines = whois_ret.readlines()! return lines! except:! return ''! 38
Database Lookups in General! Use DB Connect from apps.splunk.com if possible Splunk supported No code to be wrinen to do lookups to for popular RDBMS! Use your own DB lookup when Your Database is not supported by DB Connect You want to perform the lookup with custom code to meet a requirement 39
Database Lookup vs. Database Sent to Index! Depends! Use a lookup when Using needle in the haystack searches with a few users Using form searches returning few results! Index the database table or view when ê Having lots of users and ad hoc reporing is needed ê It is ok to have stale data (N minutes) old for a dynamic database 40
Database Lookup! Acquire proper modules to connect to the database! Connect and authenicate to database Use a connecion pool, if possible! Have lookup funcion query the database Return a list ( [ ] ) of results 41
Example Database Lookup Using MySQL # See http://splunk-base.splunk.com/apps/36664/splunk-mysql-connector! # for a general example using MySQL! # First connect to DB outside of the for loop! conn = MySQLdb.connect(host = "localhost",! user",! user = name of passwd = password",! db = Name of DB")! cursor = conn.cursor()! 42
Example Database Lookup (cont.) Ssing MySQL import MySQLdb.! # Given a city, find its country def lookup(city, cur):! try:! selstring = "SELECT country FROM city_country where city=!! cur.execute (selstring + "\"" + city + "\"")! row = cur.fetchone()! return row[0]!!except:! return []!! 43
Example Mongdb Lookup import pymongo! from pymongo import Connection! # Given a star name, find its magnitude! def lookup(collection, key):!!try:!!star = collection.find_one({'name : key})!!!return star['magnitude']!!except:! return None!...! connection = Connection()! db = connection.test_database! star_collection = db.stars...! 44
Web Services Lookup! Acquire proper modules to connect to web service! Connect and authenicate to web service, if necessary! Have lookup funcion call your web service method Return a String of results or an empty String, if no match occurs 45
Example Web Services Lookup Using Suds From suds.client import Client! # Given a name, height, and weight, return percent body fat def lookup(client, name, height, weight):! try:! result=client.service.getpercentbodyfat(name, height, weight)! if result!=none and result!= :!!return result!! else:!!!return!!except:! return! client=client(<url to some Web Service WSDL>)! 46
Lookup Using Key Value Persistent Cache! Download and install Redis! Download and install Redis Python module! Import Redis module in Python and populate key value DB! Import Redis module in lookup funcion given to Splunk to look up a value given a key 47
Redis Lookup ##### CHANGE PATH TO your distribution FIRST ############ sys.path.append("/library/python/2.6/site-packages/ redis-2.4.5-py2.6.egg") import redis def main():. # Connect to redis CHANGE for your DISTRIBTUION pool = redis.connectionpool(host='localhost', port=6379, db=0) redp = redis.redis(connection_pool=pool) 48
Redis Lookup (cont.) # Note that this returns key value pairs. Redis can also return keys mapped to muliple values def lookup(redp, mykey): try: return redp.get(mykey) except: return 49
Combine Persistent Cache with External Lookup! For data that is relaively staic First see if the data is in the persistent cache If not, look it up in the external source such as a database or web service If results come back, add results to the persistent cache and return results! For data that changes osen, you will need to create your own cache retenion policies 50
Combining Redis with Whois Lookup def lookup(redp, ip):! try:! ret = redp.get(ip)! if ret!=none and ret!='':! return ret! else:! whois_ret = urllib.urlopen(location_url + ip)! lines = whois_ret.readlines()! if lines!='':! redp.set(ip, lines)! return lines! except:!! return 51
Where to Get Add- ons Discussed Here Splunkbase. Add- On Download LocaIon Release Whois hnp://splunk- base.splunk.com/apps/22381/whois- add- on 4.x DBLookup hnp://splunk- base.splunk.com/apps/22394/example- lookup- using- a- database 4.x Redis Lookup hnp://splunk- base.splunk.com/apps/27106/redis- lookup 4.x Geo IP Lookup (not in these slides) hnp://splunk- base.splunk.com/apps/22282/geo- locaion- lookup- script- powered- by- maxmind 4.x 52
So, What? Enrich BIG DATA with external sources Conclusion: Lookups are a powerful way to enhance your search experience beyond indexing 53
THANK YOU