IT Data Visualization Raffael Marty, GCIA, CISSP Chief Security Strategist @ Splunk> SUMIT, Michigan - October 08
Raffael Marty Chief Security Strategist @ Splunk> Looked at logs/it data for over 10 years - IBM Research - Conference boards / committees Presenting around the world on SecViz Passion for Visualization - http://secviz.org - http://afterglow.sourceforge.net Applied Security Visualization Paperback: 552 pages Publisher: Addison Wesley (August, 2008) ISBN: 0321510100
Agenda IT Data Visualization - Security Visualization Dichotomy - Research Dichotomy IT Data Management - A shifted crime landscape Perimeter Threat Visualization is a more effective way of IT data management and analysis. Insider Threat Security Visualization Community 3
Visualization Questions Who analyzes logs? Who uses visualization for log analysis? Who has used DAVIX? Have you heard of SecViz.org? What tools are you using for log analysis? 4
IT Data Visualization Applied Security Visualization, Chapter 3
What is Visualization? Generate a picture from IT data A picture is worth a thousand log records. Explore and Discover Inspire Answer a Pose a New Increase Communicate Support Question Question Efficiency Information Decisions 6
Information Visualization Process Capture Process Visualize 7
The 1st Dichotomy Security security data types of data networking protocols two domains perception routing protocols (the Internet) optics security impact color theory depth cue theory security policy jargon use-cases are the end-users Visualization Security & Visualization interaction theory types of graphs human computer interaction 8
9 The Failure - New Graphs
10 The Right Thing - Reuse Graphs
11 The Failure - The Wrong Graph
12 The Right Thing - Adequate Graphs
The Failure - The Wrong Integration Using proprietary data format Provide parsers for various data formats does not scale is probably buggy / incomplete Use wrong data access paradigm complex configuration e.g., needs an SSH connection /usr/share/man/man5/launchd.plist.5 <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/dtds/propertylist-1.0.dtd"> <plist version="1.0"> <dict> <key>_name</key> <dict> <key>_iscolumn</key> <string>yes</string> <key>_isoutlinecolumn</key> <string>yes</string> <key>_order</key> <string>0</string> </dict> <key>bsd_name</key> <dict> <key>_order</key> <string>62</string> </dict> <key>detachable_drive</key> <dict> <key>_order</key> <string>59</string> </dict> <key>device_manufacturer</key> <dict> <key>_order</key> <string>41</string> </dict> <key>device_model</key> <dict> <key>_order</key> <string>42</string> </dict> <key>device_revision</key> 13
The Right Thing - KISS Keep It Simple Stupid Use CSV input Use files as input Offload to other tools parsers data conversions /usr/share/man/man5/launchd.plist.5 <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/dtds/propertylist-1.0.dtd"> <plist version="1.0"> <dict> <key>_name</key> <dict> <key>_iscolumn</key> <string>yes</string> <key>_isoutlinecolumn</key> <string>yes</string> <key>_order</key> <string>0</string> </dict> <key>bsd_name</key> <dict> <key>_order</key> <string>62</string> </dict> <key>detachable_drive</key> <dict> <key>_order</key> <string>59</string> </dict> <key>device_manufacturer</key> <dict> <key>_order</key> <string>41</string> </dict> <key>device_model</key> <dict> <key>_order</key> <string>42</string> </dict> <key>device_revision</key> # Using node sizes: size.source=1; size.target=200 maxnodesize=0.2 14
15 The Failure - Unnecessary Ink
The Right Thing - Apply Good Visualization Practices Don't use graphics to decorate a few numbers Reduce data ink ratio Visualization principles 16
Industry don t understand the real impact get the 70% solution don t think big no time/money for real research can t scale work based off of a few customer s input The 2nd Dichotomy Academia Some comments are based on paper reviews from RAID 2007/08, VizSec 2007/08 don t know what s been done in industry don t understand the use-cases two worlds Industry & Academia don t understand the environments / data / domain work on simulated data construct their own problems use overly complicated, impractical solutions use graphs / visualization where it is not needed 17
The Way Forward Building a secviz discipline Bridging the gap Learning the other discipline More academia / industry collaboration Security Visualization SecViz 18
My Focus Areas Use-case oriented visualization IT data management Perimeter Threat Governance Risk Compliance (GRC) Insider Threat IT data visualization SecViz.Org DAVIX 19
IT Data Management
A Shifted Crime Landscape Crimes are moving up the stack Insider crime Large-scale spread of many small attacks Application Layer Transport Layer Are you prepared? Are you monitoring enough? Questions are not known in advance! Have the data when you need it! Network Layer Link Layer Physical Layer 21
What Is IT Data? Logs /var/log/messags /opt/log/* multi-line files Configurations Traps & Alerts Scripts & Code Change Events /etc/syslog.conf /etc/hosts 1.3.6.1.2.1.25.3.3.1.2.2 iso. org. dod. internet. mgmt. mib-2. host. hrdevice. hrprocessortable. hrprocessorentry. hrprocessorload ps netstat File system changes Windows Registry entire files multi-line structures multi-line table format hooks into the OS The IT Search Company
Perimeter Threat Applied Security Visualization, Chapter 6
Sparklines "Data-intense, design-simple, word-sized graphics". Edward Tufte (2006). Beautiful Evidence. Graphics Press. Average } Standard Deviation Examples: - stock price over a day - access to port 80 over the last week Java Script Implementation: http://omnipotent.net/jquery.sparkline/ 24
Sparklines Port Source IP Destination IP 25
Insider Threat Applied Security Visualization, Chapter 8
Three Types of Insider Threats Fraud Information Leak Sabotage 27
Example - Insider Threat Visualization More and other data sources than for the traditional security use-cases Insiders often have legitimate access to machines and data. You need to log more than the exceptions Insider crimes are often executed on the application layer. You need transaction data and chatty application logs The questions are not known in advance! Visualization provokes questions and helps find answers Dynamic nature of fraud Problem for static algorithms Bandits quickly adapt to fixed thresholdbased detection systems Looking for any unusual patterns 28
User Activity Color indicates failed logins High ratio of failed logins 29
30
Security Visualization Community
SecViz - Security Visualization This is a place to share, discuss, challenge, and learn about security visualization.
V D X Data Analysis and Visualization Linux davix.secviz.org
Tools Capture - Network tools Argus Snort Wireshark - Logging syslog-ng - Fetching data wget ftp scp Processing - Shell tools awk, grep, sed - Graphic preprocessing Afterglow LGL - Date enrichment geoiplookup whois/gwhois Visualization - Network Traffic EtherApe InetVis tnv - Generic Afterglow Treemap Mondrian R Project * Non-concluding list of tools
Thank You! raffy @ splunk. com