Presents Securing NoSQL Clusters Adrian Lane, CTO alane@securosis.com Twitter: @AdrianLane David Mortman dmortman@securosis.com Twitter: @
Independent analysts with backgrounds on both the user and vendor side. Focused on deep technical and industry expertise. We like pragmatic. We are security guys - that s all we do. About Securosis
How does big data help with security analytics? and How Do I Protect Data in the Cluster? The Research
Encyclopedic Hutton and the Big Data Blues Source: Wikipedia, property of Warner Bros.
More data of more types Need forensics Need to determine risk Need to detect fraud Need to detect intrusions Need to protect this data Need to automate Management: Get it done!
Security analytics not working! My systems won t do the forensics Bolt-ons not working with my SIEM or data management systems Won t collect the data types I need Shock/Denial
Why Doesn't my SIEM do this? Isn t that what I already bought?
What SIEM promised
Not really... Most SIEM s can t handle the volume of data Most SIEMs can t process all data types Many based upon RDBMS Many can t do complex analysis
I ll buy a security analytics platform Feed event data in Correlate across my SIEM and data warehouse Use my existing policies and reports! Image source: www.nycgo.com No problem!
Image Source: nithyananda-cult.blogspot.com Anger
SIEM Mashup An#$Fraud*&* 3rd*Party*Analy#cs* MSP*&*3rd*Party* Monitoring* Advanced*Malware* Protec#on* DIY* Big*Data * SIEM% Threat*Intelligence* General*Purpose* Analy#cs*
Security Analytics Platforms Each deals with one use case - customers have several Companies need structured, unstructured and semi-structured data analysis Use different platforms internally, some piggyback on select SIEM, some are standalone Real time _or_ forensic, not both Vendors offer one or two analysis approach REST-ful APIs not available
Bargaining Image source: larainydays.blogspot.com
The Inevitable Questions: Bunch of previously acquired technologies - how do we fit them together? What is the rest of the industry doing? Where are the enterprise grade analytics tools? Who handles fraud and risk and security intelligence and threat analytics? Where do I go to find people?
Encyclopedia Hutton Asks Friends For Advice
DIY Security Analytics! Use Big Data - it scales It handles many types of data You can customize as you see fit It s designed to support analytics
Image courtesy of pragsis.com Hadoop let s you do all this and more - virtually free analytics tools on commodity hardware!
Image source: Problogger.net Big Data Will Save The Day!
Performance Scalability Data volume Data types Fast lookup or fast analysis Flexibility How does big data help?
Image source: monkeysbadmonkeys.wordpress.com Build everything from scratch? Do you know how much this will cost? All new software All new systems Data architect, statisticians and security pro s Depression
Big Data is Supposed to Address My Problems
I don t know what I don t know! What pieces do I need? How do I organize data? How will I manage something this complex? How do I secure this critical data? Getting control is not easy
It s all new Pig? Hive? Flume? What does it mean? What exactly is a data architect? It s not SQL? Can I run queries across databases? How does it scale? Key data on what values? How do I secure it?
NoSQL Cluster Architecture Client%Job%Request% Node%Status% M7R%Status% Resource%Request% Node% Manager% Data$ App$ Client$ Client$ Resource% Manager% Node% Manager% App$ Data$ Node% Manager% Data$ Data$
Hadoop Stack
Early days for big data No in-house data scientist Programmers needed Just figuring out what we can do with NoSQL DIY Analytics Today vendors don t know much more than you http://flic.kr/p/efqfy9 Talent Gap
Integration Issues APIs inconsistent/unavailable Log Management & data collection Peer to peer queries and results
Taking on the task that is security analytics with big data. Realizing that platforms like Hadoop are first step Cluster Security can be done With the right skills, that can be leverage to great effect. Acceptance
Building the machine
Applied Big Data Start with Metrics Build a model (aka have a theory) Test it! Having a data scientist type helps
GQM Goal Question Metric
Example - NIST CSF ID.AM: The data, personnel, devices, systems, and facilities that enable the organization to achieve business purposes are identified and managed consistent with their relative importance to business objectives and the organization s risk strategy.
Example - NIST CSF Are network ingress points documented? Are network egress points mapped? Are data flows mapped?
Example - NIST CSF # Undocumented Ingress points # Undocumented egress points # of Undocumented Data Flows % business units/business processes/etc. without data flow diagrams % business units/business processes/etc. with data flow diagrams
SIRA - NIST CSF http://nistcsf.societyinforisk.org
Different Flavors of NoSQL Hadoop - Universal M-R for huge data sets. Great for search, log analysis, ad-hoc queries. Cassandra - Columnar store. Indexed. Best for writing lots of data quickly, few lookups. Highly distributable. CouchDB - General purpose analytics database. Fast insert/few changes. Pre-defined queries. RIAK - Super-fast data lookup - like Dynamo - but with data management and scalability. Control system logs and fast devices. Redis - Fast changing data. In memory."
Operational Issues Node & App Validation Admin Access Data at Rest Monitoring Config. Management
Big Data Security Architectures
Model 1: Walled Garden
So if I put a firewall around it
Model 1: Walled Garden Think Mainframe security silo Basically hide the cluster behind firewall User passwords Network segmentation, SSL
Beyond the Status Quo http://www.despair.com/tradition.html
Model 2: App Protected
Model 2: App Protected Authenticate Applications Authenticate Users Authorize data access (roles) Filter API requests Audit Activity
Model 3: Data Centric Approach Tokenization Encryption Masking
Securosis Data Breach Triangle Exploit Egress Data
Tokenization, FPE & Masking
Model 3: Data Centric Approach Protect data before it s put into cluster Can t steal what s not there Removal: Masking Removal: Tokenization Protection: Encryption
Model 4: Deploy in The Cloud
Given general knowledge of Cloud & NoSQL security, some of you are thinking this does not end well
Reality is different u Security Zones u Data Encryption u Built-in SSL u Authentication u Hyper-segregation u Logging, monitoring u Automated Config Management
Model 4: Leverage Cloud Security Data encryption (SSL, encrypted storage) Key management services Security zones Authentication services Server management (config, patch) Logging & monitoring services
Big Data Security is not easy - Complex environments - No clear definition - Lots of new research - Pragmatic approach - Many more issues - Ongoing research project Easy? No.
Adrian Lane Securosis, L.L.C. David Mortman Dell, Inc.