Applying Machine Learning to Network Security Monitoring. Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject!



Similar documents
Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre

Cymon.io. Open Threat Intelligence. 29 October 2015 Copyright 2015 esentire, Inc. 1

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath)

Protec'ng Communica'on Networks, Devices, and their Users: Technology and Psychology

Whose IP Is It Anyways: Tales of IP Reputation Failures

From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tidatasci)

ThreatSTOP Technology Overview

Next Generation IPS and Reputation Services

ISSA Phoenix Chapter Meeting Topic: Security Enablement & Risk Reducing Best Practices for BYOD + SaaS Cloud Apps

Defend Your Network with DNS Defeat Malware and Botnet Infections with a DNS Firewall

Data Mining. Supervised Methods. Ciro Donalek Ay/Bi 199ab: Methods of Sciences hcp://esci101.blogspot.

How To Use Data Analysis And Machine Learning For Information Security

Unified Security Management and Open Threat Exchange

No Cloud Allowed. Denying Service to DDOS Protection Services

APPLICATION PROGRAMMING INTERFACE

Defend Your Network with DNS Defeat Malware and Botnet Infections with a DNS Firewall

DNS Traffic Monitoring. Dave Piscitello VP Security and ICT Coordina;on, ICANN

Domain Hygiene as a Predictor of Badness

Software that provides secure access to technology, everywhere.

Getting the Most Out of SIEM. Presentation Title. Data in Big Data. Presented By: Dr. Char Sample, CERT

Splunk and Big Data for Insider Threats

Separating Signal from Noise: Taking Threat Intelligence to the Next Level

Modern Approach to Incident Response: Automated Response Architecture

Using SIEM for Real- Time Threat Detection

Pa8ern Recogni6on. and Machine Learning. Chapter 4: Linear Models for Classifica6on

WHITE PAPER: THREAT INTELLIGENCE RANKING

Global Network Pandemic The Silent Threat Darren Grabowski, Manager NTT America Global IP Network Security & Abuse Team

NGFW is yesterdays news what is next in scope for the firewall in the threat intelligence age

Cisco Security Intelligence Operations

RSA Security Analytics

Machine Learning for Cyber Security Intelligence

CALNET 3 Category 7 Network Based Management Security. Table of Contents

DDOS Mi'ga'on in RedIRIS. SIG- ISM. Vienna

Hunting for the Undefined Threat: Advanced Analytics & Visualization

Advanced Visibility. Moving Beyond a Log Centric View. Matthew Gardiner, RSA & Richard Nichols, RSA

Cloud and Critical Infrastructures how Cloud services are factored in from a risk perspective

NETWORK DEVICE SECURITY AUDITING

Content Security: Protect Your Network with Five Must-Haves

Cyber Watch. Written by Peter Buxbaum

SANS Top 20 Critical Controls for Effective Cyber Defense

Data- Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing (#ddti)

IT Sicherheit im Web 2.0 Zeitalter

Reduce Your Network's Attack Surface

Part III: Machine Learning. CS 188: Artificial Intelligence. Machine Learning This Set of Slides. Parameter Estimation. Estimation: Smoothing

Presentation Title: When Anti-virus Doesn t Cut it: Catching Malware with SIEM

AlienVault. Unified Security Management (USM) 5.x Policy Management Fundamentals

/Endpoint Security and More Rondi Jamison

Enabling Security Operations with RSA envision. August, 2009

Metric Matters. Dain Perkins, CISSP

Threat Intelligence for Dummies. Karen Scarfone Scarfone Cybersecurity

ANALYTICAL TECHNIQUES FOR DATA VISUALIZATION

Security Intelligence Blacklisting

Machine learning for algo trading

Ensemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/

WORKING WITH WINDOWS FIREWALL IN WINDOWS 7

True in Depth Security through Next Generation SIEM. Ray Menard Senior Principal Security Consultant Q1 Labs

Know Your Foe. Threat Infrastructure Analysis Pitfalls

Cybercrime myths, challenges and how to protect our business. Vladimir Kantchev Managing Partner Service Centrix

WhatWorks in Detecting and Blocking Advanced Threats:

Presented by: Aaron Bossert, Cray Inc. Network Security Analytics, HPC Platforms, Hadoop, and Graphs Oh, My

Cyber Intelligence Workforce

Security Analytics for Smart Grid

Intercept Anti-Spam Quick Start Guide

Threat Intelligence is Dead. Long Live Threat Intelligence!

Microsoft Exchange 2003

Data-Driven Threat Intelligence: Metrics on Indicator Dissemination and Sharing (#ddti)

F-SECURE MESSAGING SECURITY GATEWAY

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Comprehensive Malware Detection with SecurityCenter Continuous View and Nessus. February 3, 2015 (Revision 4)

Enterprise Organizations Need Contextual- security Analytics Date: October 2014 Author: Jon Oltsik, Senior Principal Analyst

Comprehensive Understanding of Malicious Overlay Networks

Patching, AlerFng, BYOD and More: Managing Security in the Enterprise with Splunk Enterprise

Evolution Of Cyber Threats & Defense Approaches

The principle of Network Security Monitoring[NSM]

Intel Security Certified Product Specialist Security Information Event Management (SIEM)

Transcription:

Applying Machine Learning to Network Security Monitoring Alex Pinto Chief Data Scien2st MLSec Project @alexcpsec @MLSecProject!

whoami Almost 15 years in Informa2on Security, done a licle bit of everything. Most of them leading security consultancy and monitoring teams in Brazil, London and the US. If there is any way a SIEM can hurt you, it did to me. Researching machine learning and data science in general for the 2 years or so and presen2ng about its intersec2on with Infosec for more then an year now. Created MLSec Project in July 2013

Agenda Defini2ons Network Security Monitoring PoC GTFO Feature Intui2on MLSec Project

Big Data + Machine Learning + Data Science

Big Data + Machine Learning + Data Science

Big Data

(Security) Data Scien>st Data Scien2st (n.): Person who is becer at sta2s2cs than any so]ware engineer and becer at so]ware engineering than any sta2s2cian. - - Josh Willis, Cloudera Data Science Venn Diagram by Drew Conway!

Kinds of Machine Learning Machine learning systems automa2cally learn programs from data CACM 55(10) Domingos 2012 Supervised Learning: Classifica2on (NN, SVM, Naïve Bayes) Regression (linear, logis2c)! Unsupervised Learning : Clustering (k- means) Decomposi2on (PCA, SVD) Source scikit- learn.github.io/scikit- learn- tutorial/general_concepts.html

Classifica>on Example VS!

Considera>ons on Data Gathering Models will (generally) get becer with more data Always have to consider bias and variance as we select our data points Am I selec2ng the correct features to describe the en22es? Have I got a representable sample of labeled data I can use? I ve got 99 problems, but data ain t one! Domingos, 2012 Abu- Mostafa, Caltech, 2012

Security Applica>ons of ML Fraud detec2on systems (not security): Is what he just did consistent with past behavior? SPAM filters Network anomaly detec2on: Good luck finding baselines ML is a bit more then rolling averages User behavior anomaly detec2on: My personal favorite, 2 new companies/day Does fraud detec2on follow the CLT?

Considera>ons on Data Gathering (2) Adversaries - Exploi2ng the learning process Understand the model, understand the machine, and you can circumvent it Any predic2ve model on InfoSec will be pushed to the limit Again, think back on the way SPAM engines evolved! Posit: Intrinsic features of malicious actors cannot be masked as easily as behavioral features"

Network Security Monitoring

Kinds of Network Security Monitoring Alert- based: Tradi2onal log management SIEM Using Threat Intelligence (i.e blacklists) for about a year or so Lack of context Low effec2veness You get the results handed over to you Explora2on- based: Network Forensics tools (2/3 years ago) ELK stacks High effec2veness Lots of people necessary Lots of HIGHLY trained people Much more promising Big Data Security Analy2cs (BDSA): Basically explora2on- based monitoring on Hadoop and friends Sounds kind of painful for the analysts involved

Alert- based + Explora>on- based

Using robots to catch bad guys

PoC GTFO We developed a set of algorithms to detect malicious behavior from log entries of firewall blocks Over 6 months of data from SANS DShield (thanks, guys!) A]er a lot of sta2s2cal- based math (true posi2ve ra2o, true nega2ve ra2o, odds likelihood), it could pinpoint actors that would be 13x- 18x more likely to acack you. Today reducing amount of clucer in log files to less then 0.5% of actors worth inves2ga2ng, and having less than 20% false posi2ves in par2cipant deployments.!

Feature Intui>on: IP Proximity Assump2ons to aggregate the data Correla2on / proximity / similarity BY BEHAVIOR Bad Neighborhoods concept: Spamhaus x CyberBunker Google Report (June 2013) Moura 2013 Group by Geoloca2on Group by Netblock (/16, /24) Group by BGP prefix Group by ASN informa2on

0 MULTICAST AND FRIENDS 10 Map of the Internet (Hilbert Curve) Block port 22 2013-07- 20 CN 127 CN, BR, TH RU

Feature Intui>on: Temporal Decay Even bad neighborhoods renovate: ACackers may change ISPs/proxies Botnets may be shut down / relocate A licle paranoia is Ok, but not EVERYONE is out to get you (at least not all at once)! As days pass, let's forget, bit by bit, who acacked Last 2me I saw this actor, and how o]en did I see them!

Feature Intui>on: DNS features Who resolves to this IP address pdns data + WHOIS Number of domains that resolve to the IP address Distribu2on of their life2me Entropy, size, cctlds Registrar informa2on Reverse DNS informa2on History of DNS registra2on (Thanks, Farsight Security!)

Training the Model YAY! We have a bunch of numbers per IP address/domain! How do you define what is malicious or not? Curated indicator feeds OSINT indicator feeds with some help from sta2s2cal- based cura2ng Top X lists of visted sites. Feedback from security tools (if you trust them)!

MLSec Project Working with several companies on tuning these models on their environment with their data Looking for par2cipants and data sharing agreements Visit hyps://www.mlsecproject.org, message @MLSecProject or just e- mail me.!

MLSec Project - Current Research Inbound acacks on exposed services (BlackHat 2013): Informa2on from inbound connec2ons on firewalls, IPS, WAFs Feature extrac2on and supervised learning Malware Distribu2on and Botnets (hopefully BlackHat 2014): Informa2on from outbound connec2ons on firewalls, DNS and Web Proxy Ini2al labeling provided by intelligence feeds and AV/an2- malware Some semi- supervised learning involved User Impersona2on in Web Applica2ons (early days): Inputs: logs describing authen2ca2on acempts (both failed and successful), click stream data Segmenta2on of users by risk level

Thanks! Q&A at the end of the webinar Alex Pinto @alexcpsec @MLSecProject hcps://www.mlsecproject.org/ " Essen2ally, all models are wrong, but some are useful." - George E. P. Box