Reputation based Security. Vijay Seshadri Zulfikar Ramzan Carey Nachenberg

Similar documents

Insight. Security Response. Deployment Best Practices

Securing the endpoint and your data

Big Data in Action: Behind the Scenes at Symantec with the World s Largest Threat Intelligence Data

Product Roadmap Symantec Endpoint Protection Suzanne Konvicka & Paul Murgatroyd

End to End Security do Endpoint ao Datacenter

MACHINE LEARNING & INTRUSION DETECTION: HYPE OR REALITY?

Integrating MSS, SEP and NGFW to catch targeted APTs

Endpoint Threat Detection without the Pain

Fighting Advanced Threats

Symantec Endpoint Security Management Solutions Presentation and Demo for:

F-Secure Internet Security 2014 Data Transfer Declaration

Firewall Testing Methodology W H I T E P A P E R

Next Generation IPS and Reputation Services

Analyzing HTTP/HTTPS Traffic Logs

An Delivery Report for 2012: Yahoo, Gmail, Hotmail & AOL

Whose IP Is It Anyways: Tales of IP Reputation Failures

Confidence in a Connected World. MEEC Symantec Product Availability. John Lally MD Education Account Executive John_Lally@symantec.

How To Protect Your Data From Being Hacked On Security Cloud

Applying machine learning techniques to achieve resilient, accurate, high-speed malware detection

SPEAR PHISHING AN ENTRY POINT FOR APTS

Protecting the Infrastructure: Symantec Web Gateway

Proactive Rootkit Protection Comparison Test

Symantec's Secret Sauce for Mobile Threat Protection. Jon Dreyfus, Ellen Linardi, Matthew Yeo

KASPERSKY PRIVATE SECURITY NETWORK: REAL-TIME THREAT INTELLIGENCE INSIDE THE CORPORATE INFRASTRUCTURE

McAfee Global Threat Intelligence File Reputation Service. Best Practices Guide for McAfee VirusScan Enterprise Software

The Hillstone and Trend Micro Joint Solution

SECURITY ANALYTICS MOVES TO REAL-TIME PROTECTION

Trend Micro OfficeScan Best Practice Guide for Malware

Mobile App Reputation

Running A Fully Controlled Windows Desktop Environment with Application Whitelisting

Cyber Essentials Questionnaire

CDM Software Asset Management (SWAM) Capability

Getting Started with the iscan Online Data Breach Risk Intelligence Platform

CMX: IEEE Clean File Metadata Exchange. Introduction. Mark Kennedy, Symantec Dr. Igor Muttik, McAfee

How McAfee Endpoint Security Intelligently Collaborates to Protect and Perform

Utilizing Pervasive Application Monitoring and File Origin Tracking in IT Security

Symantec Endpoint Protection

Norton Mobile Privacy Notice

How To Set Up A Shared Insight Cache Server On A Pc Or Macbook With A Virtual Environment On A Virtual Computer (For A Virtual) (For Pc Or Ipa) ( For Macbook) (Or Macbook). (For Macbook

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

Kaspersky Whitelisting Database Test

Secure Your Mobile Workplace

Putting Web Threat Protection and Content Filtering in the Cloud

Best Practices for Running Symantec Endpoint Protection 12.1 on the Microsoft Azure Platform

Symantec Endpoint Protection

Decoding DNS data. Using DNS traffic analysis to identify cyber security threats, server misconfigurations and software bugs

Trend Micro Hosted Security Stop Spam. Save Time.

Zscaler Internet Security Frequently Asked Questions

Comprehensive Filtering. Whitepaper

Zscaler Cloud Web Gateway Test

5 Steps to Advanced Threat Protection

Endpoint Security 2.0: The Emerging Role of Application Whitelisting Solutions. Todd Schell

LASTLINE WHITEPAPER. The Holy Grail: Automatically Identifying Command and Control Connections from Bot Traffic

Rise of the Machines: An Internet-Wide Analysis of Web Bots in 2014

Secure Because Math: Understanding ML- based Security Products (#SecureBecauseMath)

Domain Hygiene as a Predictor of Badness

EXTENDING NETWORK SECURITY: TAKING A THREAT CENTRIC APPROACH TO SECURITY

ClearSkies SIEM Security-as-a-Service (SecaaS) Infocom Security Athens April 2014

W H I T E P A P E R : T E C H N I C AL

REVOLUTIONIZING ADVANCED THREAT PROTECTION

On and off premises technologies Which is best for you?

ThreatSpike Dome: A New Approach To Security Monitoring

Real World and Vulnerability Protection, Performance and Remediation Report

Big Data Big Deal? Salford Systems

Cloud App Security. Tiberio Molino Sales Engineer

GFI Product Comparison. GFI MailEssentials vs Barracuda Spam Firewall

Cybersecurity: An Innovative Approach to Advanced Persistent Threats

Recurrent Patterns Detection Technology. White Paper

HTTPS Inspection with Cisco CWS

Unified Security, ATP and more

STPIC/Admin/002/ / Date: Sub: Quotation for purchase/renewal of Anti Virus Software Reg.

Fight fire with fire when protecting sensitive data

Presentation Title: When Anti-virus Doesn t Cut it: Catching Malware with SIEM

The Latest Internet Threats to Affect Your Organisation. Tom Gillis SVP Worldwide Marketing IronPort Systems, Inc.

Installation and usage of SSL certificates: Your guide to getting it right

Endpoint Security: Moving Beyond AV

COMBATING SPAM. Best Practices OVERVIEW. White Paper. March 2007

Removing Web Spam Links from Search Engine Results

Symantec Endpoint Protection 12.1 Symantec Protection Center 2.0

Endpoint Business Products Testing Report. Performed by AV-Test GmbH

Anti Spam Best Practices

WEB APPLICATION FIREWALLS: DO WE NEED THEM?

Introducing IBM s Advanced Threat Protection Platform

INCREASINGLY, ORGANIZATIONS ARE ASKING WHAT CAN T GO TO THE CLOUD, RATHER THAN WHAT CAN. Albin Penič Technical Team Leader Eastern Europe

Host-based Intrusion Prevention System (HIPS)

Modern Cyber Threats. how yesterday s mind set gets in the way of securing tomorrow s critical infrastructure. Axel Wirth

SSL BEST PRACTICES OVERVIEW

Transcription:

Reputation based Security Vijay Seshadri Zulfikar Ramzan Carey Nachenberg

Agenda Reputation Based Security The Problem Reputation Concept Implementing Reputation Deploying Reputation Conclusion 2

The Problem 3

Mismatched odds Malware authors have switched from mass distribution of a few threats to micro distribution of millions of distinct threats They re targeting each user with a distinct new threat Each has different instructions and a distinct fingerprint Stats: 20k 40k new threats discovered per day 120M distinct threats detected by Symantec over last 12 months The average Vundo variant is distributed to 18 Symantec users! The average Harakit variant is distributed to 1.6 Symantec users! With such micro distribution, what are the odds a security vendor will discover most of these threats? And if the vendor doesn t have a sample, how do they protect the customer? 4

No Existing Protection Addresses the Long Tail Today, both good and bad software obey a long tail distribution. Bad Files Good Files Unfortunately neither technique works well for the tens of millions of files with low prevalence. Prevalen nce Blacklisting works well here. For this long tail a new technique is needed. Whitelisting works well here. 5

Is the Cloud the Answer? Clouds only contain what the AV industry sees Most tthreats t today exist itin very small numbers Remember that long tail! What are the odds that a security vendor is going to discover a threat that targets just one or two PCs? Low! And if they know nothing about such a threat, neither will their cloud! So The Cloud by itself doesn t add much It s just the old signature based model, but very slightly sped up Most cloud based systems stillfail to address thelong tail 6

Ah-Ha Moment (Part 1) Three years ago we started thinking There s got to be a better way! Could there possibly be a third option? What about using a reputation based approach? Companies like ebay, Amazon and Zagat have had great success with this approach! Symantec has > 130M active users Couldn t we somehow leverage this massive installed user base to compute file reputations? 7

Reputation based Security Idea: Use a reputation based approach to rate applications Classic reputation approach (e.g., Amazon.com) Customers rate items they purchase Over time, products build a reputation I only buy items with at least 4 stars. One potential reputation approach: Symantec users rate apps they download Over time, applications build a reputation Symantec Endpoint Protection only allows users to run programs with at least a 4 stars. Challenge for security vendors: Unlike Amazon.com we can t ask our users How do you rate this file? Most users don t know if software they download is good or bad! So, how do we derive reputation scores for applications without prompting the user? Through our research efforts, we believe we ve found a way to anonymously harvest useful reputation information i about websites and applications from our tens of millions of opt in users without requiring them to explicitly tell us anything or make any judgments

How Can Reputation Help with Security? Traditional AV delivers ratings on only the subset of files that trigger signatures or heuristics In contrast, reputation gives us useful data about every file known to the reputation system And the data is not just black and white but nuanced This file is trending bad This file is trending good This data can be used to drive policy based lockdown, supplement existing detection algorithms and inform the user 9

Reputation is Complementary to Fingerprint AV Traditional fingerprints are totally fooled by polymorphism Attackers can simply change malware logic until existing fingerprints are missed Other the other hand, polymorphism sticks out like a sore thumb in a reputation system Most good apps have at least a medium number of users Most good apps have some longevity Polymorphic bad apps have exactly the opposite demographics Benefit: If the attackers polymorph their threats, they yield a lower reputation Ifthe attackers don t polymorph their threats: traditional blacklisting works great We can compute a more accurate reputation based on the larger number of users Reputation is derived by a statistical inference engine analyzing data from 35 million+ users, NOT by deriving it just from signatures. 10

The Impact of Reputation Traditional Products Our Next-gen Products 100% confidence 100% confidence 100% confidence Do not scan Norton Trusted Norton Trusted Do not scan Allow Norton Reputation Safe Scan and allow Potentially miss many malicious files Norton Reputation Warning Warn against (few users, recently released, not from a major publisher URL) (Behavior blocking and IPS may catch these) Norton Reputation Malicious Block Block Exact match signatures Exact match signatures and Heuristics Exact match signatures and Heuristics and Heuristics 11 100% confidence 100% confidence 100% confidence

Implementing Reputation 12

How Reputation Works Symantec Submission Server 1 Collect data 2 Calculate Reputation ti Score Symantec Reputation Server File hash Good/bad Confidence Prevalence Date first seen 13 3 Deliver Reputation Score

Calculating Reputation Our reputation algorithm is in many ways similar to Google s PageRank algorithm Here s a flavor for the approach Inputs: prevalence, age, provenance, demographics of software adopters All files identified by SHA2 hashes for security Reputation computed notbased onthe contents of each file but rather who s using each file! Scale: 35+M millions submitting users Data structure: huge, sparse bipartite graph of machines and files Outputs: Disposition (good/bad), confidence level, time to live M1 M2 M3 F1 F2 F3 F4 F5 1

Data Import & Computation Challenges: Potentially billions of files across hundreds of millions of users need to be tracked Raw data that we receive from customers on the order of hundreds d of gigabytes/day compressed! Need to perform complex mathematical computations on raw data for the statistical learning engine. Need a Highly scalable and available computation system! Solutions: Follow iterative design of the compute and storage clusters to improve mathematical calculations. Optimized computation framework that can work produce/update incremental features for files. We adapted our reputation algorithms to a distributed compute model (map reduce like system). Break a large bi partite graph (files and machines) into smaller file hash based partitions Multiple partitions are mapped to physical nodes We scale the system by adding more nodes 15

Results 16

Initial Results BAD GOOD 17

Initial Detection Results All files are classified as either Good, Bad or Unknown. Results on Malware Results on Goodware 4% unknown 0.1% unknown 0.3% FP 20% missed (FN) 76% malicious (TP) 99.6% clean 208,291 queries on malicious files 52,563,538 queries on clean files These results are above and beyond any protection provided by classical fingerprinting, heuristics and behavior blocking.

ROC Curve X Axis FP rate Y Axis TP rate 1.00 0.90 0.80 0.70 0.60 050 0.50 ROC Curve 0.40 0.30 0.20 0.10 0.00 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

Reputation Challenges All cloud based systems are subject to DDoS attacks Reputation based systems are hosted in the cloud Subject to the same attacks as fingerprint based clouds DDoS attacks on the cloud affects non reputation systems too! Gaming is the largest concern for reputation Attacker submits under multiple IDs (Sybil Attacks) Attackers suppress info submitted to us Attackers submit forged info to us Attackers use a botnet to scale their attacks Solutions Use cyptography to identify users and prevent invalid submissions Perform back end analytics lti to detect dt tt strange behavior bh & inconsistencies Use encryption to ensure secrecy and authentication Statistical machine learning framework tolerates some noise in the data 20

Deploying Reputation 21

Reputation Use Cases Five distinct use cases: Scan Avoidance Aggressive Heuristics False Positive Mitigation Download Protection Enterprise Policy based Lockdown Checks Reputation when: SEP identifies high reputation software. Our client can now permanently exclude good programs from further scanning. Drastically reduces AV overhead (60 90% of actively used files skipped in NAV 09) A local heuristic suspects malware. Client blocks when the file also has a bad reputation. Heuristics and Reputation are orthogonal so we combine them to catch more malware without higher FPs. A generic signature or behavior blocking detects a new threat. Reputation data is used to reduce the risk of a highly prevalent False Positive detection. Block low reputation files automatically and provide context to users for other files The Client blocks when the file has a bd bad reputation. The user is provided with actionable data to help them decide if a download is legit. The administrator can create custom blocking policies based on their unique risk profile. The Client filters all downloads according to the administrator s Policy The differentiator is not the cloud itself but the information that the cloud delivers Reactive Cloud Scanning Our reputation cloud is also used to deliver our traditional signatures. This helps to address the periods in between traditional infield signature updates.

How useful is a warning? We ve all seen dialog boxes like this that lead to user confusion user confusion and frustration So dialog like this doesn t give a user real information to make an informed decision 23

Symantec Reputation Actionable data for all files When Symantec can t predict a definitive good or bad result, Reputation still provides the user with actionable data How many other Norton users have this file? When was the first time a Norton user saw the file? 4 hours 24

Enterprise Policy based Lockdown You will have the ability to specify your own blocking policy for all downloaded files My finance department can only install high reputation software used by at least 10,000 000 other users and available for at least 2 weeks Our help desk employees can install good reputation software with at least 100 other users. Our CEO can install anything she likes. Given that most malware strains are distributed to less than 20 users, such policies would virtually eliminate traditional malware in the enterprise!

Conclusion 26

Reputation Changes the Game Reputation changes the rules of the malware /anti malware game We no longer need to rely solely on traditional signatures We now use metadata from tens of millions of users to identify malware Attackers can no longer evade us b t ki th i l by tweaking their malware 27

Questions 28

THANK YOU