Hardware Malware Detectors John Demme, Matthew Maycock, Jared Schmitz, Adrian Tang, Adam Waksman, Simha Sethumadhavan, Salvatore Stolfo Computer Science Department, Columbia University 1
Worms Trojan Horses Botnets Adware Spyware 2
Malware in the Wild 125M 100M 75M 50M 25M 0M 2000 2002 2004 2006 2008 2010 2012 3 ATest unique database entries (av-test.org)
Why we are losing the battle? Lines of Code 10,000,000 1,000,000 Sources: An analytical framework for cyber security DARPA s Peter Mudge Zatko 2011 & Ohloh.net 100,000 10,000 1,000 100 DEC Seal Milky Way Stalker Network Flight Recorder Malware: 125 lines Snort Unified Threat Management Growth of Security Code ClamA (2012) ClamA (2005) Expected number of bugs 1985 1991 1997 2003 2009 2015 4
State of Practice Applications Antivirus Operating System Hypervisor Software-based antivirus runs above the O/S, hypervisor and hardware Operating systems and hypervisors have exploited bugs Software antivirus is vulnerable and can t help it! Hardware 5
Proposed Solution Applications Operating System Hypervisor Hardware-based antivirus not susceptible O/S bugs Hardware tends to have fewer bugs & exploits Hardware Antivirus 6
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 7
How do we build hardware A/? Software Hardware Primitives Files, Downloads, File Systems, Registry Entries, System Calls Memory, Dynamic Instructions, uarch Events, System Calls How it works Scans downloads for static signatures? 8
bzip2 Programs have Unique µarch Signatures bzip2-i2 L1 Exclusive Hits Arithmetic µops Executed bzip2-i1 bzip2-i3 bzip2-i2 mcf bzip2-i3 hmmer Can µarch footprints uniquely identify malware during execution? mcf sjeng hmmer libquantum Could we build a database of malicious µarch signatures? sjeng h264 Time 9 m p
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 10
Machine Learning: Classifiers 11 Classifier 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 Malware Not Malware Training (at A/ endor) Production (on consumer device) 1 2 3 Classifier Malware? p( malware )
Machine Learning on µarch Events bzip2-i2 libquantum bzip2-i1 sjeng mc hmmer mcf omnetpp bzip2-i3 h264 hmmer astar sjeng astar L1 Exclusive Hits Arithmetic µops Executed Classifier p( malware ) Not malware Time 12 Arithmetic µops Executed Microarchitectural event counts yield performance vectors over time Feed each vector into classifier, results in p(malware) over time Average over time, decide if malware or not with threshold
Machine Learning on µarch Events L1 Exclusive Hits Arithmetic µops Executed Microarchitectural event bzip2-i2 bzip2-i1 Arithmetic µops Executed counts yield performance vectors over time mcf bzip2-i3 Classifier p( malware ) Time Feed each vector into classifier, results in p(malware) over time Average over time, hmmer Malware decide if malware or not with threshold sjeng 13
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 14
Android Malware Detection 17x increase malware detections in 2012 ( Trends for 2013 ESET Latin America s Lab) Android 4.1 mobile O/S ARM/TI PandaBoard 6 performance counters 15
Malware Families & ariants GPS Logger GPS Logger Family of variants which all do similar things GPS Logger GPS Logger GPS Logger GPS Logger 16 Usually packaged with different host software Expectation: similar malicious code, different host code
Can we detect new variants after seeing only old ones? GPS Logger GPS Logger GPS Logger GPS Logger GPS Logger Train classifier on these malware apps Evaluate our classifier with different variants in the same family 17
Experimental Setup Tapsnake Endofday 503 Malware apps Zitmo Loozfon-android Android.Steek Android.Trojan.Qic somos CruseWin Jifake AnserverBot Gone60 YZHC FakePlayer LoveTrap Bgserv KMIN DroidDreamLight HippoSMS Dropdialerab Zsone AngryBirds-Lena.C jsmshider Plankton PJAPPS Android.Sumzand RogueSPPush FakeNetflix GEINIMI SndApps GoldDream CoinPirate BASEBRIDGE DougaLeaker.A Newzitmo BeanBot GGTracker FakeAngry DogWars 18 37 families Taken from internet repository [1] and previous work [2] 210 Non-malware apps Most popular apps on Google play System applications (ls, bash, com.android.*) 3.68e8 data points total [1] http://contagiominidump.blogspot.com/ [2] Y. Zhou and X. Jiang, Dissecting android malware: Characterization and evolution, in Security and Privacy (SP), 2012 IEEE Symp. on, pp. 95 109, may 2012.
ery Realistic, Noisy Data Network connectivity allowed: Malware could phone home Additional noise introduced Input bias allowed: Multiple users conducted data collection Contamination between training & testing data prevented: Non-volatile storage wiped, eliminating sticky malware Training/testing split before data collection Environmental noise allowed: Malware ran with system applications, not isolated Makes our task harder, better feasibility study 19
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 20
Experiments in Paper Detection of malicious packages on Android Detection of malicious threads on Android Linux rootkit detection Cache side-channel attack detection 21
Detection Rates Percentage of Malware Identified Ideal Random False Positive Rate 22
Malware Family Detection Percentage of Malware Identified False Positive Rate 23
100 Rootkit Detection 80 Percentage of Malware Identified 60 40 20 0 0 20 40 False Positive Rate 24 60 80 100
Result Summary Detection of malicious packages on Android 90% accuracy Detection of malicious threads on Android 80% accuracy Linux rootkit detection About 60% accuracy Difficult problem; rootkits are tiny slices of execution Cache side-channel attack detection 100% accuracy, no false positives 25
One Way to Improve Results Android Software Package Non-Malware (e.g. Angry Birds, Pandora) Malware Malware writers package with non-malware. Problem: what s actually malware? Our (bad) solution: all of it 26 Raises false-positives
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 27
Recommendations for Hardware A/ System Design 1. Provide strong isolation mechanisms to enable anti-virus software to execute without interference. 2. Investigate both on-chip and off-chip solutions for the A implementations. 3. Allow performance counters to be read without interrupting the executing process. 4. Ensure that the A engine can access physical memory safely. 6. Increase performance counter coverage and the number of counters available. 7. The A engine should be flexible enough to enforce a wide range of security policies. 8. Create mechanisms to allow the A engine to run in the highest privilege mode. 9. Provide support in the A engine for secure updates. 5. Investigate domain-specific optimizations for the A engine. 28
Recommendations for Hardware A/ System Design 1. Provide strong isolation mechanisms to enable anti-virus software to execute without interference. 2. Investigate both on-chip and off-chip solutions for the A implementations. 3. Allow performance counters to be read without interrupting the executing process. 4. Ensure that the A engine can access physical memory safely. 6. Increase performance counter coverage and the number of counters available. 7. The A engine should be flexible enough to enforce a wide range of security policies. 8. Create mechanisms to allow the A engine to run in the highest privilege mode. 9. Provide support in the A engine for secure updates. 5. Investigate domain-specific optimizations for the A engine. 29
Strong isolation mechanisms enable anti-virus software to execute without interference NOC: Rx for performance cntr. data, Tx: security exceptions A Strongly Isolated Core Isolated Secure Bus Non-interruptable A/ requires data from cores Starvation == exploit A/ uses off-die memory Starvation == exploit 30
Allow Secure System Updates Update protocol: Decrypt erify signature Check revision number Disallows access to classifiers & action programs Update Package (Encrypted & signed) Trained Classifier Action Program Revision Number 31
Outline Detecting malware with performance counters... using machine learning Experimental setup Results for Android An architecture for hardware antivirus systems 32
Contributions (1) First hardware-based antivirus detection Promising results, reasons to believe results will improve First branch predictors started at 80% accuracy... (2) Dataset available: http://castl.cs.columbia.edu/colmalset Much to follow on: 0-day exploit detection, attacks, counterattack malware detection, better machine learning, precise training labels, ML accelerators, prototypes, etc. 33