Challenges in Cyber Security Experiments: Our Experience Annarita Giani, UC Berkeley, George Cybenko, Dartmouth College Vincent Berk, Dartmouth College Eric Renauf, Skaion
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Process Query System Observable events coming from sensors Models Hypothesis PQS ENGINE Tracking Algorithms 4
Hierarchical PQS Architecture TIER 1 TIER 2 TIER 1 Models TIER 1 Observations TIER 1 Hypothesis TIER 2 Observations TIER 2 Models TIER 2 Hypothesis Scanning PQS Events Snort IP Tables More Complex Models Infection PQS Events Snort Tripwire PQS Data Access PQS Events Samba RESULTS Exfiltration PQS Events Flow and Covert Channel Sensor 5
PQS in Computer Security Now Security Analysts look at the data and make hypotheses. Internet DIB:s BGP IPTables Experience + Education + Expertise = Expensive BRIDGE DMZ Snort 7 5 2 12 8 1 WS WWW Mail observations WinXP LINUX Tripwire SaMBa PQS ENGINE 6
Sensors and Models 1 2 3 4 5 6 7 DIB:s Snort, Dragon IPtables Samba Flow sensor ClamAV Tripwire Dartmouth ICMP-T3 Bcc: System Signature Matching IDS Linux Netfilter firewall, log based SMB server - file access reporting Network analysis Virus scanner Host filesystem integrity checker 1 2 3 4 5 6 7 Noisy Internet Worm Propagation fast scanning Email Virus Propagation hosts aggressively send emails Low&Slow Stealthy Scans of our entire network Unauthorized Insider Document Access insider information theft Multistage Attack several penetrations, inside our network DATA movement TIER 2 models 7
Example PQS model: Macro in word document for exfiltration Balanced Flow 2 TIER 1 VIRUS 1 4 RECON Data Exfiltration 3 Balanced Flow or Data Exfiltration Word virus opens up a ftp connection with a server and upload documents. 8
Phishing Attack The act of sending an e-mail to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private information. The e-mail directs the user to visit a web site where they are asked to update personal information. Visit http://www.cit1zensbank.com 1 2 First name, Last name Account Number SSN Bogus web site 3 First name, Last name Account number SSN 9
accesses user machine using username and password uploads some code Complex Phishing Attack Steps Stepping stone 100.20.3.127 as usual browses the web and 5 1. visits a web page. inserts username and password. (the same used to access his machine) 2 records username and password 165.17.8.126 Web page, Madame X 3 4 Victim Attacker downloads some data 6 51.251.22.183 100.10.20.9 10
Sept 29 11:23:56 2. SNORT SSH (Policy Violation) NON-STANDARD-PROTOCOL 3. DATA UPLOAD FLOW SENSOR Sept 29 11:23:56 Complex Phishing Attack Observables Stepping stone 100.20.3.127 DEST Sept 29 11:17:09 1. RECON SNORT: KICKASS_PORN DRAGON: PORN HARDCORE Web Server used- Madame X Attacker SOURCE DEST DEST DEST Username password 165.17.8.126 SOURCE SOURCE SOURCE Victim Attacker 51.251.22.183 DEST 5. DATA DOWNLOAD FLOW SENSOR Sept 29 11:24:07 SOURCE 100.10.20.9 11
Phishing Attack Model 1 very specific UPLOAD 2 UPLOAD 4 DOWNLOAD RECON 1 UPLOAD 6 7 DOWNLOAD RECON UPLOAD 3 5 UPLOAD 12
Phishing Attack Model 2 less specific dst,src UPLOAD dst,src 2 UPLOAD dst,src 4 DOWNLOAD src RECON or or COMPROMISE RECON or or COMPROMISE dst dst, src dst,! src 1 UPLOAD dst, src 6 7 DOWNLOAD src RECON or or COMPROMISE UPLOAD dst 3 dst,a 5 dst,!src dst,!src UPLOAD dst, src dst,src 13
Phishing Attack Model 3 more general RECON or or COMPROMISE dst, src UPLOAD dst,src 2 UPLOAD dst,src 4 DOWNLOAD src RECON or or COMPROMISE RECON or or COMPROMISE dst RECON or or COMP dst, src RECON or or COMP dst,! src 1 UPLOAD dst, src 6 7 DOWNLOAD src UPLOAD dst RECON or or COMP dst,!src RECON or or COMPROMISE 3 RECON or or COMP dst 5 RECON or or COMP dst,! src UPLOAD dst, src RECON or or COMP dst, src 14
Phishing Attack Model 3 Most general or UPLOAD DOWNLOAD RECON or UPLOAD 1 2 3 4 DOWNLOAD RECON Stricter models reduce false positives, but less strict models can detect unknown attack sequences 15
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Blind Test December 12-14, 2005 Who was there The system developer team The reviewers The ground truth developer team The SA system developer team were given 4 hard drive full of network and host data. They had to provide the list of the attacks that generated such data. The responses were used to evaluate the systems and guide development. The collected data is an anonymized stream of network traffic, collected using tcpdump. It resulted in hundreds of gigabytes of raw network traffic. 17
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Ground truth (1/2) Skaion provides offline data sets consisting of a variety of captured sensor data streams during simulated scenarios. During those scenarios, normal background traffic is provided by the Skaion Traffic Generation System (TGS). An attack as an attempt to gain privileges or information in excess of those granted. A ground truth system should ideally be able to: 1. handle network-based and host-based alerts. 2. uniquely distinguish every event. 3. categorize all captured behavior, not just malicious behavior. 4. be used by humans or programs to understand the documented scenario. 5. provide multiple flexible perspectives.
Ground truth (2/2) Skaion created malicious outsider data sets that included 1. Captured network traffic 2. IDS alerts 3. application log files Ground truth was represented using four methods: 1. A narrative description of attacks 2. a formatted list of attack steps 3. a per-sensor breakdown of false and true positives and negatives 4. a relational database of all captured events Sam Gordon, Eric Renouf, Role-Based Ground Truth for Generated Attack Scenarios Skaion Corporation
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Definitions The raw data or input streams for a Cyber SA system are the events generated by network sensors. These events are considered the evidence (observations) of a Cyber SA system. When the evidence from multiple data streams is fused together in such a way as to identify a potential attack, we call this collection of evidence an attack track. A situation is the set of all attack tracks at a given point in time. From Tadda: Measuring Performance of Cyber Situation Awareness Systems, 11th International Conference on Information Fusion 2008 John Salerno, Measuring situation assessment performance through the activities of interest score, 11th International Conference on Information Fusion 2008 These definitions were unknown at the test.
Metrics of Performance The metrics fall into four dimensions: 1. confidence 2. purity 3. cost utility 4. timeliness Tadda: Measuring performance of cyber situation awareness systems 11th International Conference on Information Fusion, 2008
Confidence How well the system detects the true attack tracks. (1) recall (2) precision (3) fragmentation (4) mis-association False Negative False Positive
Purity Quality of the correctly detected tracks. How well the evidence is being correlated and aggregated into an attack track. whether the system was assigning evidence to a track that wasn t relevant or if it only considered directly useful evidence How much of the observation available was truly being used?
Cost Utility the ability of a system to identify the important or key attack tracks with respect to the concept of cost. In [8], two cost utility metrics were described. Timeliness The ability of the system to respond within the time requirements of a particular domain.
Assessment Process 11th International Conference on Information Fusion 2008
Outline 1. The Situational Awareness system to be tested (Dartmouth) 2. The blind test 3. Ground truth (Skaion) 4. Performance measures (AFRL) 5. Results and conclusion
Complex Phishing Attack Results No observations coming from Dragon sensor and Flow sensor Attack steps 0 of 5 Background attackers 9 of 15 Background scanners 25 of 55 Stepping stones 0 of 1 Using Dragon and Flow observations Attack steps 5 of 5 Background attackers 10 of 15 Background scanners 23 of 55 Stepping stones 1 of 1 False alarms 1 29
GOAL: < AVERAGE GOAL: > AVERAGE GOAL: < AVERAGE Summary of Results Precision Fragmentation 100 100 90 90 80 80 70 70 60 60 50 40 30 20 10 0 4s1 4s3 4s4 4s13 4s14 4s5 4s6 4s8 4s16 4s17 50 40 30 20 10 0 4s1 4s3 4s4 4s13 4s14 4s5 4s6 4s8 4s16 4s17 Mis-Associations 100 90 80 Scenario 4s14: Phishing attack 70 60 50 40 30 20 10 0 4s1 4s3 4s4 4s13 4s14 4s5 4s6 4s8 4s16 4s17 Threshold Values: 0.0 0.5 0.75 30
Putting Everything Together Valuable feedback on performance and design (Animated) Disagreement on the definition of what an attack is. This can lead to lead to different degrees of performance. Sleepless nights trying to have the results in the format the reviewer wanted so that they could run their software assessment. Are the precision measurements proposed by the reviewers good? How do we assess that? This lead to the problem of choosing right measures of performance.