Using big data analytics to identify malicious content: a case study on spam emails



Similar documents
Phishing Scams Security Update Best Practices for General User

BOTNETS. Douwe Leguit, Manager Knowledge Center GOVCERT.NL

Current counter-measures and responses by CERTs

Cybercrime myths, challenges and how to protect our business. Vladimir Kantchev Managing Partner Service Centrix

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

Defending Against. Phishing Attacks

Malware & Botnets. Botnets

Dragonfly: Energy Companies Under Sabotage Threat Symantec Security Response

Cisco 4Q11. Global Threat Report

OVERVIEW. 1. Cyber Crime Unit organization. 2. Legal framework. 3. Identity theft modus operandi. 4. How to avoid online identity theft

Current Threat Scenario and Recent Attack Trends

PHISHING IN SEASON TAX TIME MALWARE, PHISHING AND FRAUD

Countermeasures against Bots

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

Don t Fall Victim to Cybercrime:

Zscaler Cloud Web Gateway Test

Cyber liability threats, trends and pointers for the future

Domain Name Abuse Detection. Liming Wang

Case 2:08-cv ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8

Who Drives Cybersecurity in Your Business? Milan Patel, K2 Intelligence. AIBA Quarterly Meeting September 10, 2015

Global Network Pandemic The Silent Threat Darren Grabowski, Manager NTT America Global IP Network Security & Abuse Team

Modern Cyber Threats. how yesterday s mind set gets in the way of securing tomorrow s critical infrastructure. Axel Wirth

Recurrent Patterns Detection Technology. White Paper

IBM Security Systems Trends and IBM Framework

Information Security Threat Trends

Security Incidents And Trends In Croatia. Domagoj Klasić

Targeted attacks begin with spearphishing

SECURITY REIMAGINED SPEAR PHISHING ATTACKS WHY THEY ARE SUCCESSFUL AND HOW TO STOP THEM. Why Automated Analysis Tools are not Created Equal

Overview. Common Internet Threats. Spear Phishing / Whaling. Phishing Sites. Virus: Pentagon Attack. Viruses & Worms

Malicious Network Traffic Analysis

Protecting against Mobile Attacks

Vulnerability Assessment & Compliance

WEB ATTACKS AND COUNTERMEASURES

One Minute in Cyber Security

QUARTERLY REPORT 2015 INFOBLOX DNS THREAT INDEX POWERED BY

Phishing Past, Present and Future

UNCLASSIFIED. Briefing to Critical Infrastructure Sector Organizations on the Canadian Cyber Incident Response Centre (CCIRC)

Managing Web Security in an Increasingly Challenging Threat Landscape

You ll learn about our roadmap across the Symantec and gateway security offerings.

When less is more (Spear-Phishing and Other Methods to Steal Data) Alexander Raczyński

DDoS Attacks & Defenses

CYBERSECURITY INESTIGATION AND ANALYSIS

Innovations in Network Security

Almost 400 million people 1 fall victim to cybercrime every year.

Who will win the battle - Spammers or Service Providers?

Phone Fax

Detailed Description about course module wise:

Spear Phishing Attacks Why They are Successful and How to Stop Them

Practical Steps To Securing Process Control Networks

Monitoring and Logging Policy. Document Status. Security Classification. Level 1 - PUBLIC. Version 1.0. Approval. Review By June 2012

Security A to Z the most important terms

Threat Events: Software Attacks (cont.)

Stanford Computer Security Lab. TrackBack Spam: Abuse and Prevention. Elie Bursztein, Peifung E. Lam, John C. Mitchell Stanford University

Web 2.0 and Data Protection. Paul Tsang Security Consultant McAfee

Cybersecurity: Thailand s and ASEAN s priorities. Soranun Jiwasurat

ITU WSIS Thematic Meeting on Countering Spam: The Scope of the problem. Mark Sunner, Chief Technical Officer MessageLabs

Deep Security Vulnerability Protection Summary

Analysis One Code Desc. Transaction Amount. Fiscal Period

Expanded Header: Viewing in Microsoft Outlook

Network Security and the Small Business

Spyware. Michael Glenn Technology Management 2004 Qwest Communications International Inc.

Botnets: The Advanced Malware Threat in Kenya's Cyberspace

Cyber Security & Role of CERT-In. Dr. Gulshan Rai Director General, CERT-IN Govt. of India grai@mit.gov.in

7 Cs of WEB design - Customer Interface

Cyber Security. Maintaining Your Identity on the Net

Blackhole Exploit Kit: A Spam Campaign, Not a Series of Individual Spam Runs AN IN-DEPTH ANALYSIS

Security workshop Protection against botnets. Belnet Aris Adamantiadis Brussels 18 th April 2013

INTERNET SERVICE PROVIDERS VOLUNTARY CODE OF PRACTICE FOR INDUSTRY SELF-REGULATION IN THE AREA OF CYBER SECURITY

OCT Training & Technology Solutions Training@qc.cuny.edu (718)

Software Engineering 4C03 Class Project. Computer Networks and Computer Security COMBATING HACKERS

Top 10 Data Security Threats Plaguing Credit Unions

The FBI and the Internet

Top tips for improved network security

CSE 3482 Introduction to Computer Security. Denial of Service (DoS) Attacks

Cyber Security and Critical Information Infrastructure

FSOEP Web Banking & Fraud: Corporate Treasury Attacks

ACS-3921/ Computer Security And Privacy. Lecture Note 5 October 7 th 2015 Chapter 5 Database and Cloud Security

Promoting a cyber security culture and demand compliance with minimum security standards;

Transcription:

Using big data analytics to identify malicious content: a case study on spam emails Mamoun Alazab & Roderic Broadhurst Mamoun.alazab@anu.edu.au http://cybercrime.anu.edu.au

2 Outline Background Cybercrime and SPAM? Importance of Big Data Analytics Data description & Analysis Summary Q&A

Background ( ANU Cybercrime Observatory 1 ) Team Research Interests: Criminology/Sociology Organised Crime Law & Regulation Information Security Malware Analysis Phishing attacks Police and media cases Computer Forensics 1 http://cybercrime.anu.edu.au 3

Spammed Messages Social Networking Websites Worm Malicious Websites Install Malware Become Zombie Removable Devices Spam as social engineering, enables malware to reach high volume low value targets that make it one of the popular means for spreading and injecting malware on computers. 4

5 Cybercrime definition from legislation Definition of high tech crime (Australian Federal Police) 1 : High tech crime offences are defined in Commonwealth legislation in Part 10.7 - Computer Offences of the Criminal Code Act 1995 and include: computer intrusions (for example, malicious hacking) unauthorised modification of data, including destruction of data denial-of-service (DoS) attacks distributed denial of service (DDoS) attacks using botnets the creation and distribution of malicious software (for example, viruses, worms, trojans). 1 AFP, link: http://www.afp.gov.au/policing/cybercrime/hightech-crime.aspx

6 Spam (Def.) it is hard to define the term spam accurately. Some distinct spam is an issue about consent not content, and while others believe it is the issue of content not the consent. Also some other believes it is about quantity or scale. In general, the word spam is commonly used to describe unsolicited e-mails that are sent in bulk 1. Certain definitions also stress the commercial nature of spam 2. 1 Commission communication, on unsolicited commercial communications or "spam", p. 5 2 For example, the US CAN-SPAM act of 2003 establishes requirements for those who send commercial e-mail.

7 Cont. In Australia, ACMA defined spam as unsolicited commercial electronic messages; also a single electronic message can also be considered spam under Australian law. On the other hand, Spamhaus defined spam differently and consider an email is a spam only if it is both unsolicited and sent in bulk. "bulk", "commercial" and " unsolicited " are on themselves problematic, as they do not provide enough flexibility to deal with the variety of the content that is distributed using modern digital means of communications.

8 Why Big Data when fighting spam Dealing with spam introduces a number of Big Data challenges. The total size and scale of the data is enormous. In the 1990s, the average PC user received one or two spam messages a day. The amount of spam was currently estimated to be 200 billion messages sent per day (circa August 2010 see Josh Halliday, 2011; Syamtec, MAAWG, 2013 ). The suppression of spam involves the need to understand complex patterns of behavior and the capacity to identify new types of spam. Around 96% of all email messages are estimated to be spam.

9 Cont. Of all spam emails sent on any one day, an average of 3.3% contained malicious attachments and higher for suspect web-pages (perhaps 1 in 5 ephemeral). Spammers collect gross world wide revenue on the order of $200 million per year ( Google, Microsoft, Yahoo, 2012) Spam now is associated with the recent crime toolkits. i.e. Blackhole, Zeus).

10 Focus Emails containing malicious contents, they attempt to compromise the security of a computer and try to lure the recipient to click on a fake or infected URL that links to a malicious Web site ( landing page ) or downloads a malicious attachment with a zero-day exploit. Regardless of the source i.e. phishing or spear phishing.

11 Data Set Data provided from the Australian Communication Media Authority's (ACMA) Spam Intelligence Database (SID) & the Computer Emergency Response Team (CERT) Australia. SID - Three sources received in anonymised. Only 2 data sources have been processed thus far (2012). Data (spam) are in the Millions in raw format. Our analysis only looked at the messages which appear to have been relayed through Australia, for example last hop IP address was located in Australia.

Month Habul data set Botnet Data set # Spam Emails # spam Emails Jan 67 31,991 Feb 104 49,085 Mar 75 45,413 Apr 65 33,311 May 83 28,415 Jun 94 11,587 Jul 72 16,251 Aug 85 21,970 Sep 363 27,819 Oct 73 13,426 Nov 193 17,145 Dec 95 20,696 Total 1,369 317,109 12

13 3 V s of Big Data Machine learning and data mining are well established techniques in the world of IT and especially among web companies and startups. Spam detection is made possible by mining the huge amount of data available and at play. However, big data is not only about Volume, but also about Velocity, and Variety (The 3V s of big data). Volume Data Quantity Velocity Data Speed Variety Data Type

14 Email Attacks Malware and phishing are becoming combined Poisoned attachments (Ex. custom PDF exploits) Links to web sites with malware (web browser exploits) Install Trojan or remote access software Attackers use Fake domains: PayPal vs. PayPaI <= I not L Compromised Sites: hosting malicious software URL Shorting services: Hides real URL Droppers: malicious code on sites that drop malware upon visiting a site a webpage. Spear-phishing: targets specific groups or individuals usually attractive targets with limited guardianship Social engineered deceptive/tailored email content (e.g. advanced fee frauds etc.)

15 Trends (1/2) Spam and spam campaigns are often sent In large quantities At certain times or time frames Seemingly harmless URL that can redirect to compromised Web sites. Inconspicuous file names and extensions. tracking_instructions.pdf.zip Social Engineering tactics - Attackers user common business terms in the file names as spear phishing bait. i.e. ups, amazon, HP_document, etc.

16 Trends (2/2) Same attachments with different email body. Same email body with different attachment ZIP files remain the preferred file of choice for malware delivery over email (potentially delivers a high payload). Malware is delivered in ZIP file format in an estimated 91% of identified cases in our data. Malware authors (spammers) focus on evasion (e.g. double extensions, obfuscation, change code) URL seems to be not working - Evidence of a so-called Waterhole attack.

17 Zeus Virus Zeus code injection Legitimate webpage

Ransomware 18

19 Identifying malicious spam emails Parsed the raw data to database. Then, extract attachments and URLs and upload to VirusTotal 1 1 Free online virus checker that offers support for academic researchers, to scan for viruses and suspicious content. VirusTotal uses over 40 different virus scanners, where we consider an attachment or URL to be malicious if at least one scanner shows a positive result. https://www.virustotal.com

Attachment 20

21 URLs URL seems to be not working - Evidence of using Waterhole attack.

22 Waterhole attack - Blackhole Exploit Kit (finding) http://comromised.com/../index.com Malicious website

23 Conclusion Predicating spam messages containing malicious contents are not possible without the systematic analysis of big data, previous knowledge of current threats and likely development in modus operandi. Propose using only spam email text to predict malicious attachments and URLs (ask for the paper) Novel features to capture text patterns Self-contained (no external resources) We show we can predict malicious attachments up to 95.2%, and up to 68.1% for URLs. Machine learning and data analytics based on Big Data will improve the discovery of targeted attacks and persistent threats. (ask for the 2 nd paper)

Thank you and visit http://cybercrime.anu.edu.au