Email Correlation and Phishing



Similar documents
43% Figure 1: Targeted Attack Campaign Diagram

Everyone s online, but not everyone s secure. It s up to you to make sure that your family is.

A number of factors contribute to the diminished regard for security:

A number of factors contribute to the diminished regard for security:

Cloud Security Primer MALICIOUS NETWORK COMMUNICATIONS: WHAT ARE YOU OVERLOOKING?

When attackers have reached this stage, it is not a big issue for them to transfer data out. Spencer Hsieh Trend Micro threat researcher

As threat actors target various types of networks, companies with improperly configured network infrastructures risk the following repercussions:

Learn about each tool in parental controls and find out how you can use them to secure you and your family.

RESEARCHBRIEF. Beyond Online Gaming Cybercrime: Revisiting the Chinese Underground Market

This document has been provided by the International Center for Not-for-Profit Law (ICNL).

Web. Paul Pajares and Max Goncharov. Connection. Edition. ios platform are also at risk, as. numbers via browser-based social.

How Do Threat Actors Move Deeper Into Your Network?

DIGITAL LIFE E-GUIDE. Keeping Your Cloud Data in Check

How To Integrate Hosted Security With Office 365 And Microsoft Mail Flow Security With Microsoft Security (Hes)

Have you ever seen an online ad with a product or brand that you searched just ten minutes ago? That s the result of customized advertising.

From Russia with Love

This guide aims to get you started on decluttering the most important aspects of your digital life.

Getting a new computer or smartphone is always exciting but do you know what to do with your old one?

FastPOS: Quick and Easy Credit Card Theft

AntiSpam. Administrator Guide and Spam Manager Deployment Guide

Latest Business Compromise Malware Found: Olympic Vision

SPEAR PHISHING AN ENTRY POINT FOR APTS

WHITE PAPER SPON. A Cloud-Client Architecture Provides Increased Security at Lower Cost. Published January 2012 SPONSORED BY

Cloud-Client Enterprise Security Impact Report Increased Protection at a Lower Cost

Trend Micro Incorporated Research Paper Adding Android and Mac OS X Malware to the APT Toolbox

Network Detection Evasion Methods

Achieve Deeper Network Security

Eco and Ego Apps in Japan

T E C H N I C A L S A L E S S O L U T I O N

Technology Blueprint. Protect Your . Get strong security despite increasing volumes, threats, and green requirements

Types of cyber-attacks. And how to prevent them

SPAM, VIRUSES AND PHISHING, OH MY! Michael Starks, CISSP, CISA ISSA Fellow 10/08/2015

How To Secure A Website With A Password Protected Login Process (

Achieve Deeper Network Security and Application Control

Stop Spam. Save Time.

Privacy 101. A Brief Guide

Big Data in Action: Behind the Scenes at Symantec with the World s Largest Threat Intelligence Data

WHAT S NEW IN WEBSENSE TRITON RELEASE 7.8

e-shot Unique Deliverability

eprism Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide

Symantec Protection Suite Add-On for Hosted and Web Security

WEBTHREATS. Constantly Evolving Web Threats Require Revolutionary Security. Securing Your Web World

10 Things Every Web Application Firewall Should Provide Share this ebook

How To Prevent Hacker Attacks With Network Behavior Analysis

Quick Reference. Administrator Guide

GFI Product Comparison. GFI MailEssentials vs Barracuda Spam Firewall

How To Ensure Your Is Delivered

Addressing Big Data Security Challenges: The Right Tools for Smart Protection

Streamlining Web and Security

THE SOUTH KOREAN FAKE BANKING APP SCAM

Follow the Data: Analyzing Breaches by Industry

The Mobile Cybercriminal Underground Market in China

Defending Against. Phishing Attacks

BYOD Guidance: Architectural Approaches

SPEAR PHISHING UNDERSTANDING THE THREAT

Intercept Anti-Spam Quick Start Guide

How To Protect Your Cloud From Attack

Anti-Phishing Best Practices for ISPs and Mailbox Providers

Migration Project Plan for Cisco Cloud Security

Boundary Encryption.cloud Deployment Process Overview

Introducing IBM s Advanced Threat Protection Platform

deliverability

defending against advanced persistent threats: strategies for a new era of attacks agility made possible

PRECIS ANTI-SPAM GATEWAY AUTHENTICATION CASE STUDIES

Symantec Cyber Threat Analysis Program Program Overview. Symantec Cyber Threat Analysis Program Team

This document has been provided by the International Center for Not-for-Profit Law (ICNL).

Trend Micro Hosted Security Stop Spam. Save Time.

The Hillstone and Trend Micro Joint Solution

HTTPS Inspection with Cisco CWS

Protect your brand from phishing s by implementing DMARC 1

The Growing Problem of Outbound Spam

BlackShield Authentication Service

Targeted Phishing. Trends and Solutions. The Growth and Payoff of Targeted Phishing

The term Broadway Pet Stores refers we to the owner of the website whose registered office is 6-8 Muswell Hill Broadway, London, N10 3RT.

Enterprise Cybersecurity Best Practices Part Number MAN Revision 006

IBM X-Force 2012 Cyber Security Threat Landscape

How To Sell Security Products To A Network Security Company

AVG AntiVirus. How does this benefit you?

Home Automation and Cybercrime

Malicious Mitigation Strategy Guide

Trend Micro Hosted Security Stop Spam. Save Time.

McAfee Global Threat Intelligence File Reputation Service. Best Practices Guide for McAfee VirusScan Enterprise Software

WildFire. Preparing for Modern Network Attacks

Breaking the Cyber Attack Lifecycle

SECURITY ANALYTICS MOVES TO REAL-TIME PROTECTION

Evaluating DMARC Effectiveness for the Financial Services Industry

Symantec Protection Center Enterprise 3.0. Release Notes

OIS. Update on the anti spam system at CERN. Pawel Grzywaczewski, CERN IT/OIS HEPIX fall 2010

Improving Business Outcomes: Plug in to Security As A Service Adrian Covich

Breach Found. Did It Hurt?

Putting Web Threat Protection and Content Filtering in the Cloud

GFI Product Manual. Administration and Configuration Manual

V1.4. Spambrella Continuity SaaS. August 2

COMBATING SPAM. Best Practices OVERVIEW. White Paper. March 2007

GFI Product Comparison. GFI MailEssentials vs Symantec Mail Security for Microsoft Exchange 7.0

Application Note. Intelligent Application Gateway with SA server using AD password and OTP

GFI Product Comparison. GFI MailEssentials vs. Trend Micro ScanMail Suite for Microsoft Exchange

Network Security Report:

SYMANTEC MANAGED SECURITY SERVICES. Superior information security delivered with exceptional value.

WHITE PAPER. Understanding How File Size Affects Malware Detection

Transcription:

A Trend Micro Research Paper Email Correlation and Phishing How Big Data Analytics Identifies Malicious Messages RungChi Chen

Contents Introduction... 3 Phishing in 2013... 3 The State of Email Authentication... 5 Mail Network Basics... 5 Actual Results... 8 Experiment Results... 9 Possible False Positive Scenario...10 Conclusion...11 References...11 TREND MICRO LEGAL DISCLAIMER The information provided herein is for general information and educational purposes only. It is not intended and should not be construed to constitute legal advice. The information contained herein may not be applicable to all situations and may not reflect the most current situation. Nothing contained herein should be relied on or acted upon without the benefit of legal advice based on the particular facts and circumstances presented and nothing herein should be construed otherwise. Trend Micro reserves the right to modify the contents of this document at any time without prior notice. Translations of any material into other languages are intended solely as a convenience. Translation accuracy is not guaranteed nor implied. If any questions arise related to the accuracy of a translation, please refer to the original language official version of the document. Any discrepancies or differences created in the translation are not binding and have no legal effect for compliance or enforcement purposes. Although Trend Micro uses reasonable efforts to include accurate and up-to-date information herein, Trend Micro makes no warranties or representations of any kind as to its accuracy, currency, or completeness. You agree that access to and use of and reliance on this document and the content thereof is at your own risk. Trend Micro disclaims all warranties of any kind, express or implied. Neither Trend Micro nor any party involved in creating, producing, or delivering this document shall be liable for any consequence, loss, or damage, including direct, indirect, special, consequential, loss of business profits, or special damages, whatsoever arising out of access to, use of, or inability to use, or in connection with the use of this document, or any errors or omissions in the content thereof. Use of this information constitutes acceptance for use in an as is condition. 2

Introduction Phishing is a long-running problem that has taken a turn for the worse. Phishing emails now so closely resemble legitimate ones, making it very difficult both for users and automated systems alike to tell them apart. As such, users end up clicking links embedded in phishing messages that take them to malicious sites, which directly or indirectly steal their personal information. This research paper describes a newly developed Trend Micro methodology that correlates the format of emails with sending agents to detect phishing messages. Using actual examples, we demonstrate how we use big data analytics through the Trend Micro Smart Protection Network infrastructure to proactively identify phishing messages so we can protect our customers from today s more sophisticated email threats. Phishing in 2013 Phishing in 2013 is more advanced and sophisticated than it has ever been. Phishing messages are becoming harder and harder to distinguish from legitimate ones. Figure 1: Sample phishing message recently obtained from LinkedIn 3

Figure 2: Sample phishing message recently obtained from Facebook It is possible that attackers are taking legitimate messages and just replicating them with slight modifications to links. These links lead to malicious sites with exploit packages designed to compromise users systems. Similarities in content make it difficult for security vendors to detect and filter messages based strictly on content. Filtering such messages could result in potential false positives, as legitimate emails could be wrongly classified as spam. Classifying messages based on embedded URLs has also become difficult as their average life span has also become extremely short. 1 New techniques are necessary to detect more carefully prepared malicious messages. The technique discussed in this paper uses big data analytics, which correlates large sets of spam data with one another to determine their sources. 2 1 2 Rod Rasmussen and Greg Aaron. (October 2012). Global Phishing Survey: Trends and Domain Name Use in 1H2012. Last accessed June 20, 2013, http://docs.apwg.org/reports/apwg_globalphishingsurvey_1h2012.pdf. TechTarget. (January 10, 2012). SearchBusinessAnalytics. Big Data Analytics. Last accessed August 5, 2013, http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics. 4

The State of Email Authentication Several protocols, including DomainKeys Identified Mail (DKIM) and Sender Policy Framework (SPF) have been designed and implemented to confirm the senders and integrity of emails. 3 The two mechanisms named above are sender-authentication technologies that assist in controlling spam and improve the deliverability of legitimate messages. DKIM applies public key cryptography to emails. Senders use private keys to sign their messages and publish public keys via Domain Name System (DNS). When recipients get the emails claimed by a specific company, they will retrieve the public keys from the DNS to check if the emails were really sent by the identified company or not. Applying DKIM can help solve the phishing problem. The mechanisms above, however, do not solve all problems related to spam and phishing due to three reasons. First, the global DKIM (35%) and SPF (63%) adoption rates are not high enough. 4 Second, DKIM is insensitive to reply emails, which means some phishing emails can be sent with valid DKIM signatures. 5 Finally, forwarded emails have reasonably high signature failure rates reaching as high as 4%. Mail Network Basics Our methodology is designed to correlate emails with the IP addresses responsible for sending them out. In some ways, it uses the similarities between legitimate and phishing messages in our favor. We developed methods to identify commonly sent emails. For these methods, however, we tried to generate an identified signature for each email. The elements considered for this signature include the domain of the sender s address, the format structure, the content of email body, and whether it was authenticated or not. Emails that have been classified are then correlated with the IP addresses that sent them out. 3 4 5 DKIM.org. DomainKeys Identified Mail (DKIM). Last accessed August 5, 2013, http://www.dkim.org/; Openspf.org. Sender Policy Framework: Project Overview. Last accessed August 5, 2013, http://www.openspf.org/. Lars Eggert. (Last updated May 30, 2012). DKIM Deployment Trends. Last accessed August 5, 2013, http://eggert.org/meter/dkim; Lars Eggert. (Last updated May 30, 2012). SPF Deployment Trends. Last accessed August 5, 2013, http://eggert.org/meter/spf. Douglas Otis. (June 14, 2011). TrendLabs Security Intelligence Blog. Possible Phishing with DKIM. Last accessed August 5, 2013, http://blog.trendmicro.com/trendlabs-security-intelligence/possible-phishing-with-dkim/. 5

Figure 3: IP address-email correlation graph The diagram above presents a relatively simple representation of the correlation that occurs with a very limited number of messages and IP addresses. In any actual situation, this could quickly become complicated. Another issue of interest has to with which IP addresses were used to send out similar messages. Figure 3 is a simplified diagram with email nodes that become links between IP addresses. Figure 4: Simplified version of Figure 3 The distance between any two IP addresses in Figure 4 becomes shorter if they send out more similar emails. Conversely, the fewer similar emails they send out, the further apart they are. 6

Two concepts related to graph theory were used to create the diagram above graph density and modularity. 6 Graph density in the diagram looks at how the nodes are connected to one another. This can be used to determine if a particular portion of the network or the network as a whole is well connected. 7 The closer a graph s density is to 1 the maximum value the better organized and more well connected a particular community of nodes is. 8 Modularity, on the other hand, determines how well a network can be divided into distinct groups. 9 Ideally, the nodes within a group or community should be well connected to one another but not necessarily to nodes in other groups. How do we use these to detect phishing messages? Conceptually, the nodes (i.e., IP addresses) in a well-connected and tight community send out very similar messages. In essence, if an IP address outside this community starts sending out similar messages, these are very likely phishing messages. 6 7 8 9 Wikimedia Foundation, Inc. (Last updated June 7, 2013). Wikipedia. Graph Theory. Last accessed August 5, 2013, http://en.wikipedia.org/wiki/graph_theory. Wikimedia Foundation, Inc. (Last updated June 6, 2013). Wikipedia. Dense Graph. Last accessed August 5, 2013, https://en.wikipedia.org/wiki/dense_graph. Wikimedia Foundation, Inc. (Last updated June 12, 2013). Wikipedia. Community Structure. Last accessed August 5, 2013, http://en.wikipedia.org/wiki/community_structure. Wikimedia Foundation, Inc. (Last updated May 27, 2013). Wikipedia. Modularity (Networks). Last accessed August 5, 2013, http://en.wikipedia.org/wiki/modularity_(networks). 7

Actual Results To demonstrate how effective this technique is, we took a look at some actual data Trend Micro gathered. Figure 5: Part of the graph generated from Trend Micro real-world data This technique, however, has various capabilities and limitations. For instance, it makes it easy to spot which organizational networks send out large numbers of email with very specific formats. We were able to identify four distinct subcommunities that sent out emails with very specific appearances. To truly maximize the usefulness of such data, it should be divided into as many distinct, tight communities as possible. The end result would be a far larger version of the graph below. 8

Figure 6: Simplified version of Figure 5, divided into distinct communities Experiment Results To gauge how effective the technique is, we chose to analyze messages from two companies LinkedIn and QQ to see if all of these actually originated from them. First, we identified more than 7.8 million emails that matched LinkedIn s signature. Approximately 99% of these were sent by IP addresses associated with LinkedIn. The remaining 1%, meanwhile, may be phishing messages. The figures were worse for QQ. Among the almost half a million messages we analyzed, more than two-thirds did not originate from QQ. 9

Figure 7: Percentages of identified phishing emails from LinkedIn and QQ; incorrect passes correspond to suspicious emails while correct passes correspond to legitimate messages This newly developed technique makes it possible for us to quickly identify phishing messages, particularly more sophisticated ones, which very closely resemble legitimate emails. It allows us to turn the tables on spammers, as their attempts to closely mirror legitimate messages only make their malicious mails stand out. It is a powerful technique that is exceptionally useful against today s more sophisticated attacks. Possible False Positive Scenario If the mechanism proposed in this paper is adopted, false positives may arise when highvolume sites start using a new message transfer agent (MTA). An MTA, also known as a mail transfer agent, refers to software that transfers emails from one computer to another using a client a server application architecture. 10 Security researchers and analyze need some time to observe how the new MTA behaves. Only after receiving enough emails from the new MTA can it be added to the existing community associated with a particular highvolume site. 10 Wikimedia Foundation, Inc. (Last updated July 17, 2013). Wikipedia. Message Transfer Agent. Last accessed August 5, 2013 https://en.wikipedia.org/wiki/message_transfer_agent. 10

Two methods can help reduce the number of false-positive detections. The first relies on the relationship between the email signature and the base domain instead of the original mapping between the email signature and IP address. The base domain of the new MTA usually stays the same as that of an already-existing MTA. Sandboxing technology can also be used. This ensures that any suspicious email already detected by the mechanism is analyzed in a sandbox to detect malware or malicious or deceptive content, regardless of MTA. Conclusion Using big data analytics to detect phishing messages was developed in response to the increased threat posed by malicious emails that closely resemble legitimate ones. This methodology not only helps detect phishing messages but also makes it easier for security companies like Trend Micro to detect such phishing messages even if they more closely mimic legitimate ones. This would, however, not be possible without knowledge of big data and previous knowledge of current threats. Trend Micro already uses big data analytics, which makes it possible for us to better protect our customers from various email attacks. References DKIM.org. DomainKeys Identified Mail (DKIM). Last accessed August 5, 2013, http://www.dkim.org/; Openspf.org. Sender Policy Framework: Project Overview. Last accessed August 5, 2013, http://www.openspf.org/. Douglas Otis. (June 14, 2011). TrendLabs Security Intelligence Blog. Possible Phishing with DKIM. Last accessed August 5, 2013, http://blog.trendmicro.com/trendlabssecurity-intelligence/possible-phishing-with-dkim/. Lars Eggert. (Last updated May 30, 2012). DKIM Deployment Trends. Last accessed August 5, 2013, http://eggert.org/meter/dkim; Lars Eggert. (Last updated May 30, 2012). SPF Deployment Trends. Last accessed August 5, 2013, http://eggert.org/meter/spf. Rod Rasmussen and Greg Aaron. (October 2012). Global Phishing Survey: Trends and Domain Name Use in 1H2012. Last accessed June 20, 2013, http://docs.apwg.org/reports/apwg_globalphishingsurvey_1h2012.pdf. 11

TechTarget. (January 10, 2012). SearchBusinessAnalytics. Big Data Analytics. Last accessed August 5, 2013, http://searchbusinessanalytics.techtarget.com/definition/big-data-analytics. Wikimedia Foundation, Inc. (Last updated May 27, 2013). Wikipedia. Modularity (Networks). Last accessed August 5, 2013, http://en.wikipedia.org/wiki/modularity_(networks). Wikimedia Foundation, Inc. (Last updated June 6, 2013). Wikipedia. Dense Graph. Last accessed August 5, 2013, https://en.wikipedia.org/wiki/dense_graph. Wikimedia Foundation, Inc. (Last updated June 7, 2013). Wikipedia. Graph Theory. Last accessed August 5, 2013, http://en.wikipedia.org/wiki/graph_theory. Wikimedia Foundation, Inc. (Last updated June 12, 2013). Wikipedia. Community Structure. Last accessed August 5, 2013, http://en.wikipedia.org/wiki/community_structure. Wikimedia Foundation, Inc. (Last updated July 17, 2013). Wikipedia. Message Transfer Agent. Last accessed August 5, 2013 https://en.wikipedia.org/wiki/message_transfer_agent. 12

Trend Micro Incorporated, a global leader in security software, strives to make the world safe for exchanging digital information. Our innovative solutions for consumers, businesses and governments provide layered content security to protect information on mobile devices, endpoints, gateways, servers and the cloud. All of our solutions are powered by cloud-based global threat intelligence, the Trend Micro Smart Protection Network, and are supported by over 1,200 threat experts around the globe. For more information, visit www.trendmicro.com. 2013 by Trend Micro, Incorporated. All rights reserved. Trend Micro and the Trend Micro t-ball logo are trademarks or registered trademarks of Trend Micro, Incorporated. All other product or company names may be trademarks or registered trademarks of their owners. 10101 N. De Anza Blvd. Cupertino, CA 95014 U.S. toll free: 1 +800.228.5651 Phone: 1 +408.257.1500 Fax: 1 +408.257.2003