CAS : A FRAMEWORK OF ONLINE DETECTING ADVANCE MALWARE FAMILIES FOR CLOUD-BASED SECURITY ABHILASH SREERAMANENI DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING SEOUL NATIONAL UNIVERSITY OF SCIENCE AND TECHNOLOGY 2014
CONTENTS i. MAIN IDEA ii. iii. iv. INTRODUCTION CLOUD BASED SECURITY SERVICE CAS : THREAT INTELLIGENCE AS A SERVICE v. CONCLUSION vi. REFERENCES 5/26/2014 2
MAIN IDEA Preventing networks from being attacked has become a critical issue for network administrators and researchers. With the popularity and variety of large-scale zero day threats over the Internet, security companies have to keep on inserting new virus signatures into their databases. However, the increasing size of virus signature file is dragging computers to a crawl during the virus scan. To effectively handle the scale and magnitude of new malware variants, antivirus functionality is being moved from the user desktop into the cloud. The large-scale volume of advanced malware has created a need for automatic framework which can discover inter-family correlations for online detection. In this paper, we propose a fast and efficient technique to extract correlation signatures from advanced malware families for cloud-based security systems. At the core of our work is CAS, a framework for largescale and cross-family malware analysis. CAS uses novel method for Advanced Persistent Threats (APTs) correlation. Our large-scale testing shows that CAS can detect millions of malware samples efficiently with malware correlation signatures at inline speed. These advanced malware include packers, PE malware, mobile malware, scripts and non- PE malware. 5/26/2014 3
INTRODUCTION (1/3) over the past years size of the internet has increased dramatically and more network facilities are connected to the internet, preventing networks from attacks has become a critical issue. The past few years have witnessed a significant increase in the number of malware threats. Today s Anti-virus (AV) industry devotes much effort to combating Advanced Persistent Threat (APT), also as known as the advanced malware. Security engineers are facing a serious problem of defeating the complexity and quantity of advanced malwares. 5/26/2014 4
INTRODUCTION (2/3) Hackers are launching unknown APT malware, which most AV software can t detect. Security researchers are facing a great challenge in overcoming advanced malware s complexity. Behavior-based detection approaches have been used to detect malware in sandbox such as CW Sandbox or virtual images. However, these approaches have slow scan speeds and some interface issues. Therefore, they cannot be used on next generation high speed network devices. To effectively handle the scale and magnitude of new malware variants, anti-virus functionality is being moved from the user desktop into the cloud. 5/26/2014 5
INTRODUCTION (3/3) For a suspicious file, the AV desktop agent fetches the fingerprint or calculates the hash value of the file, and sends it to the remote cloud server, which will compare that fingerprint or value to the continuously updated signature database in the Internet. If the value exists in the database, the client will be asked which specific action the user wants the desktop agent to take on the infected file. As the core of the threat intelligence as-a service, CAS can support very broad malware types, from PE, non-pe format, scripts, to even mobile threats. 5/26/2014 6
CLOUD BASED SECURITY SERVICE The new malware variants challenge the traditional AV protection model, which demands frequent signature updates, large signature databases, and resourceguzzler style security products. As the next-generation security infrastructure, AV In-the-Cloud service is moving the virus-scanning functionality from the desktop to the Internet. A. TRADITIONAL AV SOLOUTIONS Most malwares are executable files which can be understood and executed by operating systems. (EG. Portable Executable format (PE)). For any suspicious file, A traditional AV scanner deployed at desktop, for searching the file s signature or hash value in the signature database. 5/26/2014 7
CLOUD BASED SECURITY SERVICE Traditional signature database usually employs prior knowledge of malware signatures, which are generated by security engineers. The signature database is efficient to detect known malwares, however it cannot often detect unknown viruses and polymorphic variants. Polymorphic malwares can mutate their signatures via unpredictable compression or encryption trans-formations, and easily bypass AV scanners. Generating signatures for zero-day threats becomes a tedious reactive security function. Security vendors are facing great challenges in overcoming the complexity of malwares, and fighting against the malware backlog is nothing new. 5/26/2014 8
CLOUD BASED SECURITY SERVICE B. AV CLOUD INFRASTRUCTURE on-access scanner is deployed at the desktop. It automatically examines the local machine s memory and file system whenever these resources are accessed by an application. By distributing a set of trusted anonymous hops, it offers the location-hidden service without revealing the cloud server s networking identity. The cloud agent is a lightweight hybrid desktop solution to resolve the AV resource intensive problem. It acts like a file filter, inspecting suspicious file loading and storing activities. The agent collects hash values or fingerprints of suspicious files from users. These users can be either single distributed or locally networked. 5/26/2014 9
CLOUD BASED SECURITY SERVICE Cloud-based Anti-virus Service 5/26/2014 10
CLOUD BASED SECURITY SERVICE Nowadays, to evade malicious content detection, virus hackers use binary tools to instigate code obfuscation, to bypass the security products. It is vital for AV products to deploy the emulator to inspect hidden payloads. An emulator includes programs to execute or emulate suspicious encrypted executables until they are fully decrypted in memory. There are two ways to deploy the emulation functionality: an emulator can be embedded inside the desktop agent, or deployed in the cloud. An agent without the emulator can relieve users from the resource constraints of desktop virus scanning and send the full obfuscated samples towards the cloud servers sometimes it consumes bandwidth, and it is not suitable for customers who have the bandwidth limitation. 5/26/2014 11
CLOUD BASED SECURITY SERVICE Embedding the emulator into the desktop allows the agent inspect the hidden payloads of the obfuscated programs. Bandwidth will be saved because hash value of the dumped data rather than the file itself is sent to the cloud. Cloud-based security solutions are also facing some challenges to defend against advanced malware. which this paper is attempting to address with the solution of the threat intelligence as-a-service. 1. Increasing speed of APTs. 2. Traditional AV largely useless against APTs. 3. Stream-based AV is one of the latest techniques being used by network based products for scanning. 4. Non-PE formats (e.g., PDF) fall outside the domain of traditional signature based AV engines. 5/26/2014 12
CAS : THREAT INTELLIGENCE AS A SERVICE Advanced Persistent Threats are becoming more targeted, traditional malware detection is no longer sufficient to cope with advanced malware s obfuscation techniques to detect new breed of defense strategy is required. The threat intelligence as-a-service allows users to protect against APTs via an automaton that analyzes advance malware s malicious contents. In this paper, we will describe a framework to detect advanced malware, CAS. This system combines advantages of prior knowledge of known viruses in traditional AV signature databases and the ability of threat intelligence to detect new unknown advanced malware variants. CAS delivers accurate detection of APTs, thus reducing zero-day malware by providing early detection and near-real-time alerts for monitored systems. 5/26/2014 13
CAS : THREAT INTELLIGENCE AS A SERVICE INTELLIGENCE AS-A-SERVICE 5/26/2014 14
CAS : THREAT INTELLIGENCE AS A SERVICE A. Framework Stream-based antivirus is one of the latest approaches being used by network vendors for high-speed next generation gateway product deployed in the cloud. Increasing size of virus signature database is consuming huge memories and resources, which causes the on-the-fly malware scanning on networking devices very difficult to be implemented. For good workload balance for online scanning, the industry requires a lightweight signature database. 5/26/2014 15
CAS : THREAT INTELLIGENCE AS A SERVICE CAS FRAMEWORK 5/26/2014 16
CAS : THREAT INTELLIGENCE AS A SERVICE B. Malware types supported In CAS, to support heterogeneous malware types, the intelligent parser in CAS is able to recognize the input malware file type. Current CAS supports PE, Packers, non-pe (such as PDF, images, Microsoft Offices, web scripts, and even mobile malware). Portable Executable (PE) format is the most popular format for executables, libraries, and drivers in Windows. A PE file comprises various sections and headers that describe the section data, import table, export table, resources, and so on. PE file starts with the DOS executable header, followed by the PE header, which begins with the signature bits PE. 5/26/2014 17
CAS : THREAT INTELLIGENCE AS A SERVICE The PE header also includes some general file properties, such as the number of sections, machine type, and time stamp, optional header contains section table headers which shows raw size, virtual size and section name. At the end of the PE file is the section data, which contains the file s original entry point (OEP) entry point where file execution begins. To search a PE file for malware, a scanner typically scans the segments for the known signatures at certain offsets from OEP. Packers work on PE executable files and dynamic link libraries (DLLs). 5/26/2014 18
CAS : THREAT INTELLIGENCE AS A SERVICE PE FORMAT OVERVIEW 5/26/2014 19
CAS : THREAT INTELLIGENCE AS A SERVICE PE-HEADER-BASED DETECTION APPROACH OVERVIEW 5/26/2014 20
CAS : THREAT INTELLIGENCE AS A SERVICE To perform packing, a packer first parses PE internal structures. Then, it reorganizes PE headers, sections, import tables, and export tables into new structures and attaches a code segment that the malware will invoke before the OEP. This code is called the stub, and it decompresses the original data and locates the OEP. During packing, a packer compresses and encrypts the code and resource sections using the compression and encryption libraries. With randomization, the packer can also generate different variants of a single file every time the file is packed. For some powerful packers, the polymorphism engine also adds a protection layer against RE and debugging. 5/26/2014 21
CAS : THREAT INTELLIGENCE AS A SERVICE Nowadays mobile malware reached a new level of maturity. Threats targeting smartphones and tablets are beginning to pose meaningful challenges to clients. In 2011, there is an almost 200% percent increase in mobile malware across all mobile platforms, Based on a generic framework, CAS can also analysis mobile malware, and detect mobile malware families based on different OS platform, such as Symbian, Android, and Blackberry. Figure shows the internal SIS format of Symbian malware. SIS is an acronym of Software Installation Script, archive for Symbian OS. 5/26/2014 22
CAS : THREAT INTELLIGENCE AS A SERVICE MOBILE SYMBIAN FILE FORMAT 5/26/2014 23
CAS : THREAT INTELLIGENCE AS A SERVICE C. Stream-based on-the-fly scanning To be effective, such 10 Gigabyte networking devices have to scan on the-fly against more complicated new malware. In order to keep a good workload balance, high-speed networking devices require a lightweight signature database be half of traditional AV. To generate a light-weight malware signature database and handle the large quantity of new unknown samples, it is important to develop intelligent threat response systems which support automatic and generic signature generation. CAS uses novel method for million-scale malware correlation, and detects millions of sample using malware correlation signatures. 5/26/2014 24
CAS : THREAT INTELLIGENCE AS A SERVICE STREAM-BASED ANTIVIRUS 5/26/2014 25
CAS : THREAT INTELLIGENCE AS A SERVICE Generic Framework 5/26/2014 26
CAS : THREAT INTELLIGENCE AS A SERVICE Non-PE malware, also known as embedded malware, allows malicious codes to be hidden inside a benign file, such as JPG, GIF and PDF files. They are self-encoded, so the embedded malware is very difficult to be detected. CAS uses non-pe parsers to find the hidden malicious payloads and apply signatures to detect the malware. Malicious codes hidden in JPG format 5/26/2014 27
CONCLUSION As more incoming malware samples become available, a powerful large-scale threat response system is required to support proactive detection and protection. This paper introduces CAS to identify features across malware families that are written in similar ways. This could lead to quick identification of zero-day malware as well as fingerprinting these features. We are still in the early stages, and several major issues in protecting AV cloud service remain to be addressed. 5/26/2014 28
REFERENCES [1] W. Yan, Z. Zhang, and N. Ansari Revealing packed malware, IEEE Security and Privacy, vol. 6, no. 5, pp. 65-69, Sep/Oct, 200 [2] Liang Xie, Xinwen Zhang, Jean-Pierre Seifert, Sencun Zhu: pbmds: a behavior-based malware detection system for cellphone devices. WISEC 2010: 37-48 [3] Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu, Engin Kirda, AccessMiner: Using System-Centric Models for Malware Protection. In Proceedings of 17th ACM Conference on Computer and Communications Security (CCS), October 2010, Chicago, Illinois, USA. [4] M. Pietrek, Peering Inside the PE: A Tour of the Win32 Portable Executable File Format, Microsoft Systems J., Mar. 1994, pp. 15-34. [5] A. Pranata, Symbian Executable File Format. 5/26/2014 29
Q &A 5/26/2014 30
THANK YOU 5/26/2014 31