WHITE PAPER Big Data Analytics How Big Data Fights Back Against APTs and Malware
Table of Contents Introduction 3 The Importance of Machine Learning to Big Data 4 Addressing the Long-Tail Nature of Internet Data 4 Ensuring Big Data Doesn t Slow Down the Network 4 WHITE PAPER How Big Data Fights Back Against APTs and Malware 02
Introduction As the malware threat landscape changes and becomes more complex and dangerous, talk of Big Data is clearly growing. Yet what is less clear particularly to non-technical audiences is how Big Data analytics works, and why it is an indispensable tool to counter these threats. Hereunder we will uncover both of these mysteries in a simple and memorable way. Imagine that you are the successful owner of a large chain of clothing stores. One day, you begin to notice that, despite an array of security measures such as security tags, cameras, and personnel, your stock of shirts starts to disappear. Furthermore, the vanishing inventory is happening in a number of stores, at different times, and on different days. How can you catch the culprit? Easy: you can have your security experts watch every single minute of video, identify the individual who has visited the shirt department of every affected store, and then pass that information onto the authorities. In this way, you not only help apprehend the criminal, but you clearly see where your security must be stronger to prevent another such theft. Now, think of your IT network. It, too, is protected by security measures, such as anti-virus software and firewalls. However, despite your efforts to stay safe, imagine that an adversary is penetrating your security and stealing valuable information -- everything from proprietary data, to employee passwords, confidential correspondence, and more. What can you do? In the past, your options were severely limited due to the complexity of even a modest IT infrastructure. Keeping an eye on every door of your network was neither cost-effective nor practical. Now, however, you can identify malicious activity by applying massive amounts of processing to analyze the history of communications that are entering and leaving your organization. The name for this approach is Big Data analytics, and it is a core part of Seculert s platform. WHITE PAPER How Big Data Fights Back Against APTs and Malware 03
The Importance of Machine Learning to Big Data The concept of machine learning is certainly not new. It is a well-developed area, which has undergone a great deal of academic research. So, why is it suddenly so significant now? Again, the answer is Big Data analytics. Machine learning algorithms can do a lot of useful things -- provided that they have good learning sets. For example, we could teach an artificial brain to recognize handwriting if we show it a lot of writing with corresponding text. Domain expertise, multiplied by good data sets, is key to the successful application of machine learning. On the other hand, there are many Big Data sets that are virtually useless without machine learning. Consider billions of web pages which contain information about your competitors, target markets, and traces of security threats. Without a robot capable of rapidly reading all the data, you might as well delete everything and save the storage space. All the information will be out-of-date before you even check 0.01%! Ultimately, this illustrates that an electronic brain (a.k.a. machine learning) is only way to get value from Big Data analytics, and we humans need to decide exactly what task we want these brains to accomplish. Addressing the Long-Tail Nature of Internet Data Machine learning works very well on nice data sets where we need to differentiate between a few classes. The situation, however, is quite different when we must recognize one class among thousands. For example, what is easier: deciding whether you are looking at a mammal or an insect, or determining exactly what species of mammal or insect it is? This is analogous to the challenge posed by the long-tailed nature of Internet data, which is highly fragmented. Indeed, there are millions of sites and applications with thousands of different behaviors -- and most of them are completely legitimate, with only a small fraction actually malicious. To address this, Seculert classifies and analyzes web traffic behavior in order to determine certain properties that correlate to malicious activity. And when this is multiplied via several such properties, Seculert applies machine learning algorithms to effectively differentiate between malicious and non-malicious traffic in real-time. For example, we know that malware periodically connects to its Command and Control server in order to receive instructions. We also know that malware-related traffic tends to be more periodic than traffic from a random website. Based on these facts, Seculert applies statistical techniques to detect and measure the periodicity of any such traffic. And while this is not enough to conclusively determine the existence of malware, it is nevertheless a crucial clue. From there, Seculert leverages substantial research and machine learning algorithms to extract more information, and ultimately separate malware-related activity from normal activity. Ensuring Big Data Analytics Doesn t Slow Down the Network As you can imagine, big organizations have a lot of traffic passing in and out of their network. Just to give you an idea, imagine how much Internet traffic there is in a small town. Well, this is about the same amount of traffic in a large organization. And while there are many tools capable of applying machine learning algorithms to small data sets that can work on a computer s RAM, huge datasets require an immense amount of CPU power to process them. Seculert solves this problem in two critical ways. First, its data processing infrastructure backbone uses Hadoop to rebuild sessions, parallelize computations, and tolerate failures. Secondly, it automatically performs data processing in the cloud, and as a result it is cost-effective and there is no drain at all on local network resources. This is especially important for organizations that cannot anticipate how quickly they will grow, or how much data they will need to process even a month into the future. Ultimately, the combination of Hadoop and cloud technologies enables Seculert to focus on in-depth analysis, instead of searching for workarounds to bypass local network limitations. WHITE PAPER How Big Data Fights Back Against APTs and Malware 04
Cloud-based, Automated Breach Detection Seculert fills the gaps in existing advanced threat defenses by fo cusing on the blind spots found in breach prevention systems. In an era when infection is inevitable and adequate resources to find and remediate threats are limited, the Seculert Platform identifies new threats with unprecedented speed and precision. Leveraging its Big Data analytics as a service, botnet interception, and elastic sand box functionality, Seculert provides superior detection while driving down the cost and time it takes to remediate. For more information on Seculert, visit www.seculert.com. Contact Us Toll Free: (US/Canada): +1-855-732-8537 Tel (UK): +44-203-355-6444 Tel (other): +972-3-919-3366 Email: info@seculet.com www.seculert.com COPYRIGHT SECULERT 2014