The Lookout Security Platform Advanced Mobile Threat Protection Through Predictive Cybersecurity
Table of Contents I The Road to Predictive Security a. Cyberattack Economics b. Signature and Behavioral Analysis Limitations c. Toward Predictive Security II The Lookout Security Platform III App Analysis Architecture a. Acquisition b. Enrichment c. Analysis d. Protection IV Device Analysis Architecture V Predictive Security in Action a. FireTalk b. BadNews VI Conclusion lookout.com 2
I. The Road to Predictive Security Cyberattack Economics Given the recent spate of cyberattacks, one might conclude these attacks are the unavoidable consequence of living in a highly digital, connected world. At Lookout, however, we reject this notion. We believe that these events reflect a fundamental imbalance in the economics of cyberattacks that currently favors attackers. The path toward a better future lies in disrupting this asymmetry by dramatically raising the c ost o f attacks through better predictive security. Currently, it takes enormous effort to reverse engineer and remediate a cyberattack and only minimal effort for attackers to modify their code and infrastructure to successfully evade detection. A 2014 study found that the average cyberattack costs organizations $12.7 million 1. While difficult to quantify attacker costs, it s clear that attackers invest a pittance compared to the billions of dollars spent on digital security and the countless hours organizations spend investigating and remediating breaches. technologies. Today, most threat detection systems rely on signatures and/or virtualized behavioral analyses, and both approaches have notable blind spots and limitations. Signatures can effectively block simplistic, unchanging attacks, but can t scale with the pace of malicious software development and routinely miss advanced attacks. Typically, security researchers spend hours dissecting new malicious code to understand its identifying characteristics and then create signatures to flag these characteristics in future threats. Unfortunately, humans can t scale at the rate of software development and the increasing sophistication and volume of malware means signature-based models will increasingly miss advanced threats. In 2014 Lookout observed an overall increase in threat sophistication, including evidence that attackers may have compromised mobile supply chains and pre-loaded malware on some factory-shipped devices. 3 What explains this relatively low cost of attack? An industry overreliance on signatures and behavioral analysis detection models has much to do with the problem. While both security approaches remain important to a multi layered security defense, recent cyberattacks have exposed their limitations and the ease with which skilled attackers can evade these defense mechanisms. CONS SIGNATURES Can t scale; overly reliant on humans Brittle and easily evadable Limitations of Signatures & Behavioral Analysis Gartner estimates that globally organizations spent $71.1 billion on information security in 2014 2 and a significant portion of that spend goes toward threat detection 1 2014 Cost of Cyber Crime Study: United States. The Ponemon Institute. Oct 2014. 2 Gartner Says Worldwide Information Security Spending Will Grow Almost 8 Percent in 2014 as Organizations Become More Threat-Aware. Gartner. Aug 22, 2014. 3 2014 Mobile Threat Report. Lookout. Jan. 2014. lookout.com 3
Additionally, because of their code-level specificity and dependencies on 1:1 matches, attackers can break signatures fairly easily. Small modifications to malicious code will alter a signature pattern or cryptographic hash, rendering it useless. Consider the ease with which an attacker can break the following sequence-based signatures: custom malware installed by the attacker, the Times signature-based detection technology caught and quarantined only one of those 45 instances. 4 Behavioral analysis detection models tend to fare better than signatures against advanced attacks given the increased difficulty of obscuring malicious behavior. Table 1: Example of Signature Limitations SIGNATURE POST ATTACKER MODIFICATION Status Effective Broken Signature 1 Signature 2 \x00>apkfile and apkfile1 Already rooted or already have ==> return\x00 \x00\x00\x00androidrtservice.apk\x00 \x00>apkfile and apkfile1 Already rooted or already have _ ==> return\x00 \x00\x00\x00androidrtsxervice.apk\x00 With the simple addition of the character X and a space, literally two keystrokes, an attacker can recycle their code and evade these signatures that may have resulted from hours of human research and code analysis. Of course, knowing which specific code sequences to change can prove challenging, but attackers can automate this evasion process with the use of code obfuscation algorithms that will reorder, rename, and/or insert garbage (filler) sequences to throw off signatures and can also leverage tools to automatically test their evasive code against existing signatures. One recent cyberattack in particular illustrated the limitations of signature-based detection models. When the New York Times computer systems came under attack from hackers reportedly from China, subsequent investigation revealed that among the 45 instances of This detection approach, however, also has limitations. Namely, it tends to produce more false positives, creating excessive noise that can cause organizations to lose or overlook important signals surfaced by the detection model. BEHAVIORAL ANALYSIS Lacks context; false positive prone CONS Misses advanced, latent threats 4 Hackers in China Attacked The Times for Last 4 Months. New York Times. Jan. 30, 2014. lookout.com 4
While behaviors can signal malicious activity, most behavioral analysis models lack the context to consistently differentiate between malicious and non-malicious intent behind behaviors. Consider the table below showing the permissions and corresponding contact-exfiltration behaviors of two different Android applications: disguised as a VoIP app first detected by Lookout. This example illustrates how pure behavioral analysis approaches to security often lack the context to accurately assess behaviors. Like an overly sensitive smoke alarm, the lack of precision in these systems means they run the risk of failing to highlight the true signal amidst the noise they Table 2: Example Behavioral Analysis Limitations APP 1 APP 2 Flagged Behavior Yes Yes Sample Permissions android.permission.read_contacts android.permission.access_network_state android.permission.access_fine_location android.permission.read_calendar android.permission.read_contacts android.permission.write_contacts android.permission.access_network_state android.permission.access_fine_location Behavior Sends device contacts to server Sends device contacts to server Both apps, executed in a virtual environment, would access device contacts, network state and GPS location and a behavioral analysis model that classifies device contact access and exfiltration as bad behavior would alert on both apps. But do both apps represent threats? Does it matter that App 1 accesses device calendar data and App 2 does not? It s difficult for automated systems to make these calls without an understanding of the context of each app s behavior. App 1 in this example, however, is a benign social networking application and App 2 is MalApp.D, malware create. Some security experts, for example, have posited that although the breach of Target s credit card triggered security alerts in their system, their importance was not recognized amidst possibly hundreds of other security alerts generated on a daily basis. 5 Lastly, behavioral analysis detection models only provide a snapshot of behavior at a specific point in time and this creates blind spots. Sophisticated attackers can evade detection by temporarily suppressing malicious behavior or creating multi-stage threats that bypass analysis and then download malicious payloads. Lookout, for instance, 5 Target says it declined to act on early alert of cyber breach. Reuters. Mar. 13, 2014. lookout.com 5
detected BadNews, a mobile threat that successfully bypassed a major app store s security analysis by posing as an ad network, only to later use their capabilities to prompt users to download malware disguised as updates. 6 Other mobile threats have demonstrated an ability to suppress malicious behavior for up to 30 days to evade behavioral detection. 7 Researchers continually uncover additional ways for clever attackers to evade behavioral detection by detecting the virtual environment itself, cueing their attack on behavior that a user would perform that an analysis environment does not emulate (e.g. scrolling down on a document), or laying dormant on a particular targeted system. sophisticated the algorithms used, these security models will continue to suffer from this tradeoff on account of their limited data inputs. True predictive security requires real-time security telemetry from a global population of devices and the use of machines to sift through this dataset to identify complex risk correlations that would otherwise evade human analysis and basic 1:1 pattern matching. The real promise of a predictive security model is that it can detect threats where no prior signatures exist and before threats exhibit malicious behavior. With this promise in mind, Lookout has designed and built the Lookout Security Platform. Toward Predictive Security Threat detection is fundamentally an exercise in prediction. Security systems detect threats by taking available information (inputs) and returning an assessment of risk (outputs) according to an analysis model. Signature and behavioral analysis models, however, fall short of true predictive security. Signatures require threat encounters before they can predict (identify) threats and behavioral analysis predictions lack precision and can also fail to predict more advanced threats that obscure or suppress their behavior. In short, organizations face a basic tradeoff when adopting these security models: Signature models reduce false positives at the expense of false negatives II The Lookout Security Platform Introduction The Lookout Security Platform is a cloud-based platform that detects and stops both mainstream and advanced mobile threats. The platform uses a predictive security model that enables threat detection even in cases where no prior signatures exist and before threats exhibit malicious behavior. It protects mobile endpoints and infrastructures from app and device-based threats, enables deep threat investigation, and ultimately powers a wide range of Lookout product offerings. Behavioral models reduce false negatives at the expense of false positives These tradeoffs come from these models use of limited datasets and their corresponding inability to assess a potential threat s relation to the world of known code beyond signatures and behaviors. No matter how 6 The Bearer of Bad News. Lookout. Apr. 19, 2013. 7 Apps on Google Play Pose As Games and Infect Millions of Users with Adware. Avast. Feb. 3, 2015. lookout.com 6
Figure 1: The Lookout Security Platform and Product Architecture CONSUMER PRODUCT ENTERPRISE PRODUCT Lookout Mobile Security (LMS) ios Mobile Threat Protection(MTP) ios App Vetting API Lookout Mobile Security (LMS) Android Mobile Threat Protection (MTP) Android Mobile Intelligence Center (MIC) LOOKOUT SECURITY PLATFORM To be clear, Lookout s platform incorporates signatures and behavioral analyses into its security stack to achieve defense-in-depth capabilities. It goes beyond these traditional detection techniques, however, in its use of real-time security telemetry and machine intelligence to automatically correlate the security signals from every device and app it encounters across multiple dimensions to track existing threats and predict novel threats. III App Analysis Architecture The diagram on the following page depicts the architecture of the platform s app-based threat detection capabilities. This architecture follows a four-step process: Data Acquisition Data Enrichment Data Analysis Protection lookout.com 7
Figure 2: The Lookout Security Platform App Analysis Architecture lookout.com 8
i. Acquisition The platform collects real time security telemetry on mobile applications from a variety of sources: 8 AT A GLANCE Mobile Sensor Network More than 60 million registered mobile devices worldwide provide Lookout with a comprehensive, real-time view into threats on just one device or millions. Lookout s app binary acquisition process spreads the load among multiple devices to limit battery and data impact, reassembling the app fragments in the cloud and preserving end-user privacy by only collecting application binaries, not user personal data (e.g. photos, messages) generated in the course of using these applications. Registered mobile sensors App Vetting API Partners Unique app binaries detected 8 60+ million worldwide Many, including some of the world s largest app stores 67,500,000 Crawling Lookout continually monitors the major and minor app stores of the world, including app stores in countries such as China, Russia, and India. Lookout s crawling technology also enables app acquisition from ad hoc web sources. App Vetting API By serving as the exclusive security layer for some of the world s largest app stores, the Lookout Security Platform has privileged access to malware submitted to these stores that never sees the light of day. Unique app binaries acquired Unique app binaries detected on only one device worldwide Apps acquired daily 11,000,000 875,000 10,000+ 8 Lookout s platform is aware of the presence of 67,500,000 unique app binaries in the world, counted by cryptographic hash. This include both system apps (apps that are part of the operating system) as well as user-downloaded apps, and counts each version of an app as a unique app instance. lookout.com 9
The following table highlights the types of data collected from mobile sensors in this acquisition funnel: Table 3: Mobile Sensor Data Collection TYPE ANDROID/iOS SCOPE Application Cryptographic hash Android + ios All device apps. Package name Android + ios All device apps. Apk 9 file.ipa file metadata Bundle ID Team ID Android ios Only apps not recognized by Lookout s platform Only non-apple App Store or enterprise-signed apps not recognized by Lookout s platform. With respect to the collection of data directly from endpoint mobile devices, the Lookout Security Platform takes precautions to ensure it protects user privacy. For its consumer application, Lookout obtains consent before collecting security telemetry and offers users the right to opt-out of this data collection. For Lookout s enterprise client, use of the product is conditional on sharing this security telemetry, which is required by Lookout to protect organizations. To reiterate, Lookout never collects personal data generated by users on their devices, such as images, audio, video, or text and also never uses collected security telemetry to identify individual users unless a user specifically requests contact regarding a security issue. 9 APK = Android Application Package, the package file format used to distribute and install app software onto Android devices. lookout.com 10
ii. Enrichment Each app acquired by Lookout s platform undergoes a unique enrichment process that characterizes how it works and accurately relates it to the world of known applications: Metadata Lookout appends data that includes app name, digital signature, app store description, and developer name. examples Package name: com.android.service examples REPUTATION RESULTS: 95% of known APKs that use this signer are malware Behavior The platform generates app behavior data, generated through dynamic and symbolic execution technologies that run the app in a simulated environment and analyze the capabilities of its code. Signer: bb626d3b8406e7fc330d0f4b304cbfc5f610721f CN=Dragon, L=SZ, ST=GZ, C=CN Packaged date: 2012-09-20 18:36:44 UTC Signed date: 2012-09-20 18:36:42 UTC Reputation Lookout incorporates data related to the authorship, origin, and geo-historical distribution of an app, such as the duration and location of its popularity. examples BEHAVIORAL ANALYSIS RESULTS: write_file (Osiris[0.1.217]) read_contacts (Static Behavior Extraction[3.1.469]) write_contacts (Static Behavior Extraction[3.1.469]) read_sms (Static Behavior Extraction[3.1.469]) read_imsi (Static Behavior Extraction[3.1.469]) lookout.com 11
App Genome Sequencing Analysis The platform automatically assesses the fuzzy code similarity an app shares with all known code in Lookout s mobile intelligence dataset. It reveals where that app s code (or its relatives) appear in the world by analyzing approximate similarity between individual code classes and then computing an aggregate similarity score. examples INDEX CLASS: SCORE: Lorg/linphone/MapAPP$1$1; Lorg/linphone/MapAPP; Lorg/linphone/util/Constant; Index match: 0.9433 0.9846 1.0000 0.9923 Lookout holds patents related to its App Genome Sequencing technology, which is one of the key differentiating technologies that powers Lookout s predictive security model. Whereas attackers can evade signatures by changing a single line of code, App Genome Sequencing technology does not depend on precise 1:1 matches and can instead assess approximate match scores at both a granular (class or code block) and holistic (app) level. This dramatically raises the cost of attack because it requires attackers to essentially start from scratch and overhaul their entire code base to evade detection. Even some of the less powerful enrichment technologies can play a key role in identifying and tracking malicious code by adding relevant data points to feed Lookout s Helix security engine and enable it to find more complex, multidimensional correlations. lookout.com 12
iii. Analysis Lookout s Helix security engine ingests the data generated by the platform s acquisition and enrichment processes and then automatically compares these data points to the hundreds of millions of data points in Lookout s mobile intelligence dataset. Multidimensional threat correlation makes the platform substantially harder to evade because it requires attackers to re-implement their entire platform and command and control infrastructure, instead of simply changing the few components that match a signature or obscuring the malicious activity that would trigger an alert. In the event that the Lookout Security Platform finds no correlations the platform relies on a risk-scoring model, taking inputs from the enrichment and analysis processes to predict zero-day threats. The stunning breadth and complexity of the multidimensional correlations generated by the Helix security engine far outpace the capacities of human analysts and behavioral analysis models alone. Consider the diagrams on the following pages that visualize these correlations for two distinct malware families, Mouabad and NotInstalledYo. lookout.com 13
Whitepaper Figure 3: Multidimensional Threat Correlation Analysis of Mouabad Malware Family This diagram shows samples of the Mouabad mobile malware family, correlated by shared signer, IP communications, and binary similarity as calculated by the platform s App Genome Sequencing technology. Mouabad is a family of trojans that enable third party control over a compromised device, allowing remote attackers to send premium rate SMS messages and engage in remote dialing activities. lookout.com 14
Whitepaper Figure 4: Multidimensional Threat Correlation Analysis of NotInstalledYo Malware Family. This diagram shows samples of the NotInstalledYo mobile malware family, correlated by shared signers and binary similarity as calculated by the platform s App Genome Sequencing technology. The node at the center of this galaxy represents a widely shared signer that uses a compromised signing key. NotInstalledYo is a family of spyware that intercepts SMS messages on victimized devices and forwards them to attackers. Figure 4.1: Red Zone Enlarged Samples that share a high degree of binary similarity are grouped by color and nodes to which multiple colored nodes connect signify a shared signer amongst those samples. lookout.com 15
iv. Protection The output of Lookout s platform is a dynamic security decision that identifies evolving known threats as well as unique, targeted attacks. When the platform detects novel threats it automatically initiates an investigative process, alerting Lookout s Research and Response team to further investigate the operation and motivation of attackers, take remedial action such as issue server takedown requests, and ensure that relevant partners, customers and organizations take remedial action if needed. lookout.com 16
IV Device Analysis Architecture Figure 5: The Lookout Security Platform Device Analysis Architecture To protect the underlying security of mobile devices from threats such as malicious rooting and jailbreaking, the Lookout Security Platform collects a range of device security telemetry to form a digital fingerprint of each device. This security telemetry includes: a. OS/Firmware data - OS file metadata, such as the file name and hash b. Configuration data - system properties of the OS configuration c. Device data - device identifier information, for device remediation purposes After collecting this data the platform then re-assembles it in the cloud to form a device fingerprint. It correlates the various data points of this fingerprint against Lookout s mobile intelligence dataset to identify when a device is vulnerable or has been compromised, and can also predict device risk based on anomalies or correlations to known signals of compromise. When the platform detects a compromised device it executes remedial action through an integrated Mobile Device Management (MDM) client. Today, most device compromise detection models rely on a handful of point tests, hard coded on the mobile client. Attackers have identified and successfully deconstructed these point tests and devised lookout.com 17
countermeasures to easily evade them. Lookout s detection model, however, differs substantially from these approaches in that it collects a holistic fingerprint of the device profile and sends it up to the cloud to analyze on the server-side. Lookout s security model offers two key advantages: instead of reverse-engineering a few client-side point tests, to evade Lookout, attackers need to mimic the entire device state and its corresponding signals, which significantly raises the cost of attack. In addition, the server-side analysis also inhibits attackers from easily reverse-engineering Lookout s detection methodology. V Predictive Security in Action The following threat detections demonstrate how the Lookout Security Platform has delivered on the promise of predictive security and can detect threats for which no prior signatures exist and can even detect threats before they exhibit malicious behavior. Case Study 1: BadNews Consider the case of BadNews, a malicious mobile ad network. Lookout found BadNews embedded in 32 different apps that were live in Google Play and had received millions of downloads. BadNews enabled the installation of additional APKs and could open URLs in the browser, although it exhibited neither of these behaviors at the time of discovery. The Lookout Security Platform, however, detected that BadNews contained code that shared statistically significant correlations to known Russian malware and, in a pre-crime maneuver, proactively protected Lookout-enabled devices. point-in-time behavioral analyses would not detect the activity. To read more about BadNews, please visit our blog: blog.lookout.com/blog/2013/04/19/the-bearer-ofbadnews-malware-google-play Case Study 2: MalApp.D The power of a predictive security model is evident in Lookout s detection of MalApp.D, a mobile threat that matched no prior signature nor engaged in overtly malicious behavior, but nonetheless put enterprise contact data and voice communications at risk. MalApp.D was embedded in a seemingly benign VoIP app that was live in the Google Play Store at the time of Lookout s detection. With a handful of positive reviews and a 4.2 star rating, the app appeared legitimate.through multidimensional correlation, however, Lookout s platform revealed that this VoIP app was likely developed by a known author of mobile malware and it therefore posed an unacceptable risk to enterprises given its access to device contacts and potential call recording capabilities. To read more about MalApp.D, please visit our website: www.lookout.com/resources/reports/malapp Post protection, Lookout continued to monitor BadNews in the wild and later observed it distributing new zero-day trojans via the APK installation functionality. Notably, BadNews only engaged in this malicious activity for five minutes a day, effectively disguising its activity from sandboxed security environments where isolated, lookout.com 18
VI Conclusion The Lookout Security Platform analyzes potential mobile threats not in the context of a single server, a single device, or a single application, but in the context of global mobile devices and code. Lookout s predictive security model enables more reliable tracking of existing threats and more precise predictions of zero day threats. Yet, predictive security models only work if they can draw on global context. The continued failure of signatures and behavioral analysis alone to consistently identify threats without oceans of false positives or false negatives reveals the critical importance of having large, contextual data sets. Lookout s platform excels at finding the signal amid the noise because it has unprecedented insight into the code, both apps and firmware, running on tens of millions of devices around the planet. This massive dataset produces hundreds of millions of datapoints that the platform can use to correlate and predict security threats and risks. Predictive security models require machine intelligence to identify exceedingly complex correlations and risk signals that humans cannot possibly identify at scale. Today, most detection systems excel only at identifying the bank robber who has already hit the vault. We should instead use the deluge of data available to us to predict the next bank robber based on their correlations across multiple dimensions to known bad actors. lookout.com 19