Cybersecurity Analytics for a Smarter Planet



Similar documents
IBM Software Integrated Service Management: Visibility. Control. Automation.

Delivering information you can trust. IBM InfoSphere Master Data Management Server 9.0. Producing better business outcomes with trusted data

IBM Security Intrusion Prevention Solutions

IBM Security QRadar Risk Manager

IBM Cognos Analysis for Microsoft Excel

IBM Security QRadar Risk Manager

IBM Tivoli Netcool Configuration Manager

Real-time asset location visibility improves operational efficiencies

How To Create An Insight Analysis For Cyber Security

IBM Tivoli Endpoint Manager for Security and Compliance

IBM Tivoli Storage Manager for Virtual Environments

IBM Security X-Force Threat Intelligence

IBM Maximo Asset Management solutions for the oil and gas industry

Scorecarding with IBM Cognos TM1

are you helping your customers achieve their expectations for IT based service quality and availability?

The IBM Cognos family

Empowering intelligent utility networks with visibility and control

IBM InfoSphere Streams

Simplify security management in the cloud

The IBM Cognos family

Win the race against time to stay ahead of cybercriminals

IBM Content Analytics adds value to Cognos BI

For healthcare, change is in the air and in the cloud

Strengthen security with intelligent identity and access management

Stay ahead of insiderthreats with predictive,intelligent security

IBM Security Network Protection

IBM SmartCloud Monitoring

IBM Business Analytics: Finance and Integrated Risk Management (FIRM) solution

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Smarter wireless networks

IBM Policy Assessment and Compliance

IBM Unstructured Data Identification and Management

Safeguarding the cloud with IBM Dynamic Cloud Security

IBM Security QRadar Vulnerability Manager

Driving workload automation across the enterprise

Information Management Configuring External Storage for Archive and Backup

IBM InfoSphere Information Server Ready to Launch for SAP Applications

IBM Endpoint Manager for Core Protection

Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide

IBM Cognos Insight. Independently explore, visualize, model and share insights without IT assistance. Highlights. IBM Software Business Analytics

Anderson University increases campus safety with better emergency communications

Beyond the Hype: Advanced Persistent Threats

IBM System x reference architecture solutions for big data

White paper December Addressing single sign-on inside, outside, and between organizations

IBM SmartCloud Workload Automation

Reporting and Incident Management for Firewalls

IBM Tivoli Endpoint Manager for Lifecycle Management

IBM Cognos Enterprise: Powerful and scalable business intelligence and performance management

Beyond passwords: Protect the mobile enterprise with smarter security solutions

Improve business agility with WebSphere Message Broker

IBM SmartCloud for Service Providers

Premier. Helping healthcare providers deliver the best possible care to their patients. Smart is...

Optimizing government and insurance claims management with IBM Case Manager

IBM ediscovery Identification and Collection

IBM Protocol Analysis Module

TIBCO Live Datamart: Push-Based Real-Time Analytics

IBM Tivoli Endpoint Manager for Lifecycle Management

Optimize workloads to achieve success with cloud and big data

Addressing government challenges with big data analytics

Breaking down silos of protection: An integrated approach to managing application security

IBM Tivoli Endpoint Manager for Security and Compliance

Jabil builds momentum for business analytics

White paper September Realizing business value with mainframe security management

Predictive analytics with System z

IBM Cognos TM1. Enterprise planning, budgeting and analysis. Highlights. IBM Software Data Sheet

Streamlining Web and Security

Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager

Beyond Watson: The Business Implications of Big Data

IBM Software Enabling business agility through real-time process visibility

Securing and protecting the organization s most sensitive data

IBM Software IBM Business Process Management Suite. Increase business agility with the IBM Business Process Management Suite

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Boosting enterprise security with integrated log management

z/os V1R11 Communications Server system management and monitoring

IBM Security Network Intrusion Prevention System

IBM Cognos Analysis for Microsoft Excel

Introducing IBM s Advanced Threat Protection Platform

Six Days in the Network Security Trenches at SC14. A Cray Graph Analytics Case Study

Making confident decisions with the full spectrum of analysis capabilities

White paper December IBM Tivoli Access Manager for Enterprise Single Sign-On: An overview

IBM Software Wrangling big data: Fundamentals of data lifecycle management

Low-latency market data delivery to seize competitive advantage. WebSphere Front Office for Financial Markets: Fast, scalable access to market data.

Addressing IT governance, risk and compliance (GRC) to meet regulatory requirements and reduce operational risk in financial services organizations

Gaining insight into critical enterprise applications with IBM CICS and business events

Build an effective data integration strategy to drive innovation

IBM SPSS Direct Marketing

Better decision making under uncertain conditions using Monte Carlo Simulation

ADY-1727: IBM Watson Analytics and Cognos Business Intelligence for Line of Business Smart Data Discovery

Big data management with IBM General Parallel File System

Oracle Internet of Things Cloud Service

IBM Software Cloud service delivery and management

Fiserv. Saving USD8 million in five years and helping banks improve business outcomes using IBM technology. Overview. IBM Software Smarter Computing

IBM Cognos Controller

IBM Endpoint Manager for Server Automation

IBM Information Archive for , Files and ediscovery

Four keys to effectively monitor and control secure file transfer

Transcription:

IBM Institute for Advanced Security December 2010 White Paper Cybersecurity Analytics for a Smarter Planet Enabling complex analytics with ultra-low latencies on cybersecurity data in motion

2 Cybersecurity Analytics for a Smarter Planet Contents 2 Executive Summary 2 Introduction 3 Stream Computing 3 Architectural Overview 7 InfoSphere Streams Technology for Cybersecurity 9 Summary Executive Summary As the world s critical infrastructures become more instrumented, interconnected, and intelligent, the amount of cybersecurity-related data about those infrastructures has grown at a rate far exceeding the capacity of traditional cybersecurity management systems. This security data about network traffic, access control exceptions, and other unsafe or unsecure conditions was already stressing our systems ability to use that data. Facing exploding cybersecurity data volumes, organizations are struggling to make real-time decisions and continue to maintain the integrity and security of their systems. The conventional approach first requires data to be stored in a database and then queries are run after the fact to detect or determine the cause of problems. They are fast realizing that the time lost in this process results in a reactive security stance that is often far too late. IBM InfoSphere Streams can address this challenge by providing a futuristic technology that can extract knowledge from cybersecurity data streams still in motion, allowing prompt protective actions. Introduction The goal of IBM s research into stream processing is to provide breakthrough technologies that enable aggressive production and management of information from relevant data, which must be extracted from enormous volumes of potentially unimportant data. Specifically, stream processing can radically extend the state of the art in information processing by simultaneously addressing several technical challenges: Respond quickly to events and changing requirements Continuously analyze data at rates that are greater than existing systems Adapt rapidly to changing data forms and types Manage high availability, heterogeneity, and distribution for the new stream paradigm Provide security and information confidentiality for shared information While some research and commercial initiatives try to address those technical challenges in isolation, IBM is seeking to address all of them together. The primary goal of stream processing is to break through a number of fundamental barriers to create a system designed to meet these challenges. The project, which began in IBM Research in 2003 as System S, was demonstrated in various application environments and is now available as an IBM offering. The technology is currently installed in dozens of sites across three continents.

IBM Security 3 Stream Computing Stream computing is a new standard. In traditional processing, one can think of running queries against relatively static data; for instance, list all personnel residing within 50 miles of New Orleans, which results in a single result set. With stream computing, one can execute a continuous query that identifies personnel who are currently within 50 miles of New Orleans, but get continuous, updated results as location information from GPS data is refreshed over time. In the first case, questions are asked of static data; in the second case, data is continuously evaluated by static questions. Stream processing goes further by allowing the continuous queries to be modified over time. A simple view of this distinction is as follows: Figure 1: A conceptual overview of static data versus streaming data While there are other systems that embrace the stream computing model, IBM takes a different approach for continuous processing and differentiates with its distributed runtime platform, programming model, and tools for developing continuous processing applications. These applications rapidly analyze information as it streams from thousands of real-time sources, including sensors, cameras, news feeds, network devices (firewalls or web servers), cybersecurity monitors, or various other sources, including traditional databases. Architectural Overview The streams architecture represents a significant change in computing system organization and capability. Even though it has some similarity to Complex Event Processing (CEP) systems, it is built to support higher data rates and a broader spectrum of input data modalities. It also provides infrastructure support to address the needs for scalability and dynamic adaptability, such as scheduling, load balancing, and high availability. In stream processing, continuous applications are composed of individual operators, which interconnect and operate on multiple data streams. Data streams can come from outside the system or can be produced internally as part of an application. The following flow diagram shows an example of how multiple types of streaming data can be filtered, classified, transformed, correlated, and used to make an equities trade decision by using dynamic earnings calculations, adjusted according to news analyses, and realtime risk assessments such as the impact of impending hurricane damage.

4 Cybersecurity Analytics for a Smarter Planet Figure 2: An example of making a trade decision based on continuous data streams For the purposes of this overview, it is not necessary to understand the specifics of Figure 2; rather, its purpose is to demonstrate how diverse streaming data sources can make their way into the core of the system, be analyzed in different fashions by different pieces of the application, flow through the system, and produce results. These results can be used in different ways, including display within a dashboard, driving business actions, or storage in enterprise databases for further offline analysis.

IBM Security 5 Figure 3 illustrates the complete prototype infrastructure. Data from input data streams representing a myriad of data types and modalities flow into the system. The layout of the operations performed on that streaming data is determined by high-level system components that translate user requirements into running applications. Figure 3: System overview

6 Cybersecurity Analytics for a Smarter Planet The system offers three methods for users to operate on streaming data: The Stream Processing Application Declarative Engine (SPADE) provides a language and runtime framework to support streaming applications. Users can create applications without having to understand the lower-level stream-specific operations. SPADE offers the ability to bring streams from outside the system and export results, and a facility to extend the underlying system with user-defined operators. Users can pose inquiries to the system to express their information needs and interests. These inquiries are translated by a Semantic Solver into a specification of how potentially available raw data and existing information can be transformed to satisfy user objectives. The runtime environment accepts these specifications, considers the library of available application components, and assembles a job specification to run the required set of components. Users can develop applications through an Eclipsebased workflow development tool environment, which includes an Integrated Development Environment (IDE). These users can program lowlevel application components that can be interconnected by streams, and specify the nature of those connections. Each component is typed so that other components can later reuse or create a particular stream. This development model will evolve over time to operate directly on SPADE operators rather than the base, low-level applications components, but will still let new operators be developed. All three of these methods are supported by the underlying runtime system. As new jobs are submitted, the scheduler determines how it might reorganize the system in order to meet the requirements of both newly submitted and already executing specifications, and the Job Manager automatically affects the changes required. The runtime continually monitors and adapts to the state and utilization of its computing resources, as well as the information needs expressed by the users and the availability of data to meet those needs. Results that come from the running applications are acted upon by processes (such as web servers) that run external to the system. For example, an application might use TCP connections to receive an ongoing stream of data to visualize on a map, or it might alert an administrator to anomalous or interesting events.

IBM Security 7 InfoSphere Streams Technology for Cybersecurity In early 2010, IBM made this streams technology generally available under the name InfoSphere Streams. Many applications have been pursued, including financial services, energy trading services, health monitoring, and manufacturing. Most recently, the technology has been used to address the deluge of security data in critical cyber infrastructure systems. As the systems that operate the critical infrastructures of the world become more intelligent and interconnected, they become more reliable and efficient. However, their exposure to cybersecurity vulnerabilities can increase, as these systems were not previously interconnected to each other or the Internet. Furthermore, the importance of these systems to everyday life has driven the incorporation of security monitoring and reporting features. While well intentioned, this has led to a deluge of real-time cybersecurity data that is so large that it is sent directly to storage (when possible) or ignored. With the increase of user devices added to the growth of sensors and actuators in these critical cyber infrastructure systems, the nature of the security data is changing: Data volumes and network bandwidth are expected to grow tenfold in the next three years. With the expansion of information, come large variances in the complexion of available data. Often the data is noisy with many errors and there is no opportunity to cleanse it in a world of real-time decision-making. One of the driving factors for the cyber infrastructure is the increased speed at which decisions have to be made. The market demands that businesses optimize decisions, take action based on good information, and use advanced predictive capabilities all with speed and efficiency. The powerful analytic capability provided by InfoSphere Streams is a good match for this challenge. Figure 4 shows how the cybersecurity decision processes can be enhanced with InfoSphere Streams.

8 Cybersecurity Analytics for a Smarter Planet Figure 4: Realizing smarter security decisions using InfoSphere Streams As a specific example of how InfoSphere Streams can be used for cybersecurity, consider the problem of botnets. Botnets are a network of compromised computers (called zombies), which are controlled by single host or group called the botmaster. The size of these botnets can range from hundreds to millions of hosts scattered around the world. Typical uses of a botnet include denial of service attacks, spam delivery, stealing banking credentials, and data theft, to name a few. The real owner of these zombie machines is typically unaware that a computer has been taken over. The botmaster uses a highly distributed command and control structure, often with standard functions, such as IRC, HTML, or IM to tell the zombie computers what to do. The effective detection of botnet activity requires the aggregation of large data streams of differing formats and semantics from distributed sites across companies, jurisdictions, and countries. This real-time data can come from several sources at differing speeds and qualities: Net flow data from enterprise networks, containing source/destination address/port, protocol, data volume, and router-specific summaries of point-topoint connections Intrusion Prevention System (IPS) logs Firewall logs DHCP logs DNS query logs Web server logs

IBM Security 9 The data rates can vary widely, from a few records per hour to more than 5,000 per second. This data is analyzed and key features that indicate botnet-like behavior are extracted. Finally, these features must be correlated across the different streams to confirm the botnet s presence. For example, in one experiment, the collected live network flow data was analyzed and 32 hosts were found to be accessing known botnet command and control (C&C) servers. In another instance, the flow of DNS queries was analyzed and 12 hosts were found to be actively querying the known botnet C&C servers. Neither of these analyses was possible before due to the vast amount of real-time data that was required. InfoSphere technology provided the ability to quickly identify potential zombie systems in an ongoing, realtime manner. Summary The research that began in 2003 and eventually became IBM InfoSphere Streams has demonstrated many successes in a variety of commercial and scientific applications. It provides an infrastructure to support mission-critical data analysis with exceptional performance and interoperability. Research around stream computing continues and is expected to further increase the scale and diversity of the IBM InfoSphere Streams infrastructure, tools, support, and potential applications. For more information about InfoSphere Streams, please contact your IBM Marketing Representative or Authorized IBM Business Partner.

10 Cybersecurity Analytics for a Smarter Planet Copyright IBM Corporation 2010 IBM Corporation Route 100 Somers, NY 10589 U.S.A. Produced in the United States of America December 2010 All Rights Reserved IBM, the IBM logo, ibm.com, and InfoSphere are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or TM ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at Copyright and trademark information at ibm.com/legal/copytrade.shtml Other company, product and service names may be trademarks or service marks of others. References in this publication to IBM products and services do not imply that IBM intends to make them available in all countries in which IBM operates. No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation. Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. Any statements regarding IBM s future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED AS IS WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements (e.g. IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided.