The WebJob Framework: A Generic, Extensible, and Scalable Endpoint Security Solution



Similar documents
Guideline on Auditing and Log Management

FISMA / NIST REVISION 3 COMPLIANCE

Host/Platform Security. Module 11

Information Technology Policy

GE Measurement & Control. Top 10 Cyber Vulnerabilities for Control Systems

For Businesses with more than 25 seats.

Description of Actual State Sensor Types for the Software Asset Management (SWAM) Capability. 7 Jul 2014

Host-based Protection for ATM's

ensure prompt restart of critical applications and business activities in a timely manner following an emergency or disaster

Mobile security and your EMR. Presented by: Shawn Tester & Allen Cornwall

DMZ Gateways: Secret Weapons for Data Security

SolarWinds Security Information Management in the Payment Card Industry: Using SolarWinds Log & Event Manager (LEM) to Meet PCI Requirements

Honeywell Industrial Cyber Security Overview and Managed Industrial Cyber Security Services Honeywell Process Solutions (HPS) June 4, 2014

Solution Brief for ISO 27002: 2013 Audit Standard ISO Publication Date: Feb 6, EventTracker 8815 Centre Park Drive, Columbia MD 21045

SANS Top 20 Critical Controls for Effective Cyber Defense

Guardium Change Auditing System (CAS)

PCI Data Security Standards (DSS)

IBM Data Security Services for endpoint data protection endpoint data loss prevention solution

IBM QRadar Security Intelligence April 2013

CimTrak Technical Summary. DETECT All changes across your IT environment. NOTIFY Receive instant notification that a change has occurred

Nessus Agents. October 2015

Dooblo SurveyToGo: Security Overview

SANS Institute First Five Quick Wins

Protect Your IT Infrastructure from Zero-Day Attacks and New Vulnerabilities

DMZ Virtualization Using VMware vsphere 4 and the Cisco Nexus 1000V Virtual Switch

MIGRATIONWIZ SECURITY OVERVIEW

Agent vs. Agent-less auditing

Larry Wilson Version 1.0 November, University Cyber-security Program Critical Asset Mapping

What Do You Mean My Cloud Data Isn t Secure?

Data Management Policies. Sage ERP Online

Features Business Perspective.

Monitoring Windows Workstations Seven Important Events

Concierge SIEM Reporting Overview

Rule 4-004M Payment Card Industry (PCI) Monitoring, Logging and Audit (proposed)

Information Technology Solutions

Comprehensive Malware Detection with SecurityCenter Continuous View and Nessus. February 3, 2015 (Revision 4)

End-user Security Analytics Strengthens Protection with ArcSight

Overview of Network Security The need for network security Desirable security properties Common vulnerabilities Security policy designs

EXTENSIVE FEATURE DESCRIPTION SECUNIA CORPORATE SOFTWARE INSPECTOR. Non-intrusive, authenticated scanning for OT & IT environments. secunia.

Endpoint Threat Detection without the Pain

Securing Privileges in the Cloud. A Clear View of Challenges, Solutions and Business Benefits

defending against advanced persistent threats: strategies for a new era of attacks agility made possible

FormFire Application and IT Security. White Paper

Managing for the Long Term: Keys to Securing, Troubleshooting and Monitoring a Private Cloud

Business white paper. Missioncritical. defense. Creating a coordinated response to application security attacks

High End Information Security Services

INCIDENT RESPONSE CHECKLIST

Improving the Security of the User Environment through Standards

Running the SANS Top 5 Essential Log Reports with Activeworx Security Center

Lifecycle Solutions & Services. Managed Industrial Cyber Security Services

Security Event Management. February 7, 2007 (Revision 5)

Skoot Secure File Transfer

With Great Power comes Great Responsibility: Managing Privileged Users

Introduction to Cyber Security / Information Security

Smarter Security for Smarter Local Government. Craig Sargent, Solutions Specialist

Flow-based detection of RDP brute-force attacks

How To Secure Your Store Data With Fortinet

Global Partner Management Notice

IBM Tivoli Endpoint Manager for Security and Compliance

Securing Endpoints without a Security Expert

2. From a control perspective, the PRIMARY objective of classifying information assets is to:

IQware's Approach to Software and IT security Issues

Ovation Security Center Data Sheet

ARS v2.0. Solution Brief. ARS v2.0. EventTracker Enterprise v7.x. Publication Date: July 22, 2014

APPLICATION OF MULTI-AGENT SYSTEMS FOR NETWORK AND INFORMATION PROTECTION

Oracle Maps Cloud Service Enterprise Hosting and Delivery Policies Effective Date: October 1, 2015 Version 1.0

February 22, (Revision 2)

MySQL Security: Best Practices

Securely maintaining sensitive financial and

Driving Company Security is Challenging. Centralized Management Makes it Simple.

McAfee Server Security

Tivoli Endpoint Manager. Increasing the Business Value of IT, One Endpoint at a Time

IBM Tivoli Endpoint Manager for Lifecycle Management

Log Management for the University of California: Issues and Recommendations

National Cyber Security Framework and Protocol. for securing digital information in networked critical infrastructures and communications

Managed Security Services for Data

Network Security Policy

Analyzing HTTP/HTTPS Traffic Logs

System Security Policy Management: Advanced Audit Tasks

KEY STEPS FOLLOWING A DATA BREACH

Preempting Business Risk with RSA SIEM and CORE Security Predictive Security Intelligence Solutions

NIST CYBERSECURITY FRAMEWORK COMPLIANCE WITH OBSERVEIT

SECURELINK.COM REMOTE SUPPORT NETWORK

The Business Case for Security Information Management

How To Secure An Rsa Authentication Agent

How To Protect Your Cloud From Attack

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

The Comprehensive Guide to PCI Security Standards Compliance

KASPERSKY SECURITY INTELLIGENCE SERVICES. EXPERT SERVICES.

White Paper Integrating The CorreLog Security Correlation Server with BMC Software

SECURITY BEGINS AT THE ENDPOINT

Protecting Critical Infrastructure

Beyond Remote Control Features that Take Remote Control Capabilities to the Next Level of Network Management

IBM Tivoli Endpoint Manager for Lifecycle Management

WebEx Security Overview Security Documentation

Threat Center. Real-time multi-level threat detection, analysis, and automated remediation

AlienVault for Regulatory Compliance

Security Overview Introduction Application Firewall Compatibility

Secure Your Mobile Workplace

CorreLog Alignment to PCI Security Standards Compliance

Transcription:

The WebJob Framework: A Generic, Extensible, and Scalable Endpoint Security Solution Andy Bair and Klayton Monroe KoreLogic Security June 2015 Abstract The WebJob framework is a next generation endpoint security solution that, from a centralized management location, can execute virtually any program on any number of client systems at any time. This framework was designed to be generic in nature, operating system agnostic, minimal in footprint, minimal in resource consumption, and highly secure. The framework has been deployed in a number of production environments including the Federal government and Fortune 500 businesses to perform a myriad of functions on a wide array of end systems. There is virtually no limit to the functions the framework can execute on end systems. For example, the framework has been used to perform evidence collection, enterprise searching, incident response, live forensics, system management and monitoring, and grid computing. The framework is well-suited to be the next generation endpoint system security solution. The Problem There are many agent-based solutions in the market today that provide important value to organizations. While the individual solutions are useful, as a whole they can complicate security and interoperability. Many solutions are narrowly focused on a single problem domain (so called point solutions ). By design, this limits their effectiveness in other domains and often makes it difficult to perform custom or ad hoc tasks. Point solutions result in the need for multiple agents on each endpoint. This can cause operating conflicts and become problematic in older disconnected, intermittent, or low-bandwidth (DIL) environments, where adding and maintaining agents is difficult. Many solution frameworks lack sufficient security on the server and the endpoints. This increases the attack surface of each endpoint and exposes additional attack vectors, which could leave organizations open to prolonged and/or large-scale attacks. Finally, a number solutions are proprietary in nature and therefore are difficult to adapt to other problem sets or integrate into other systems or solutions. The Solution: WebJob Framework The WebJob framework 1 is a next generation endpoint security solution that, from a centralized management location, can execute virtually any program on any number of client systems at any time. This framework was designed to be generic in nature, operating system agnostic, minimal in footprint and resource consumption, and highly secure. The framework centralizes data collection across all clients enabling personnel to analyze and respond to the results effectively and efficiently. The framework allows execution of regular canned jobs and arbitrary ad hoc jobs allowing organizations to adapt quickly to the 1 http://webjob.sourceforge.net/webjob/index.shtml 1/5

changing needs of their environment or the situation at hand (e.g., emergency change requests, incident response requests, etc.). And because the framework was designed this way, there is virtually no limit to what it can do on end systems. Jobs Server Results The framework is an open source client-server system that enables engineers to run arbitrary programs on end systems and aggregate the results on designated collection servers (usually the management server itself). The framework allows execution of any program on a wide array of operating systems (e.g., Linux, Mac OS, Windows, Android, etc.). It provides the automation and security to support a variety of activities including system/integrity monitoring, enterprise search, configuration management, compliance verification, automated analysis, and other activities normally requiring discrete solutions. Figure 1 shows the key components of the framework and are described below. Server The Server functions as the command and control or nerve center of the framework. All tasks are managed, scheduled, and processed from this system (or cluster of systems). Client The Client is a lightweight program (typically less than a few megabytes) that is deployed to each participating end system and executes jobs assigned to it by the Server. Job Jobs are programs or scripts responsible for completing a given task. A job can perform a simple action like providing a heartbeat indicating that the end system is alive and well, or it can perform more complicated actions such as replacing system files that have drifted from a defined standard, become corrupt/infected, or just need to be updated. Figure 1: The WebJob Framework The Framework in Action The framework has been deployed in many production environments including the Federal government and Fortune 500 businesses to perform a myriad of functions on a wide array of end systems. The following are just a few examples of the framework in action. The Swiss Army Knife The framework has been deployed at a Federal agency for several years in both classified and unclassified environments. Over a seven year period, the framework supported hundreds of, and at one point, the framework supported 7,000 jobs per hour (or about 5 million jobs per month). The framework has performed many functions on end systems including execution of the DISA Security Readiness Review (SRR) scripts, end system compliance checking and auditing, file system searching, software management, user and group management, log management, file system integrity monitoring, process monitoring, network monitoring, disk utilization monitoring, and system health monitoring. 2/5

Incident Response / Live Forensics The framework was used in a Federally Funded Research and Development Company (FFRDC) to collect forensic information on Windows systems suspected of having Microsoft Word documents infected with malware designed to install a back door. A custom Windows installer program was created to harvest and upload forensic data from end systems then clean-up after execution resulting in a temporary framework of. The quick response combined with data aggregation allowed security personnel to triage machines and address legitimate system infections quickly. Evidence Collection / Enterprise Search The framework has been used in a quick response action to a data breach and extortion case for Fortune 500 companies. The framework was used to search for Personally Identifiable Information (PII) and harvest files on hundreds of production UNIX servers and thousands of Windows desktops/laptops where COTS tools were not equipped to conduct searches or perform collection and ad hoc tasking in a manner that met the customer's strict requirements. Compliance The framework was used at a Fortune 500 company to search approximately 225 million files across hundreds of production UNIX servers for sensitive unprotected data to address regulatory compliance violations. The framework allowed managers to locate the violations quickly, with pinpoint accuracy, and determine the magnitude of the problem. The rapid response (roughly three weeks) coupled with supporting data showing the issue had been resolved enabled auditors to grant compliance before the end of the calendar year. System Monitoring and Management The framework was used to manage enterprise Intrusion Detection Systems (IDS) for a Fortune 500 critical infrastructure telecommunications firm. The framework performed a number of ad hoc tasks above and beyond its primary focus, which was IDS management. Additionally, it was used to collect, manage, and process NetFlow data. Grid Computing The framework is currently used by KoreLogic to provide a distributed computing grid in support of activities such as malware analysis and password cracking. In the case of password cracking, for instance, the framework routinely manages multiple concurrent work orders, consisting of thousands of individual tasks, across tens to hundreds of GPU/CPU cores. This effectively makes it possible, in some cases, to condense decades of computing work into a few hours or a single day. The framework was also used to run and manage a distributed malware sandnet for the purpose of testing the performance of various Anti-Virus (AV) software products against a large set of malware samples. Next Generation Endpoint Security The framework can execute virtually any job on an arbitrary number of end systems with a high degree of confidentiality, integrity, and attribution. As the demand for capacity grows (either because each end system is running more jobs or because more end systems must be supported), the framework can be scaled horizontally, vertically, or both as needed. Interoperability Since the Client is relatively small and written in C, the framework supports a broad range of platforms (e.g., mobile devices, laptops, desktops, servers, etc.) and operating systems (e.g., Linux, Mac OS, Windows, *BSD, Solaris, HP-UX, AIX, Android, 3/5

etc.). Additionally, the framework supports thick, thin, and virtual endpoint systems as well as real/virtual servers. The framework can be configured to process job data in a number of ways ranging from basic aggregation to normalization to lightweight analysis to heavy post-processing and data correlation. By extension, this implies that the framework would be an ideal feed for larger Security Information and Event Management (SIEM) 2 systems or centralized log management systems. Scalability The framework is designed to scale horizontally and vertically as shown in Figure 2. A single server can support hundreds or thousands of disparate end systems depending on their polling frequencies and the total number of jobs that must be executed per unit time. Servers can also be configured as clients to other servers thereby creating a multitiered framework. This provides a number of unique benefits including independent or autonomous operation, alignment with chain of command, data compartmentalization, and better resource utilization. If deployed in a mesh configuration where each endpoint is multihomed, single points of failure within the framework can be eliminated thereby improving overall survivability. Server 2 Server 1 Scales Horizontally Server 3 Scales Vertically Figure 2: Framework Scalability 2 http://en.wikipedia.org/wiki/security_information_and_ event_management Utilize OS security solutions The framework fully supports utilization of the underlying OS security solutions available. For example, the framework can be used to manage and monitor existing firewall configurations on client systems. It can also be used to enable various host-based security controls such as file permissions, groups, logging policies, access lists, etc. Enhance OS security solutions The framework enhances the underlying OS security solutions. For example, the framework can run anything on client systems including existing security tools to monitor critical system components such as processes, file system integrity, registry hive integrity, and general security health. The data collected from end systems can be processed on the server and alerts can be sent in response to anomalous and/or malicious activity. Alternately, the data can be minimally processed on the server and forwarded on to an existing SIEM or log analysis system for further analysis. Reduce attack surface The framework was designed with a security perspective and built from day one to minimize the available amount of attack surface in a number of ways. First, the standard Client does not have a listening daemon (i.e., it does not listen and respond to network traffic). Instead, the Client employs a pull-based model to poll the Server on a regular or ad hoc basis to receive tasks. This is significant because it eliminates an entire class of attacks. More specifically, it means that a standard Client could not be overrun by a network-based attack directed at the end system on which it runs. To thwart sophisticated man-in-the-middle (MITM) attacks, the standard Client supports mutual certificate authentication and digitally signed jobs. This has the added benefit of providing confidentiality, integrity, and attribution for Client/Server communications, and it ensures that the 4/5

Client cannot be instructed to run unauthorized jobs. If configured properly, even an attacker with system privileges on the Server would have difficulty coercing a Client to execute a job that has been maliciously altered without also having access to the private job-signing key and/or system privileges on the end systems. To protect the confidentiality and integrity of job data, either the Server or the jobs running through the Client on the end system can be configured to encrypt job output using public/private encryption (e.g., Pretty Good Privacy) such that the Server simply becomes a holding pen for data. This technique could be used to prevent an attacker who has gained privileged access to the Server from viewing potentially sensitive data. Digitally Signed Jobs Server Hardened Gentoo Linux Encrypted Drive Encrypted Job Results SSL Mutual Certificate Authentication Validate signature and execute Job N Client Machines (Windows, MAC, UNIX, etc.) Figure 3: Framework Security Changing connectivity The framework supports both low- and high-agility environments, and disconnected, intermittent, or low Bandwidth/High Latency (DIL) environments. Since the Client periodically polls the Server for tasks to perform, the connection can be severed for an extended period of time with little or no ill effect. In other words the Client will simply reestablish its connection as soon as network conditions allow it, and any pending tasks can then be requested and performed. This allows client systems to temporarily "drop-off" and return at a later time depending on connectivity. Also, the framework is capable of supporting high- and low-bandwidth environments. For example, a simple heartbeat job can consume less than 1 kilobyte of bandwidth. In the case where bandwidth is limited, other utilities can be used in conjunction with the Client to stream data back to the server in a controlled fashion. Minimize operational overhead The framework is intentionally designed to be openended and flexible to reduce management and training costs. Because of this, there are any number of capabilities not yet developed that the framework could support. While it would require some development effort to engineer and integrate a new capability into the framework, this effort is typically a one-time cost. More importantly, the Return on Investment (ROI) for this effort is often realized in as few as 100 jobs run on a single end system, a single job run on 100 end systems, or some combination thereof. This implies that time and money will be saved for each and every end system serviced by the new capability beyond that break-even threshold. For environments with thousands of end systems, the savings would be significant. Conclusion The framework is a proven technology that has been used in numerous deployment scenarios. It has an extremely small footprint, low overhead, and supports multiple platforms and DIL environments. WebJob has been deployed in a number of US Government and Fortune 500 environments to perform a wide variety of tasks. It is both flexible and extensible, providing a single-agent solution in situations where existing COTS tools would be inadequate or impractical. 5/5