How To Manage Event Data With Rocano Ops



Similar documents
ROCANA WHITEPAPER Rocana Ops Architecture

ROCANA WHITEPAPER How to Investigate an Infrastructure Performance Problem

ROCANA WHITEPAPER Control Your Modern Infrastructure

How To Make Data Streaming A Real Time Intelligence

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

locuz.com Big Data Services

How To Use Hp Vertica Ondemand

An Enterprise Data Hub, the Next Gen Operational Data Store

Why Big Data in the Cloud?

Cloudera Enterprise Data Hub in Telecom:

A Vision for Operational Analytics as the Enabler for Business Focused Hybrid Cloud Operations

Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence

How Solace Message Routers Reduce the Cost of IT Infrastructure

Windows Embedded Security and Surveillance Solutions

WHAT IS ENTERPRISE OPEN SOURCE?

Big Data at Cloud Scale

AtScale Intelligence Platform

The Purview Solution Integration With Splunk

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

McAfee Web Reporter Turning volumes of data into actionable intelligence

Big Data and Your Data Warehouse Philip Russom

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

More Data in Less Time

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Adobe Insight, powered by Omniture

WHITE PAPER. Five Steps to Better Application Monitoring and Troubleshooting

Cloud Computing and Advanced Relationship Analytics

CitusDB Architecture for Real-Time Big Data

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

InfiniteGraph: The Distributed Graph Database

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

PUSH INTELLIGENCE. Bridging the Last Mile to Business Intelligence & Big Data Copyright Metric Insights, Inc.

Actifio Big Data Director. Virtual Data Pipeline for Unstructured Data

SOLUTION BRIEF. TIBCO LogLogic A Splunk Management Solution

CRITEO INTERNSHIP PROGRAM 2015/2016

Benefits of Deploying VirtualWisdom with HP Converged Infrastructure March, 2015

Trading. Next Generation Monitoring. James Wylie Senior Manager, Product Marketing

Server & Application Monitor

Leveraging Machine Data to Deliver New Insights for Business Analytics

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

Using an In-Memory Data Grid for Near Real-Time Data Analysis

Processing and Analyzing Streams. CDRs in Real Time

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

whitepaper critical software characteristics

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief

SQL Server 2012 Performance White Paper

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

HP Virtualization Performance Viewer

Product Brief SysTrack VMP

How To Handle Big Data With A Data Scientist

PEPPERDATA OVERVIEW AND DIFFERENTIATORS

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Make the Most of Big Data to Drive Innovation Through Reseach

IT Operations analytics redefined: uncovering business impact and opportunities with Application Analytics

Client Monitoring with Microsoft System Center Operations Manager 2007

Databricks. A Primer

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

Big Data 101: Harvest Real Value & Avoid Hollow Hype

Data Refinery with Big Data Aspects

EMC SOLUTION FOR SPLUNK

Operational Analytics

Advanced In-Database Analytics

Integration Maturity Model Capability #5: Infrastructure and Operations

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Scalability in Log Management

The 3 questions to ask yourself about BIG DATA

BMC ProactiveNet Performance Management: Delivering on the Promise of Predictive Control Across the Total IT Environment SOLUTION WHITE PAPER

Big Data on the Open Cloud

Kepware Whitepaper. Enabling Big Data Benefits in Upstream Systems. Steve Sponseller, Business Director, Oil & Gas. Introduction

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Lawson Healthcare Solutions Optimization of Key Resources Forms a Foundation for Excellent Patient Care

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Hadoop & Spark Using Amazon EMR

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

Ubuntu and Hadoop: the perfect match

APPLICATION MANAGEMENT SUITE FOR SIEBEL APPLICATIONS

EMERGING TRENDS Business Process Management

BlackStratus for Managed Service Providers

IBM Netezza High Capacity Appliance

Big Data Services From Hitachi Data Systems

Performance Management for Enterprise Applications

Log Management Solution for IT Big Data

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Detect & Investigate Threats. OVERVIEW

Transcription:

ROCANA WHITEPAPER Improving Event Data Management and Legacy Systems

INTRODUCTION STATE OF AFFAIRS WHAT IS EVENT DATA? There are a myriad of terms and definitions related to data that is the by-product of operational systems. Curt Monash is credited for creating the term machine data 2. Log data is commonly used term but refers to only a subset of the data. ExtraHop 3 has defined four classes of data: wire, machine, metric, and synthetic. In this paper we use the term event data to be the union of all these terms. As IT infrastructure evolves to a more dynamic and elastic infrastructure, IT operators are encountering a new set of challenges in controlling their modern environments. Most obvious among these challenges is the ability to manage the massive amount of event data being generated in global-scale enterprises estimates are that machine data is growing at the rate of 40% annually 1. Many businesses are already generating terabytes of event data per day, rapidly growing to tens if not hundreds of terabytes per day in the next few years. Managing event data is the first step of the journey to utility, and analyzing event data is a second big challenge. Event data has myriad uses from monitoring and managing IT systems to security to business intelligence. There have been many attempts at providing a software solution to monitoring and analyzing event data, the most well-known being Splunk. As with many other legacy solutions, the industry has started to recognize some serious liabilities, including: cost licensing policies complexity closed architecture COST Increasingly, businesses are looking for alternatives to increasing Splunk spend due to cost structure alone. A recent study by Blue Hill Research 4 determined that the one-year TCO for a 1TB/day implementation of Splunk Enterprise was nearly $1,000,000, with the three-year TCO exceeding $2M. Because of Splunk s cost structure, most businesses are very selective about which data is ingested in order to save money. This decision to sample data often causes problems down the line when data needed is not available. Another reason businesses are looking for alternatives to Splunk is the hostage clause in the license agreement for Splunk Enterprise. The agreement sets limited conditions for daily ingest volumes. When these limits are exceeded, access to data is blocked until the license violation is rectified. Often this means purchasing an unanticipated, unbudgeted license upgrade. Worse yet, event data volumes typically surge right before or when problems are occurring, so this licensing policy often means that Splunk becomes unavailable just when it is needed most. COMPLEXITY Managing and using Splunk is so complicated that the community of Splunk experts has become known as ninjas for their advanced skillset. The complexity is twofold: (1) configuring the actual hardware infrastructure and capacity planning and (2) implementing the brute-force user model to find problems and build solutions. 2015 Rocana, Inc. 1

Splunk provides a reference architecture that involves indexers, search heads, and load balancers. The ratio of search heads and index heads is highly dependent on the queries users execute, making capacity planning a very challenging proposition. Worse, Splunk throughput is highly affected by both disk utilization rates and query class. The brute-force approach employed by Splunk is a second axis of complexity. Users must navigate through mountains of data to find records of real value. This is an artifact of Splunk s user model, in which everything is a search query. Even integrations with other systems are query based: save a query, execute the saved query, have an external tool pick up the results. CLOSED ARCHITECTURE Like a roach motel, Splunk is a one-way street. Splunk is a completely proprietary product; there is no way to access data other than through Splunk-provided tools. The data and indexes are closed. The only APIs are those provided by Splunk. All data access is query-based, either directly or through the API. This hurts businesses in several ways: The biggest limitation of the closed architecture is sharing data with other systems. Splunk provides methods for sharing data, but they are query-based. This makes Splunk effectively a batch-oriented system. Real-time or streaming analysis with external tools is not possible. The workaround of rapidly re-issuing queries can address this problem to a large degree, but at the cost of excessive system overhead, which in turn means an increase in required system resources and a higher TCO. Additionally, the proprietary nature of Splunk has limited the pool of so-called ninjas. The scarcity of Splunk professionals results in an excessive cost to hire. Also, because these resources are in great demand, they are frequently targeted by recruiters resulting in a high risk of turnover. A BETTER APPROACH Rocana Ops was created from the ground up using a modern approach to event data management relying on proven Big Data software, open source standards, machine learning, and purpose-built visualizations to bring a superior solution to market. Rocana Ops was designed to: encourage data collection and retention; provide an open standards based approach; support multiple integration & analysis techniques; and augment IT operations. 2015 Rocana, Inc. 2

ENCOURAGE DATA COLLECTION AND RETENTION Rocana s licensing policies are not tied to the amount of data being indexed but rather the number of users accessing the system. Licensing is easy, predictable, and cost-efficient without the threat of being held hostage. These licensing policies were developed so that businesses can collect data from all of their systems and integrate data into an event data warehouse. This addresses critical business needs such as: No more data silos and resultant miscommunication about the state of systems. Providing a single source of truth for all IT monitoring and management activities. Assuring that all necessary data is available when it is needed. Having a single source of all necessary data is especially important in security use cases, where issues are often discovered months after the fact and retroactive analysis is commonplace. USE OPEN STANDARDS By building Rocana Ops on top of open source software and following open standards, Rocana greatly reduces the TCO and risk to businesses. Data in Rocana Ops is managed using popular open source products and can be accessed using standard software that businesses are already using. This means businesses have a large pool of candidates to choose from when building out their teams, and these team members have skills that are transferable to other projects. Not only are such human resources easier to find than ninjas, they are also less expensive, and community resources are available for training and troubleshooting. SUPPORT MULTIPLE INTEGRATION & ANALYSIS TECHNIQUES Unlike legacy systems, which employ a brute-force search interface, Rocana Ops provides a much more flexible implementation for integration and analysis. First and foremost is the publish-subscribe architecture supported by Rocana Ops. 2015 Rocana, Inc. 3

The Rocana Ops architecture provides real-time streaming of events to external applications. Rocana Ops can establish many different channels, each with its own filtered set of data for downstream applications. The data published to these channels can be both modified and augmented (for example, a Geo IP lookup, database merge, etc.) before being published, and Rocana Ops always allows users to retain the original unaltered event. This design gives users a critical capability to replay data multiple times without affecting the source data. This allows Rocana Ops to serve as a deep data store for 3rd party tools and aids in testing and tuning machine learning models until the exact outputs desired are obtained. Rocana Ops also supports query-based interfaces to data. Developers can choose the best implementation option for their application: streaming or query-based. Since Rocana Ops is built using open source software, application developers can use their favorite SQL tools or other Hadoop compatible solutions to query the data stored by Rocana Ops, avoiding the challenges and liabilities that come with proprietary solutions. AUGMENT IT OPERATIONS Rocana Ops does much more than just collect and manage event data. Rocana Ops includes out-of-the-box functionality to make it easier for IT administrators to monitor and manage systems. Unlike brute-force solutions, Rocana Ops automatically organizes and presents multiple visualizations of data, organized by location, service, and host. Users can drill-down into system metrics, utilization, and detailed event data. Rocana Ops supports multiple visualization views out of the box, including changes in event volumes, annotated event data and custom metrics and dashboards. Rocana Ops employs cutting edge machine-learning features like anomaly detection, which can find aberrant activity such as CPU thrashing. Rocana s anomaly detection is not a simple threshold evaluation; it accounts for naturally occurring differences that might be caused by periodicity. CONCLUSION Legacy event data management and analysis solutions such as Splunk are greatly limited by implementation and design choices made a decade ago when proprietary solutions were still in vogue and the volume, variety, and velocity of machinegenerated data was much different. Rocana provides a modern alternative that is better in several regards: simpler and greater scalability open data access and formats out-of-the-box functionality for augmented IT ops open integration and rich analytics significantly lower TCO 2015 Rocana, Inc. 4

Rocana combines the assurance of predictable pricing and the scalability of a modern Big Data architecture, as well as the collaborative nature and reliability of open source components all to provide the next generation of event data management for controlling modern IT infrastructure. ABOUT ROCANA Rocana is creating the next generation of IT operations analytics software in a world in which IT complexity is growing exponentially as a result of virtualization, containerization and shared services. Rocana s mission is to provide guided root cause analysis of event oriented machine data in order to streamline IT operations and boost profitability. Founded by veterans from Cloudera, Vertica and Experian, the Rocana team has directly experienced the challenges of today s IT infrastructures, and has set out to address them using modern technology that leverages the Hadoop ecosystem. 1 http://www.dbms2.com/2015/01/30/growth-in-machine-generated-data/ 2 https://en.wikipedia.org/wiki/machine-generated_data 3 http://www.extrahop.com/post/blog/the-four-data-sets-essential-for-it-operations-analytics-itoa/ 4 http://www.tibco.com/assets/blt6d15d0ef383138d2/research-report-estimating-the-cost-of-machine-data-managementsplunk-and-tibco-loglogic.pdf Rocana, Inc. 548 Market St #22538, San Francisco, CA 94104 +1 (877) ROCANA1 info@rocana.com www.rocana.com 2015 Rocana, Inc. All rights reserved. Rocana and the Rocana logo are trademarks or registered trademarks of Rocana, Inc. in the United States and/or other countries. WP-EDM-0715 2015 Rocana, Inc. 5