This ESG White Paper was commissioned by Zettaset and is distributed under license from ESG.

Similar documents
White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

Enterprise Big Data, Business Intelligence, and Analytics Trends

Getting on the Road to SDN. Attacking DMZ Security Issues with Advanced Networking Solutions

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

Research Perspectives

By Jason Buffington, Senior Analyst, and Monya Keane, Research Analyst

SaaS with a Face: User Satisfaction in Cloud- based E- mail Management with Mimecast

White. Paper. Building Next Generation Data Centers. Implications for I/O Strategies. August 2014

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

Addressing APTs and Modern Malware with Security Intelligence Date: September 2013 Author: Jon Oltsik, Senior Principal Analyst

Securing Data in the Virtual Data Center and Cloud: Requirements for Effective Encryption

Enterprise Database Trends in a Big Data World

White. Paper. Big Data Advisory Service. September, 2011

White. Paper. Benefiting from Server Virtualization. Beyond Initial Workload Consolidation. June, 2010

EMC Isilon: Data Lake 2.0

Data- centric Security: A New Information Security Perimeter Date: March 2015 Author: Jon Oltsik, Senior Principal Analyst

This ESG White Paper was commissioned by DH2i and is distributed under license from ESG.

White. Paper. Extracting the Value of Big Data with HP StoreAll Storage and Autonomy. December 2012

IT Infrastructure Development and Its Future

The Shift Toward Data Protection Appliances

WHY YOU SHOULD CONSIDER CLOUD BASED ARCHIVING.

White. Paper. Customer Service & Support in the Age of IT-as-a-Service. July, 2012

Securing and protecting the organization s most sensitive data

Cybersecurity Skills Shortage: A State of Emergency

Data Protection Services Should Be About Services, as Well as Data Protection Date: February 2013 Author: Jason Buffington, Senior Analyst

ProtectV. Securing Sensitive Data in Virtual and Cloud Environments. Executive Summary

Deploying an Operational Data Store Designed for Big Data

The Convergence of Big Data Processing and Integrated Infrastructure

This ESG White Paper was commissioned by Extreme Networks and is distributed under license from ESG.

Achieve Economic Synergies by Managing Your Human Capital In The Cloud

Market Maturity. Cloud Definitions

White. Paper. The Big Data Security Analytics Era Is Here. January 2013

The Growing Need for Real-time and Actionable Security Intelligence Date: February 2014 Author: Jon Oltsik, Senior Principal Analyst

Why You Should Consider Cloud- Based Archiving. A whitepaper by The Radicati Group, Inc.

IBM Software InfoSphere Guardium. Planning a data security and auditing deployment for Hadoop

Securing Data in Oracle Database 12c

Trends in Private Cloud Infrastructure

White. Paper. Enterprises Need Hybrid SSO Solutions to Bridge Internal IT and SaaS. January 2013

Integrated Computing Platforms: Infrastructure Builds for Tomorrow s Data Center

MULTI VENDOR ANALYSIS

White. Paper. The Rise of Network Functions Virtualization. Implications for I/O Strategies in Service Provider Environments.

White. Paper. Rethinking Endpoint Security. February 2015

The Data Center of the Future

Solution Impact. Analysis. NEC Powers ServIT's Custom Hosting Solutions. September, 2011

Assessing the Business Value of SDN Datacenter Security Solutions

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Enterprise Strategy Group Getting to the bigger truth. By Bill Lundell, Senior Research Analyst and John McKnight, VP Research and Analysts

Digital Pathways. Harlow Enterprise Hub, Edinburgh Way, Harlow CM20 2NQ

White. Paper. EMC Personalized Support Services: A Focus on Keeping IT Healthy. November 2012

Threat Intelligence and Its Role Within Enterprise Cybersecurity Practices

An Oracle White Paper May Oracle Database Cloud Service

7 things to ask when upgrading your ERP solution

The State of Mobile Computing Security

White. Paper. Cloud Computing Demands Enterprise- class Password Management and Security. April 2013

Total year-over-year spending change in networking, (Percent of respondents) 37% 36% 35% 37% 29% 26% 16% 13% 0% 20% 40% 60% 80%

Virtual Patch Management Offers Automation, Availability, and Cost Benefits Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

White. Paper. The Road to the Hybrid Cloud: Signposts on the Way to Success. July 2015

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

Symantec OpenStorage Date: February 2010 Author: Tony Palmer, Senior ESG Lab Engineer

White. Paper. The SMB Market is Ready for Data Encryption. January, 2011

WHAT IS ENTERPRISE OPEN SOURCE?

A Comparative TCO Study: VTLs and Physical Tape. With a Focus on Deduplication and LTO-5 Technology

Field Audit Report. Asigra. Hybrid Cloud Backup and Recovery Solutions. May, By Brian Garrett with Tony Palmer

How To Understand The Needs Of The Network

Take Back Control in IT. Desktop & Server Management (DSM)

White. Paper. Evaluating Sync and Share Solutions. Balancing Security, Control, and Productivity. September, 2014

Big Data Management and Security

HGST Object Storage for a New Generation of IT

Compensating Security Controls for Windows Server 2003 Security

Pentaho Enterprise and Community Editions Feature Comparison

Windows Server 2003 Migration: Take a Fresh Look at Your IT Infrastructure

IBM Enterprise Linux Server

VMware Hybrid Cloud. Accelerate Your Time to Value

A Storage Network Architecture for Highly Dynamic Virtualized and Cloud Computing Environments

Research Report. Abstract: The Impact of Server Virtualization on Data Protection. September 2010

Research Report. Abstract: Trends in Data Protection Modernization. August 2012

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

IBM Enterprise Content Management Product Strategy

Backup and Archiving Convergence Trends

How to Enhance Traditional BI Architecture to Leverage Big Data

Preemptive security solutions for healthcare

The Challenge. ESG Case Study

Qlik Sense Enabling the New Enterprise

Security Information Lifecycle

RSA Enterprise Compromise Assessment Tool (ECAT) Date: January 2014 Authors: Jon Oltsik, Senior Principal Analyst and Tony Palmer, Senior Lab Analyst

Why DBMSs Matter More than Ever in the Big Data Era

Pervasive vs. Regular Database Solutions

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Product Brief. Overview. Analysis

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Requirements When Considering a Next- Generation Firewall

Research Report. Abstract: Solid-state Storage Market Trends. November By Bill Lundell and Mark Peters With Jennifer Gahm and John McKnight

ProtectWise: Shifting Network Security to the Cloud Date: March 2015 Author: Tony Palmer, Senior Lab Analyst and Aviv Kaufmann, Lab Analyst

Research Report. Abstract: Scale-out Storage Market Forecast February By Terri McClure

How To Improve Storage Efficiency With Ibm Data Protection And Retention

Complete Database Security. Thomas Kyte

Mitra Innovation Leverages WSO2's Open Source Middleware to Build BIM Exchange Platform

Utilizing Security Ratings for Enterprise IT Risk Mitigation Date: June 2014 Author: Jon Oltsik, Senior Principal Analyst

Transcription:

White Paper Closing the Big Data Management and Security Gap By Nik Rouda, Senior Analyst October 2014 This ESG White Paper was commissioned by Zettaset and is distributed under license from ESG.

2 Contents Big Data Is Gaining Momentum, but Increasing Concerns, Too... 3 Big Data Projects Still Rely Heavily on Professional Services... 3 Security Still a Top Concern for Big Data Platforms... 4 How Organizations Should Automate and Secure Big Data Deployments... 5 Zettaset Delivers a Safer, More Automated and Secure Solution... 6 The Bigger Truth... 7 All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.

3 Big Data Is Gaining Momentum, but Increasing Concerns, Too More and more companies are exploring new opportunities offered by big data and advanced analytics, across a broad range of industries and functional lines of business. Data- driven decision making is being seen not as a luxury, a management fad, or an area for future innovation, but as an essential need in order to compete successfully in the modern world. In parallel or even driving this interest, emerging technologies like Hadoop and NoSQL databases are finding a ready market and are increasingly being chosen as the primary platforms for accommodating the intense demands of big data. The appetite and applications are virtually endless, applicable to nearly any business process or activity, and limited more often by managerial creativity and institutional resistance to change than by technology today. IT budgets are suddenly reflecting this fundamental shift as well, and recent ESG research found 56% of companies surveyed are increasing their investments in big data and analytics by more than 10% in 2014, as compared with the previous year. 1 This rapid increase further indicates that most organizations are now moving beyond small pilots and proof- of- concept stages into enterprise- wide production deployments. However, as big data projects migrate from pilot to production deployment and extend beyond the exclusive realm of IT and into the business unit, new factors come into play. How will the enterprise efficiently scale a technology that is still relatively immature and overly dependent on manual installation and configuration processes? How will the enterprise lock down sensitive data in Hadoop and NoSQL environments for Big Data technologies that were never conceived with security in mind? Big Data Projects Still Rely Heavily on Professional Services Development of a big data solution is still a complex undertaking that is very interdisciplinary in nature, requiring specialized personnel to provide operational support. Hadoop is rapidly evolving, but has not yet reached the level of maturity and sophistication that traditional relational databases offer. There may not be enough in- house expertise to understand all the requirements of the new Big Data platforms, making users more reliant on the professional services. Persistent skills gaps in various IT disciplines impact projects, and these include shortages in security (25% surveyed), architecture planning (24%), BI and analytics (20%), and database administration (17%), as shown in Figure 1. 2 If unaddressed, these staff gaps will often lead to unforeseen delays and risks in new initiatives. Hadoop and NoSQL technology is rapidly evolving, but has not yet reached the level of maturity and sophistication that traditional relational databases offer. As a result, users expecting lower operational costs by using Hadoop software and infrastructure can sometimes find they must spend significant sums for software support and maintenance in the form of recurring subscription fees to vendors of branded Hadoop and NoSQL distributions. It could be argued that since professional services represent a substantial revenue source for some distribution vendors, they have less incentive to incorporate more process automation into their respective offerings. While this model may have worked during the early phases of Hadoop deployment in pilot environments, it often becomes a resource issue for organizations wishing to scale their deployments in an efficient and cost- effective manner. More automation of management tasks could help organizations to avoid having to spend inordinate sums for outside support and maintenance of a technology that has been touted as cost- saving. 1 Source: ESG Research Report, Enterprise Data Analytics Trends, May 2014. 2 Ibid.

4 Figure 1. Top Ten Skills Shortages Impacting Initiative Success In which of the following areas do you believe your IT organizagon currently has a problemagc shortage of exisgng skills? (Percent of respondents, N=545, mulgple responses accepted) Informaeon security IT architecture/planning 25% 24% Mobile applicaeon development Business intelligence/data analyecs Server virtualizaeon/private cloud infrastructure Mobile device management Applicaeon development Database administraeon Data proteceon (i.e., backup and recovery) 21% 20% 20% 19% 18% 17% 17% Security Still a Top Concern for Big Data Platforms 0% 5% 10% 15% 20% 25% 30% Source: Enterprise Strategy Group, 2014. As the number of distinct data sources and total data volumes grow exponentially, correspondingly more strategic planning and tactical administration is required, and this basic talent problem is magnified to potentially deleterious effect. This problem can manifest in different ways, but when asked about it by ESG, 38% of respondents cited security requirements as being a top order challenge due to unchecked size growth and proliferation of databases. 3 So not only is there more data, in more places, and too few people to steer projects, but also the stakes are raised for protecting this sensitive information in the age of malicious hackers, advanced persistent threats, and occasional internal malfeasance. One implication is that these new big data projects can t be led solely by the data scientists, analysts, and database administrators. While they may possess the know- how to design in new functionality and support new applications, they may not have the detailed understanding and skill- set required to manage the security nuances. A copy of privileged data in a test and development environmental is still a copy susceptible to breach, and more worryingly, the end goal of consolidating as much information as possible into a central data lake or hub can further compound the exposure if not handled appropriately. As such, ESG research found that 84% of respondents in a recent enterprise data survey say it is important or crucial that security teams are actively involved in development of new big data and analytics initiatives. 4 This is proven out in customers lists of technology evaluation criteria for selecting an enterprise data management platform in Figure 2, below. Security is tied for first place as the most important factor according to survey respondents when defining requirements for new initiatives in big data, analytics, or business intelligence. 5 With these various challenges in mind, most customers are looking for already proven approaches to achieving better security in the face of pressure to deliver new deployments in the most efficient and cost- effective way. 3 Source: ESG Research Report, Enterprise Database Trends in a Big Data World, July 2014. 4 Source: ESG Research Report, Enterprise Data Analytics Trends, May 2014. 5 Source: Ibid.

5 Figure 2. Top Five Most Important Criteria in Evaluating a Big Data Solution Which of the following auributes are most important to your organizagon when considering technology solugons in the area of business intelligence, analygcs, and big data? (Percent of respondents, N=375, three responses accepted) Security 26% Cost, ROI and/or TCO 26% Reliability 22% Performance 21% Ease of integraeon with other applicaeons, APIs 20% 0% 5% 10% 15% 20% 25% 30% Source: Enterprise Strategy Group, 2014. How Organizations Should Automate and Secure Big Data Deployments The good news is that as adoption has accelerated and more production deployments are being settled into enterprise environments, there are now some emerging best practices to follow to automate and secure a Hadoop environment. The bad news is that the requisite functionality is by no means yet a standardized part of any particular distribution, and many customers will need to look carefully at vendors glib promises to determine for themselves which are most up for the deployment and security challenge. A typical CISO will be interested in establishing sound methodologies for security efficacy, operational efficiency, and enabling the business to conduct activities in a safe manner without undue burden. Both IT and line of business leaders should take an interest and demand the best- of- breed capabilities outlined in Table 1 from any production solution. Table 1. Four Primary Considerations in Selecting a Secure Big Data Platform Common Enterprise Requirements Deployment (incl. automation and integration of tested configurations) Encryption (both at rest and in motion) and/or data masking as appropriate Key management (incl. policies, HA, and key management interoperability protocol - KMIP) User authentication and access control by role for users and administrators Impact / Benefit Faster time to production and reduced risk of security gaps Safer ETL and storage of everything in data lake/hub Simplified key admin and more reliable access Only approved people can see only appropriate data Source: Enterprise Strategy Group, 2014.

6 While set up and configuration of a few management and data nodes in a Hadoop cluster may be touted as relatively easy to do, the manual effort introduces chances of errors, which are increased for each additional instance. Having an automated system for deployment simplifies this process, making for both a more scalable and more reliably protected environment. Encryption may seem like a common tick box option on many Hadoop distributions, but not all follow the same conventions or coverage model. Ensure that all data on disk is covered with strong encryption, and take steps to also guard against network attacks for data being transferred between nodes; during extract, transform, and load activities; and when exporting information. Data masking can also be useful if certain fields need to be identifiably unique for analytics without exposing their actual contents. Though encryption itself may seem quite simple to turn on, key management is often the weak point of solutions, particularly in larger, more varied, or more dynamic environments. Unique keys should be generated and controlled via customizable policies, kept and provided in a highly available source, and compliant with KMIP definitions. Key management should also have role- based administration and auditing capabilities. Even if the whole environment is defended from external attacks using these mechanisms, steps should be taken to limit access to particular data sets for only authenticated users. This should be fine- grained, role- based, automatically tied into AD and LDAP protocols, and carry over permissions as specified from these proven access control systems. From a broader perspective, additional steps should be explored as best practices, including establishing a security zone for the analytics servers, deploying these servers in a hardened configuration, frequent scanning and timely patching, and traffic monitoring. These approaches are not necessarily different for Hadoop environments, however, and should be considered as a standard part of a larger IT security framework. Although a non- trivial undertaking, IT technology decision makers should build these into their must have evaluation criteria, and select products that have functionality to match. Zettaset Delivers a Safer, More Automated and Secure Solution While many companies, young and old, are rushing to capitalize on the new opportunities afforded by big data, many vendors are seeking to provide them with the technology to do so. Of these, some focus on performance, some on connectivity, and some on vertical- specific applications. Zettaset is differentiating with a focus on building rock solid enterprise- ready management and security applications that augment and improve the branded open- source distribution frameworks. In doing so, Zettaset enables other vendors big data solutions to also better meet enterprise operational requirements. As already noted, these requirements may not be top of mind for the DBA or data scientist, but they will be critical steps before IT infrastructure and operations teams can adopt the new solutions and begin enterprise- wide production deployments. Zettaset s Orchestrator provides a more mature, more comprehensive approach to managing big data environments, automating and standardizing common activities like cluster configuration, node deployment, set up of interfaces to applications, general administration, and not least, securing Hadoop environments. With the recent Fast- PATH addition, Orchestrator process automation reduces reliance on manual efforts and accelerates database cluster deployment. In the company s internal benchmark testing, Zettaset found Fast- PATH was able to fully install a 50- node Hadoop cluster in 140 minutes, which would almost certainly be quicker and less error- prone than a manual effort. The benchmark time includes installation of the Hadoop distribution, as well as installation of Kerberos, HBase, Hive, Encryption, Key Management, and Zettaset s patented High- Availability framework on all nodes. Orchestrator Fast- PATH dramatically lowers operational costs and reduces the IT resource requirements necessary to implement Hadoop, as well as reduces time to value from weeks to hours. Now Zettaset is going a step further and modularizing key components, like Hadoop security and their patented multi- service high availability and automated failover, to more easily complement and integrate with popular Hadoop distributions from Cloudera and Hortonworks. This enterprise- class add- on functionality enhances the

7 management and security mechanisms of most branded distributions, and will help address the considerations outlined in Table 1. Specific modularized Big Data management and security capabilities include: Data- at- rest Encryption Zettaset offers a standards- based, low- overhead approach linking up AES- 256 bit disk partition encryption with existing frameworks, and smoothly interoperates with KMIP- compliant key management, PKCS hardware security modules, and a wide range of leading Hadoop distributions and NoSQL databases. This complements open source encryption approaches for data in motion in Hadoop clusters, and also ensures the Orchestrator console communications are safe. Multi- Service High Availability - Hadoop cluster environments are complex, and require multiple services to productively function. Zettaset Orchestrator uniquely delivers enterprise class high availability with automated fail- over for all Hadoop services running in a cluster, eliminating single points of failure that exist in open source Hadoop, and delivering the robust security and compliance capabilities that enterprises expect and need. Fine- Grained, Role- based Access Control Because Hadoop may often contain a wide range of information, both management tools and data itself must be restricted to those who need to know. Fine- grained controls ensure that roles and permissions can be easily customized, and that only appropriate administrators and users can make changes or access sensitive information. Zettaset has a bigger vision, too, including smoother deployments, better reliability, improved performance, and easier support and administration for broader big data environments. Centralizing and certifying management of all required functions to meet enterprise operational standards will go a long way to facilitating the adoption of technologies that are still evolving and maturing. Modularizing the Zettaset offerings opens them up to the wider community with a flexible a la carte menu to suit specific enterprise requirements, while also paving the way for an expanded, more comprehensive, and fully integrated solution for big data management and security. The Bigger Truth Big data is rapidly entering the mainstream, and new data platforms like Hadoop and NoSQL databases are becoming increasingly popular tools to capture and serve up more enterprise data than ever before, spanning sensitive personal profile, health, financial, and sometimes R&D information. Not only is more data being collected and compiled into a single repository, but also more people are being given access to this data across multiple lines of business for application development and for analysis and reporting. Yet these emerging technologies are not yet fully mature in their security capabilities, increasing the risk of a super breach. The financial repercussions and brand damage of an incident are well documented, as are the limitations of simple perimeter- based security products. While many are leaping into the big data opportunity with enthusiasm, the need to build a robust, manageable, and safe solution is paramount. Many vendors are paying lip- service to these issues, but few have really understood the scope of the problem or yet endeavored to design and implement a truly protected product. Zettaset has focused on building more comprehensive security and management functionality, and offers a great complementary solution that addresses the inherent risks of Hadoop distribution frameworks.

20 Asylum Street Milford, MA 01757 Tel: 508.482.0188 Fax: 508.482.0218 www.esg- global.com