EMC Isilon: Data Lake 2.0



Similar documents
HGST Object Storage for a New Generation of IT

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era

Hyperconverged Transformation: Getting the Software-defined Data Center Right

This ESG White Paper was commissioned by DH2i and is distributed under license from ESG.

WHITE PAPER. Get Ready for Big Data:

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

EMC VPLEX FAMILY. Transparent information mobility within, across, and between data centers ESSENTIALS A STORAGE PLATFORM FOR THE PRIVATE CLOUD

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

How To Manage A Single Volume Of Data On A Single Disk (Isilon)

The Challenge of Securing and Managing Data While Meeting Compliance

VMware Solutions for Small and Midsize Business

EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers

TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC

NetApp Big Content Solutions: Agile Infrastructure for Big Data

Virtualization Essentials

Hitachi Virtual Storage Platform Family Global-Active Device Date: July 2015 Author: Tony Palmer, Senior Lab Analyst

Simple. Extensible. Open.

In the Age of Unstructured Data, Enterprise-Class Unified Storage Gives IT a Business Edge

Managing the Unmanageable: A Better Way to Manage Storage

The Challenge. ESG Case Study

Solution Brief. Introduction

A Guide to Hybrid Cloud An inside-out approach for extending your data center to the cloud

VMware and Primary Data: Making the Software-Defined Datacenter a Reality

SunGard Enterprise Cloud Services Date: March 2012 Author: Mark Bowker, Senior Analyst

This ESG White Paper was commissioned by Extreme Networks and is distributed under license from ESG.

Information Technology White Paper

IDC MarketScape: Worldwide Datacenter Infrastructure Management 2015 Vendor Assessment

VCE PROFESSIONAL SERVICES PORTFOLIO OVERVIEW

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

How To Improve Storage Efficiency With Ibm Data Protection And Retention

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

Getting on the Road to SDN. Attacking DMZ Security Issues with Advanced Networking Solutions

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

REDEFINE SIMPLICITY TOP REASONS: EMC VSPEX BLUE FOR VIRTUALIZED ENVIRONMENTS

Reduce your data storage footprint and tame the information explosion

MULTI VENDOR ANALYSIS

I D C T E C H N O L O G Y S P O T L I G H T. T i m e t o S c ale Out, Not Scale Up

EMC PERSPECTIVE. The Private Cloud for Healthcare Enables Coordinated Patient Care

Software-Defined Storage: What it Means for the IT Practitioner WHITE PAPER

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

TRANSFORMING DATA PROTECTION

Realizing the True Potential of Software-Defined Storage

Riverbed WAN Acceleration for EMC Isilon Sync IQ Replication

Reducing Storage TCO With Private Cloud Storage

Product Brochure. Hedvig Distributed Storage Platform Modern Storage for Modern Business. Elastic. Accelerate data to value. Simple.

EMC ISILON ONEFS OPERATING SYSTEM

VMware Hybrid Cloud. Accelerate Your Time to Value

IBM Storwize V7000 Unified and Storwize V7000 storage systems

CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL

Optimizing the Data Center for Today s State & Local Government

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.

Five Best Practices for Improving the Cloud Experience by Cloud Innovators. By Hitachi Data Systems

IBM Enterprise Linux Server

Reducing the Cost and Complexity of Business Continuity and Disaster Recovery for

Total year-over-year spending change in networking, (Percent of respondents) 37% 36% 35% 37% 29% 26% 16% 13% 0% 20% 40% 60% 80%

AccelOps NOC and SOC Analytics in a Single Pane of Glass Date: March 2016 Author: Tony Palmer, Senior ESG Lab Analyst

White. Paper. When Cloud Makes Sense. November 2013

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

The Next Evolution in Storage Virtualization Management

Journey to the cloud. Sergei Butenko District Manager EMC

Hitachi Data Systems Silver Lining. HDS Enables a Flexible, Fluid Cloud Storage Infrastructure

Next Generation NAS: A market perspective on the recently introduced Snap Server 500 Series

Taking the Open Path to Hybrid Cloud with Dell Networking and Private Cloud Solutions

Protecting Big Data Data Protection Solutions for the Business Data Lake

The Software-defined Data Center in the Enterprise

Top 10 Reasons to Virtualize VMware Zimbra Collaboration Server with VMware vsphere. white PAPER

Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale

Future Proofing Data Archives with Storage Migration From Legacy to Cloud

Nutanix Solutions for Private Cloud. Kees Baggerman Performance and Solution Engineer

GPFS Cloud ILM. IBM Research - Zurich. Storage Research Technology Outlook

IBM System Storage SAN Volume Controller

WhitePaper. Private Cloud Computing Essentials

The Software-Defined Data Center is Key to IT-as-a-Service

Product Brochure. Hedvig Distributed Storage Platform. Elastic. Modern Storage for Modern Business. One platform for any application. Simple.

How to Make Oracle Databases Faster and More Efficient with Pure Storage. By Scott Sinclair, Storage Analyst and Nik Rouda, Senior Big Data Analyst

Enterprise-class Backup Performance with Dell DR6000 Date: May 2014 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Senior Lab Analyst

Simplifying the Desktop Transformation with HP and Liquidware Labs

Transcription:

` ESG Solution Showcase EMC Isilon: Data Lake 2.0 Date: November 2015 Author: Scott Sinclair, Analyst Abstract: With the rise of new workloads such as big data analytics and the Internet of Things, data scales not only in the data center, but also at enterprise edge locations and in the cloud. With the release of IsilonSD Edge and Isilon CloudPools, EMC is extending data awareness and understanding outside of the data center to the next-generation data lake. Introduction When discussing the challenges of IT storage environments, identifying the underlying culprit can often be oversimplified by focusing solely on the rapid rate of data growth. With the amount of data created and the length of time organizations wish to store data increasing, the challenge of data growth is a very real phenomenon that can extend well beyond the simple cost of storing and managing additional capacity. Higher levels of data growth can impact backup and protection schemes, and create power and cooling challenges. While these challenges have created and will likely continue to create concerns for IT storage leaders, many IT organizations are also grappling with an added layer of data storage complexity resulting from the advent of new generation workloads such as business intelligence (or big data) analytics and the Internet of Things (IoT). Digital repositories for business intelligence analytics are often referred to as data lakes. While these architectures may provide the scale to store the added influx of content, a greater level of flexibility and manageability may be required to make data lake architecture truly effective. In many cases, these newer workloads extend the acts of data creation and access well beyond the centralized and somewhat predictable confines of the centralized data center. As businesses integrate IoT workloads, sensor data may be created at the edge (i.e., a remote site or system) just as often as it is created within the data center. Additionally, as more departments in the business look to leverage business intelligence analytics, broad access to digital content will likely be desired from a wider range of locations. As the viability of the traditional storage silos looks to be coming to an end, global organizations appear to require the next generation of the data lake architecture. EMC, a market leader in storage, understands the evolving infrastructure demands of IT organizations and has augmented its Isilon storage technology to enable the next generation of the data lake. With the release of IsilonSD Edge and Isilon CloudPools, the capabilities of Isilon s OneFS file system are extended well beyond the data center. IsilonSD Edge delivers Isilon s OneFS with a software-deployment model for a software-defined storage solution that can leverage new or existing commodity hardware as well as help simplify storage manageability at the edge. CloudPools extend its capability to public cloud deployments as well. The resulting solution allows a content repository to take advantage of the benefits of a public cloud infrastructure while offering the seamless accessibility of data on-premises. With these two additions, Isilon delivers a next-generation data lake offering with an expanded level of flexibility to serve a new generation of workloads. This ESG Solution Showcase was commissioned by EMC and is distributed under license from ESG.

Solution Showcase: EMC Isilon: The Next-generation Data Lake 2 The Need for a New Storage Architecture In recent years, IT organizations have often looked toward scale-out storage architectures as a means to keep pace with the challenges of data growth. With the advent of big data analytics, these scale-out architectures added new capabilities such as broader protocol support to serve a wider variety of applications. The goal was to deliver what some in the industry refer to as a data lake a single, scalable storage repository of digital content that can be leveraged for business intelligence and big data analytics. Recently, however, new innovations are driving organizations to seek a more flexible and capable storage infrastructure layer. For example, the rise of IoT workloads and the collection of sensor data have expanded the breadth of locations where data may be created. The emergence of public cloud storage has enticed IT organizations to migrate data off-premises to free up on-premises resources. The net result is an increased desire to extend the data lake concept to a more flexible and more capable storage solution. In an attempt to quantify some of these trends, ESG recently surveyed IT decision makers responsible for their organizations data storage environments, which revealed a number of insights including: The rapid rate of data growth continues to be a top storage challenge. The application/ workload most widely identified as driving this data and subsequent storage capacity growth spending over the next 24 months was business intelligence and analytics. There is an early awareness of and focus on IoT and its potential impact on data storage infrastructure and strategy. 1 In other words, the demand for the data lake storage infrastructure will likely continue to increase. However, as mentioned previously, the storage silo architecture, regardless of its scalability, will likely be sub-optimal for IT organizations as they seek to extend their business intelligence capabilities. These organizations will likely require a next-generation data lake. To address the growing data storage demands, next-generation data lake architectures must continue to be resilient and highly available, but for global scale, geo-dispersed protection capabilities are ideal. Planned or unplanned downtime of the data lake can have a critical impact on the business. Additionally, the next generation data lake cannot simply be isolated to the data center; it should integrate data from the edge and leverage public cloud resources as well. Software-defined Storage: IsilonSD Edge Software EMC s Isilon storage is a market leader in scale-out file storage and offers a robust level of capabilities designed for growing unstructured data environments. In addition to providing a scale-out storage architecture, Isilon offers support for a variety of storage protocols including NFS, SMB/CIFS, HDFS, and Openstack Swift, along with automated data migration across tiers and a solid complement of data protection capabilities including snapshots and replication. With the advent of IsilonSD Edge, EMC is able to offer Isilon OneFS storage technology as a software-only option. As a software-defined storage component, EMC is able to extend the benefits of OneFS to remote office environments with a simple and flexible software deployment model. IsilonSD Edge can be deployed on new or even existing commodity hardware to simplify the deployment and help reduce the cost of storage equipment, power, and cooling. IsilonSD Edge continues, however, to offer the same levels of capability as Isilon OneFS in addition to leveraging the same management tools, interfaces, and VMware integration, increasing management simplicity. ESG 2015 storage research conducted earlier this year identified that the emergence of software-defined storage (SDS) has seen an emphatic level of interest from the IT industry. When IT decision makers were asked to identify their organization s 1 Source: ESG Research Report, 2015 Data Storage Market Trends, October 2015.

Solution Showcase: EMC Isilon: The Next-generation Data Lake 3 perception of software-defined storage, 60% of IT decision makers reported that their organizations are committed to SDS as a long-term strategy (68%) or at least conceptually interested in SDS (26%). 2 For additional detail on the rationale driving this high level of interest, the data in Figure 1 offers perspective on the factors responsible for the consideration of software-defined storage. 3 Although many respondents identified the potential benefits that focus on cost savings by means of reducing operational or capital expenditures as drivers, the most oftencited response for all factors was simplified storage management. This data provides credence to the potential impact that the flexibility enabled by SDS environments can have on simplifying storage management. It is this simplicity that contributes to a large portion of the benefit behind IsilonSD Edge and Isilon Cloud pools. Figure 1. Factors Responsible for Organization s Consideration of Software-defined Storage To the best of your knowledge, which of the following factors are responsible for your organization s consideration of software-defined storage? (Percent of respondents, N=307) Simplified storage management 17% 55% Reduction in operational expenditures Reduction in capital expenditures Total cost of ownership (TCO) Greater agility to better align with evolving and fluid needs of the business Support server virtualization workload consolidation Support virtual desktop infrastructure (VDI) deployment Don't know 1% 1% 15% 13% 17% 13% 14% 10% 47% 44% 43% Most important factor driving consideration of software-defined storage All factors driving consideration of software-defined storage 0% 10% 20% 30% 40% 60% IsilonSD Edge and Isilon Cloud Pools: Delivering the Next-generation Data Lake Source: Enterprise Strategy Group, 2015 As mentioned previously, the data lake concept potentially only represents the first step in delivering an architecture to serve not only the rapid rate of date growth, but also the new types of workloads being deployed by IT organizations. The Next-generation Data Lake The promise of big data or business intelligence can be quite alluring: Take the data you are storing already and run some additional analysis to glean business insights, then use those insights to help your business run more efficiently and effectively. The effectiveness of these analytics applications, however, can be limited by the storage infrastructure. If the underlying storage foundation does not scale enough or does not offer the right performance, the completeness of the results could suffer. As a result, the concept of a data lake emerged, offering a storage foundation designed to present the 2 Source: ESG Brief, Software-defined Storage Trends, September 2015. 3 Source: ibid.

Solution Showcase: EMC Isilon: The Next-generation Data Lake 4 benefits of storage consolidation, which provides simplified management and reduced infrastructure costs, but those benefits are often limited to the data center. To deliver a next-generation data lake, EMC has introduced a new OneFS operating system for Isilon that provides increased reliability and availability at the core of the data center. It includes support for non-disruptive operations, nondisruptive upgrades, and rollback of upgrades. As data lakes grow in size and become critical repositories of massive scaled-business data, resiliency is key for the data lake. OneFS now also includes support for Microsoft s SMB3 Continuous Availability protocol, which enables newer Windows clients to seamlessly IsilonSD Edge Benefit Overview fail over in case of any outage. Extends enterprise data lake from data center to enterprise edge locations. Simple, software-only deployment model. Ability to leverage (new or existing and unused) commodity hardware EMC s has also introduced IsilonSD Edge and Isilon CloudPools to extend the Isilon ecosystem well beyond the data center, delivering a nextgeneration data lake architecture that can extend the aggregation and accessibility benefits to both the edge (e.g., remote offices or sites) and the cloud. The net result significantly increases deployment and infrastructure flexibility, helping the IT organization to design the optimal storage ecosystem for its specific workload needs. storage infrastructure, and reduce Addressing the Challenges of the Edge power and cooling. The management of data at remote sites can create a challenge for IT Improved data protection at the administrators. Lack of direct accessibility to storage hardware can often edge with the capabilities of Isilon. add an extra layer of management complexity, slowing both planned and unplanned maintenance tasks. ESG recently conducted a research study Support for a number of emerging into the challenges associated with managing remote office environments. use cases including IoT, analysis at When IT decision makers were asked to identity their top IT priorities with the edge health care, video respect to supporting ROBO locations, four of the top five most-cited surveillance, and content involved the protection, storage, and accessibility of data; improving collaboration. information security measures (45%), managing data growth (37%), improving backup and recovery processes (37%), and improving employees abilities to share files/collaborate with other employees (36%). 4 When considered in aggregate, these priorities can represent a myriad of specific IT ecosystem concerns. For example, multiple challenges such as the need for greater efficiency, management simplicity, reduced power and cooling, and superior data protection and security can fall under managing storage growth. These priorities further support the rising interest and demand for next-generation data lake environments that can consolidate data from the edge into a central data lake ecosystem, simplifying the management of data on the edge. The Promise of Cloud Infrastructure While managing data at the edge with traditional storage can create challenges, the emergence of the public cloud storage tiers introduces opportunities. Often, IT organizations look to off-premises cloud storage as a potential low-cost bastion for unused, cold or frozen data storage. In ESG s aforementioned storage research, more than one-third (37%) of IT decision makers identified leveraging public cloud-based storage as an initiative expected to impact storage spending over the next 12 to 18 months. This data is understandable given the cost savings often associated with leveraging public cloud storage tiers. These savings result in benefits such as reduced infrastructure, simpler manageability, and reduced power and cooling, to name a few. 4 Source: ESG Research Report, Remote Office/Branch Office Technology Trends, May 2015.

Solution Showcase: EMC Isilon: The Next-generation Data Lake 5 For some years, Isilon has offered policy-based, automated storage tiering on an Isilon cluster with SmartPools software to provide the most appropriate storage resources for specific data sets. EMC s Isilon CloudPools leverages the SmartPools policy engine to extend storage tiering to cloud storage resources as Isilon CloudPool Benefit Overview part of a larger Isilon storage ecosystem. With CloudPools, Isilon Integrates data center with cloud storage provides automated and policy-based data migration to the cloud as a new storage tier for less active data sets. To secure this data, all data resources. that is moved to the cloud with CloudPools is sharded (divided up and Simple solution management. separated) and then encrypted. In addition to the ability to automatically migrate data to cloud, Isilon s CloudPools also provides Seamless viability to content on the cloud the ability for the data to remain accessible as a part of the enterprise s Access to low-cost public and private cloud Isilon data lake. This capability lets organizations more effectively resources for cold, unused data. leverage public cloud resources by allowing local on-premises workloads to retain access to data even when it has been migrated offpremises. The net result can allow for more efficient utilization of both Date stored on cloud resources remains accessible for analytical analysis performed in on- and off-premises resources, reducing the cost and complexity of the data center on the entire data lake. data management while being transparent to users and applications. The Bigger Truth Ultimately, an organization s data and its technology should enable the business to do more, be more competitive, and be more successful. Achieving these goals requires a storage architecture similar to that which Isilon is delivering with IsilonSD Edge and CloudPools, where data can reside at the right location for the business whether that is in the data center, at the edge, or in the cloud while providing the management and simplicity of one single pool. This design has become increasingly important as organizations continue to increase their usage of analytics and extend the collection and analysis of digital content to a wider variety of locations. The new data lake extends beyond the data center to the edge and to the cloud, which simplifies management and reduces storage costs and complexity. When looking to deploy a foundation for the next generation of digital workloads, organizations should ensure that the storage foundation can provide the resiliency and flexibility to extend to all the locations where data may be created, analyzed, and retained. EMC understands that organizations require a storage solution that can evolve to meet the specific needs of their environments, and that those requirements will continue to evolve with the organization s demands. As such, EMC Isilon is delivering the next-generation data lake that supports traditional and next-generation workloads. All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.