VIRTUALIZING HADOOP IN LARGE-SCALE INFRASTRUCTURES

Size: px
Start display at page:

Download "VIRTUALIZING HADOOP IN LARGE-SCALE INFRASTRUCTURES"

Transcription

1 VIRTUALIZING HADOOP IN LARGE-SCALE INFRASTRUCTURES How Adobe Systems achieved breakthrough results in Big Data analytics with Hadoop-as-a-Service ABSTRACT Large-scale Apache Hadoop analytics have long eluded the industry, especially in virtualized environments. In a ground-breaking proof of concept (POC), Adobe Systems demonstrated running Hadoop-as-a-Service (HDaaS) on a virtualized and centralized infrastructure handled large-scale data analytics workloads. This white paper documents the POC s infrastructure design, initial obstacles, and successful completion, as well as sizing and configuration details, and best practices. Importantly, the paper also underscores how HDaaS built on an integrated and virtualized infrastructure delivers outstanding performance, scalability, and efficiency, paving the path toward larger-scale Big Data analytics in Hadoop environments. December, 2014 FEDERATION WHITE PAPER

2 To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller, visit or explore and compare products in the EMC Store Copyright 2014 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. VMware and vsphere are registered trademarks or trademarks of VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners. Part Number H13856

3 TABLE OF CONTENTS EXECUTIVE SUMMARY... 4 INTRODUCTION... 5 BOLD ARCHITECTURE FOR HDAAS... 7 NAVIGATING TOWARD LARGE-SCALE HDAAS... 8 A Few surprises... 8 Diving in Deeper... 8 Relooking at Memory Settings... 8 Modifying Settings Properly with BDE... 9 Bigger is Not Always Better... 9 Storage Sizing Proved Successful... 9 BREAKTHROUGH IN HADOOP ANALYTICS Impressive Performance Results Breaking with Tradition Adds Efficiency Stronger Data PRotection Freeing the Infrastructure BEST PRACTICE RECOMMENDATIONS Memory settings are key Understand Sizing and Configuration Acquire or Develop Hadoop Expertise NEXT STEPS: LIVE WITH HDAAS H13856 Page 3 of 12

4 EXECUTIVE SUMMARY Apache Hadoop has become a prime tool for analyzing Big Data and achieving greater insights that help organizations improve strategic decision making. Traditional Hadoop clusters have proved inefficient for handling large-scale analytics jobs sized at hundreds of terabytes or even petabytes. Adobe s Digital Marketing organization, which operates data analytic jobs on this scale, was encountering increased demand internally to use Hadoop for analysis of the company's existing eight-petabyte data repository. To address this need, Adobe explored an innovative approach to Hadoop. Rather than running traditional Hadoop clusters on commodity servers with locally attached storage, Adobe virtualized the Hadoop computing environment and used its existing EMC Isilon storage where the eight-petabyte data repository resides as a central location for Hadoop data. Adobe enlisted resources, technologies, and expertise of EMC, VMware, and Cisco to build a reference architecture for virtualized Hadoop-as-a-Service (HDaaS) and perform a comprehensive proof of concept. While the five-month POC encountered some challenges, the project also yielded a wealth of insights and understanding relating to how Hadoop operates and its infrastructure requirements. After meticulous configuring, refining, and testing, Adobe successfully ran a 65-terabyte Hadoop job one of the industry s largest to date in a virtualized environment. This white paper details the process that Adobe and the POC team followed that led to this accomplishment. The paper includes specific configurations of the virtual HDaaS environment used in the POC. It also covers initial obstacles and how the POC team overcame them. It also documents how the team adjusted settings, sized systems, and reconfigured the environment to support large-scale Hadoop analytics in a virtual environment with centralized storage. Most importantly, the paper presents POC s results, along with valuable best practices for other organizations interested in pursuing similar projects. The last section describes Adobe s plans to bring virtual HDaaS to production for its business users and data scientists. H13856 Page 4 of 12

5 INTRODUCTION Organizations across the world increasingly view Big Data as a prime source of competitive differentiation, and analytics as the means to tap this source. Specifically, Hadoop enables data scientists to perform sophisticated queries against massive volumes of data to gain insights, discover trends, and predict outcomes. In fact, a GE and Accenture study reported that 84 percent of survey respondents believe that using Big Data analytics has the power to shift the competitive landscape for my industry" in the next year. 1 Apache Hadoop, an increasingly popular environment for running analytics jobs, is an open source framework for storing and processing large data sets. Traditionally running on clusters of commodity servers with local storage, Hadoop comprises multiple components, primarily the Hadoop Distributed File System (HDFS) for data storage, Yet Another Resource Negotiator (YARN) for managing system resources like memory and CPUs, and MapReduce for processing massive jobs by splitting up input data into small subtasks and collating results. At Adobe, a global leader in digital marketing and digital media solutions, its Technical Operations team uses traditional Hadoop clusters to deliver Hadoop as a Service (HDaaS) in a private cloud for several application teams. These teams run Big Data jobs such as log and statistical analysis of application layers to uncover trends that help guide product enhancements. Elsewhere, Adobe's Digital Marketing organization tracks and analyzes customers website statistics, which are stored in an eight-petabyte data repository on EMC Isilon storage. Adobe Digital Marketing would like to use HDaaS for more in-depth analysis that would help their clients improve website effectiveness, correlate site visits to revenue, and guide strategic business decisions. Rather than moving data from a large data repository to the Hadoop clusters a timeconsuming task, Technical Operations determined it would be most efficient to simply use Hadoop to access data sets on the existing Isilon-based data repository. Adobe has a goal of running analytics jobs against data sets that are hundreds of terabytes in size. Simply adding commodity servers to Hadoop clusters would become highly inefficient, especially since traditional Hadoop clusters require three copies of the data to ensure availability. Adobe also was concerned that current Hadoop versions lack high availability features. For example, Hadoop has only has two NameNodes, which tracks where data resides in Hadoop environments. If both NameNodes fail, the entire Hadoop cluster would collapse. Technical Operations proposed separating the Hadoop elements and placing them where they can scale more efficiently and reliably. This meant using Isilon, where Adobe s file-based data repository is stored, for centralized Hadoop storage and virtualizing the Hadoop cluster nodes to enable more flexible scalability and lower compute costs. (Figures 1 and 2) Figure 1. Traditional Hadoop Architecture 1 "Industrial Internet Insights Report for 2015." GE, Accenture H13856 Page 5 of 12

6 Figure 2. Virtual Hadoop Architecture with Isilon Despite internal skepticism about a virtualized infrastructure handling Hadoop s complexity, Technical Operations recognized a compelling upside: improving efficiency and increasing scalability to a level that had not been achieved for single-job data sets in a virtualize Hadoop environment with Isilon. This is enticing, especially as data analytics jobs continue to grow in size across all environments. "People think by that virtualizing Hadoop, you're going to take a performance hit. But we showed that's not the case. Instead you get added flexibility that actually unencumbers your infrastructure." Chris Mutchler, Compute Platform Engineer, Adobe Systems To explore the possibilities, Adobe Technical Operations embarked on a virtual HDaaS POC for Adobe Systems Digital Marketing. The infrastructure comprised EMC, VMware, and Cisco solutions and was designed to test the outer limits of Big Data analytics on Isilon and VMware using Hadoop. Key objectives of the POC included: Building a virtualized HDaaS environment to deliver analytics through a self-service catalog to internal Adobe customers Decoupling storage from compute by using EMC Isilon to provide HDFS, ultimately enabling access to the entire data repository for analytics Understanding sizing and security requirements of the integrated EMC Isilon, EMC VNX, VMware and Cisco UCS infrastructure to support larger-scale HDaaS Proving an attractive return on investment and total cost of ownership in virtualized HDaaS environments compared to physical in-house solutions or public cloud services such as Amazon Web Services Documenting key learnings and best practices The results were impressive. While the POC uncovered some surprises, Adobe gained valuable knowledge for future HDaaS projects. Ultimately, Adobe ran some of the largest Hadoop data analytics jobs to date in a virtualized HDaaS environment. It was a groundbreaking achievement and bodes a new era of scale and efficiency for Big Data analytics. H13856 Page 6 of 12

7 BOLD ARCHITECTURE FOR HDAAS The POC s physical topology is built on Cisco Unified Compute System (UCS), Cisco Nexus networking, EMC VNX block storage, and EMC Isilon scale-out storage. (Figure 3) Figure 3. HDaaS Hardware Topology At the compute layer, Adobe was particularly interested in Cisco UCS for its firmware management and centralized configuration capabilities. Plus, UCS provides a converged compute and network environment when deployed with Nexus. VNX provides block storage for VMware ESX hosts and virtual machines (VMs) that comprise the Hadoop cluster. Adobe's focus was learning the VNX sizing and performance requirements to support virtualized HDaaS. An existing Isilon customer, Adobe especially liked Isilon s data lake concept that enables access to one source of data through multiple protocols, such as NFS, FTP, Object, and HDFS. In the POC, data was loaded onto Isilon via NFS and accessed via HDFS by virtual machines in the Hadoop compute cluster. The goal was to prove that Isilon delivered sufficient performance to support large Hadoop workloads. To deploy, run, and manage Hadoop on a common virtual infrastructure, Adobe relied on VMware Big Data Extensions (BDE) an essential software component of the overall environment. Adobe already used BDE in its private cloud HDaaS deployment and wanted to apply it to the new infrastructure. BDE enabled Adobe to automate and simplify deployment of hundreds of virtualized Hadoop compute nodes that were tied directly to Isilon for HDFS. During testing, Adobe also used BDE to deploy, reclaim, and redeploy the Hadoop cluster more than 30 times to evaluate different cluster configurations. Without the automation and flexibility of BDE, Adobe would not have been able to conduct such a wide range and high volume of tests within such a short timeframe. In this POC, Adobe used Pivotal HD as an enhanced Hadoop distribution framework but designed the infrastructure to run any Hadoop distribution. The following tools assisted Adobe with monitoring, collecting and reporting on metrics generated by the POC: VNX Monitor and Reporting Suite (M&R) Isilon Insight IQ (IIQ) Vmware vcenter Operations Manager (VCOPS) Cisco UCS Director (USCD) H13856 Page 7 of 12

8 NAVIGATING TOWARD LARGE-SCALE HDAAS The POC spanned five months from hardware delivery through final testing. Adobe expected the infrastructure components to integrate well, provide a stable environment, and perform satisfactorily. In fact, the POC team implemented the infrastructure in about one and a half weeks. Then it put Isilon to the test as the HDFS data store, and evaluated how well Hadoop ran in a virtualized environment. A FEW SURPRISES Adobe ran its first Hadoop MapReduce job in the virtual HDaaS environment within three days of initial set-up. Smaller data sets of 60 to 450 gigabytes performed well, but the team hit a wall beyond 450 gigabytes. The team focused on the job definition of the Hadoop configuration to determine if it was written correctly or using memory efficiently. In researching the industry at large, Adobe learned that most enterprise Hadoop environments were testing data on a small scale. In fact, Adobe did not find another Hadoop POC or implementation that exceeded 10 terabytes for single-job data sets in a virtualized Hadoop environment with Isilon. "When we talked to other people in the industry, we realized we were on the forefront of scaling Hadoop at levels possibly never seen before." Jason Farnsworth, Senior Storage Engineer, Adobe Systems After four weeks of tweaking the Hadoop job definition and adjusting memory settings, the team successfully ran a six-terabyte job. Pushing beyond six terabytes, the team sought to run larger data sets upwards of 60 terabytes. The larger jobs again proved difficult to complete successfully. DIVING IN DEEPER The next phase involved Adobe Technical Operations enlisting help from storage services, compute platforms, research scientists, data center operations, and network engineering. Technical Operations also reached out to the POC s key partners EMC, including Isilon and Pivotal, VMware, Cisco, and Trace3, an EMC value-added reseller and IT systems integrator. The team, which included several Hadoop experts, dissected nearly every element of the HDaaS environment. This included Hadoop job definitions, memory settings, Java memory allocations, command line options, physical and virtual infrastructure configurations, and HDFS options. "We had several excellent meetings with Hadoop experts from EMC and VMware. We learned an enormous amount that helped us solve our initial problems and tweak the infrastructure to scale the way we wanted." Jason Farnsworth, Senior Storage Engineer, Adobe Systems Relooking at Memory Settings Close inspection of Hadoop revealed a lack of maturity to perform in virtualized environments. For example, some operations launched through VMware BDE did not function properly on Hadoop, requiring significant tweaking. Complicating matters, the team learned that Hadoop error messages did not clearly describe the problem or indicate the origin. Most notable, the team discovered that Hadoop lacked sufficient intelligence to analyze memory requirements for large analytics jobs. This necessitated manually adjusting memory settings. The POC team recommends the following memory settings as a good starting point for organizations to diagnose scaling and job-related issues when testing Hadoop in larger-scale environments: Yarn Settings Amount of physical memory in megabytes that can be allocated for containers: yarn.nodemanager.resource.memory-mb=x x=memory in megabytes. BDE has a base calculation for this value according to how much RAM to allocate to the workers on deployment. Default value is Minimum container memory for YARN. The minimum allocation for every container request at the ResourceManager, in megabytes: yarn.scheduler.minimum-allocation-mb=x x=memory in megabytes. Default Value is Application Master Memory: yarn.app.mapreduce.am.resource.mb=x x=memory in megabytes. Default value is Java options for the application master (JVM HEAP Size): yarn.app.mapreduce.am.command-opts=x x=memory in megabytes but passed as a Java option (e.g., Xmx7000m). Default value is Xmx1024m. H13856 Page 8 of 12

9 Mapred Settings Mapper memory: mapreduce.map.memory.mb=x x=memory in megabytes. Default Value is Reducer memory: mapreduce.reduce.memory.mb=x x=memory in megabytes. Default Value is 3072 Mapper Java Options (JVM Heap Size). Heap size for child JVMs of maps: mapreduce.map.java.opts=x x=memory but passed as a Java option (e.g, Xmx2000m). Default Value is Xmx1024m Reducer Java Options (JVM Heap Size). Heap size for child JVMs of reduces: mapreduce.reduce.java.opts=x x=memory but passed as a Java option (e.g., xmx4000m). Default Value is Xmx2560m Maximum size of the split metainfo file: mapreduce.jobtracker.split.metainfo.maxsize=x x= by default. POC team set this to -1, which disables or sets to any size. For guidance on baseline values to use in these memory settings, the POC team recommends the following documents: Modifying Settings Properly with BDE Both the virtual and physical infrastructure required configuration adjustments. Since VMware BDE acts as a management service layer on top of Hadoop, the team relied on BDE to modify Hadoop settings to ensure they were properly applied to the virtual clusters and remained persistent. Changing the settings via the servers would not enable consistent application of modifications across all the virtual clusters. The team also kept in mind that stopping, restarting, or redeploying a cluster through BDE would automatically reset all the node settings to their default values. Bigger is Not Always Better The POC revealed that the configuration of physical servers (hosts) and virtual servers (Hadoop workers or guests) affected Hadoop performance and cost efficiency. For example, a greater number of physical cores (CPUs) with less megahertz delivered improved performance versus fewer cores with more megahertz. At a higher cost, the same number of physical cores with more megahertz delivered even better performance. In a virtual environment, fewer virtual CPUs (vcpus) with a greater number of Hadoop workers, performed and scaled better than a greater number of vcpus supporting fewer workers. The team also learned to keep all physical hosts in the VMware cluster configured identically and ensure there were not any variations in host configurations. This way, VMware distributed resource scheduling would not be invoked to spend time and resources balancing the cluster and resources instead would be made immediately available to Hadoop. BDE also was especially valuable in ensuring that memory settings and the alignment between cores and VMs were consistent. Storage Sizing Proved Successful Both VNX and Isilon performed perfectly in the POC. The team sized VNX to hold both the VMware environment and the Hadoop intermediate space temporary space used by Hadoop jobs such as MapReduce. Intermediate space also can be configured to be stored directly on the Isilon cluster, but this setting was not tested during the POC. Technical Operations also tested various HDFS block sizes, resulting in performance optimizations. Depending on job and workload, the team found that block sizes of 64 megabytes to 1024 megabytes drove optimal throughput. The 12 Isilon X-Series nodes with two-terabyte drives provided more than enough capacity and performance for tested workloads, and could easily scale to support Hadoop workloads hundreds of terabytes in size. While the POC s Isilon did not incorporate flash technology, the team noted that adding flash drives would provide a measurable performance increase. H13856 Page 9 of 12

10 BREAKTHROUGH IN HADOOP ANALYTICS After eight weeks of fine-tuning the virtual HDaaS infrastructure, Adobe succeeded in running a 65-terabyte Hadoop workload significantly larger than the largest known virtual Hadoop workloads. In addition, this was the largest workload ever tested by EMC in a virtual Hadoop environment on Isilon. Fundamentally, these results proved that Isilon as the HDFS layer worked. In fact, the POC refutes claims by some in the industry that suggest shared storage will cause problems with Hadoop. To the contrary, Isilon had no adverse effects and even contributed superior results in a virtualized HDaaS environment compared to traditional Hadoop clusters. These advantages apply to many aspects of Hadoop, including performance, storage efficiency, data protection, and flexibility. "Our results proved that having Isilon act as the HDFS layer was not adverse. In fact, we got better results with Isilon than we would have in a traditional cluster." Chris Mutchler, Compute Platform Engineer, Adobe Systems IMPRESSIVE PERFORMANCE RESULTS With compute resources allocated in small quantities to a large number of VMs, job run time improved significantly. (Figures 4 and 5) Furthermore, the test demonstrated that Isilon performed well without flash drives. Figure 4. TeraSort Job Run Time by Worker Count Figure 5. Adobe Pig Job Run Time by Worker Count The team concluded that Hadoop performs better in a scale-out rather than scale-up configuration. That is, jobs complete more quickly when run on a greater number of compute nodes, so having more cores is more important than having faster processors. In fact, performance improved as the number of workers increased. Tests were run with the following cluster configurations: 256 workers, 1 vcpu, 7.25 GB RAM, 30 GB intermediate space 128 workers, 2 vcpu, 14.5 GB RAM, 90 GB intermediate space 64 workers, 4 vcpu, 29 GB RAM, 210 GB intermediate space 32 Workers, 8 vcpu, 58 GB RAM, 450 GB intermediate space H13856 Page 10 of 12

11 BREAKING WITH TRADITION ADDS EFFICIENCY Traditional Hadoop clusters require three copies of the data in case servers fail. Isilon eliminates tripling storage capacity requirements due to built-in data protection capabilities of the Isilon OneFS operating system. For example, in a traditional Hadoop cluster running jobs against eight petabytes of data, the infrastructure would require 24 petabytes of raw disk capacity a 200 percent overhead to accommodate three copies. Eight petabytes of Hadoop data when stored on Isilon requires only 9.6 petabytes of raw disk capacity a nearly 60 percent reduction. Not only does Isilon save on storage but it also streamlines storage administration by eliminating the need to oversee numerous islands of storage. Using Adobe s eight-petabyte data set in a traditional environment would require 24 petabytes of local disk capacity necessitating thousands of Hadoop nodes when hundreds of compute nodes would be adequate. Enabling a data lake, Isilon OneFS provides enterprises with one central data repository of data accessible through multiple protocols. Rather than requiring a separate, purpose-built HDFS device, Isilon supports HDFS along with NFS, FTP, SMP, HTTP, NDMB, SWIFT, and OBJECT. (Figure 6). This allows organizations to bring Hadoop to the data a more streamlined approach, rather than moving data to Hadoop. Figure 4. Isilon Data Lake Concept with Multi-protocol Support STRONGER DATA PROTECTION Isilon provides secure control over data access by supporting POSIX for granular file access permissions. Isilon stores data in a POSIX-compliant file system with SMB and NFS workflows that users can also access through HDFS for MapReduce. Isilon protects partitioned subsets of data with access zones that prevent unauthorized access. In addition, Isilon offers rich data services that are not available in traditional Hadoop environments. For example, Isilon enables users to create snapshots of the Hadoop environment for point-in-time data protection or to create duplicate environments. Isilon replication also can synchronize Hadoop to a remote site, providing even greater protection. This allows organizations to keep Hadoop data secure on premises, rather than moving data to a public cloud. FREEING THE INFRASTRUCTURE Virtualizing HDaaS introduces greater opportunities for flexibility, unencumbering the infrastructure from physical limitations. Instead of traditional bare-metal clusters with rigid configurations, virtualization allows organizations to tailor Hadoop VMs to their individual workloads and even use existing compute infrastructure. This is key to optimizing performance and efficiency. Plus, virtualization facilitates multi-tenancy and offers additional high-availability advantages through fluid movement of VMs from one physical host to another. H13856 Page 11 of 12

12 BEST PRACTICE RECOMMENDATIONS Several important lessons learned and best practices were documented from this breakthrough POC, as follows. MEMORY SETTINGS ARE KEY It's important to recognize that Hadoop is still a maturing product and does not automatically recognize optimal memory requirements. Memory settings are crucial to achieving sufficient performance to run Hadoop jobs against large data sets. EMC recommends methodically adjusting memory settings and repeatedly testing configurations until the optimal environment is achieved. UNDERSTAND SIZING AND CONFIGURATION Operating at Adobe's scale hundreds of terabytes to tens of petabytes demands close attention to sizing and configuration of virtualized infrastructure components. Since no two Hadoop jobs are alike, IT organizations must thoroughly understand the data sets and jobs their customers plan to run. Key sizing and configuration insights from this POC include: Devote ample time upfront to sizing storage layers based on workload and scalability requirements. Sizing for Hadoop intermediate space also deserves careful consideration. Consider setting large HDFS block sizes to 256 to 1024 megabytes to ensure sufficient performance. On Isilon, HDFS block size is configured as a protocol setting in the OneFS operating system. In the compute environment, deploy a large number of hosts using processors with as many cores as possible and align the VMs to those cores. In general, having more cores is more important than having faster processors and results in better performance and scalability. Configure all physical hosts in the VMware cluster identically. For example, mixing eight-core and ten-core systems will make CPU alignment challenging when using BDE. Different RAM amounts also will cause unwanted overhead while VMware's distributed resource scheduling moves virtual guests. ACQUIRE OR DEVELOP HADOOP EXPERTISE Hadoop is complex, with numerous moving parts that must operate in concert. For example, MapReduce settings may affect Java, which may in turn, impact YARN. EMC recommends that organizations wishing to use Hadoop to ramp up gradually and review the many resources available to help simplify Hadoop implementation with Isilon. Hadoop insights also may be achieved through "tribal" sharing of experiences among industry colleagues, as well as formal documentation and training. The POC team recommends these resources as a starting place: EMC Isilon Free Hadoop website EMC Hadoop Starter Kit EMC Isilon Best Practices for Hadoop Data Storage white paper EMC Big Data website When building and configuring the virtual HDaaS infrastructure, companies should select vendors with extensive expertise in Hadoop and especially in largescale Hadoop environments. EMC, VMware, and solution integrators with Big Data experience can help accelerate a Hadoop deployment and ensure success. Because of the interdependencies among the many components in a virtual HDaaS infrastructure, internal and external team members will need broad knowledge of the technology stack, including compute, storage, virtualization, and networking, with deep understanding of how each performs separately and together. While IT as a whole is still evolving toward developing integrated skill sets, EMC has been on the forefront of this trend and can provide insights and guidance. NEXT STEPS: LIVE WITH HDAAS With the breakthrough results of this POC, Adobe plans to take the HDaaS reference architecture using Isilon into production and test even larger Hadoop jobs. To generate additional results, Adobe also will run a variety of Hadoop jobs on the virtual HDaaS platform repeatedly as much as hundreds of times. The goal is to demonstrate that virtual HDaaS can deliver and is ready for large production applications. While the POC pointed one Hadoop cluster to Isilon, additional testing will focus on multiple Hadoop clusters accessing data sets on Isilon to further prove scalability. This multi-tenancy capability is crucial for supporting multiple analytics teams with separate projects. Adobe Technical Operations plans to run Hadoop jobs through Isilon access zones to ensure isolation is preserved without impacting performance or scalability. In addition, the team plans to move intermediate space from VNX block storage to Isilon and evaluate the impact of additional I/O on Isilon. Adobe also expects that an all-flash array such as EMC XtremIO would provide an excellent option for block storage in place of VNX. Additional configuration adjustments and testing are well worth the effort to Adobe and present tremendous opportunities for the analytics community as a whole. Using centralized storage, such as Isilon, provides a common data source rather than creating numerous storage locations for multiple Hadoop projects. The flexibility and scalability of the virtual HDaaS environment is also of great value as Hadoop jobs continue to grow in size. Most important, moving virtual HDaaS into production will enable Adobe's data scientists will be able to query against the entire data set residing on Isilon. By doing so, they will have a powerful way to gain more insight and intelligence that can be presented to Adobe s customers and provide both Adobe and their customers with strong competitive advantage. H13856 Page 12 of 12

Adobe Deploys Hadoop as a Service on VMware vsphere

Adobe Deploys Hadoop as a Service on VMware vsphere Adobe Deploys Hadoop as a Service A TECHNICAL CASE STUDY APRIL 2015 Table of Contents A Technical Case Study.... 3 Background... 3 Why Virtualize Hadoop on vsphere?.... 3 The Adobe Marketing Cloud and

More information

EMC SOLUTION FOR SPLUNK

EMC SOLUTION FOR SPLUNK EMC SOLUTION FOR SPLUNK Splunk validation using all-flash EMC XtremIO and EMC Isilon scale-out NAS ABSTRACT This white paper provides details on the validation of functionality and performance of Splunk

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned

More information

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with

More information

TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC

TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC Vision Big data and analytic initiatives within enterprises have been rapidly maturing from experimental efforts to production-ready deployments.

More information

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS ESSENTIALS Executive Summary Big Data is placing new demands on IT infrastructures. The challenge is how to meet growing performance demands

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

Simple. Extensible. Open.

Simple. Extensible. Open. White Paper Simple. Extensible. Open. Unleash the Value of Data with EMC ViPR Global Data Services Abstract The following paper opens with the evolution of enterprise storage infrastructure in the era

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

EMC PERFORMANCE OPTIMIZATION FOR MICROSOFT FAST SEARCH SERVER 2010 FOR SHAREPOINT

EMC PERFORMANCE OPTIMIZATION FOR MICROSOFT FAST SEARCH SERVER 2010 FOR SHAREPOINT Reference Architecture EMC PERFORMANCE OPTIMIZATION FOR MICROSOFT FAST SEARCH SERVER 2010 FOR SHAREPOINT Optimize scalability and performance of FAST Search Server 2010 for SharePoint Validate virtualization

More information

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014 White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS Successfully configure all solution components Use VMS at the required bandwidth for NAS storage Meet the bandwidth demands of a 2,200

More information

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems Simplified Management With Hitachi Command Suite By Hitachi Data Systems April 2015 Contents Executive Summary... 2 Introduction... 3 Hitachi Command Suite v8: Key Highlights... 4 Global Storage Virtualization

More information

EMC BACKUP-AS-A-SERVICE

EMC BACKUP-AS-A-SERVICE Reference Architecture EMC BACKUP-AS-A-SERVICE EMC AVAMAR, EMC DATA PROTECTION ADVISOR, AND EMC HOMEBASE Deliver backup services for cloud and traditional hosted environments Reduce storage space and increase

More information

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data:

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data: WHitE PaPER: Easing the Way to the cloud: 1 WHITE PAPER Get Ready for Big Data: How Scale-Out NaS Delivers the Scalability, Performance, Resilience and manageability that Big Data Environments Demand 2

More information

Integrated Grid Solutions. and Greenplum

Integrated Grid Solutions. and Greenplum EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving

More information

MANAGEMENT AND ORCHESTRATION WORKFLOW AUTOMATION FOR VBLOCK INFRASTRUCTURE PLATFORMS

MANAGEMENT AND ORCHESTRATION WORKFLOW AUTOMATION FOR VBLOCK INFRASTRUCTURE PLATFORMS VCE Word Template Table of Contents www.vce.com MANAGEMENT AND ORCHESTRATION WORKFLOW AUTOMATION FOR VBLOCK INFRASTRUCTURE PLATFORMS January 2012 VCE Authors: Changbin Gong: Lead Solution Architect Michael

More information

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH

More information

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper

More information

Managing Application Performance and Availability in a Virtual Environment

Managing Application Performance and Availability in a Virtual Environment The recognized leader in proven and affordable load balancing and application delivery solutions White Paper Managing Application Performance and Availability in a Virtual Environment by James Puchbauer

More information

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR 1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of

More information

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved. THE EMC ISILON STORY Big Data In The Enterprise 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology Summary 2 What is Big Data? 3 The Big Data Challenge File Shares 90 and Archives 80 Bioinformatics

More information

Virtualizing Exchange

Virtualizing Exchange Virtualizing Exchange Simplifying and Optimizing Management of Microsoft Exchange Server Using Virtualization Technologies By Anil Desai Microsoft MVP September, 2008 An Alternative to Hosted Exchange

More information

PROSPHERE: DEPLOYMENT IN A VITUALIZED ENVIRONMENT

PROSPHERE: DEPLOYMENT IN A VITUALIZED ENVIRONMENT White Paper PROSPHERE: DEPLOYMENT IN A VITUALIZED ENVIRONMENT Abstract This white paper examines the deployment considerations for ProSphere, the next generation of Storage Resource Management (SRM) from

More information

RETHINK STORAGE. Transform the Data Center with EMC ViPR Software-Defined Storage. White Paper

RETHINK STORAGE. Transform the Data Center with EMC ViPR Software-Defined Storage. White Paper White Paper RETHINK STORAGE Transform the Data Center with EMC ViPR Software-Defined Storage Abstract The following paper opens with the evolution of the Software- Defined Data Center and the challenges

More information

CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL

CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL Vision In today s volatile economy, an organization s ability to exploit IT to speed time-to-results, control cost and risk, and drive differentiation

More information

Top 5 Reasons to choose Microsoft Windows Server 2008 R2 SP1 Hyper-V over VMware vsphere 5

Top 5 Reasons to choose Microsoft Windows Server 2008 R2 SP1 Hyper-V over VMware vsphere 5 Top 5 Reasons to choose Microsoft Windows Server 2008 R2 SP1 Hyper-V over VMware Published: April 2012 2012 Microsoft Corporation. All rights reserved. This document is provided "as-is." Information and

More information

Platfora Big Data Analytics

Platfora Big Data Analytics Platfora Big Data Analytics ISV Partner Solution Case Study and Cisco Unified Computing System Platfora, the leading enterprise big data analytics platform built natively on Hadoop and Spark, delivers

More information

VMware vsphere 4.1. Pricing, Packaging and Licensing Overview. E f f e c t i v e A u g u s t 1, 2 0 1 0 W H I T E P A P E R

VMware vsphere 4.1. Pricing, Packaging and Licensing Overview. E f f e c t i v e A u g u s t 1, 2 0 1 0 W H I T E P A P E R VMware vsphere 4.1 Pricing, Packaging and Licensing Overview E f f e c t i v e A u g u s t 1, 2 0 1 0 W H I T E P A P E R Table of Contents Executive Summary...................................................

More information

Building a Scalable Big Data Infrastructure for Dynamic Workflows

Building a Scalable Big Data Infrastructure for Dynamic Workflows Building a Scalable Big Data Infrastructure for Dynamic Workflows INTRODUCTION Organizations of all types and sizes are looking to big data to help them make faster, more intelligent decisions. Many efforts

More information

Solving the Big Data Intention-Deployment Gap

Solving the Big Data Intention-Deployment Gap Whitepaper Solving the Big Data Intention-Deployment Gap Big Data is on virtually every enterprise s to-do list these days. Recognizing both its potential and competitive advantage, companies are aligning

More information

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray ware 2 Agenda The Hadoop Journey Why Virtualize Hadoop? Elasticity and Scalability Performance Tests Storage Reference

More information

EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers

EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers EMC VPLEX FAMILY Continuous Availability and Data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is

More information

What Is Microsoft Private Cloud Fast Track?

What Is Microsoft Private Cloud Fast Track? What Is Microsoft Private Cloud Fast Track? MICROSOFT PRIVATE CLOUD FAST TRACK is a reference architecture for building private clouds that combines Microsoft software, consolidated guidance, and validated

More information

EMC Data Protection Advisor 6.0

EMC Data Protection Advisor 6.0 White Paper EMC Data Protection Advisor 6.0 Abstract EMC Data Protection Advisor provides a comprehensive set of features to reduce the complexity of managing data protection environments, improve compliance

More information

EMC Integrated Infrastructure for VMware

EMC Integrated Infrastructure for VMware EMC Integrated Infrastructure for VMware Enabled by EMC Celerra NS-120 Reference Architecture EMC Global Solutions Centers EMC Corporation Corporate Headquarters Hopkinton MA 01748-9103 1.508.435.1000

More information

Understanding Enterprise NAS

Understanding Enterprise NAS Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA

More information

Evaluation of Enterprise Data Protection using SEP Software

Evaluation of Enterprise Data Protection using SEP Software Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &

More information

BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS

BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS Best Practices Guide BEST PRACTICES FOR INTEGRATING TELESTREAM VANTAGE WITH EMC ISILON ONEFS Abstract This best practices guide contains details for integrating Telestream Vantage workflow design and automation

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

Product Brochure. Hedvig Distributed Storage Platform Modern Storage for Modern Business. Elastic. Accelerate data to value. Simple.

Product Brochure. Hedvig Distributed Storage Platform Modern Storage for Modern Business. Elastic. Accelerate data to value. Simple. Product Brochure Elastic Scales to petabytes of data Start with as few as two nodes and scale to thousands. Add capacity if and when needed. Embrace the economics of commodity x86 infrastructure to build

More information

Frequently Asked Questions: EMC ViPR Software- Defined Storage Software-Defined Storage

Frequently Asked Questions: EMC ViPR Software- Defined Storage Software-Defined Storage Frequently Asked Questions: EMC ViPR Software- Defined Storage Software-Defined Storage Table of Contents What's New? Platform Questions Customer Benefits Fit with Other EMC Products What's New? What is

More information

MICROSOFT HYPER-V SCALABILITY WITH EMC SYMMETRIX VMAX

MICROSOFT HYPER-V SCALABILITY WITH EMC SYMMETRIX VMAX White Paper MICROSOFT HYPER-V SCALABILITY WITH EMC SYMMETRIX VMAX Abstract This white paper highlights EMC s Hyper-V scalability test in which one of the largest Hyper-V environments in the world was created.

More information

AUTOMATED DATA RETENTION WITH EMC ISILON SMARTLOCK

AUTOMATED DATA RETENTION WITH EMC ISILON SMARTLOCK White Paper AUTOMATED DATA RETENTION WITH EMC ISILON SMARTLOCK Abstract EMC Isilon SmartLock protects critical data against accidental, malicious or premature deletion or alteration. Whether you need to

More information

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction

Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction Simplifying Storage Operations By David Strom (published 3.15 by VMware) Introduction There are tectonic changes to storage technology that the IT industry hasn t seen for many years. Storage has been

More information

EMC XTREMIO EXECUTIVE OVERVIEW

EMC XTREMIO EXECUTIVE OVERVIEW EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying

More information

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution EMC Virtual Infrastructure for Microsoft Applications Data Center Solution Enabled by EMC Symmetrix V-Max and Reference Architecture EMC Global Solutions Copyright and Trademark Information Copyright 2009

More information

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created

More information

EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS

EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS EXECUTIVE SUMMARY It s no secret that organizations continue to produce overwhelming amounts

More information

Is Hyperconverged Cost-Competitive with the Cloud?

Is Hyperconverged Cost-Competitive with the Cloud? Economic Insight Paper Is Hyperconverged Cost-Competitive with the Cloud? An Evaluator Group TCO Analysis Comparing AWS and SimpliVity By Eric Slack, Sr. Analyst January 2016 Enabling you to make the best

More information

IBM PureApplication System for IBM WebSphere Application Server workloads

IBM PureApplication System for IBM WebSphere Application Server workloads IBM PureApplication System for IBM WebSphere Application Server workloads Use IBM PureApplication System with its built-in IBM WebSphere Application Server to optimally deploy and run critical applications

More information

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7 Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7 Yan Fisher Senior Principal Product Marketing Manager, Red Hat Rohit Bakhshi Product Manager,

More information

Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash. June 2013 Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

More information

EMC ISILON ONEFS OPERATING SYSTEM

EMC ISILON ONEFS OPERATING SYSTEM EMC ISILON ONEFS OPERATING SYSTEM Powering scale-out storage for the Big Data and Object workloads of today and tomorrow ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers EMC VPLEX FAMILY Continuous Availability and data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is

More information

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with

More information

Building the Virtual Information Infrastructure

Building the Virtual Information Infrastructure Technology Concepts and Business Considerations Abstract A virtual information infrastructure allows organizations to make the most of their data center environment by sharing computing, network, and storage

More information

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing

More information

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads Solution Overview Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads What You Will Learn MapR Hadoop clusters on Cisco Unified Computing System (Cisco UCS

More information

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp

More information

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment Technical Paper Moving SAS Applications from a Physical to a Virtual VMware Environment Release Information Content Version: April 2015. Trademarks and Patents SAS Institute Inc., SAS Campus Drive, Cary,

More information

White Paper. SAP NetWeaver Landscape Virtualization Management on VCE Vblock System 300 Family

White Paper. SAP NetWeaver Landscape Virtualization Management on VCE Vblock System 300 Family White Paper SAP NetWeaver Landscape Virtualization Management on VCE Vblock System 300 Family Table of Contents 2 Introduction 3 A Best-of-Breed Integrated Operations Architecture 3 SAP NetWeaver Landscape

More information

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX

SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX White Paper SECURE, ENTERPRISE FILE SYNC AND SHARE WITH EMC SYNCPLICITY UTILIZING EMC ISILON, EMC ATMOS, AND EMC VNX Abstract This white paper explains the benefits to the extended enterprise of the on-

More information

EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION

EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION Automated file synchronization Flexible, cloud-based administration Secure, on-premises storage EMC Solutions January 2015 Copyright 2014 EMC Corporation. All

More information

LEVERAGE VBLOCK SYSTEMS FOR Esri s ArcGIS SYSTEM

LEVERAGE VBLOCK SYSTEMS FOR Esri s ArcGIS SYSTEM Leverage Vblock Systems for Esri's ArcGIS System Table of Contents www.vce.com LEVERAGE VBLOCK SYSTEMS FOR Esri s ArcGIS SYSTEM August 2012 1 Contents Executive summary...3 The challenge...3 The solution...3

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

Security. Reliability. Performance. Flexibility. Scalability

Security. Reliability. Performance. Flexibility. Scalability ESG Lab Review VCE Vblock Systems with EMC Isilon for Enterprise Hadoop Date: November 2014 Author: Tony Palmer, Senior ESG Lab Analyst, and Mike Leone, ESG Lab Analyst Abstract: This ESG Lab review documents

More information

EMC VFCACHE ACCELERATES ORACLE

EMC VFCACHE ACCELERATES ORACLE White Paper EMC VFCACHE ACCELERATES ORACLE VFCache extends Flash to the server FAST Suite automates storage placement in the array VNX protects data EMC Solutions Group Abstract This white paper describes

More information

EMC ISILON AND ELEMENTAL SERVER

EMC ISILON AND ELEMENTAL SERVER Configuration Guide EMC ISILON AND ELEMENTAL SERVER Configuration Guide for EMC Isilon Scale-Out NAS and Elemental Server v1.9 EMC Solutions Group Abstract EMC Isilon and Elemental provide best-in-class,

More information

REDEFINE SIMPLICITY TOP REASONS: EMC VSPEX BLUE FOR VIRTUALIZED ENVIRONMENTS

REDEFINE SIMPLICITY TOP REASONS: EMC VSPEX BLUE FOR VIRTUALIZED ENVIRONMENTS REDEFINE SIMPLICITY AGILE. SCALABLE. TRUSTED. TOP REASONS: EMC VSPEX BLUE FOR VIRTUALIZED ENVIRONMENTS Redefine Simplicity: Agile, Scalable and Trusted. Mid-market and Enterprise customers as well as Managed

More information

EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, Symmetrix Management Console, and VMware vcenter Converter

EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, Symmetrix Management Console, and VMware vcenter Converter EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, VMware vcenter Converter A Detailed Review EMC Information Infrastructure Solutions Abstract This white paper

More information

Changing the Equation on Big Data Spending

Changing the Equation on Big Data Spending White Paper Changing the Equation on Big Data Spending Big Data analytics can deliver new customer insights, provide competitive advantage, and drive business innovation. But complexity is holding back

More information

EMC Integrated Infrastructure for VMware

EMC Integrated Infrastructure for VMware EMC Integrated Infrastructure for VMware Enabled by Celerra Reference Architecture EMC Global Solutions Centers EMC Corporation Corporate Headquarters Hopkinton MA 01748-9103 1.508.435.1000 www.emc.com

More information

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide Isilon OneFS Version 7.2 OneFS Migration Tools Guide Copyright 2014 EMC Corporation. All rights reserved. Published in USA. Published November, 2014 EMC believes the information in this publication is

More information

EMC Business Continuity for VMware View Enabled by EMC SRDF/S and VMware vcenter Site Recovery Manager

EMC Business Continuity for VMware View Enabled by EMC SRDF/S and VMware vcenter Site Recovery Manager EMC Business Continuity for VMware View Enabled by EMC SRDF/S and VMware vcenter Site Recovery Manager A Detailed Review Abstract This white paper demonstrates that business continuity can be enhanced

More information

EMC XtremSF: Delivering Next Generation Performance for Oracle Database

EMC XtremSF: Delivering Next Generation Performance for Oracle Database White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing

More information

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Accelerating Hadoop MapReduce Using an In-Memory Data Grid Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for

More information

VBLOCK SOLUTION FOR SAP: SAP APPLICATION AND DATABASE PERFORMANCE IN PHYSICAL AND VIRTUAL ENVIRONMENTS

VBLOCK SOLUTION FOR SAP: SAP APPLICATION AND DATABASE PERFORMANCE IN PHYSICAL AND VIRTUAL ENVIRONMENTS Vblock Solution for SAP: SAP Application and Database Performance in Physical and Virtual Environments Table of Contents www.vce.com V VBLOCK SOLUTION FOR SAP: SAP APPLICATION AND DATABASE PERFORMANCE

More information

Future Proofing Data Archives with Storage Migration From Legacy to Cloud

Future Proofing Data Archives with Storage Migration From Legacy to Cloud Future Proofing Data Archives with Storage Migration From Legacy to Cloud ABSTRACT This white paper explains how EMC Elastic Cloud Storage (ECS ) Appliance and Seven10 s Storfirst software enable organizations

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

Big Data and Apache Hadoop Adoption:

Big Data and Apache Hadoop Adoption: Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

Solution Overview VMWARE PROTECTION WITH EMC NETWORKER 8.2. White Paper

Solution Overview VMWARE PROTECTION WITH EMC NETWORKER 8.2. White Paper White Paper VMWARE PROTECTION WITH EMC NETWORKER 8.2 Solution Overview Abstract This white paper describes the integration of EMC NetWorker with VMware vcenter. It also includes details on the NetWorker

More information

Overview executive SUMMArY

Overview executive SUMMArY EMC Isilon TCO Benefits for Large-Scale Home Directories Overview EMC Isilon scale-out network-attached storage (NAS) has rapidly gained popularity over the past several years, successfully moving from

More information

Consolidate and Virtualize Your Windows Environment with NetApp and VMware

Consolidate and Virtualize Your Windows Environment with NetApp and VMware White Paper Consolidate and Virtualize Your Windows Environment with NetApp and VMware Sachin Chheda, NetApp and Gaetan Castelein, VMware October 2009 WP-7086-1009 TABLE OF CONTENTS 1 EXECUTIVE SUMMARY...

More information

EMC IT AUTOMATES ENTERPRISE PLATFORM AS A SERVICE

EMC IT AUTOMATES ENTERPRISE PLATFORM AS A SERVICE EMC IT AUTOMATES ENTERPRISE PLATFORM AS A SERVICE Self-service portal delivers ready-to-use development platform in less than one hour Application developers order from online catalog with just a few clicks

More information

Master Hybrid Cloud Management with VMware vrealize Suite. Increase Business Agility, Efficiency, and Choice While Keeping IT in Control

Master Hybrid Cloud Management with VMware vrealize Suite. Increase Business Agility, Efficiency, and Choice While Keeping IT in Control Master Hybrid Cloud Management with VMware vrealize Suite Increase Business Agility, Efficiency, and Choice While Keeping IT in Control Empower IT to Innovate The time is now for IT organizations to take

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

ABSTRACT. February, 2014 EMC WHITE PAPER

ABSTRACT. February, 2014 EMC WHITE PAPER EMC APPSYNC SOLUTION FOR MANAGING PROTECTION OF MICROSOFT SQL SERVER SLA-DRIVEN, SELF-SERVICE CAPABILITIES FOR MAXIMIZING AND SIMPLIFYING DATA PROTECTION AND RECOVERABILITY ABSTRACT With Microsoft SQL

More information

Oracle Hyperion Financial Management Virtualization Whitepaper

Oracle Hyperion Financial Management Virtualization Whitepaper Oracle Hyperion Financial Management Virtualization Whitepaper Oracle Hyperion Financial Management Virtualization Whitepaper TABLE OF CONTENTS Overview... 3 Benefits... 4 HFM Virtualization testing...

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

can you effectively plan for the migration and management of systems and applications on Vblock Platforms? SOLUTION BRIEF CA Capacity Management and Reporting Suite for Vblock Platforms can you effectively plan for the migration and management of systems and applications on Vblock Platforms? agility made possible

More information

EMC HADOOP AS A SERVICE SOLUTION

EMC HADOOP AS A SERVICE SOLUTION White Paper EMC HADOOP AS A SERVICE SOLUTION EMC Isilon, Pivotal HD, VMware vsphere Big Data Extensions Hadoop for service providers Virtualized and shared infrastructure Global Solutions Sales Abstract

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP Agenda ADP Cloud Vision and Requirements Introduction to SUSE Cloud Overview Whats New VMWare intergration HyperV intergration ADP

More information