HadoopTM Analytics DDN

Size: px
Start display at page:

Download "HadoopTM Analytics DDN"

Transcription

1 DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate their Hadoop infrastructure and gain a deeper understanding of their business DataDirect Networks. All Rights Reserved.

2 The Big Data Challenge and Opportunity in Hadoop Analytics Organizations in a wide range of industries rely on advanced analytics to gain important insights from rapidly growing data sets and to make faster, more informed decisions. The ability to perform detailed and complex analytics on Big Data using Apache Hadoop is integral to success in fields such as Life Sciences, Financial Services, and Government. Life Sciences. Hadoop-based analytics is being used to detect drug interactions, identify the best courses of treatment, and determine a patient s likelihood of developing a disease. Financial Services. Visionary hedge funds, proprietary trading firms, and other leading institutions are turning to Hadoop for market monitoring, risk modeling, fraud detection and compliance reporting. Government. Federal and local government agencies are turning to Hadoop to satisfy diverse mission goals. The Intelligence community is working daily to find hidden associations across multiple large data sets. Law enforcement is analyzing all available information as a way to better deploy limited resources and ensure that police are positioned to respond rapidly while providing a visible presence to deter crime. As Hadoop-based analytics becomes an essential part of operations like these, the performance, scalability, and reliability of the infrastructure that supports Hadoop has become increasingly business critical. Science projects are no longer enough to get the job done. Hadoop infrastructure has to be efficient, reliable, and IT friendly. Shared storage solutions can eliminate bottlenecks to increase Hadoop performance while providing greater reliability and the familiar feature set that IT teams depend on. As a leader in Big Data storage, DDN is the perfect storage partner for your Hadoop infrastructure needs. What is Analytics and How Does Hadoop Enable It? Analytics is about taking data and turning it into actionable information. This process includes finding important details, making associations, and working towards a recommendation that you can execute on. Hadoop is becoming the preferred platform for efficiently processing the large volumes of data from diverse sources that is needed to drive these decisions. Hadoop provides storage and analysis of large, semi-structured and unstructured data sets, and it offers a rich ecosystem of add-on components such as Apache HCatalog, Mahout, Pig and Accumulo that allow for integration with other platforms and simplified access to complex data sets. Hadoop has established itself as a standard for organizations working to solve Big Data challenges, because it is: Scalable in both performance & capacity Has a growing solution ecosystem that increases capability and flexibility Provides established APIs and interfaces that accelerate development 2012 DataDirect Networks. All Rights Reserved. 2

3 The Hadoop software consists of two main components: MapReduce. An algorithm for processing problems against huge data sets. Problems are divided into parallel tasks by a job tracker and each task is assigned to a task tracker on a Hadoop node for execution. In the map part of the process, queries are processed in parallel on many nodes. During the reduce part of the process, results are gathered, organized, and presented. Hadoop Distributed File System (HDFS). The distributed file system used for data management by Hadoop. A single name node manages metadata for a Hadoop cluster while a data node process on each cluster node is responsible for the subset of the total data set that resides there. A standard Hadoop installation runs on a cluster of compute nodes in which each node contains compute and internal (direct-attached) storage. For data protection and disaster recovery, Hadoop maintains three copies of all data on separate nodes. This operational model is quite different from what most IT teams are accustomed to, and, as data sets grow in size, storing three copies of data consumes a huge amount of storage not to mention electricity and floor space. As more mainstream organizations adopt Hadoop, a new set of capabilities is needed to allow Hadoop to integrate better with standard IT practices. Replacing Hadoop s direct-attached storage model with shared storage may be the fastest way to rationalize Hadoop deployment and simplify the integration of Hadoop solutions with your existing IT infrastructure and practices. Shared Storage Accelerates Hadoop and Increases Operational Efficiency As Hadoop becomes an integral part of business processes, IT teams are looking for Hadoop infrastructure solutions that deliver: Enterprise-class hardware Enterprise integration High availability Efficient CAPEX and OPEX scaling Resource management, SLAs and QoS Moving to a shared storage infrastructure for Hadoop can address these concerns and provide significant advantages in terms of performance, scaling, and reliability while creating a more IT-friendly infrastructure. Making the investment in the enterprise-class hardware necessary to support shared infrastructure significantly reduces ongoing operational expenses and allows you to share resources across multiple applications and business units. Performance: Shared storage can achieve better storage performance with far fewer spinning disks. Each Hadoop node typically includes a few commodity disk drives organized either as a JBOD or in a single RAID group. Solidstate disks are rarely used to keep per node costs down. The storage performance of any single Hadoop node is 2012 DataDirect Networks. All Rights Reserved. 3

4 low, and high aggregate storage performance is only achieved by having a large number of nodes and a very large number of disks. In recent testing using TestDFSIO, a distributed I/O benchmark tool that writes and reads random data from HDFS, the DDN Storage Fusion Architecture (SFA) demonstrated a 50% to 100% or more improvement in HDFS performance versus commodity servers with local storage. Scalability: By deploying shared storage for Hadoop, compute and storage capacity scale independently, with greater flexibility to choose the best solution for each. Pairing dense compute infrastructure with dense shared storage can significantly shrink your overall Hadoop footprint. Data growth in Hadoop environments often exceeds growth in computing demand, so scaling out compute and disk in lockstep as in a standard Hadoop deployment means paying for CPU capacity in order to get storage. Because a standard Hadoop installation has three data copies, it requires 3X the storage to satisfy a given requirement, making the addition of new capacity an expensive proposition. Shared storage provides storage resiliency with much lower capacity overhead. Reliability: Placing the storage for Hadoop s Name Node and Job Tracker--which are particularly vulnerable to failures--on reliable, shared storage protects both performance and availability; service can be restored more quickly should one of these services fail. All things being equal, having 3X the disks means 3X the disk failures. With each disk failure, a Hadoop node is compromised. While Hadoop can continue to run, at some point performance suffers and results are delayed. Shared storage provides the same usable capacity from far fewer disk spindles with better overall reliability. When a disk does fail, advanced storage systems like the DDN SFA12K can generate missing data from parity without impacting Hadoop performance. When a compute node fails, it s easy to re-assign its storage to a spare. IT Friendliness: Shared storage is familiar and IT friendly. Which would you rather manage, 1,000 disks in a single discrete storage system, or 3,000 disks spread across hundreds of compute nodes? Shared storage eliminates the mismatch between Hadoop and the rest of your IT infrastructure, making it easier to integrate Hadoop with your operations. Management. Manage all storage from a single interface. Scale capacity quickly. Data protection. Take advantage of built-in data integrity and data protection functions such as RAID, Snapshots, and off-site replication, offloading that work from Hadoop. Flexibility. Pull compute resources into a Hadoop cluster for intensive jobs and release them when the job is complete. Multiple workloads. Support other workloads without affecting Hadoop performance. Cost. Fewer spinning disks, a smaller storage footprint, reduced complexity and simplified management decrease energy consumption, save datacenter space, and decrease management costs DataDirect Networks. All Rights Reserved. 4

5 Award Winning SFA12K Innovative, award winning and proven in the world s largest and most demanding production environments, DDN s Storage Fusion Architecture (SFA) utilizes the most advanced processor technology, busses and memory with an optimized RAID engine and sophisticated data management algorithms. The SFA12K product family is designed to derive peak performance from Hadoop investments with a massive I/O infrastructure and multi-media disk drives that maximize system performance and lower storage investment costs. The SFA12K product family is purpose-built to simplify and tame Big Data growth, enabling you to architect and scale your Hadoop environment more intelligently, efficiently and cost effectively. For architects in global businesses coping with complex big data solutions, the SFA platform for Hadoop infrastructure is an extremely reliable and high performing platform that will accelerate workflows to enable you to analyze growing amounts of data without increasing costs. Performance and scalability: Our state-of-the-art SFA12K storage engine is almost eight times faster than legacy enterprise storage. With an SFA12K you can leverage industry leading SFA storage performance to satisfy Hadoop storage requirements with the fewest storage systems. A single system delivers up to 40GB/second of system bandwidth and bandwidth scales with each additional SFA12K system. It s possible to achieve an aggregate bandwidth of 1TB/second in just 25 systems. The SFA platform is the fastest platform for Big Data, with the ability to extract the highest performance from all media. Higher performance means that you can deliver exceptional performance from smaller Hadoop clusters. Density: Reduce your Hadoop footprint, reclaim your datacenter, and resolve space and power limitations with the industry's densest storage platform. Each enclosure houses 60 drives in just 4U to deliver 2.4PB of raw storage per rack. (with 4TB SATA drives). Our world leading density and power efficiency means that organizations can reduce their TCO requirements. Reliability: The SFA12K delivers world-class reliability features that protect the availability and integrity of your Hadoop data. A unique multi-raid architecture combines up to 1,680 SATA, SAS and SSD drives into a simply managed, multipetabyte platform. The system is able to perform multiple levels of parity generation and real-time data integrity verification and error correction in the background without impacting Hadoop performance. DirectProtect further increases data resiliency and reliability by automatically detecting and correcting silent data corruption DataDirect Networks. All Rights Reserved. 5

6 Lowest Total Cost of Ownership (TCO) in the industry: TCO that s 50% lower than other enterprise storage solutions makes the SFA12K a smarter choice for Hadoop shared storage infrastructure, and you can support workloads in addition to Hadoop from the same storage. The leading edge SFA12K brings to your datacenter, industry-leading performance, capacity, density and reliability. The SFA12K is a performance powerhouse. The power, speed and scalability of SFA delivers unparalleled performance improvements for Hadoop, in an IT-friendly platform with lower TCO. For business executives seeking to understand how an organization is perceived by customers and the world, the SFA platform for Hadoop infrastructure helps you gain insights and understand your business better and faster than ever before. Because the DDN SFA12K is the faster shared storage platform, it is the ideal choice for accelerating Hadoop-based analytics to power better decisions. DDN About Us DataDirect Networks (DDN) is the world leader in massively scalable storage. We are the leading provider of data storage and processing solutions and professional services that enable contentrich and high-growth IT environments to achieve the highest levels of systems scalability, efficiency and simplicity. DDN enables enterprises to extract value and deliver results from their information. Our customers include the world s leading online content and social networking providers, high-performance cloud and grid computing, life sciences, media production organizations and security and intelligence organizations. Deployed in thousands of mission-critical environments, worldwide, DDN s solutions have been designed, engineered and proven in the world s most scalable data centers to ensure competitive business advantage for today s information-powered enterprise. For more information, go to www. or call , DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks, SFA, and Storage Fusion Architecture are trademarks of DataDirect Networks. All other trademarks are the property of their respective owners. Version DataDirect Networks. All Rights Reserved. 6

Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. ddn.com

Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. ddn.com DDN Technical Brief Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. A Fundamentally Different Approach To Enterprise Analytics Architecture: A Scalable Unit

More information

With DDN Big Data Storage

With DDN Big Data Storage DDN Solution Brief Accelerate > ISR With DDN Big Data Storage The Way to Capture and Analyze the Growing Amount of Data Created by New Technologies 2012 DataDirect Networks. All Rights Reserved. The Big

More information

ANY SURVEILLANCE, ANYWHERE, ANYTIME

ANY SURVEILLANCE, ANYWHERE, ANYTIME ANY SURVEILLANCE, ANYWHERE, ANYTIME WHITEPAPER DDN Storage Powers Next Generation Video Surveillance Infrastructure INTRODUCTION Over the past decade, the world has seen tremendous growth in the use of

More information

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper

More information

Collaborative Research Infrastructure Deployments. ddn.com. Accelerate > DDN Case Study

Collaborative Research Infrastructure Deployments. ddn.com. Accelerate > DDN Case Study DDN Case Study Accelerate > Collaborative Research Infrastructure Deployments University College London Transforms Research Collaboration and Data Preservation with Scalable Cloud Object Storage Appliance

More information

Improving Time to Results for Seismic Processing with Paradigm and DDN. ddn.com. DDN Whitepaper. James Coomer and Laurent Thiers

Improving Time to Results for Seismic Processing with Paradigm and DDN. ddn.com. DDN Whitepaper. James Coomer and Laurent Thiers DDN Whitepaper Improving Time to Results for Seismic Processing with Paradigm and DDN James Coomer and Laurent Thiers 2014 DataDirect Networks. All Rights Reserved. Executive Summary Companies in the oil

More information

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization

HP Vertica OnDemand. Vertica OnDemand. Enterprise-class Big Data analytics in the cloud. Enterprise-class Big Data analytics for any size organization Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing

More information

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce

More information

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Optimizing Dell PowerEdge Configurations for Hadoop

Optimizing Dell PowerEdge Configurations for Hadoop Optimizing Dell PowerEdge Configurations for Hadoop Understanding how to get the most out of Hadoop running on Dell hardware A Dell technical white paper July 2013 Michael Pittaro Principal Architect,

More information

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5

More information

EMC XtremSF: Delivering Next Generation Performance for Oracle Database

EMC XtremSF: Delivering Next Generation Performance for Oracle Database White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...

More information

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all. SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE When you can do it simply, you can do it all. SYMANTEC NETBACKUP APPLIANCES Symantec understands the shifting needs of the data center and offers NetBackup

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a

More information

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Reference Architecture solutions are configured and tested for support with Maxta software- defined storage and with industry

More information

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud

More information

Media Workflows Nice Shoes operates a 24x7 highly collaborative environment and needed to enable users to work in real-time. ddn.com.

Media Workflows Nice Shoes operates a 24x7 highly collaborative environment and needed to enable users to work in real-time. ddn.com. DDN Case Study Accelerating > Media Workflows Nice Shoes operates a 24x7 highly collaborative environment and needed to enable users to work in real-time. 2012 DataDirect Networks. All Rights Reserved.

More information

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with

More information

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage

Cisco for SAP HANA Scale-Out Solution on Cisco UCS with NetApp Storage Cisco for SAP HANA Scale-Out Solution Solution Brief December 2014 With Intelligent Intel Xeon Processors Highlights Scale SAP HANA on Demand Scale-out capabilities, combined with high-performance NetApp

More information

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved.

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. DDN Whitepaper WOS for Research Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. irods and the DDN Web Object Scalar (WOS) Integration irods, an open source

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

Minimize cost and risk for data warehousing

Minimize cost and risk for data warehousing SYSTEM X SERVERS SOLUTION BRIEF Minimize cost and risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (55TB) Highlights Improve time to value for your data

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

Geospatial Imaging Cloud Storage Capturing the World at Scale with WOS TM. ddn.com. DDN Whitepaper. 2011 DataDirect Networks. All Rights Reserved.

Geospatial Imaging Cloud Storage Capturing the World at Scale with WOS TM. ddn.com. DDN Whitepaper. 2011 DataDirect Networks. All Rights Reserved. DDN Whitepaper Geospatial Imaging Cloud Storage Capturing the World at Scale with WOS TM Table of Contents Growth and Complexity Challenges for Geospatial Imaging 3 New Solutions to Drive Insight, Simplicity

More information

Netapp HPC Solution for Lustre. Rich Fenton (fenton@netapp.com) UK Solutions Architect

Netapp HPC Solution for Lustre. Rich Fenton (fenton@netapp.com) UK Solutions Architect Netapp HPC Solution for Lustre Rich Fenton (fenton@netapp.com) UK Solutions Architect Agenda NetApp Introduction Introducing the E-Series Platform Why E-Series for Lustre? Modular Scale-out Capacity Density

More information

Maximum performance, minimal risk for data warehousing

Maximum performance, minimal risk for data warehousing SYSTEM X SERVERS SOLUTION BRIEF Maximum performance, minimal risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (95TB) The rapid growth of technology has

More information

White Paper Storage for Big Data and Analytics Challenges

White Paper Storage for Big Data and Analytics Challenges White Paper Storage for Big Data and Analytics Challenges Abstract Big Data and analytics workloads represent a new frontier for organizations. Data is being collected from sources that did not exist 10

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

Business white paper Invest in the right flash storage solution

Business white paper Invest in the right flash storage solution Business white paper Invest in the right flash storage solution A guide for the savvy tech buyer Business white paper Page 2 Introduction You re looking at flash storage because you see it s taking the

More information

Easier - Faster - Better

Easier - Faster - Better Highest reliability, availability and serviceability ClusterStor gets you productive fast with robust professional service offerings available as part of solution delivery, including quality controlled

More information

An Oracle White Paper November 2010. Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager

An Oracle White Paper November 2010. Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager An Oracle White Paper November 2010 Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager Introduction...2 Oracle Backup and Recovery Solution Overview...3 Oracle Recovery

More information

Enabling High performance Big Data platform with RDMA

Enabling High performance Big Data platform with RDMA Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery

More information

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

ioscale: The Holy Grail for Hyperscale

ioscale: The Holy Grail for Hyperscale ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often

More information

The Ultimate in Scale-Out Storage for HPC and Big Data

The Ultimate in Scale-Out Storage for HPC and Big Data Node Inventory Health and Active Filesystem Throughput Monitoring Asset Utilization and Capacity Statistics Manager brings to life powerful, intuitive, context-aware real-time monitoring and proactive

More information

Cray: Enabling Real-Time Discovery in Big Data

Cray: Enabling Real-Time Discovery in Big Data Cray: Enabling Real-Time Discovery in Big Data Discovery is the process of gaining valuable insights into the world around us by recognizing previously unknown relationships between occurrences, objects

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH

More information

Microsoft Private Cloud Fast Track

Microsoft Private Cloud Fast Track Microsoft Private Cloud Fast Track Microsoft Private Cloud Fast Track is a reference architecture designed to help build private clouds by combining Microsoft software with Nutanix technology to decrease

More information

Microsoft Windows Server Hyper-V in a Flash

Microsoft Windows Server Hyper-V in a Flash Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise-class storage arrays with the ease and flexibility of Windows Storage Server in an integrated solution to achieve higher density,

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

EMC XTREMIO EXECUTIVE OVERVIEW

EMC XTREMIO EXECUTIVE OVERVIEW EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying

More information

DDN updates object storage platform as it aims to break out of HPC niche

DDN updates object storage platform as it aims to break out of HPC niche DDN updates object storage platform as it aims to break out of HPC niche Analyst: Simon Robinson 18 Oct, 2013 DataDirect Networks has refreshed its Web Object Scaler (WOS), the company's platform for efficiently

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse... 3 Built for Big Data...

More information

FLASH STORAGE SOLUTION

FLASH STORAGE SOLUTION Invest in the right FLASH STORAGE SOLUTION A guide for the savvy tech buyer Introduction You re looking at flash storage because you see it s taking the storage world by storm. You re interested in accelerating

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

StarWind Virtual SAN for Microsoft SOFS

StarWind Virtual SAN for Microsoft SOFS StarWind Virtual SAN for Microsoft SOFS Cutting down SMB and ROBO virtualization cost by using less hardware with Microsoft Scale-Out File Server (SOFS) By Greg Schulz Founder and Senior Advisory Analyst

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance

More information

Dell Reference Configuration for Hortonworks Data Platform

Dell Reference Configuration for Hortonworks Data Platform Dell Reference Configuration for Hortonworks Data Platform A Quick Reference Configuration Guide Armando Acosta Hadoop Product Manager Dell Revolutionary Cloud and Big Data Group Kris Applegate Solution

More information

How Solace Message Routers Reduce the Cost of IT Infrastructure

How Solace Message Routers Reduce the Cost of IT Infrastructure How Message Routers Reduce the Cost of IT Infrastructure This paper explains how s innovative solution can significantly reduce the total cost of ownership of your messaging middleware platform and IT

More information

IBM Enterprise Linux Server

IBM Enterprise Linux Server IBM Systems and Technology Group February 2011 IBM Enterprise Linux Server Impressive simplification with leading scalability, high availability and security Table of Contents Executive Summary...2 Our

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

T a c k l i ng Big Data w i th High-Performance

T a c k l i ng Big Data w i th High-Performance Worldwide Headquarters: 211 North Union Street, Suite 105, Alexandria, VA 22314, USA P.571.296.8060 F.508.988.7881 www.idc-gi.com T a c k l i ng Big Data w i th High-Performance Computing W H I T E P A

More information

Protecting Information in a Smarter Data Center with the Performance of Flash

Protecting Information in a Smarter Data Center with the Performance of Flash 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 Protecting Information in a Smarter Data Center with the Performance of Flash IBM FlashSystem and IBM ProtecTIER Printed in

More information

Software-defined Storage Architecture for Analytics Computing

Software-defined Storage Architecture for Analytics Computing Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture

More information

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010 Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,

More information

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth

More information

Microsoft SQL Server on Stratus ftserver Systems

Microsoft SQL Server on Stratus ftserver Systems W H I T E P A P E R Microsoft SQL Server on Stratus ftserver Systems Security, scalability and reliability at its best Uptime that approaches six nines Significant cost savings for your business Only from

More information

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk WHITE PAPER Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk 951 SanDisk Drive, Milpitas, CA 95035 2015 SanDisk Corporation. All rights reserved. www.sandisk.com Table of Contents Introduction

More information

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief Optimizing Storage for Better TCO in Oracle Environments INFOSTOR Executive Brief a QuinStreet Excutive Brief. 2012 To the casual observer, and even to business decision makers who don t work in information

More information

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era

Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era Enterprise Strategy Group Getting to the bigger truth. White Paper Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era Apeiron introduces NVMe-based storage innovation designed

More information

Integrated Grid Solutions. and Greenplum

Integrated Grid Solutions. and Greenplum EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving

More information

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Five Technology Trends for Improved Business Intelligence Performance

Five Technology Trends for Improved Business Intelligence Performance TechTarget Enterprise Applications Media E-Book Five Technology Trends for Improved Business Intelligence Performance The demand for business intelligence data only continues to increase, putting BI vendors

More information

Kaminario K2 All-Flash Array

Kaminario K2 All-Flash Array Kaminario K2 All-Flash Array The Kaminario K2 all-flash storage array delivers predictable performance, cost, scale, resiliency and simplicity so organizations can handle ever-changing and unforeseen business

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information

Platfora Big Data Analytics

Platfora Big Data Analytics Platfora Big Data Analytics ISV Partner Solution Case Study and Cisco Unified Computing System Platfora, the leading enterprise big data analytics platform built natively on Hadoop and Spark, delivers

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise Introducing Unisys All in One software based weather platform designed to reduce server space, streamline operations, consolidate

More information

BEYOND TOOLS: BUSINESS INTELLIGENCE MEETS ANA FLASH-OPTIMIZED STORAGE IS TRANSFORMING THE DATA CENTER

BEYOND TOOLS: BUSINESS INTELLIGENCE MEETS ANA FLASH-OPTIMIZED STORAGE IS TRANSFORMING THE DATA CENTER BEYOND TOOLS: BUSINESS INTELLIGENCE MEETS ANA LYTICS FLASH-OPTIMIZED STORAGE IS TRANSFORMING THE DATA CENTER White Paper - February 2016 BEYOND TOOLS: BUSINESS INTELLIGENCE MEETS ANA LYTICS Flash-Optimized

More information

Red Hat Storage Server

Red Hat Storage Server Red Hat Storage Server Marcel Hergaarden Solution Architect, Red Hat marcel.hergaarden@redhat.com May 23, 2013 Unstoppable, OpenSource Software-based Storage Solution The Foundation for the Modern Hybrid

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

The Flash- Transformed Server Platform Maximizing Your Migration from Windows Server 2003 with a SanDisk Flash- enabled Server Platform

The Flash- Transformed Server Platform Maximizing Your Migration from Windows Server 2003 with a SanDisk Flash- enabled Server Platform WHITE PAPER The Flash- Transformed Server Platform Maximizing Your Migration from Windows Server 2003 with a SanDisk Flash- enabled Server Platform.www.SanDisk.com Table of Contents Windows Server 2003

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned

More information

SAS IT Intelligence for VMware Infrastructure: Resource Optimization and Cost Recovery Frank Lieble, SAS Institute Inc.

SAS IT Intelligence for VMware Infrastructure: Resource Optimization and Cost Recovery Frank Lieble, SAS Institute Inc. Paper 346-2009 SAS IT Intelligence for VMware Infrastructure: Resource Optimization and Cost Recovery Frank Lieble, SAS Institute Inc. ABSTRACT SAS and VMware have collaborated on an offering that leverages

More information

Unlock the value of data with smarter storage solutions.

Unlock the value of data with smarter storage solutions. Unlock the value of data with smarter storage solutions. Data is the currency of the new economy.... At HGST, we believe in the value of data, and we re helping the world harness its power.... Data is

More information

Fast, Low-Overhead Encryption for Apache Hadoop*

Fast, Low-Overhead Encryption for Apache Hadoop* Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software

More information

New Hitachi Virtual Storage Platform Family. Name Date

New Hitachi Virtual Storage Platform Family. Name Date New Hitachi Virtual Storage Platform Family Name Date Familiar Challenges and Big Transformations Too Much Information Too Much Complexity 24 x 7 Expectations Continually Rising Costs Software-Defined

More information

Maxta Storage Platform Enterprise Storage Re-defined

Maxta Storage Platform Enterprise Storage Re-defined Maxta Storage Platform Enterprise Storage Re-defined WHITE PAPER Software-Defined Data Center The Software-Defined Data Center (SDDC) is a unified data center platform that delivers converged computing,

More information

Big data management with IBM General Parallel File System

Big data management with IBM General Parallel File System Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT UNPRECEDENTED OBSERVABILITY, COST-SAVING PERFORMANCE ACCELERATION, AND SUPERIOR DATA PROTECTION KEY FEATURES Unprecedented observability

More information

SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY

SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY Among the priorities for efficient storage management is an appropriate protection architecture. This paper will examine how to architect storage

More information

Everything you need to know about flash storage performance

Everything you need to know about flash storage performance Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices

More information

Building the Business Case for Cloud: Real Ways Private Cloud Can Benefit Your Organization

Building the Business Case for Cloud: Real Ways Private Cloud Can Benefit Your Organization : Real Ways Private Cloud Can Benefit Your Organization In This Paper Leveraging cloud technology can help drive down costs while enabling service-oriented IT. Private and hybrid cloud approaches improve

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

FLASH 15 MINUTE GUIDE DELIVER MORE VALUE AT LOWER COST WITH XTREMIO ALL- FLASH ARRAY Unparal eled performance with in- line data services al the time

FLASH 15 MINUTE GUIDE DELIVER MORE VALUE AT LOWER COST WITH XTREMIO ALL- FLASH ARRAY Unparal eled performance with in- line data services al the time FLASH 15 MINUTE GUIDE DELIVER MORE VALUE AT LOWER COST WITH XTREMIO ALL- FLASH ARRAY Unparalleled performance with in- line data services all the time OVERVIEW Opportunities to truly innovate are rare.

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved

Accelerate > Converged Storage Infrastructure. DDN Case Study. ddn.com. 2013 DataDirect Networks. All Rights Reserved DDN Case Study Accelerate > Converged Storage Infrastructure 2013 DataDirect Networks. All Rights Reserved The University of Florida s (ICBR) offers access to cutting-edge technologies designed to enable

More information

5 KEY QUESTIONS FOR BIG DATA STORAGE STRATEGIES

5 KEY QUESTIONS FOR BIG DATA STORAGE STRATEGIES TECHNOLOGY IN DEPTH 5 KEY QUESTIONS FOR BIG DATA STORAGE STRATEGIES And Comprehensive Answers from DDN GRIDScaler OCTOBER 2012 Big data is an increasingly complex and differentiated workflow with unique

More information