Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. ddn.com
|
|
|
- Silas Reeves
- 10 years ago
- Views:
Transcription
1 DDN Technical Brief Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. A Fundamentally Different Approach To Enterprise Analytics Architecture: A Scalable Unit Design Leveraging Shared High-Throughput Storage To Minimize Compute TCO Abstract: In this paper the author attempts to educate the user on the limitations of a traditional Hadoop architecture that is built on commodity compute with Direct Attached Storage [DAS]. The paper reviews the design imperatives of DataDirect Networks hscaler Apache Hadoop appliance architecture and how it has been engineered to try to eliminate the limitations that plague today s purely commodity approaches DataDirect Networks. All Rights Reserved.
2 The Impetus For Today s Hadoop Design At a time when commodity networking operated at 10MB/s and disks were each capable of achieving 80MB/s of data transfer performance (and whereas multiple disks can be configured either on a network or in a server chassis), the obvious mismatch in performance attributes identified by data center engineers and analysts highlighted severe efficiency challenges in then-current systems designs and the need for better approaches to data-intensive computing. As a result of the imbalance between network and storage resources in standard data centers and the perceived high costs of enterprise shared storage, data-intensive processing organizations began to embrace new methods of processing data, where the processing routines are brought to the data, which lives in commodity computers that participate in distributed processing of large analytic queries. The most popular of approach to this style of processing is, today, Apache Hadoop [Hadoop]. Hadoop supports the distribution of applications across commodity hardware in a shared-nothing fashion where each commodity server independently owns its data and where data is replicated across several commodity nodes for resiliency and performance purposes. Hadoop implements a computational process known as map/reduce. This is the process of dividing data sets into several fragments, distributing these fragments uniformly across a commodity processing cluster and processing across nodes in parallel. This approach was developed to minimize the cost and performance overhead of data movement across commodity networks and accelerate data processing. Since the emergence of Hadoop, the limitations associated with hard drive physics have created a new imbalance, where hard drive performance advancements have not kept pace with increases in networking and processing performance [see table 1]. Today, as high speed data center networking approaches 100Gb/s, the gradual increase in disk performance has resulted in a new imbalance; whereby inefficient spinning disk technologies have become the new data processing bottleneck for large-scale Hadoop implementations. While today s systems are still capable of economically utilizing the performance of spinning media (as opposed to SSDs, since the workload is still predominately throughput-oriented), the classic Hadoop function-shipping model of today is challenged by the ever-growing need for more node-local spinning disks and the performance utilization of this media is being challenged by the scale-out approaches of today s Hadoop data protection and distribution software Delta HDD Bandwidth MB/s x CPU Cores / Socket x Ethernet Gb/s x Table 1: Computing Commodity Advancements 2013 DataDirect Networks. All Rights Reserved. 2
3 Hadoop Systems Components & Bottlenecks To illustrate the various areas of optimization that are possible with Apache Hadoop, we will review the core design tenets and the associated configuration impact to cluster efficiency. Data Protection: Today s data protection layer in Hadoop is commonly implemented in a three-way replicated storage configuration where HDFS (the Hadoop File System a Java-based namespace and data protection framework) receives writes in a sequential fashion from the host to each of the unique nodes. This method of data protection can benefit from relinquishing the responsibility of replication via HDFS. By treating HDFS as a conventional file system, centralized storage can be employed to reduce the number of data copies to 1, using highspeed RAID or Erasure Coding techniques to protect the data, freeing the compute node from the burden of data replication in order to increase Hadoop node performance by up to 50%. The ancillary benefit to this approach also includes a reduction in hard drives in the Hadoop architecture by as much as 60%, which has resulting economic, data center and environmental benefits. Job Affinity: In large cluster configurations, Hadoop jobs are routinely challenged to process data which is not local to itself, breaking the paradigm of map/reduce processing. The amount of data that is retrieved from other nodes on the network, in a particular Hadoop job can be as high as 30%. The use of centralized, RDMA storage can result in an 80% decrease in I/O wait times for remote data retrieval, as compared to transferring data via TCP/IP. Map/Reduce Shuffle Efficiency: Whereas commodity networks are now capable of delivering performance at rates of 56Gb/s and greater, conventional network protocols are unable of encapsulating data efficiently and TCP/IP overhead continues to consume substantial portions of CPU cycles from these data-intensive operations. Historically, SAN and HPC networking technologies have been applied to resolving this problem and making compute nodes more efficient through the use of protocols that maximize bandwidth, while minimizing CPU overhead. Dataset 1 x 40GbE 1 x 56Gb IB Gain 80GB % 500GB % Table 2: Hadoop Compute Comparisons (in sec) 2013 DataDirect Networks. All Rights Reserved. 3
4 Whereas it is counter-intuitive to think that a Hadoop system demands high-speed networking when the processing is shipped to the data, in fact, the Shuffle process in map/reduce operations can reorient a large amount of data across a Hadoop cluster and the speed of this operation is a direct byproduct of networking and protocol choices made during the time of cluster architecture. Today, RDMA encapsulation of Shuffle data, using InfiniBand or RDMA over Converged Ethernet networking, is proving to provide dramatic efficiency gains for Hadoop clusters. Data Nodes and Compute Nodes: Let us look at the I/O profile of a normal Hadoop job. As shown in the system profile on the left, a Hadoop job will pause and wait for the CPU before trying to fetch the next set of data. This process serialization causes the I/O subsystem to go alternately from saturated to idle. This inefficiency wastes about 30% of a job's run time. The establishment of computeonly nodes in a Hadoop environment can present material benefits vs. a conventional one-node-fits-all approach. This model presents opportunities to provide much better sequential access to data storage, while dramatically reducing job resets/pauses. This parallelization is a radical new approach to job-processing, and can speed-up jobs at a hyper-linear rate, thereby making the cluster faster as it grows. By leveraging high-throughput, RDMA-connected storage, compute-only nodes can save as much as 30% of the time they would otherwise be spending on data pipelining. Data Center Packaging: When discussing efficiency, it s often easy to overlook the data center impact of commodity HW. At a time when whole data centers are being built for map/reduce computing, the economics are increasingly difficult to ignore. By turning Hadoop systems' design convention on it s head and implementing a highly-efficient and highly-dense architecture (where compute and disk resources are minimized), the resulting effect can be dramatic. Efficient configurations of Hadoop scalable compute + storage units, have demonstrated the ability to minimize data center impact by as much as 60% DataDirect Networks. All Rights Reserved. 4
5 Introducing hscaler: A Fundamentally New Approach To Enterprise Analytics hscaler, is a highly engineered and tightly integrated HW/SW appliance that features the Hortonworks distribution of the Apache Hadoop platform. It leverages DDN s Storage Fusion Architecture family of high-throughput, RDMA-attached storage systems to address many levels of inefficiencies, which exist in today s Apache Hadoop environments. These inefficiencies continue to grow as CPU and networking advances outpace the legacy methods of data storage management and delivery in commodity Hadoop clusters. DDN s hscaler product was, first and foremost engineered to be a simple-to-deploy, simple-to-operate, scale-out analytics platform, which features high-availability, and is factory delivered to minimize time-to-insight. To be competitive in a market that is dominated by commodity economics, hscaler leverages the power of the world s fastest storage technology to exploit the power of industry-standard componentry. Key aspects of the product include: Turnkey appliance and Hadoop process management through DDN s DirectMon analytics cluster management utility. Fully-integrated Hadoop and high-speed ETL tools, all supported and managed by DDN in a"one throat to choke" model. A scalable unit design, where compute and DDN s SFA storage is built into an appliance bundle. These appliances can be iterated out onto a network to achieve an aggregated performance and capacity equivalent to an 8,000 node Hadoop cluster. Configuration is flexible. Compute and storage can be added to each scalable unit independently, to ensure that the least amount of infrastructure is consumed for any performance & capacity profile. A unique approach to Hadoop whereby compute nodes and data nodes are scaled independently. This reengineering of the system and job scheduling design opens up the ComputeNode, to provide much more complex transforms of the data. This is in a nearly embarrassingly parallel scalability method that alone accelerates cluster performance by upwards of 30%. At the core of hscaler, is DDN s flagship SFA12K-40 storage appliance. The system is capable of delivering throughput up to 40GB/s, over 1.4M IOPS, making it the world s fastest storage appliance. The system is configurable with both spinning and Flash disks. This enables Hadoop to efficiently deliver the performance that is customized to the composition of the data and processing requirements. The system also features the highest levels of data center density in the industry, by housing up to 1,680 HDDs in just two data center racks. The SFA12K-40 is up to 300% more dense than competing storage systems. DDN SFA products demonstrate up to 800% greater performance than legacy enterprise storage and uniquely enables configurations where powerful, high-throughput storage can be cost-effectively coupled with today s data-hungry Hadoop compute nodes at speeds greater than direct-attached storage speeds. Real-time SFA performance enables mitigation of drive or enclosure failure impact to performance to preserve sustained cluster processing performance DataDirect Networks. All Rights Reserved. 5
6 Summary While Hadoop and the map/reduce paradigm have resulted in advances in time to insight by orders of magnitude, today s enterprises still remain challenged to adopt Hadoop technology. This is due to the complexity of adopting so many new Hadoop concepts and the substantial challenges associated with implementing them on commodity clusters. The root cause of today s hesitation in adopting Hadoop lies within the complex deployment methods. This causes IT departments to take a hands-off approach, due to the fact that the majority of the architecture work is done by highly-skilled data scientists. With hscaler, DDN has engineered simplicity and efficiency into this next-generation Hadoop appliance. This delivers a Hadoop experience which is not only IT friendly, but focuses on deriving business value at scale. By offloading every aspect of Hadoop I/O, data protection and packaging a cluster with a highly-resilient, dense and highthroughput data storage platform. Now, DDN has increased map/reduce performance by up to 700%. This enables hscaler to deliver new efficiencies and substantial savings to your bottom line. DDN About Us DataDirect Networks (DDN) is the world leader in massively scalable storage. We are the leading provider of data storage and processing solutions and professional services that enable contentrich and high-growth IT environments to achieve the highest levels of systems scalability, efficiency and simplicity. DDN enables enterprises to extract value and deliver results from their information. Our customers include the world s leading online content and social networking providers, high-performance cloud and grid computing, life sciences, media production organizations and security and intelligence organizations. Deployed in thousands of mission-critical environments, worldwide, DDN s solutions have been designed, engineered and proven in the world s most scalable data centers to ensure competitive business advantage for today s information-powered enterprise. For more information, go to www. or call , DataDirect Networks, Inc. All Rights Reserved. DataDirect Networks, hscaler, DirectMon, Storage Fusion Architecture, SFA, and SFA12K are trademarks of DataDirect Networks. All other trademarks are the property of their respective owners. Version-1 2/ DataDirect Networks. All Rights Reserved. 6
HadoopTM Analytics DDN
DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate
Enabling High performance Big Data platform with RDMA
Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery
Improving Time to Results for Seismic Processing with Paradigm and DDN. ddn.com. DDN Whitepaper. James Coomer and Laurent Thiers
DDN Whitepaper Improving Time to Results for Seismic Processing with Paradigm and DDN James Coomer and Laurent Thiers 2014 DataDirect Networks. All Rights Reserved. Executive Summary Companies in the oil
Big Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA
WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5
Flash Memory Arrays Enabling the Virtualized Data Center. July 2010
Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,
With DDN Big Data Storage
DDN Solution Brief Accelerate > ISR With DDN Big Data Storage The Way to Capture and Analyze the Growing Amount of Data Created by New Technologies 2012 DataDirect Networks. All Rights Reserved. The Big
Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk
WHITE PAPER Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk 951 SanDisk Drive, Milpitas, CA 95035 2015 SanDisk Corporation. All rights reserved. www.sandisk.com Table of Contents Introduction
SMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era
Enterprise Strategy Group Getting to the bigger truth. White Paper Direct Scale-out Flash Storage: Data Path Evolution for the Flash Storage Era Apeiron introduces NVMe-based storage innovation designed
Solving I/O Bottlenecks to Enable Superior Cloud Efficiency
WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one
Block based, file-based, combination. Component based, solution based
The Wide Spread Role of 10-Gigabit Ethernet in Storage This paper provides an overview of SAN and NAS storage solutions, highlights the ubiquitous role of 10 Gigabit Ethernet in these solutions, and illustrates
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Software-defined Storage Architecture for Analytics Computing
Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume
Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,
ANY SURVEILLANCE, ANYWHERE, ANYTIME
ANY SURVEILLANCE, ANYWHERE, ANYTIME WHITEPAPER DDN Storage Powers Next Generation Video Surveillance Infrastructure INTRODUCTION Over the past decade, the world has seen tremendous growth in the use of
High-Performance Networking for Optimized Hadoop Deployments
High-Performance Networking for Optimized Hadoop Deployments Chelsio Terminator 4 (T4) Unified Wire adapters deliver a range of performance gains for Hadoop by bringing the Hadoop cluster networking into
Microsoft Windows Server in a Flash
Microsoft Windows Server in a Flash Combine Violin s enterprise-class storage with the ease and flexibility of Windows Storage Server in an integrated solution so you can achieve higher performance and
Microsoft Windows Server Hyper-V in a Flash
Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise-class storage arrays with the ease and flexibility of Windows Storage Server in an integrated solution to achieve higher density,
Integrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
Scala Storage Scale-Out Clustered Storage White Paper
White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current
Can High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics
Platfora Big Data Analytics
Platfora Big Data Analytics ISV Partner Solution Case Study and Cisco Unified Computing System Platfora, the leading enterprise big data analytics platform built natively on Hadoop and Spark, delivers
WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief
DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
WHITE PAPER Addressing Enterprise Computing Storage Performance Gaps with Enterprise Flash Drives
WHITE PAPER Addressing Enterprise Computing Storage Performance Gaps with Enterprise Flash Drives Sponsored by: Pliant Technology Benjamin Woo August 2009 Matthew Eastwood EXECUTIVE SUMMARY Global Headquarters:
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000
Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000 Clear the way for new business opportunities. Unlock the power of data. Overcoming storage limitations Unpredictable data growth
EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server
White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller
White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller
MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products
MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with
Building a Flash Fabric
Introduction Storage Area Networks dominate today s enterprise data centers. These specialized networks use fibre channel switches and Host Bus Adapters (HBAs) to connect to storage arrays. With software,
Optimizing Web Infrastructure on Intel Architecture
White Paper Intel Processors for Web Architectures Optimizing Web Infrastructure on Intel Architecture Executive Summary and Purpose of this Paper Today s data center infrastructures must adapt to mobile
RevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
Please give me your feedback
Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &
Protecting Big Data Data Protection Solutions for the Business Data Lake
White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With
WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data:
WHitE PaPER: Easing the Way to the cloud: 1 WHITE PAPER Get Ready for Big Data: How Scale-Out NaS Delivers the Scalability, Performance, Resilience and manageability that Big Data Environments Demand 2
Hadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
How To Get The Most Out Of A Large Data Set
DDN Solution Brief Overcoming > The Big Data Technology Hurdle Turning Data into Answers with DDN & Vertica 20 Networks. All Rights Reserved. Executive Summary Networks and Vertica have collaborated to
Hadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution
Arista 10 Gigabit Ethernet Switch Lab-Tested with Panasas ActiveStor Parallel Storage System Delivers Best Results for High-Performance and Low Latency for Scale-Out Cloud Storage Applications Introduction
Accelerating and Simplifying Apache
Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly
BIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
Easier - Faster - Better
Highest reliability, availability and serviceability ClusterStor gets you productive fast with robust professional service offerings available as part of solution delivery, including quality controlled
Virtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
Accelerating High-Speed Networking with Intel I/O Acceleration Technology
White Paper Intel I/O Acceleration Technology Accelerating High-Speed Networking with Intel I/O Acceleration Technology The emergence of multi-gigabit Ethernet allows data centers to adapt to the increasing
EMC XtremSF: Delivering Next Generation Performance for Oracle Database
White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What
ioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH
ECMWF HPC Workshop: Accelerating Data Management
October 2012 ECMWF HPC Workshop: Accelerating Data Management Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Glenn Wright Systems Architect, DDN Data-Driven Paradigm
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
In-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
Technology Insight Series
Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary
Proact whitepaper on Big Data
Proact whitepaper on Big Data Summary Big Data is not a definite term. Even if it sounds like just another buzz word, it manifests some interesting opportunities for organisations with the skill, resources
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce
White Paper Solarflare High-Performance Computing (HPC) Applications
Solarflare High-Performance Computing (HPC) Applications 10G Ethernet: Now Ready for Low-Latency HPC Applications Solarflare extends the benefits of its low-latency, high-bandwidth 10GbE server adapters
How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router
HyperQ Hybrid Flash Storage Made Easy White Paper Parsec Labs, LLC. 7101 Northland Circle North, Suite 105 Brooklyn Park, MN 55428 USA 1-763-219-8811 www.parseclabs.com [email protected] [email protected]
Microsoft Windows Server Hyper-V in a Flash
Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise- class all- flash storage arrays with the ease and capabilities of Windows Storage Server in an integrated solution to achieve higher
IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads
89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com @EdisonGroupInc 212.367.7400 IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads A Competitive Test and Evaluation Report
An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing
An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates
Building Enterprise-Class Storage Using 40GbE
Building Enterprise-Class Storage Using 40GbE Unified Storage Hardware Solution using T5 Executive Summary This white paper focuses on providing benchmarking results that highlight the Chelsio T5 performance
Performance Analysis: Scale-Out File Server Cluster with Windows Server 2012 R2 Date: December 2014 Author: Mike Leone, ESG Lab Analyst
ESG Lab Review Performance Analysis: Scale-Out File Server Cluster with Windows Server 2012 R2 Date: December 2014 Author: Mike Leone, ESG Lab Analyst Abstract: This ESG Lab review documents the storage
Maxta Storage Platform Enterprise Storage Re-defined
Maxta Storage Platform Enterprise Storage Re-defined WHITE PAPER Software-Defined Data Center The Software-Defined Data Center (SDDC) is a unified data center platform that delivers converged computing,
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Forward-Looking Statements During our meeting today we may make forward-looking
StarWind Virtual SAN for Microsoft SOFS
StarWind Virtual SAN for Microsoft SOFS Cutting down SMB and ROBO virtualization cost by using less hardware with Microsoft Scale-Out File Server (SOFS) By Greg Schulz Founder and Senior Advisory Analyst
Hadoop Cluster Applications
Hadoop Overview Data analytics has become a key element of the business decision process over the last decade. Classic reporting on a dataset stored in a database was sufficient until recently, but yesterday
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created
Accelerating Hadoop MapReduce Using an In-Memory Data Grid
Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for
Building a Scalable Storage with InfiniBand
WHITE PAPER Building a Scalable Storage with InfiniBand The Problem...1 Traditional Solutions and their Inherent Problems...2 InfiniBand as a Key Advantage...3 VSA Enables Solutions from a Core Technology...5
Maximum performance, minimal risk for data warehousing
SYSTEM X SERVERS SOLUTION BRIEF Maximum performance, minimal risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (95TB) The rapid growth of technology has
Mellanox Accelerated Storage Solutions
Mellanox Accelerated Storage Solutions Moving Data Efficiently In an era of exponential data growth, storage infrastructures are being pushed to the limits of their capacity and data delivery capabilities.
Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database
WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
Connecting Flash in Cloud Storage
Connecting Flash in Cloud Storage Kevin Deierling Vice President Mellanox Technologies kevind AT mellanox.com Santa Clara, CA 1 Five Key Requirements for Connecting Flash Storage in the Cloud 1. Economical
Networking in the Hadoop Cluster
Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop
3G Converged-NICs A Platform for Server I/O to Converged Networks
White Paper 3G Converged-NICs A Platform for Server I/O to Converged Networks This document helps those responsible for connecting servers to networks achieve network convergence by providing an overview
SQL Server 2012 Parallel Data Warehouse. Solution Brief
SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...
TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC
TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC Vision Big data and analytic initiatives within enterprises have been rapidly maturing from experimental efforts to production-ready deployments.
Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division
Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division In this talk Big data storage: Current trends Issues with current storage options Evolution of storage to support big
DataStax Enterprise, powered by Apache Cassandra (TM)
PerfAccel (TM) Performance Benchmark on Amazon: DataStax Enterprise, powered by Apache Cassandra (TM) Disclaimer: All of the documentation provided in this document, is copyright Datagres Technologies
EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst
White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned
MaxDeploy Hyper- Converged Reference Architecture Solution Brief
MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Reference Architecture solutions are configured and tested for support with Maxta software- defined storage and with industry
High Performance MySQL Cluster Cloud Reference Architecture using 16 Gbps Fibre Channel and Solid State Storage Technology
High Performance MySQL Cluster Cloud Reference Architecture using 16 Gbps Fibre Channel and Solid State Storage Technology Evaluation report prepared under contract with Brocade Executive Summary As CIOs
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability
White Paper Windows TCP Chimney: Network Protocol Offload for Optimal Application Scalability and Manageability The new TCP Chimney Offload Architecture from Microsoft enables offload of the TCP protocol
Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
