Enabling High performance Big Data platform with RDMA
|
|
- Marshall Tucker
- 8 years ago
- Views:
Transcription
1 Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014
2 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery From 451 Research 2013 Hadoop survey 2
3 Where can we improve Hadoop? Hive Pig Map Reduce SQL (e.g. Impala) HBase HDFS (Hadoop Distributed File System) High demand to improve Real-time operation Fast execution Streaming data Issues Inherent data latency issue with HDFS Cannot support large number of small files Efficiency of Map Reduce, Hbase, Hive, etc. 3
4 HDFS Operation NameNode HDFS Federation NameNode Client Read Write DataNode DataNode DataNode Replication 4 8 Replication HDFS Federation Faster Disks Faster CPU and Memory Bigger network pipe 4
5 HARDWARE KERNEL USER RDMA (Remote Directory Memory Access) Application 1 Buffer 1 Application Buffer 1 2 OS Buffer 1 Buffer 1 Buffer 1 Buffer 1 OS RDMA over InfiniBand or Ethernet HCA HCA NIC Buffer 1 Buffer 1 NIC TCP/IP RACK 1 RACK 2 5
6 RDMA: Critical for Efficient Data Movement ZERO Copy Remote Data Transfer USER KERNEL Application Buffer Application Buffer HARDWARE Kernel Bypass Protocol Offload Low Latency, High Performance Data Transfers InfiniBand - 56Gb/s RoCE * RDMA over Converged Ethernet 6
7 HDFS Operation with RDMA NameNode NameNode Client Read Write DataNode DataNode DataNode Replication 4 8 Replication
8 HDFS RDMA Acceleration Solution 1 Hadoop HDFS-RDMA acceleration: 100% java code written on top of JXIO Same memory footprint as the vanilla client/server uses First results show double performance for HDFS WRITE operation With 3 replications compared to vanilla 8
9 Accelio, High-Performance Reliable Messaging and RPC Library Open source && Faster RDMA integration to application Maximize message and CPU parallelism 9
10 HDFS RDMA Acceleration Solution 2 Package available at: Big performance gain with RDMA support 10
11 Map Reduce Workflow 11
12 RDMA-Enabled MapReduce Unstructured Data Accelerator - UDA Uses RDMA to do the Shuffle & Merge Plug-in architecture Open-source Supported Hadoop Distributions Apache 3.0, Apache 2.2.x, Apache 1.3 Cloudera Distribution Hadoop 4.4 Inbox 12
13 Storage Limitations for Hadoop Hadoop using local disk to maintain data locality and reduce latency High-value that resides on external storage systems Copy data onto HDFS, run Analytics, and then copy the results to another system Wasting storage space As data sources increase, managing data is nightmare Option of just accessing the external data without having to deal with copying Need to provide performance 13
14 Storage: From Scale-Up to Scale-Out Scale-out storage systems using distributed computing architectures Scalable and resilient 14
15 Sequential Read Performance (singe port) 15
16 Fastest and Lowest Latency Storage Access with iser K 4K IO Size iscsi (TCP/IP) 1 x FC 8 Gb port 4 x FC 8 Gb port iser 1 x 40GbE/IB Port iser 2 x 40GbE/IB Port (+Acceleratio n) KIOPs
17 Lustre as Hadoop Storage Solution RDMA enables highest Lustre performance 17
18 Hadoop over Cloud?? Performance? Benefits: Lowering the cost of innovation Procuring large scale resources quickly Running closer to the data Simplifying Hadoop operations Concerns: Heavily utilized, rather than being massively provisioned Cloud storage is slower and expensive Data locality makes a big difference for performance 18
19 Fastest OpenStack Storage Access Compute Servers V V V OS M OS M OS M Storage Servers OpenStack (Cinder) iscsi/iser Target (tgt) Hypervisor (KVM) Open-iSCSI w iser RDMA Cache Adapter Adapter Local Disks Using RDMA to accelerate iscsi storage RDMA Capable Interconnect Using OpenStack Built-in components and management RDMA is already inbox and used by OpenStack RDMA enables faster performance, with much lower CPU% 19
20 Fast Interconnect with RDMA to Boost Big Data 4X Faster Run Time! Benchmark: TestDFSIO (1TeraByte, 100 files) 2X Higher Performance! Benchmark: 1M Records Workload (4M Operations) 2X faster run time and 2X higher throughput 2X Faster Run Time! Benchmark: MemCacheD Operations 3X Faster Run Time! Benchmark: Redis Operations 20
21 RDMA Can Accelerate All Layers Compute I/O Nodes Filesystem Storage 21
22 What s Happening with Big Data Platform Big Data Meets HPC! 22
23 Questions? All trademarks are property of their respective owners. All information is provided As-Is without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and completeness of the information contained herein. HPC Advisory Council Mellanox undertakes no duty and assumes no obligation to update or correct any information presented herein 23 23
State of the Art Cloud Infrastructure
State of the Art Cloud Infrastructure Motti Beck, Director Enterprise Market Development WHD Global I April 2014 Next Generation Data Centers Require Fast, Smart Interconnect Software Defined Networks
More informationSMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
More informationHigh Performance OpenStack Cloud. Eli Karpilovski Cloud Advisory Council Chairman
High Performance OpenStack Cloud Eli Karpilovski Cloud Advisory Council Chairman Cloud Advisory Council Our Mission Development of next generation cloud architecture Providing open specification for cloud
More informationSolving I/O Bottlenecks to Enable Superior Cloud Efficiency
WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one
More informationComparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014
Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,
More informationUnstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012
Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume
More informationDriving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA
WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5
More informationCan High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
More informationMaximizing Hadoop Performance and Storage Capacity with AltraHD TM
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created
More informationAchieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
More informationBuilding a Scalable Storage with InfiniBand
WHITE PAPER Building a Scalable Storage with InfiniBand The Problem...1 Traditional Solutions and their Inherent Problems...2 InfiniBand as a Key Advantage...3 VSA Enables Solutions from a Core Technology...5
More informationStorage, Cloud, Web 2.0, Big Data Driving Growth
Storage, Cloud, Web 2.0, Big Data Driving Growth Kevin Deierling Vice President of Marketing October 25, 2013 Delivering the Highest ROI Across all Markets HPC Web 2.0 DB/Enterprise Cloud Financial Services
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More informationA Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks
A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department
More informationLS DYNA Performance Benchmarks and Profiling. January 2009
LS DYNA Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center The
More informationStorage Architectures for Big Data in the Cloud
Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas
More informationAccelerating and Simplifying Apache
Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly
More informationMellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct
Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 Direct Increased Performance, Scaling and Resiliency July 2012 Motti Beck, Director, Enterprise Market Development Motti@mellanox.com
More informationMellanox Accelerated Storage Solutions
Mellanox Accelerated Storage Solutions Moving Data Efficiently In an era of exponential data growth, storage infrastructures are being pushed to the limits of their capacity and data delivery capabilities.
More informationECLIPSE Performance Benchmarks and Profiling. January 2009
ECLIPSE Performance Benchmarks and Profiling January 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox, Schlumberger HPC Advisory Council Cluster
More informationSimplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions
Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions 64% of organizations were investing or planning to invest on Big Data technology
More informationModernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. ddn.com
DDN Technical Brief Modernizing Hadoop Architecture for Superior Scalability, Efficiency & Productive Throughput. A Fundamentally Different Approach To Enterprise Analytics Architecture: A Scalable Unit
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationHADOOP MOCK TEST HADOOP MOCK TEST I
http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at
More informationHyper-V over SMB Remote File Storage support in Windows Server 8 Hyper-V. Jose Barreto Principal Program Manager Microsoft Corporation
Hyper-V over SMB Remote File Storage support in Windows Server 8 Hyper-V Jose Barreto Principal Program Manager Microsoft Corporation Agenda Hyper-V over SMB - Overview How to set it up Configuration Options
More informationHyper-V over SMB: Remote File Storage Support in Windows Server 2012 Hyper-V. Jose Barreto Principal Program Manager Microsoft Corporation
Hyper-V over SMB: Remote Storage Support in Windows Server 2012 Hyper-V Jose Barreto Principal Program Manager Microsoft Corporation Abstract In this session, we cover the Windows Server 2012 Hyper-V support
More informationCSE-E5430 Scalable Cloud Computing Lecture 2
CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing
More informationSMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation
SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers Microsoft Corporation Agenda SMB Remote File Storage for Server Apps SMB Direct (SMB over RDMA) SMB Multichannel
More informationI/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
More informationIntegrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
More informationBig Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com StreamHorizon & Big Data Integrates into your Data Processing Pipeline Seamlessly integrates at any point of your your data processing pipeline Implements
More informationConnecting the Clouds
Connecting the Clouds Mellanox Connected Clouds Mellanox s Ethernet and InfiniBand interconnects enable and enhance worldleading cloud infrastructures around the globe. Utilizing Mellanox s fast server
More informationIOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org
IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator
More informationAccelerating Spark with RDMA for Big Data Processing: Early Experiences
Accelerating Spark with RDMA for Big Data Processing: Early Experiences Xiaoyi Lu, Md. Wasi- ur- Rahman, Nusrat Islam, Dip7 Shankar, and Dhabaleswar K. (DK) Panda Network- Based Compu2ng Laboratory Department
More informationEMC IRODS RESOURCE DRIVERS
EMC IRODS RESOURCE DRIVERS PATRICK COMBES: PRINCIPAL SOLUTION ARCHITECT, LIFE SCIENCES 1 QUICK AGENDA Intro to Isilon (~2 hours) Isilon resource driver Intro to ECS (~1.5 hours) ECS Resource driver Possibilities
More informationDriving Revenue Growth Across Multiple Markets
Driving Revenue Growth Across Multiple Markets Kevin Deierling Vice President, Marketing October 24, 2014 Multiple Growing Market Segments Key Markets Driving Growth High-Performance Computing Web 2.0
More informationBenchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
More informationAn Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
More informationBig Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
More informationHadoopTM Analytics DDN
DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate
More informationInstalling Hadoop over Ceph, Using High Performance Networking
WHITE PAPER March 2014 Installing Hadoop over Ceph, Using High Performance Networking Contents Background...2 Hadoop...2 Hadoop Distributed File System (HDFS)...2 Ceph...2 Ceph File System (CephFS)...3
More informationBig Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division
Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division In this talk Big data storage: Current trends Issues with current storage options Evolution of storage to support big
More informationInfiniBand Software and Protocols Enable Seamless Off-the-shelf Applications Deployment
December 2007 InfiniBand Software and Protocols Enable Seamless Off-the-shelf Deployment 1.0 Introduction InfiniBand architecture defines a high-bandwidth, low-latency clustering interconnect that is used
More informationConnecting Flash in Cloud Storage
Connecting Flash in Cloud Storage Kevin Deierling Vice President Mellanox Technologies kevind AT mellanox.com Santa Clara, CA 1 Five Key Requirements for Connecting Flash Storage in the Cloud 1. Economical
More informationVirtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
More informationMellanox Academy Online Training (E-learning)
Mellanox Academy Online Training (E-learning) 2013-2014 30 P age Mellanox offers a variety of training methods and learning solutions for instructor-led training classes and remote online learning (e-learning),
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationEMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst
White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned
More informationCDH AND BUSINESS CONTINUITY:
WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable
More informationNews and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
More informationMaximizing Hadoop Performance with Hardware Compression
Maximizing Hadoop Performance with Hardware Compression Robert Reiner Director of Marketing Compression and Security Exar Corporation November 2012 1 What is Big? sets whose size is beyond the ability
More informationHadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
More informationIntroduction to Cloud Design Four Design Principals For IaaS
WHITE PAPER Introduction to Cloud Design Four Design Principals For IaaS What is a Cloud...1 Why Mellanox for the Cloud...2 Design Considerations in Building an IaaS Cloud...2 Summary...4 What is a Cloud
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More informationCase Study : 3 different hadoop cluster deployments
Case Study : 3 different hadoop cluster deployments Lee moon soo moon@nflabs.com HDFS as a Storage Last 4 years, our HDFS clusters, stored Customer 1500 TB+ data safely served 375,000 TB+ data to customer
More informationWelcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components
Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop
More informationLS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance
11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu
More informationBuilding Enterprise-Class Storage Using 40GbE
Building Enterprise-Class Storage Using 40GbE Unified Storage Hardware Solution using T5 Executive Summary This white paper focuses on providing benchmarking results that highlight the Chelsio T5 performance
More informationFrom Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller
White Paper From Ethernet Ubiquity to Ethernet Convergence: The Emergence of the Converged Network Interface Controller The focus of this paper is on the emergence of the converged network interface controller
More informationTHE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.
THE EMC ISILON STORY Big Data In The Enterprise 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology Summary 2 What is Big Data? 3 The Big Data Challenge File Shares 90 and Archives 80 Bioinformatics
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationOutline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging
Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging
More informationHadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013
Hadoop MapReduce over Lustre* High Performance Data Division Omkar Kulkarni April 16, 2013 * Other names and brands may be claimed as the property of others. Agenda Hadoop Intro Why run Hadoop on Lustre?
More informationHigh Performance Computing OpenStack Options. September 22, 2015
High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,
More informationData-Intensive Programming. Timo Aaltonen Department of Pervasive Computing
Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti
More informationBig Data With Hadoop
With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
More informationOpen source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
More informationHADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics
HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationStarWind Virtual SAN for Microsoft SOFS
StarWind Virtual SAN for Microsoft SOFS Cutting down SMB and ROBO virtualization cost by using less hardware with Microsoft Scale-Out File Server (SOFS) By Greg Schulz Founder and Senior Advisory Analyst
More informationHadoop Architecture. Part 1
Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,
More informationThe virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.
White Paper Virtualized SAP: Optimize Performance with Cisco Data Center Virtual Machine Fabric Extender and Red Hat Enterprise Linux and Kernel-Based Virtual Machine What You Will Learn The virtualization
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationEnabling the Use of Data
Enabling the Use of Data Michael Kagan, CTO June 1, 2015 - Technion Computer Engineering Conference Safe Harbor Statement These slides and the accompanying oral presentation contain forward-looking statements
More informationBuilding a cost-effective and high-performing public cloud. Sander Cruiming, founder Cloud Provider
Building a cost-effective and high-performing public cloud Sander Cruiming, founder Cloud Provider 1 Agenda Introduction of Cloud Provider Overview of our previous cloud architecture Challenges of that
More informationRoCE vs. iwarp Competitive Analysis
WHITE PAPER August 21 RoCE vs. iwarp Competitive Analysis Executive Summary...1 RoCE s Advantages over iwarp...1 Performance and Benchmark Examples...3 Best Performance for Virtualization...4 Summary...
More informationBig Data Storage Options for Hadoop Sam Fineberg, HP Storage
Sam Fineberg, HP Storage SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations
More informationAdvancing Applications Performance With InfiniBand
Advancing Applications Performance With InfiniBand Pak Lui, Application Performance Manager September 12, 2013 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server and
More informationReplacing SAN with High Performance Windows Share over a Converged Network
WHITE PAPER November 2015 Replacing SAN with High Performance Windows Share over a Converged Network Abstract...1 Introduction...1 Early FC SAN (Storage Area Network)...1 FC vs. Ethernet...1 Changing SAN
More informationProtecting Big Data Data Protection Solutions for the Business Data Lake
White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With
More informationInfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures. Brian Sparks IBTA Marketing Working Group Co-Chair
InfiniBand Update Addressing new I/O challenges in HPC, Cloud, and Web 2.0 infrastructures Brian Sparks IBTA Marketing Working Group Co-Chair Page 1 IBTA & OFA Update IBTA today has over 50 members; OFA
More informationScaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
More informationAchieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003
Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks An Oracle White Paper April 2003 Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics
More informationAn Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing
An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates
More informationBig Data: Are You Ready? Kevin Lancaster
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
More informationAt-Scale Data Centers & Demand for New Architectures
Allen Samuels At-Scale Data Centers & Demand for New Architectures Software Architect, Software and Systems Solutions August 12, 2015 1 Forward-Looking Statements During our meeting today we may make forward-looking
More informationEMC XtremSF: Delivering Next Generation Performance for Oracle Database
White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
More informationWhat s new in Hyper-V 2012 R2
What s new in Hyper-V 2012 R2 Carsten Rachfahl MVP Virtual Machine Rachfahl IT-Solutions GmbH & Co KG www.hyper-v-server.de Thomas Maurer Cloud Architect & MVP itnetx gmbh www.thomasmaurer.ch Before Windows
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationNew Storage System Solutions
New Storage System Solutions Craig Prescott Research Computing May 2, 2013 Outline } Existing storage systems } Requirements and Solutions } Lustre } /scratch/lfs } Questions? Existing Storage Systems
More informationNew Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN
New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN Updated: May 19, 2015 Contents Introduction... 1 Cloud Integration... 1 OpenStack Support... 1 Expanded
More informationHadoop Optimizations for BigData Analytics
Hadoop Optimizations for BigData Analytics Weikuan Yu Auburn University Outline WBDB, Oct 2012 S-2 Background Network Levitated Merge JVM-Bypass Shuffling Fast Completion Scheduler WBDB, Oct 2012 S-3 Emerging
More informationOracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture
Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture Ron Weiss, Exadata Product Management Exadata Database Machine Best Platform to Run the
More informationBig data management with IBM General Parallel File System
Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers
More informationFast, Low-Overhead Encryption for Apache Hadoop*
Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software
More informationCS2510 Computer Operating Systems
CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction
More information