HPC data becomes Big Data. Peter Braam
|
|
|
- Ethan Randell Ray
- 10 years ago
- Views:
Transcription
1 HPC data becomes Big Data Peter Braam
2 me Academia Maths & Computer Science Entrepreneur with startups (5x) 4 startups sold Lustre emerged Held executive jobs with acquirers 2014 Independent, advise, research Advise SKA Cambridge Research on automatic parallelization with Haskell community Help others Dec 2013 (C) 2013 Braam Research, All Rights Reserved 2
3 Contents Introduction market & key questions Some Big Data problems & Algorithms HPC storage Cloud storage Conclusions Dec 2013 (C) 2013 Braam Research, All Rights Reserved 3
4 Key questions & market trends Dec 2013 (C) 2013 Braam Research, All Rights Reserved 4
5 Two Questions Given an HPC storage system, how can it be used for Big Data Analysis? What storage platforms are candidates to meet HPC and Big Data requirements? Dec 2013 (C) 2013 Braam Research, All Rights Reserved 5
6 IDC market data Fact % of sites using co-processors 28.2% 76.9% HPC sites performing big data analysis 67% % of compute cycles dedicated to big data 30% % of sites using cloud infrastructure for HPC 18.8% 23.5% Year over year growth in high density servers ($) 25.5% Year over year growth in servers ($) -6.2% Dec 2013 (C) 2013 Braam Research, All Rights Reserved 6
7 Other facts Flash and much faster persistent memory tiers are inevitably coming. Multiple software challenges arise from this Management of tiers Much faster storage software to keep up with devices Gap between disk and other system performance continues to increase There is embedded processing on servers with attached storage and client-server processing with clients networked to servers. Pros & cons somewhat unclear. Dec 2013 (C) 2013 Braam Research, All Rights Reserved 7
8 Big Data Problems & Algorithms Dec 2013 (C) 2013 Braam Research, All Rights Reserved 8
9 Big Data Problems samples Input generally from simulation or sensors Climate modeling simulate then Find the hottest day each year in Cape Town Find very low pressure spots (typhoons) on Earth Genomics, Astronomy Find patterns (e.g. strings, galaxies) in huge data sets Pre-process data at TB/sec rates Data management Move all files with data on a particular server Dec 2013 (C) 2013 Braam Research, All Rights Reserved 9
10 Big Data Problems samples 2 Social network, advertising & intelligence Most of these become graph problems, some very hard Non-compliance in stock market transaction logs Replace legacy consumer information data warehousing with modern analytics Replacements of Teradata / Netezza sometimes difficult Modern platforms lack easy to use analytics language Dec 2013 (C) 2013 Braam Research, All Rights Reserved 10
11 Wide variations Some problems (e.g. some graph problems) must be executed in RAM. Graph500 benchmark 2000x speedup in 2.5 years Other problems require many iterations through disk-resident data Netezza analytics systems use FPGA s for accelerated streaming (e.g. filtering, compressing) Dec 2013 (C) 2013 Braam Research, All Rights Reserved 11
12 Big Data Algorithms Considerable variation Machine learning Bayesian analysis Indexing, sorting DB like Graph algorithms Maximal Information Coefficients generalize regressions Compressed sensing (aka sparse recovery) Topological Data Analysis Dec 2013 (C) 2013 Braam Research, All Rights Reserved 12
13 Ogres Analogously to Berkeley Dwarfs big data problems have been classified: see Understanding Big Data Applications and Architectures 1st JTC 1 SGBD Meeting SDSC San Diego March Geoffrey Fox Judy Qiu Shantenu Jha (Rutgers) Dec 2013 (C) 2013 Braam Research, All Rights Reserved 13
14 So Given these variations a single architecture is not likely to address all big data problems well. Dec 2013 (C) 2013 Braam Research, All Rights Reserved 14
15 HPC Storage Dec 2013 (C) 2013 Braam Research, All Rights Reserved 15
16 HPC data Traditional model cluster file system and Single Shared File (with # cores readers / writers) File Per Process (and 1 process per core ) Tightly coupled problems allow little scheduling of tasklets or redistribution of I/O Problems Throughput == #server nodes x (speed of slowest node) Very sensitive to component variation Monitoring tools fail to root cause Dec 2013 (C) 2013 Braam Research, All Rights Reserved 16
17 Results quite reasonable Systems like Lustre, GPFS, Panasas Use carefully configured and tested hardware Fast networks Deliver 80% of slowest hardware component Pipelines from clients to disk are uniformly wide Servers can deliver ~3GB/sec / controller Achilles heels: Metadata Availability Data management Dec 2013 (C) 2013 Braam Research, All Rights Reserved 17
18 A sample of hard cases First write then read. Why the gap? Opening & creating files is too slow. Should run >2x faster! First seen at ORNL in Metadata performance on Sequoia and on Cove (50 & 5 SSD drives) Low 1000 s to ~15K ops / sec Maximum seen ever ~50K ops Dec 2013 (C) 2013 Braam Research, All Rights Reserved 18
19 HPC hard cases ctd Larger numbers of concurrent metadata clients are not easy. Conclusion: 1. Problems systems like Lustre remain 2. Sensitivity to uniformly good hardware 3. Honest data from the users & understanding exists 4. It has been used at very large scale Acknowledgement: graphs from a variety of presentations given at LADD 2013 Dec 2013 (C) 2013 Braam Research, All Rights Reserved 19
20 Cloud data into HPC file system Intel s FastForward project Ingest massive ACG graphs through Hadoop Represent ACG using an HDF5 adaptation layer (HAL) & in Lustre DAOS objects. Then compute. Acknowledgement: Figure from Intel s hpdd.intel.com wiki Dec 2013 (C) 2013 Braam Research, All Rights Reserved 20
21 Cloud Storage Dec 2013 (C) 2013 Braam Research, All Rights Reserved 21
22 Hybrid solutions may be best TACC Wrangler system Big Data companion to Stampede DSSD storage is PCI connected and has KV interface 120 node Dell cluster with DSSD storage 275M IOPS Undoubtedly This will solve many big data problems well There will be problems that don t fit or for which flash is too slow Dec 2013 (C) 2013 Braam Research, All Rights Reserved 22
23 Typical Cloud Storage Combines memcached key value stores or DB s Relational, Distributed Key Value, Embedded Key Value MySql, Cassandra / Hbase, Rocksdb / LevelDB object stores (swift, CEPH, ) Results Read heavy loads from one cluster 100 s of servers serving 10M s of requests/sec Only the embedded DBs keep up with flash and NVRam Flash means: ~10us / read or write, RAM means <1us Flexible schemas for metadata Dec 2013 (C) 2013 Braam Research, All Rights Reserved 23
24 Manageability AWS elastic cloud master piece Open source solutions do similar Cassandra, CEPH, OpenStack Dec 2013 (C) 2013 Braam Research, All Rights Reserved 24
25 Tiered storage When is tiered storage important? For HPC dumping RAM requires flash cache Likely of increased importance: L1,2,3 PCM Flash Disk Tape Tiered storage can use container concept Cache misses fetch a container to faster memory High bandwidth transfers container relatively quickly One time latency e.g. 1 sec Then speed of faster tiers Key Point: neither cloud nor HPC has this now Dec 2013 (C) 2013 Braam Research, All Rights Reserved 25
26 Cloud object stores - CEPH Object is file with an id not with a name CEPH manages Removal and addition of storage Failed nodes, racks Quite clever load balancing and data placement CRUSH data placement perfect for management Dec 2013 (C) 2013 Braam Research, All Rights Reserved 26
27 Cloud objects still to demonstrate HPC bandwidth == #nodes x BW/node only limited testing at scale, no models Not yet clear: how it integrates with tiered storage Dealing with mixed workloads Dec 2013 (C) 2013 Braam Research, All Rights Reserved 27
28 Data layout - placement How to place many stripes? Bottleneck in RAID arrays: Rebuild a drive goes at rate of BW of 1 drive takes days Parity de-clustering & distributed spare Rebuild at BW of N drives (N = 60 / 600 / 6000?) For e.g redundancy, speedup 60/10, 600/10, etc. Benefit is large 5x 100x+ Algorithms & math is hard: block mappings Somewhat unproven for HPC loads Cloud objects have a form of parity declustering Dec 2013 (C) 2013 Braam Research, All Rights Reserved 28
29 Data layout erasure codes How to rebuild a single stripe faster Generalizes RAID, Solomon-Reed codes etc. Benefits stripe reconstruction I/O 1-2x Tons of attention and publications If the network is the slowest component this is important parity de-clustering is hard on network Dec 2013 (C) 2013 Braam Research, All Rights Reserved 29
30 Conclusions Dec 2013 (C) 2013 Braam Research, All Rights Reserved 30
31 Conclusions There are many Big Data algorithms There are many cloud storage solutions Big data on HPC several vendors New specialized solutions (DSSD) More attention for modeling the problems & solutions Inevitably mileage will vary depending on the problem. Dec 2013 (C) 2013 Braam Research, All Rights Reserved 31
32 Thank you Dec 2013 (C) 2013 Braam Research, All Rights Reserved 32
Netapp HPC Solution for Lustre. Rich Fenton ([email protected]) UK Solutions Architect
Netapp HPC Solution for Lustre Rich Fenton ([email protected]) UK Solutions Architect Agenda NetApp Introduction Introducing the E-Series Platform Why E-Series for Lustre? Modular Scale-out Capacity Density
High Performance Oracle RAC Clusters A study of SSD SAN storage A Datapipe White Paper
High Performance Oracle RAC Clusters A study of SSD SAN storage A Datapipe White Paper Contents Introduction... 3 Disclaimer... 3 Problem Statement... 3 Storage Definitions... 3 Testing Method... 3 Test
Solid State Storage in the Evolution of the Data Center
Solid State Storage in the Evolution of the Data Center Trends and Opportunities Bruce Moxon CTO, Systems and Solutions stec Presented at the Lazard Capital Markets Solid State Storage Day New York, June
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS
Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS M ost storage vendors now offer all-flash storage arrays, and many modern organizations recognize the need for these highperformance
Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage
White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage
Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015
A Diablo Technologies Whitepaper Diablo and VMware TM powering SQL Server TM in Virtual SAN TM May 2015 Ricky Trigalo, Director for Virtualization Solutions Architecture, Diablo Technologies Daniel Beveridge,
IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO)
IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO) Rick Koopman IBM Technical Computing Business Development Benelux [email protected] Enterprise class replacement for HDFS
Improving Lustre OST Performance with ClusterStor GridRAID. John Fragalla Principal Architect High Performance Computing
Improving Lustre OST Performance with ClusterStor GridRAID John Fragalla Principal Architect High Performance Computing Legacy RAID 6 No Longer Sufficient 2013 RAID 6 data protection challenges Long rebuild
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
The Flash Transformed Data Center & the Unlimited Future of Flash John Scaramuzzo Sr. Vice President & General Manager, Enterprise Storage Solutions
The Flash Transformed Data Center & the Unlimited Future of Flash John Scaramuzzo Sr. Vice President & General Manager, Enterprise Storage Solutions Flash Memory Summit 5-7 August 2014 1 Forward-Looking
High Performance Computing Specialists. ZFS Storage as a Solution for Big Data and Flexibility
High Performance Computing Specialists ZFS Storage as a Solution for Big Data and Flexibility Introducing VA Technologies UK Based System Integrator Specialising in High Performance ZFS Storage Partner
POSIX and Object Distributed Storage Systems
1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome
IBM System x GPFS Storage Server
IBM System x GPFS Storage Crispin Keable Technical Computing Architect 1 IBM Technical Computing comprehensive portfolio uniquely addresses supercomputing and mainstream client needs Technical Computing
Hadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
Enabling High performance Big Data platform with RDMA
Enabling High performance Big Data platform with RDMA Tong Liu HPC Advisory Council Oct 7 th, 2014 Shortcomings of Hadoop Administration tooling Performance Reliability SQL support Backup and recovery
GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"
GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID
IBM System x GPFS Storage Server
IBM System x GPFS Storage Server Schöne Aussicht en für HPC Speicher ZKI-Arbeitskreis Paderborn, 15.03.2013 Karsten Kutzer Client Technical Architect Technical Computing IBM Systems & Technology Group
RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29
RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Datacenter Operating Systems
Datacenter Operating Systems CSE451 Simon Peter With thanks to Timothy Roscoe (ETH Zurich) Autumn 2015 This Lecture What s a datacenter Why datacenters Types of datacenters Hyperscale datacenters Major
Choosing Storage Systems
Choosing Storage Systems For MySQL Peter Zaitsev, CEO Percona Percona Live MySQL Conference and Expo 2013 Santa Clara,CA April 25,2013 Why Right Choice for Storage is Important? 2 because Wrong Choice
Flash Use Cases Traditional Infrastructure vs Hyperscale
Flash Use Cases Traditional Infrastructure vs Hyperscale Steve Knipple, CTO / VP Engineering Atmosera : Global Hybrid Managed Services Provider Agenda Speaker Perspective The Infrastructure Market Traditional
ebay Storage, From Good to Great
ebay Storage, From Good to Great Farid Yavari Sr. Storage Architect - Global Platform & Infrastructure September 11,2014 ebay Journey from Good to Great 2009 to 2011 TURNAROUND 2011 to 2013 POSITIONING
High Performance Computing OpenStack Options. September 22, 2015
High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,
Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA
WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5
Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance
Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance Hybrid Storage Performance Gains for IOPS and Bandwidth Utilizing Colfax Servers and Enmotus FuzeDrive Software NVMe Hybrid
HPC ABDS: The Case for an Integrating Apache Big Data Stack
HPC ABDS: The Case for an Integrating Apache Big Data Stack with HPC 1st JTC 1 SGBD Meeting SDSC San Diego March 19 2014 Judy Qiu Shantenu Jha (Rutgers) Geoffrey Fox [email protected] http://www.infomall.org
NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB
bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de [email protected] T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,
Connecting Flash in Cloud Storage
Connecting Flash in Cloud Storage Kevin Deierling Vice President Mellanox Technologies kevind AT mellanox.com Santa Clara, CA 1 Five Key Requirements for Connecting Flash Storage in the Cloud 1. Economical
IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org
IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator
MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products
MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with
Storage Architectures for Big Data in the Cloud
Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas
Getting performance & scalability on standard platforms, the Object vs Block storage debate. Copyright 2013 MPSTOR LTD. All rights reserved.
Getting performance & scalability on standard platforms, the Object vs Block storage debate 1 December Webinar Session Getting performance & scalability on standard platforms, the Object vs Block storage
HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief
Technical white paper HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief Scale-up your Microsoft SQL Server environment to new heights Table of contents Executive summary... 2 Introduction...
Sonexion GridRAID Characteristics
Sonexion GridRAID Characteristics Mark Swan Performance Team Cray Inc. Saint Paul, Minnesota, USA [email protected] Abstract This paper will present performance characteristics of the Sonexion declustered
Intel RAID SSD Cache Controller RCS25ZB040
SOLUTION Brief Intel RAID SSD Cache Controller RCS25ZB040 When Faster Matters Cost-Effective Intelligent RAID with Embedded High Performance Flash Intel RAID SSD Cache Controller RCS25ZB040 When Faster
IBM ELASTIC STORAGE SEAN LEE
IBM ELASTIC STORAGE SEAN LEE Solution Architect Platform Computing Division IBM Greater China Group Agenda Challenges in Data Management What is IBM Elastic Storage Key Features Elastic Storage Server
HPC Advisory Council
HPC Advisory Council September 2012, Malaga CHRIS WEEDEN SYSTEMS ENGINEER WHO IS PANASAS? Panasas is a high performance storage vendor founded by Dr Garth Gibson Panasas delivers a fully supported, turnkey,
Hybrid Software Architectures for Big Data. [email protected] @hurence http://www.hurence.com
Hybrid Software Architectures for Big Data [email protected] @hurence http://www.hurence.com Headquarters : Grenoble Pure player Expert level consulting Training R&D Big Data X-data hot-line
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME? EMC and Intel work with multiple in-memory solutions to make your databases fly Thanks to cheaper random access memory (RAM) and improved technology,
Software-defined Storage Architecture for Analytics Computing
Software-defined Storage Architecture for Analytics Computing Arati Joshi Performance Engineering Colin Eldridge File System Engineering Carlos Carrero Product Management June 2015 Reference Architecture
New Storage System Solutions
New Storage System Solutions Craig Prescott Research Computing May 2, 2013 Outline } Existing storage systems } Requirements and Solutions } Lustre } /scratch/lfs } Questions? Existing Storage Systems
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server
Flash Memory Arrays Enabling the Virtualized Data Center. July 2010
Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,
Technology Insight Series
Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary
Big data management with IBM General Parallel File System
Big data management with IBM General Parallel File System Optimize storage management and boost your return on investment Highlights Handles the explosive growth of structured and unstructured data Offers
Big Data Trends and HDFS Evolution
Big Data Trends and HDFS Evolution Sanjay Radia Founder & Architect Hortonworks Inc Page 1 Hello Founder, Hortonworks Part of the Hadoop team at Yahoo! since 2007 Chief Architect of Hadoop Core at Yahoo!
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What
The Pitfalls of Deploying Solid-State Drive RAIDs
The Pitfalls of Deploying Solid-State Drive RAIDs Nikolaus Jeremic 1, Gero Mühl 1, Anselm Busse 2 and Jan Richling 2 Architecture of Application Systems Group 1 Faculty of Computer Science and Electrical
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat
Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat Why Computers Are Getting Slower The traditional approach better performance Why computers are
The Flash-Transformed Financial Data Center. Jean S. Bozman Enterprise Solutions Manager, Enterprise Storage Solutions Corporation August 6, 2014
The Flash-Transformed Financial Data Center Jean S. Bozman Enterprise Solutions Manager, Enterprise Storage Solutions Corporation August 6, 2014 Forward-Looking Statements During our meeting today we will
DataStax Enterprise, powered by Apache Cassandra (TM)
PerfAccel (TM) Performance Benchmark on Amazon: DataStax Enterprise, powered by Apache Cassandra (TM) Disclaimer: All of the documentation provided in this document, is copyright Datagres Technologies
All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at www.frankdenneman.
WHITE PAPER All-Flash Arrays Weren t Built for Dynamic Environments. Here s Why... This whitepaper is based on content originally posted at www.frankdenneman.nl 1 Monolithic shared storage architectures
Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server
Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server Flash storage adoption has increased in recent years, as organizations have deployed it to support business applications.
Accelerating Hadoop MapReduce Using an In-Memory Data Grid
Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage sponsored by Dan Sullivan Chapter 1: Advantages of Hybrid Storage... 1 Overview of Flash Deployment in Hybrid Storage Systems...
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
How To Test A Flash Storage Array For A Health Care Organization
AFA Storage Performance Testing and Validation Methodology PRESENTATION TITLE GOES HERE Peter Murray Load DynamiX Agenda Introduction Load DynamiX Testing Methodologies Performance Profiling Workload Modeling
Ceph. A file system a little bit different. Udo Seidel
Ceph A file system a little bit different Udo Seidel Ceph what? So-called parallel distributed cluster file system Started as part of PhD studies at UCSC Public announcement in 2006 at 7 th OSDI File system
Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
Hadoop on the Gordon Data Intensive Cluster
Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,
Can High-Performance Interconnects Benefit Memcached and Hadoop?
Can High-Performance Interconnects Benefit Memcached and Hadoop? D. K. Panda and Sayantan Sur Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University,
Quantcast Petabyte Storage at Half Price with QFS!
9-131 Quantcast Petabyte Storage at Half Price with QFS Presented by Silvius Rus, Director, Big Data Platforms September 2013 Quantcast File System (QFS) A high performance alternative to the Hadoop Distributed
File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System
CS341: Operating System Lect 36: 1 st Nov 2014 Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati File System & Device Drive Mass Storage Disk Structure Disk Arm Scheduling RAID
Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013
Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics
IBM Netezza High Capacity Appliance
IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data
Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
Building a Scalable Storage with InfiniBand
WHITE PAPER Building a Scalable Storage with InfiniBand The Problem...1 Traditional Solutions and their Inherent Problems...2 InfiniBand as a Key Advantage...3 VSA Enables Solutions from a Core Technology...5
HP Z Turbo Drive PCIe SSD
Performance Evaluation of HP Z Turbo Drive PCIe SSD Powered by Samsung XP941 technology Evaluation Conducted Independently by: Hamid Taghavi Senior Technical Consultant June 2014 Sponsored by: P a g e
Easier - Faster - Better
Highest reliability, availability and serviceability ClusterStor gets you productive fast with robust professional service offerings available as part of solution delivery, including quality controlled
SSDs: Practical Ways to Accelerate Virtual Servers
SSDs: Practical Ways to Accelerate Virtual Servers Session B-101, Increasing Storage Performance Andy Mills CEO Enmotus Santa Clara, CA November 2012 1 Summary Market and Technology Trends Virtual Servers
How To Speed Up A Flash Flash Storage System With The Hyperq Memory Router
HyperQ Hybrid Flash Storage Made Easy White Paper Parsec Labs, LLC. 7101 Northland Circle North, Suite 105 Brooklyn Park, MN 55428 USA 1-763-219-8811 www.parseclabs.com [email protected] [email protected]
Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database
WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
DeIC Watson Agreement - hvad betyder den for DeIC medlemmerne
DeIC Watson Agreement - hvad betyder den for DeIC medlemmerne Preben Jacobsen Solution Architect Nordic Lead, Software Defined Infrastructure Group IBM Danmark 2014 IBM Corporation Link: https://www.youtube.com/watch?v=_xcmh1lqb9i
How SSDs Fit in Different Data Center Applications
How SSDs Fit in Different Data Center Applications Tahmid Rahman Senior Technical Marketing Engineer NVM Solutions Group Flash Memory Summit 2012 Santa Clara, CA 1 Agenda SSD market momentum and drivers
SSDs: Practical Ways to Accelerate Virtual Servers
SSDs: Practical Ways to Accelerate Virtual Servers Session B-101, Increasing Storage Performance Andy Mills CEO Enmotus Santa Clara, CA November 2012 1 Summary Market and Technology Trends Virtual Servers
How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server
White Paper October 2014 Scaling MySQL Deployments Using HGST FlashMAX PCIe SSDs An HGST and Percona Collaborative Whitepaper Table of Contents Introduction The Challenge Read Workload Scaling...1 Write
Latency vs. Capacity Storage Projections 2012-2026
Wikibon.com - http://wikibon.com Latency vs. Capacity Storage Projections 2012-2026 by David Floyer - 24 August 2015 http://wikibon.com/latency-vs-capacity-storage-projections-2012-2026/ 1 / 6 Premise
Virtualization of the MS Exchange Server Environment
MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of
MaxDeploy Hyper- Converged Reference Architecture Solution Brief
MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Reference Architecture solutions are configured and tested for support with Maxta software- defined storage and with industry
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
SALSA Flash-Optimized Software-Defined Storage
Flash-Optimized Software-Defined Storage Nikolas Ioannou, Ioannis Koltsidas, Roman Pletka, Sasa Tomic,Thomas Weigold IBM Research Zurich 1 New Market Category of Big Data Flash Multiple workloads don t
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
ioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
Running Highly Available, High Performance Databases in a SAN-Free Environment
TECHNICAL BRIEF:........................................ Running Highly Available, High Performance Databases in a SAN-Free Environment Who should read this paper Architects, application owners and database
An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing
An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates
SGI Solutions for RDSI/CAUDIT 2013 SGI
SGI Solutions for RDSI/CAUDIT 1 Agenda SGI Company Strategy Overview Product, Solutions and Services SGI Customer Solution Examples SGI s Pricing Model SGI s Value for RDSI/Caudit 2 SGI: The Trusted Leader
How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014
How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014 2014 年 云 计 算 效 率 与 能 耗 暨 第 一 届 国 际 云 计 算 咨 询 委 员 会 中 国 高 峰 论 坛 Contents Background
Practical Applications of Lustre/ZFS Hybrid Systems LUG 2014 Miami FL
Practical Applications of Lustre/ZFS Hybrid Systems LUG 2014 Miami FL Q2-2014 Josh Judd, CTO Agenda Brief Review: Luster over ZFS Brief Overview: platforms used in example solutions Discuss three cases
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
