SDGen: Mimicking Datasets for Content Generation in Storage Benchmarks
|
|
|
- Doreen Thomas
- 9 years ago
- Views:
Transcription
1 SDGen: Mimicking Datasets for Content Generation in Storage Benchmarks Raúl Gracia-Tinedo (Universitat Rovira i Virgili, Spain) Danny Harnik, Dalit Naor, Dmitry Sotnikov (IBM Research-Haifa, Israel) Sivan Toledo, Aviad Zuck (Tel-Aviv University, Israel)
2 Pre-Introduction Stones in the backpack!!! Just thin air Random Data Zero Data 2
3 Introduction Benchmarking is essential to evaluate storage systems: File systems, databases, micro-benchmarks FileBench, LinkBench, Bonnie++, YCSB, Many storage benchmarks try to recreate real workloads: Operations per unit of time, R/W behavior, But, what about the data generated during a benchmark? Real dataset: representative, proprietary, potentially large Simple synthetic data (zeros, random data): not-representative, easy to create, reproducible 3
4 The Problem Does the benchmarking data actually matter? ZFS Example: A file system with built-in compression ZFS is significantly content-sensitive if compression enabled The throughput also varies depending on the compressor Conclusion: Yes, it matters if data reduction is involved! 4
5 Current Solutions Some benchmarks try to emulate the compressibility of data (LinkBench, Fio, VDBench): Mixing compressible/incompressible data at right proportion. Problems (LinkBench data vs real data): Accurate compression ratios but insensitive to compressor Unrealistic compression times Heterogeneity is not captured Rand Zeros Rand Zeros 50% compressible! zlib - Text Data (Calgary Corpus) 5
6 Our Mission Complex situation: Most storage benchmarks generate unrealistic contents Representative data is normally not shared due to privacy issues Not good for the performance evaluation of storage systems with data reduction techniques built-in. We need a common approach to generate realistic and reproducible benchmarking data. In this work, we focus on compression benchmarking. 6
7 Summary of our Work Synthetic Data GENerator (SDGen): open and extensible framework to generate realistic data for storage benchmarks. Goal: mimic real datasets. Compact, reusable and anonymized dataset representation. Mimicking compression: identify the properties of data that are key to the performance of popular lossless compressors (e.g. zlib, lz4). Usability and integration: SDGen is available for download and has been integrated in popular benchmarks (LinkBench, Impressions). 7
8 SDGen Lifecycle SDGen: Concept & Overview Mimicking method: capture the characteristics of data that affect data reduction techniques to generate similar synthetic data. SDGen works in two main phases: Scan Phase Generation Phase Scan Dataset Build Characteri zation Share it Load Characteri zation Generate Data SDGen can do full scans or use sampling. SDGen requires knowing what to scan for and how to generate data. 8
9 Mimicking data for compression We empirically found two properties that affect the behavior of compression engines: Repetition length distribution Key for compression time & ratio Typically follows a power-law Byte frequency Impacts on entropy coding Changes importantly depending on data 9
10 Generating synthetic data Goals: Generate data with similar properties (repetition lengths, byte freq.) Fast generation throughput At high-level, we generate a data chunk as follows: 1) Random decision: repetition or not? 2) Repetition: No repetition: insert insert repeated newly randomized data data Initialize source of repetitions Repeated sequence 3) Pick a random repeated sequence length from the the repetition histogram 4) Insert repeated random data Synthetic chunk Rep. Len. Histogram 10 Data generator Byte Freq. Histogram
11 Evaluation Objective Metrics Compression ratio Compression time Datasets Calgary/Canterbury corpus Silesia Corpus PDFs (FAST conferences) Media (IBM engineers) Sensor network data Enwiki9 Private Mix (VMs,.xml,.html, ) Additional Mimicked Properties Repetition length Entropy (byte frequencies) Compressors Target: Lossless compression based on byte level repetition finding and/or on entropy encoding (zlib, lz4). We tested other families of compressors (bzip2, lzma). 11
12 Evaluation: Mimicked Properties Experiment: compare repetition length distributions and byte entropy in real and SDGen data. SDGen generates data that closely mimics these metrics. 12
13 Evaluation: Compression Ratio & Time Experiment: Capture per-chunk compression ratios and times for both synthetic and real datasets. Per-chunk compression ratio Compression ratios are closely mimicked Heterogeneity is also well captured within a dataset Per-chunk compression time Compression times are harder to mimic (especially for lz4) Still, for most data types compressors behave similarly 13
14 Evaluation: Performance of ZFS Experiment: write to ZFS 1GB files augmenting previous datasets. ZFS exhibits similar behavior for both real and our synthetic data. ZFS digests faster LinkBench data (+12% to +44%). DNA sequencing files in Calgary corpus are specially hard to compress 14
15 Evaluation: Integration with LinkBench Experiment: LinkBench write workload using distinct data types (ZFS + SSD storage). SDGen serves as data generation layer for LinkBench. Write latency is similar in both synthetic and text dataset. 15
16 Conclusions & Future Directions Data is an important aspect of storage benchmarking when data reduction is involved (compression, dedup). We presented SDGen: a framework for generating realistic and sharable benchmarking data. Idea: scan data, build a characterization, share it, generate data We designed a method to mimic data compression ratios and times for popular lossless compressors. We plan to extend SDGen to mimic data deduplication. 16
17 Q&A Thanks for your attention! SDGen code: Funding projects: Towards the next generation of open Personal Clouds Software-Defined Storage for Big Data 17
18 Backup: Generation Performance Characterizations of chunks can be used in parallel for generation. Generating uncompressible data is more expensive. We plan optimizations to wisely reuse random data for boosting throughput. 18
Estimating Deduplication Ratios in Large Data Sets
IBM Research labs - Haifa Estimating Deduplication Ratios in Large Data Sets Danny Harnik, Oded Margalit, Dalit Naor, Dmitry Sotnikov Gil Vernik Estimating dedupe and compression ratios some motivation
Data Reduction: Deduplication and Compression. Danny Harnik IBM Haifa Research Labs
Data Reduction: Deduplication and Compression Danny Harnik IBM Haifa Research Labs Motivation Reducing the amount of data is a desirable goal Data reduction: an attempt to compress the huge amounts of
Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group
Deduplication, Compression and Pattern-Based Testing for All Flash Storage Arrays Peter Murray - Load DynamiX Leah Schoeb - Evaluator Group Introduction Advanced AFAs are a Different Animal Flash behavior
Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges
Reference Guide WindSpring Data Management Technology (DMT) Solving Today s Storage Optimization Challenges September 2011 Table of Contents The Enterprise and Mobile Storage Landscapes... 3 Increased
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
Everything you need to know about flash storage performance
Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices
Reducing Replication Bandwidth for Distributed Document Databases
Reducing Replication Bandwidth for Distributed Document Databases Lianghong Xu 1, Andy Pavlo 1, Sudipta Sengupta 2 Jin Li 2, Greg Ganger 1 Carnegie Mellon University 1, Microsoft Research 2 #1 You can
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique
A Novel Way of Deduplication Approach for Cloud Backup Services Using Block Index Caching Technique Jyoti Malhotra 1,Priya Ghyare 2 Associate Professor, Dept. of Information Technology, MIT College of
3Gen Data Deduplication Technical
3Gen Data Deduplication Technical Discussion NOTICE: This White Paper may contain proprietary information protected by copyright. Information in this White Paper is subject to change without notice and
Flash Storage: Trust, But Verify
Flash Storage: Trust, But Verify A better, vendor-independent way to analyze flash performance Leah Schoeb, Evaluator Group Peter Murray, Load DynamiX 1 Introductions 2 Speakers Leah Schoeb Senior Partner
Speeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,
How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)
WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...
PARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
Inside Dropbox: Understanding Personal Cloud Storage Services
Inside Dropbox: Understanding Personal Cloud Storage Services Idilio Drago Marco Mellia Maurizio M. Munafò Anna Sperotto Ramin Sadre Aiko Pras IRTF Vancouver Motivation and goals 1 Personal cloud storage
Datasheet iscsi Protocol
Protocol with DCB PROTOCOL PACKAGE Industry s premiere validation system for SAN technologies Overview Load DynamiX offers SCSI over TCP/IP transport () support to its existing powerful suite of file,
LBPerf: An Open Toolkit to Empirically Evaluate the Quality of Service of Middleware Load Balancing Services
LBPerf: An Open Toolkit to Empirically Evaluate the Quality of Service of Middleware Load Balancing Services Ossama Othman Jaiganesh Balasubramanian Dr. Douglas C. Schmidt {jai, ossama, schmidt}@dre.vanderbilt.edu
How To Test A Flash Storage Array For A Health Care Organization
AFA Storage Performance Testing and Validation Methodology PRESENTATION TITLE GOES HERE Peter Murray Load DynamiX Agenda Introduction Load DynamiX Testing Methodologies Performance Profiling Workload Modeling
Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
Calsoft Webinar - Debunking QA myths for Flash- Based Arrays
Most Trusted Names in Data Centre Products Rely on Calsoft! September 2015 Calsoft Webinar - Debunking QA myths for Flash- Based Arrays Agenda Introduction to Types of Flash-Based Arrays Challenges in
A STUDY OF WORKLOAD CHARACTERIZATION IN WEB BENCHMARKING TOOLS FOR WEB SERVER CLUSTERS
382 A STUDY OF WORKLOAD CHARACTERIZATION IN WEB BENCHMARKING TOOLS FOR WEB SERVER CLUSTERS Syed Mutahar Aaqib 1, Lalitsen Sharma 2 1 Research Scholar, 2 Associate Professor University of Jammu, India Abstract:
Reliability and Fault Tolerance in Storage
Reliability and Fault Tolerance in Storage Dalit Naor/ Dima Sotnikov IBM Haifa Research Storage Systems 1 Advanced Topics on Storage Systems - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom
Side channels in cloud services, the case of deduplication in cloud storage
Side channels in cloud services, the case of deduplication in cloud storage Danny Harnik, Benny Pinkas, Alexandra Shulman-Peleg Presented by Yair Yona Yair Yona (TAU) Side channels in cloud services Advanced
HP StoreOnce D2D. Understanding the challenges associated with NetApp s deduplication. Business white paper
HP StoreOnce D2D Understanding the challenges associated with NetApp s deduplication Business white paper Table of contents Challenge #1: Primary deduplication: Understanding the tradeoffs...4 Not all
IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org
IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator
VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE
1 VNX HYBRID FLASH BEST PRACTICES FOR PERFORMANCE JEFF MAYNARD, CORPORATE SYSTEMS ENGINEER 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product
Key Components of WAN Optimization Controller Functionality
Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications
Measuring Performance of Solid State Storage Arrays
Technology Insight Paper Measuring Performance of Solid State Storage Arrays Using Data Set and Data Stream Pattern Modeling By Leah Schoeb June 2014 Enabling you to make the best technology decisions
Cloud Gateway. Agenda. Cloud concepts Gateway concepts My work. Monica Stebbins
Approved for Public Release; Distribution Unlimited. Case Number 15 0196 Cloud Gateway Monica Stebbins Agenda 2 Cloud concepts Gateway concepts My work 3 Cloud concepts What is Cloud 4 Similar to hosted
Data Compression and Deduplication. LOC 2010 2010 Cisco Systems, Inc. All rights reserved.
Data Compression and Deduplication LOC 2010 2010 Systems, Inc. All rights reserved. 1 Data Redundancy Elimination Landscape VMWARE DeDE IBM DDE for Tank Solaris ZFS Hosts (Inline and Offline) MDS + Network
FAST 11. Yongseok Oh <[email protected]> University of Seoul. Mobile Embedded System Laboratory
CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh University of Seoul Mobile Embedded System Laboratory
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database
Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database Built up on Cisco s big data common platform architecture (CPA), a
HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant
DISCOVER HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant HP StorageWorks Data Protection Solutions HP has it covered Near continuous data protection Disk Mirroring Advanced Backup
SOLUTION BRIEF. Resolving the VDI Storage Challenge
CLOUDBYTE ELASTISTOR QOS GUARANTEE MEETS USER REQUIREMENTS WHILE REDUCING TCO The use of VDI (Virtual Desktop Infrastructure) enables enterprises to become more agile and flexible, in tune with the needs
Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying
Hue Streams Seismic Compression Technology Hue Streams real-time seismic compression results in a massive reduction in storage utilization and significant time savings for all seismic-consuming workflows.
Flash at the price of disk Redefining the Economics of Storage. Kris Van Haverbeke Enterprise Marketing Manager Dell Belux
Flash at the price of disk Redefining the Economics of Storage Kris Van Haverbeke Enterprise Marketing Manager Dell Belux We believe that Redefining the Economics of Storage helps businesses achieve their
Bigtable is a proven design Underpins 100+ Google services:
Mastering Massive Data Volumes with Hypertable Doug Judd Talk Outline Overview Architecture Performance Evaluation Case Studies Hypertable Overview Massively Scalable Database Modeled after Google s Bigtable
Protecting enterprise servers with StoreOnce and CommVault Simpana
Technical white paper Protecting enterprise servers with StoreOnce and CommVault Simpana HP StoreOnce Backup systems Table of contents Introduction 2 Technology overview 2 HP StoreOnce Backup systems key
In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller
In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key
Using Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...
A Study of Application Performance with Non-Volatile Main Memory
A Study of Application Performance with Non-Volatile Main Memory Yiying Zhang, Steven Swanson 2 Memory Storage Fast Slow Volatile In bytes Persistent In blocks Next-Generation Non-Volatile Memory (NVM)
balesio Native Format Optimization Technology (NFO)
balesio AG balesio Native Format Optimization Technology (NFO) White Paper Abstract balesio provides the industry s most advanced technology for unstructured data optimization, providing a fully system-independent
Microsoft SQL Server 2014 Fast Track
Microsoft SQL Server 2014 Fast Track 34-TB Certified Data Warehouse 103-TB Maximum User Data Tegile Systems Solution Review 2U Design: Featuring Tegile T3800 All-Flash Storage Array http:// www.tegile.com/solutiuons/sql
Databases Acceleration with Non Volatile Memory File System (NVMFS) PRESENTATION TITLE GOES HERE Saeed Raja SanDisk Inc.
bases Acceleration with Non Volatile Memory File System (NVMFS) PRESENTATION TITLE GOES HERE Saeed Raja SanDisk Inc. MySQL? Widely used Open Source Relational base Management System (RDBMS) Popular choice
Availability Digest. www.availabilitydigest.com. Data Deduplication February 2011
the Availability Digest Data Deduplication February 2011 What is Data Deduplication? Data deduplication is a technology that can reduce disk storage-capacity requirements and replication bandwidth requirements
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created
Network Performance Optimisation: The Technical Analytics Understood Mike Gold VP Sales, Europe, Russia and Israel Comtech EF Data May 2013
Network Performance Optimisation: The Technical Analytics Understood Mike Gold VP Sales, Europe, Russia and Israel Comtech EF Data May 2013 Copyright 2013 Comtech EF Data Corporation Network Performance
VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group
Russ Fellows, Evaluator Group SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material
Design and Implementation of a Storage Repository Using Commonality Factoring. IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen
Design and Implementation of a Storage Repository Using Commonality Factoring IEEE/NASA MSST2003 April 7-10, 2003 Eric W. Olsen Axion Overview Potentially infinite historic versioning for rollback and
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.
The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms Abhijith Shenoy Engineer, Hedvig Inc. @hedviginc The need for new architectures Business innovation Time-to-market
Solid State Storage in a Hard Disk Package. Brian McKean, LSI Corporation
Solid State Storage in a Hard Disk Package Brian McKean, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individual members may
GET. tech brief FASTER BACKUPS
GET tech brief FASTER BACKUPS Faster Backups Local. Offsite. Remote Office. Why Should You Care? According to a recent survey from the IDG Research Group, the biggest challenge facing IT managers responsible
Image Compression through DCT and Huffman Coding Technique
International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking
Analysis of Memory Sensitive SPEC CPU2006 Integer Benchmarks for Big Data Benchmarking Kathlene Hurt and Eugene John Department of Electrical and Computer Engineering University of Texas at San Antonio
Samsung Solid State Drive RAPID mode
Samsung Solid State Drive RAPID mode White Paper 2013 Samsung Electronics Co. Improving System Responsiveness with Samsung RAPID mode Innovative solution pairs advanced SSD technology with cutting-edge
IBM System Storage Portfolio Overview
IBM System Storage Portfolio Overview Daniel Ndirangu: Storage Sales Specialist Email Address: [email protected] The Business Challenge Every two days now, we create as much information as we did from
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
Creating Synthetic Temporal Document Collections for Web Archive Benchmarking
Creating Synthetic Temporal Document Collections for Web Archive Benchmarking Kjetil Nørvåg and Albert Overskeid Nybø Norwegian University of Science and Technology 7491 Trondheim, Norway Abstract. In
The Why and How of SSD Performance Benchmarking. Esther Spanjer, SMART Modular Easen Ho, Calypso Systems
The Why and How of SSD Performance Benchmarking Esther Spanjer, SMART Modular Easen Ho, Calypso Systems SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies
Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos
Building a High Performance Deduplication System Fanglu Guo and Petros Efstathopoulos Symantec Research Labs Symantec FY 2013 (4/1/2012 to 3/31/2013) Revenue: $ 6.9 billion Segment Revenue Example Business
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
Using Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD
Generational Performance Comparison: Microsoft Azure s A- Series and D-Series. A Buyer's Lens Report by Anne Qingyang Liu
Generational Performance Comparison: Microsoft Azure s A- Series and D-Series A Buyer's Lens Report by Anne Qingyang Liu Generational Performance Comparison: Microsoft Azure s A-Series and D-Series 02/06/2015
The Curious Case of Database Deduplication. PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle
The Curious Case of Database Deduplication PRESENTATION TITLE GOES HERE Gurmeet Goindi Oracle Agenda Introduction Deduplication Databases and Deduplication All Flash Arrays and Deduplication 2 Quick Show
How NAND Flash Threatens DRAM
How NAND Flash Threatens DRAM Jim Handy OBJECTIVE ANALYSIS Outline Why even think about DRAM vs. NAND? The memory/storage hierarchy What benchmarks tell us What about 3D XPoint memory? The system of the
SALSA Flash-Optimized Software-Defined Storage
Flash-Optimized Software-Defined Storage Nikolas Ioannou, Ioannis Koltsidas, Roman Pletka, Sasa Tomic,Thomas Weigold IBM Research Zurich 1 New Market Category of Big Data Flash Multiple workloads don t
Data Reduction Methodologies: Comparing ExaGrid s Byte-Level-Delta Data Reduction to Data De-duplication. February 2007
Data Reduction Methodologies: Comparing ExaGrid s Byte-Level-Delta Data Reduction to Data De-duplication February 2007 Though data reduction technologies have been around for years, there is a renewed
Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%
openbench Labs Executive Briefing: April 19, 2013 Condusiv s Server Boosts Performance of SQL Server 2012 by 55% Optimizing I/O for Increased Throughput and Reduced Latency on Physical Servers 01 Executive
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Security of Cloud Storage: - Deduplication vs. Privacy
Security of Cloud Storage: - Deduplication vs. Privacy Benny Pinkas - Bar Ilan University Shai Halevi, Danny Harnik, Alexandra Shulman-Peleg - IBM Research Haifa 1 Remote storage and security Easy to encrypt
Understanding the Economics of Flash Storage
Understanding the Economics of Flash Storage By James Green, vexpert Virtualization Consultant and Scott D. Lowe, vexpert Co-Founder, ActualTech Media February, 2015 Table of Contents Table of Contents...
The What, Why and How of the Pure Storage Enterprise Flash Array
The What, Why and How of the Pure Storage Enterprise Flash Array Ethan L. Miller (and a cast of dozens at Pure Storage) What is an enterprise storage array? Enterprise storage array: store data blocks
Lab Testing Summary Report
Lab Testing Summary Report May 2007 Report 070529 Product Category: Network Acceleration Vendor Tested: Cisco Systems Product Tested: Cisco Wide Area Application Services (WAAS) v4.0.7 Key findings and
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,
Protect Data... in the Cloud
QUASICOM Private Cloud Backups with ExaGrid Deduplication Disk Arrays Martin Lui Senior Solution Consultant Quasicom Systems Limited Protect Data...... in the Cloud 1 Mobile Computing Users work with their
Oracle Aware Flash: Maximizing Performance and Availability for your Database
Oracle Aware Flash: Maximizing Performance and Availability for your Database Gurmeet Goindi Principal Product Manager Oracle Kirby McCord Database Architect US Cellular Kodi Umamageswaran Vice President,
Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments
Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,
Data Deduplication HTBackup
Data Deduplication HTBackup HTBackup and it s Deduplication technology is touted as one of the best ways to manage today's explosive data growth. If you're new to the technology, these key facts will help
A KAMINARIO WHITE PAPER. Changing the Data Center Economics with Kaminario s K2 All-Flash Storage Array
A KAMINARIO WHITE PAPER Changing the Data Center Economics with Kaminario s K2 All-Flash Storage Array November 2014 Table of Contents Executive Summary... 3 Cost Efficiency of the K2 All-Flash Array...
Tegile Zebi Application Selling. Virtual Desktop Initiatives
Tegile Zebi Application Selling Virtual Desktop Initiatives 1 Virtual Desktop Challenges Kiosks and labs Desktop administration cost Enterprise pain of migrating from XP to Windows 7 ipad & Android tablet
FIOS: A Fair, Efficient Flash I/O Scheduler. Stan Park and Kai Shen presented by Jason Gevargizian
FIOS: A Fair, Efficient Flash I/O Scheduler Stan Park and Kai Shen presented by Jason Gevargizian Flash devices NAND Flash devices are used for storage aka Solid-state Drives (SSDs) much higher I/O performance
Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering
Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations A Dell Technical White Paper Database Solutions Engineering By Sudhansu Sekhar and Raghunatha
IBM Storwize V5000. Designed to drive innovation and greater flexibility with a hybrid storage solution. Highlights. IBM Systems Data Sheet
IBM Storwize V5000 Designed to drive innovation and greater flexibility with a hybrid storage solution Highlights Customize your storage system with flexible software and hardware options Boost performance
09'Linux Plumbers Conference
09'Linux Plumbers Conference Data de duplication Mingming Cao IBM Linux Technology Center [email protected] 2009 09 25 Current storage challenges Our world is facing data explosion. Data is growing in a amazing
Wan Accelerators: Optimizing Network Traffic with Compression. Bartosz Agas, Marvin Germar & Christopher Tran
Wan Accelerators: Optimizing Network Traffic with Compression Bartosz Agas, Marvin Germar & Christopher Tran Introduction A WAN accelerator is an appliance that can maximize the services of a point-to-point(ptp)
SafePeak Case Study: Large Microsoft SharePoint with SafePeak
SafePeak Case Study: Large Microsoft SharePoint with SafePeak The benchmark was conducted in an Enterprise class organization (>2, employees), in the software development business unit, unit that widely
PostgreSQL Performance Characteristics on Joyent and Amazon EC2
OVERVIEW In today's big data world, high performance databases are not only required but are a major part of any critical business function. With the advent of mobile devices, users are consuming data
Data Deduplication and Corporate PC Backup
A Druva White Paper Data Deduplication and Corporate PC Backup This Whitepaper explains source based deduplication technology and how it is used by Druva s insync product to save storage bandwidth and
