Research Statement Yiying Zhang
|
|
|
- Dominic Harris
- 10 years ago
- Views:
Transcription
1 Research Statement Yiying Zhang My research interests span operating systems, distributed systems, computer architecture, networking, and data analytics, with a focus on building fast, reliable, and flexible systems for emerging hardware and applications. My doctoral research centers around file and storage systems, specifically focusing on how to remove redundant levels of indirection in storage systems [1, 2, 3]. More recently, I have been exploring and building systems for next-generation fast, byte-addressable, non-volatile main memory (NVMM) [4, 5]. I have also worked in other aspects of storage systems [6, 7, 8] and I have experience in several different fields including machine learning and bioinformatics [9, 10]. My research approach has three characteristics: Reexamining established principles in light of new hardware and applications. System software, hardware, and applications have changed dramatically over the past decades and will continue to evolve. I believe that these new technologies and applications call for a reexamination of how we apply many established computer science design principles. My dissertation research reconsiders indirection and its costs for new types of storage systems. More recently, I revisited traditional data replication schemes for NVMM. Building real systems. I believe in prototyping research projects in real systems and understand the challenges and benefits of doing so. My work in hardware prototyping of a new I/O system is among the few research projects on flash-based SSDs that are built with real hardware. As a system designer, my experience with hardware has been both challenging and rewarding. I learned that implementing designs in reality can bring unexpected difficulties to light and inspire completely new designs. Interdisciplinary research. I enjoy interdisciplinary research and have worked in several quite different areas including storage systems, operating systems, distributed systems, computer architecture, networking, scheduling and optimization, machine learning, and bioinformatics. In several of my projects, I have looked at how software and hardware should interact with each other in light of technology shifts. My interdisciplinary background enables me to see the whole picture when building systems. Below, I first describe my thesis and more recent research, then briefly discuss research I have done in other areas, and finally present my future research agenda. 1 De-indirection in Storage Systems Indirection is a core technique in computer systems that offers many benefits. However, it is easy to overlook its costs. As software and hardware systems become more complex, redundant levels of indirection can exist in a single system. For example, running a file system on top of a device with indirection (such as a flash-based SSD) creates two levels of indirection: a block is first mapped from a file offset to its logical address and then from the logical address to its physical address in the device. Excess indirection can cause performance and memory overheads. For instance, flash-based SSDs maintain their level of indirection in SSD-internal DRAM, imposing performance, monetary, and energy costs. Such costs are of even greater concern when the SSD capacity grows or when SSDs are deployed in mobile devices. For my Ph.D. dissertation, I proposed removing excess indirection, or de-indirection, for flash-based SSDs. I used two methods to perform de-indirection. The first method is to avoid excess indirection in flash-based SSDs in the first place. The second method is to allow excess indirection to be created and later remove part of it dynamically. Avoiding excess indirection. Using the first method of de-indirection, I designed and implemented Nameless Writes [1], a new device interface that removes the need for indirection in flash-based SSDs. Nameless writes allow the device to choose the location of a write and inform the file system of the name (i.e., the physical address) where the block now resides. The file system then records the physical address in its metadata for future accesses. One of the major challenges that I encountered in designing nameless writes is that flash-based SSDs must move physical blocks for tasks like wear leveling, and these physical address changes need to be reflected in the file system metadata. I used a new method where the device sends callbacks to the file system to inform it about the physical address changes. I ported the Linux ext3 file system to nameless writes and developed a flash-based SSD emulator that models typical SSD configurations and firmware. Experiments show that nameless writes reduce SSD mapping table size (and thus the amount of DRAM in SSD) by 14 to 50 and improve random write performance by 20 compared to a typical traditional SSD. 1
2 I was then curious to see how nameless writes work with real hardware, since they require fundamental changes in the internal workings of the device, its interface to the host operating system, and the host OS. Using an experimental flash-based SSD hardware board, I developed a hardware prototype of the nameless write SSD [2]. During the prototyping, I discovered several challenges not foreseen in my emulation or in previous published work. Specifically, because the nameless writes interface is fundamentally different from traditional I/O interfaces, it is difficult to integrate nameless writes into the existing fixed SATA storage interface. To tackle these problems, I redesigned the SSD storage system by placing minimal functionality in the device and moving complex functionalities into the kernel. Nameless writes have attracted industry attention from several storage companies. A major mobile device manufacturer even expressed their interest in deploying nameless writes in their cell phones. Reducing existing excess indirection. Using the second method of de-indirection, I designed, implemented, and evaluated the File System De-Virtualizer (FSDV) [3], a system that dynamically removes the storage device indirection costs. FSDV is a flexible, light-weight tool that reduces excess indirection by changing file system pointers to use device physical addresses. When FSDV is not running, the file system and the device both maintain their indirection layers and perform normal I/O operations. When it runs, FSDV significantly reduces indirection mapping table space in a dynamic way while preserving the foreground I/O performance. Moreover, because most of the functionality is placed in FSDV, only small changes are required in existing storage systems. 2 Non-Volatile Main Memory Next-generation non-volatile memories (NVMs) promise DRAM-like performance, persistence, and high density. They can attach directly to processors to form non-volatile main memory (NVMM) and offer the opportunity to build very low-latency storage systems. Recently, I have been working on understanding NVMM performance and building systems designed for NVMM. Understanding and improving NVMM performance. I analyzed storage application performance with NVMM using a hardware NVMM emulator [5]. I found that although NVMM is projected to have higher latency and lower bandwidth than DRAM, these differences have only a small impact on application performance. Rather, the bottleneck of NVMM performance is the cost of ensuring that data resides safely in the NVMM (rather than the volatile caches). In response, I designed and implemented an approach to selectively flush data from CPU caches to minimize this cost. This technique significantly improves performance (up to 240 ) for applications that require strict durability and consistency guarantees over large regions of memory. Reliable and highly-available NVMM. Reliability and availability are critical to the success of NVMM as persistent storage in large-scale data center environments. Providing reliability and availability to NVMM is challenging, since the latency of data replication can squander the low latency that NVMM can provide. I reexamined traditional data replication methods and the tradeoffs among performance, reliability, availability, and consistency in light of new NVMM technologies. As a result, I designed, developed, and evaluated Mojim [4], a system that provides the reliability and availability that large-scale storage systems require, while preserving the low-latency performance of NVMM. Mojim achieves these goals by using highly-optimized RDMA-based replication protocols and a two-tier architecture in which the primary tier contains a mirrored pair of nodes and the secondary tier contains one or more secondary backup nodes with weakly consistent copies of data. I implemented Mojim as a generic layer in the Linux kernel and developed an optimized RDMA-based protocol to replicate fine-grained data in NVMM. Experiments show that surprisingly, Mojim provides replicated NVMM with only 29% to 73% the average latency of the un-replicated NVMM and 0.5 to 3.5 the bandwidth. It is also 3.4 to 4 faster than existing replication schemes. Mojim is recent work, but many companies already showed their interest in Mojim and my other NVMM-related research. Some of them are building NVMM-based systems with architectures and approaches that highly resemble Mojim. Other aspects of NVMM. I have also been working on a few other NVMM-related projects including optimizations in the virtualization environment for NVMM and data placement in a hybrid DRAM and NVMM system. In addition, I am currently mentoring six Ph.D. students and visiting scholars to tackle problems in various areas, including architectural and programming language support for NVMM, distributed NVMM systems, and OS optimizations for NVMM. 2
3 3 Other Research Storage systems. Apart from the research work described above, I have worked in several other aspects of storage systems. First, I analyzed data-center storage workloads, and designed and developed a system for accelerating storage-level cache warmup [6]. Experiments show that this system speeds up the cache warmup time by 14% to 100% and has 44% to 228% more server load reduction compared to traditional cache warmup. Second, I designed and implemented a system that prevents correlated device failure in a flash-based storage array by carefully introducing slightly heavier dummy writes on one device that cause it to fail sooner and then slowing down I/O rates on the surviving device [11]. Third, I analyzed different file server contents for duplicate-aware usage, and designed and implemented Duplicate- Aware Disk Arrays (DADA) [7], a system that keeps track of block duplication and uses it to improve the reliability and availability of storage arrays. DADA reduces disk scrubbing and recovery time by 17% to 26%. Scheduling, optimization, machine learning, and bioinformatics. Prior to my Ph.D. work in storage systems, I worked in several other fields. I analyzed energy usage logs of the Condor cluster at University of Wisconsin and developed scheduling algorithms to reduce energy consumption of server clusters. I also worked at a startup company where I was the primary developer of a new product, Locomotive Planning Optimizer, which optimizes locomotive routing and scheduling using integer programming and graph theories. When studying for my master s degree, I investigated several machine learning techniques, including decision trees, neural networks, and support vector machines, to learn the cleavage properties of HIV-1 virus protein sequences and other properties of DNA and RNA sequences [9, 10]. 4 Future Research Going forward, I hope to leverage my area of expertise and to explore a broad range of other areas. Below, I discuss three research directions that I plan to take in the future: (1) building rack-scale systems for little big data analysis; (2) automating software and hardware system implementation; and (3) interdisciplinary research. 4.1 Rack-Scale Little Big Data To analyze the enormous amount of data in today s world is challenging. Many research works have focused on supporting the analytics and processing of gigantic datasets on the scale of petabytes or exabytes. However, there are also many analytics tasks that involve only a few terabytes of data [12, 13, 14]. For example, as of October 2014, GenBank, a database that contains all publicly available DNA sequences, has only 680 GB data [15], and the Facebook friendship graph contains less than 2 TB data [16]: an amount that can fit in the memory of a common server rack. Even for some datasets in the peta- or exabyte range, analytics is performed only on a condensed form of the raw data (e.g., a few summarizing values of a raw MRI image). These little big data (LBD) present their own challenges. Many of these analytics workloads are highly diverse in their data skew, job burstiness, and I/O and compute patterns. Another trend is the increasing demand for interactive data analytics such as real-time risk management. Going forward, I plan to build Rack-Scale Analytical Systems (RacSAS) to more efficiently support LBD, the diversity among these datasets, and the need for interactive access. With the growing complexity of hardware and heterogeneity of analytics applications, a single framework and a fixed hardware configuration will not be enough. I advocate for a reconfigurable hardware layer and a flexible, light-weight software layer. In order to achieve this flexibility, I believe that we need to rethink the abstraction of different software and hardware layers and expose the right amount of information across layers. With such cross-layer information, we can build flexible system software and analytics frameworks that better utilize and share hardware resources and support application diversity. I would also like to take more radical design approaches such as disaggregated computing, storage, and networking resources. In the following, I outline several research directions towards building RacSAS. Configurable storage system. Many data analytic tasks are data-centric and exhibit heterogeneity, high skew, burstiness, and iterative patterns. To support such diversity, I plan to design a tiering architecture of DRAM, non-volatile main memory, and flash. The decreasing price of flash makes it possible to store all raw LBD data on flash in Rac- SAS. Emerging flash-based SSDs have easily programmable computation capabilities inside them [17], and can thus 3
4 perform raw data preprocessing. While flash-based SSDs offer better performance than hard disks, their performance can be irregular because of SSDs internal operations [1]. Exposing the performance cost of such internal operations enables system software to better schedule and coordinate tasks at different layers, for example by allowing SSDs to schedule their wear-leveling operations at system idle time. To take another example, next-generation non-volatile memories promise low latency and byte addressability, but they are still likely to exhibit read-write performance imbalance. Conveying such information to software layers would be helpful as well. To manage these storage devices and enable diverse applications, I believe that we should have highly flexible storage software. For example, it should be easy to configure what form the data is stored in, how much data is cached at each layer, what consistency and reliability model to use for different applications, and whether to remove data duplication. I plan to decouple traditional file and storage system functionalities to enable a configurable storage system. My previous work in cross-layer design [1, 2], file system design [1, 3], data caching [6, 8], reliability and consistency models [4], and data duplication [7] have prepared me for building this kind of configurable storage system. Rack-scale low-latency communication. As computation and storage are getting faster, data communication will become the bottleneck to deliver low-latency application performance. To enable fast and flexible data movement, I plan to reexamine networking architectures and protocols and optimize them for rack-scale data processing. My previous work in optimized RDMA protocol [4] provides me with a starting point towards a low-latency rack-scale network. I also hope to re-examine message passing with commodity networks and look into other possibilities at the rack scale such as photonic network and PCIe interlinks. Redesign OS and system software. With the increasing performance of new storage and networking technologies [5], OS and other system software overhead is becoming the bottleneck in application latency. For RacSAS, an important issue is how to reduce software overhead, especially for real-time analytics. I would like to reexamine current OS and system software functionalities and redesign them in the context of RacSAS and LBD. Reliability and security. Reliability and security will remain important issues for rack-scale systems. For example, how do we prevent a whole-rack failure? How to prevent information leak from a lost rack during transportation? From my previous work [4], I found that we should reconsider traditional data replication and the tradeoff of reliability, availability, and performance for new technologies. I believe that RacSAS and LBD also call for a reexamination of reliability, security, and other system properties: for example, relaxing the reliability requirements for data that can be recomputed. Adaptive resource sharing and scheduling. With the diversity of data analytic tasks and the flexibility in RacSAS s lower layers, scheduling and resource sharing will be a challenging problem. I plan to study typical LBD workloads in different domains and design scheduling systems that take into consideration domain-specific knowledge, perworkload information, and RacSAS s exposed resource information. New programming model. MapReduce-like frameworks were designed for petascale data processing. I believe that LBD calls for a new, more versatile programming model that is suitable for diverse LBD applications and takes advantage of the flexibility of RacSAS. Application adaptation. Finally, I would like to take a different approach from the applications point of view and help RacSAS users adapt their analytics algorithms and software. My previous experience in machine learning [9, 10] has prepared me for this effort. 4.2 Automating System Implementation One of my longer term research directions is to automate the implementation and testing of new software and hardware systems. Currently, building software and hardware systems is difficult, time-consuming, and error-prone. For example, modern file systems such as btrfs contain tens of thousands of lines of code in the kernel. Modern hardware devices such as SSDs have complex firmware that intelligently manages their resources. Implementing a new OS requires even more effort. At the same time, the increasing heterogeneity of applications and hardware technologies has increased the need for new systems that are optimized and customized for specific domains. My goal is to raise the level of abstraction of system implementation, so that system builders can use high-level programming models to specify system functionalities. One possible approach is to utilize and decouple existing system components, while allowing system builders to modify and add components through well-defined interfaces. As a first step to building this kind of automated systems, I hope to build libraries of generic Linux driver components. 4
5 4.3 Interdisciplinary Research I would like to work on interdisciplinary projects with researchers from other fields. For example, I am interested in designing programming models for future hardware technologies, such as new type systems for reliable data access in non-volatile main memory. I am also interested in the security issues raised by emerging hardware devices, such as preventing information leaks in non-volatile main memory. As another example, I hope to make the computation and storage of bioinformatics data more efficient. One possibility is to integrate sequence alignment with storage deduplication techniques. I would also like to adapt database query optimization techniques for data selection of different applications in heterogeneous storage environments. Finally, I would like to redesign networking mechanisms and protocols for new hardware and application trends, for example, moving networking closer to processors. Overall, I believe that my experience in storage systems, operating systems, distributed systems, architecture, networking, scheduling, and machine learning provides me with a solid background for my future research. As a professor, I want to be at the frontier of these research directions and make an impact in real world. I look forward to joining a stimulating environment where I can learn from others and contribute to the research community. References [1] Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. De-indirection for Flash-based SSDs with Nameless Writes. In Proceedings of the 10th USENIX Symposium on File and Storage Technologies (FAST 12), San Jose, California, February [2] Mohit Saxena, Yiying Zhang, Michael M. Swift, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. Getting Real: Lessons in Transitioning Research Simulations into Hardware Systems. In Proceedings of the 11th USENIX Symposium on File and Storage Technologies (FAST 13), San Jose, California, February [3] Yiying Zhang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. Removing the Costs and Retaining the Benefits of Flash-Based SSD Virtualization with FSDV. in preparation. [4] Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. Mojim: A Reliable and Highly- Available Non-Volatile Memory System. To appear in the proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 15). [5] Yiying Zhang and Steven Swanson. Sync Stinks: Application Performance with Non-Volatile Main Memory. in preparation. [6] Yiying Zhang, Gokul Soundararajan, Mark W. Storer, Lakshmi N. Bairavasundaram, Sethuraman Subbiah, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. Warming up Storage-Level Caches with Bonfire. In Proceedings of the 11th USENIX Symposium on File and Storage Technologies (FAST 13), San Jose, California, February [7] Yiying Zhang and Vijayan Prabhakaran. Duplication Aware Disk Array. In Microsoft Technical Report, [8] Mohit Saxena, Michael M. Swift, and Yiying Zhang. FlashTier: a Lightweight, Consistent and Durable Storage Cache. In Proceedings of the EuroSys Conference (EuroSys 12), Bern, Switzerland, April [9] Hyeoncheol Kim, Tae-Sun Yoon, Yiying Zhang, Anupam Dikshit, and Su-Shing Chen. Predictability of Rules in HIV-1 Protease Cleavage Site Analysis. In Proceedings of the 2006 International Conference on Computational Science (ICCS 06), Reading, United Kingdom, March [10] Hyeoncheol Kim, Yiying Zhang, Yong-Seok Heo, Heung-Bum Oh, and Su-Shing Chen. Specificity Rule Discovery in HIV-1 Protease Cleavage Site Analysis. Computational Biology and Chemistry, 32:72 79, [11] Yiying Zhang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. Warped Mirrors for Flash. In Proceedings of the 29th IEEE Conference on Massive Data Storage (MSST 13), Long Beach, California, May [12] Yanpei Chen, Sara Alspaugh, and Randy Katz. Interactive Analytical Processing in Big Data Systems: A Crossindustry Study of MapReduce Workloads. Proceedings of the VLDB Endowment, 5(12): , August
6 [13] Raja Appuswamy, Christos Gkantsidis, Dushyanth Narayanan, Orion Hodson, and Antony Rowstron. Scale-up vs Scale-out for Hadoop: Time to rethink? In Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC 13), Santa Clara, California, October [14] Yahoo Inc. Yahoo! WebScope Datasets. [15] National Institutes of Health (NIH). GenBank. [16] Facebook Inc. Large-scale Graph Partitioning with Apache Giraph. [17] Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson. Willow: A User-Programmable SSD. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, CO, October
A Study of Application Performance with Non-Volatile Main Memory
A Study of Application Performance with Non-Volatile Main Memory Yiying Zhang, Steven Swanson 2 Memory Storage Fast Slow Volatile In bytes Persistent In blocks Next-Generation Non-Volatile Memory (NVM)
EMC XTREMIO EXECUTIVE OVERVIEW
EMC XTREMIO EXECUTIVE OVERVIEW COMPANY BACKGROUND XtremIO develops enterprise data storage systems based completely on random access media such as flash solid-state drives (SSDs). By leveraging the underlying
Memory Channel Storage ( M C S ) Demystified. Jerome McFarland
ory nel Storage ( M C S ) Demystified Jerome McFarland Principal Product Marketer AGENDA + INTRO AND ARCHITECTURE + PRODUCT DETAILS + APPLICATIONS THE COMPUTE-STORAGE DISCONNECT + Compute And Data Have
Big Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
FAST 11. Yongseok Oh <[email protected]> University of Seoul. Mobile Embedded System Laboratory
CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of flash Memory based Solid State Drives FAST 11 Yongseok Oh University of Seoul Mobile Embedded System Laboratory
How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server
White Paper October 2014 Scaling MySQL Deployments Using HGST FlashMAX PCIe SSDs An HGST and Percona Collaborative Whitepaper Table of Contents Introduction The Challenge Read Workload Scaling...1 Write
Taking Linux File and Storage Systems into the Future. Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated
Taking Linux File and Storage Systems into the Future Ric Wheeler Director Kernel File and Storage Team Red Hat, Incorporated 1 Overview Going Bigger Going Faster Support for New Hardware Current Areas
RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University
RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction
Hadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
ioscale: The Holy Grail for Hyperscale
ioscale: The Holy Grail for Hyperscale The New World of Hyperscale Hyperscale describes new cloud computing deployments where hundreds or thousands of distributed servers support millions of remote, often
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, [email protected] Assistant Professor, Information
MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products
MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW
HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW 757 Maleta Lane, Suite 201 Castle Rock, CO 80108 Brett Weninger, Managing Director [email protected] Dave Smelker, Managing Principal [email protected]
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?
IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME? EMC and Intel work with multiple in-memory solutions to make your databases fly Thanks to cheaper random access memory (RAM) and improved technology,
Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology
Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology Evaluation report prepared under contract with NetApp Introduction As flash storage options proliferate and become accepted in the enterprise,
Towards Rack-scale Computing Challenges and Opportunities
Towards Rack-scale Computing Challenges and Opportunities Paolo Costa [email protected] joint work with Raja Appuswamy, Hitesh Ballani, Sergey Legtchenko, Dushyanth Narayanan, Ant Rowstron Hardware
Flash Memory Arrays Enabling the Virtualized Data Center. July 2010
Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,
The Data Placement Challenge
The Data Placement Challenge Entire Dataset Applications Active Data Lowest $/IOP Highest throughput Lowest latency 10-20% Right Place Right Cost Right Time 100% 2 2 What s Driving the AST Discussion?
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
MS Exchange Server Acceleration
White Paper MS Exchange Server Acceleration Using virtualization to dramatically maximize user experience for Microsoft Exchange Server Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba
Diablo and VMware TM powering SQL Server TM in Virtual SAN TM. A Diablo Technologies Whitepaper. May 2015
A Diablo Technologies Whitepaper Diablo and VMware TM powering SQL Server TM in Virtual SAN TM May 2015 Ricky Trigalo, Director for Virtualization Solutions Architecture, Diablo Technologies Daniel Beveridge,
Choosing Storage Systems
Choosing Storage Systems For MySQL Peter Zaitsev, CEO Percona Percona Live MySQL Conference and Expo 2013 Santa Clara,CA April 25,2013 Why Right Choice for Storage is Important? 2 because Wrong Choice
Microsoft Windows Server Hyper-V in a Flash
Microsoft Windows Server Hyper-V in a Flash Combine Violin s enterprise-class storage arrays with the ease and flexibility of Windows Storage Server in an integrated solution to achieve higher density,
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering
Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center
White Paper Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center Provide accelerated data access and an immediate performance boost of businesscritical applications with caching
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Forward-Looking Statements During our meeting today we may make forward-looking
High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software
High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software White Paper Overview The Micron M500DC SSD was designed after months of close work with major data center service providers and
MaxDeploy Hyper- Converged Reference Architecture Solution Brief
MaxDeploy Hyper- Converged Reference Architecture Solution Brief MaxDeploy Reference Architecture solutions are configured and tested for support with Maxta software- defined storage and with industry
Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)
WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...
All-Flash Arrays: Not Just for the Top Tier Anymore
All-Flash Arrays: Not Just for the Top Tier Anymore Falling prices, new technology make allflash arrays a fit for more financial, life sciences and healthcare applications EXECUTIVE SUMMARY Real-time financial
Unlock the value of data with smarter storage solutions.
Unlock the value of data with smarter storage solutions. Data is the currency of the new economy.... At HGST, we believe in the value of data, and we re helping the world harness its power.... Data is
WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression
WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression Sponsored by: Oracle Steven Scully May 2010 Benjamin Woo IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
June 2009. Blade.org 2009 ALL RIGHTS RESERVED
Contributions for this vendor neutral technology paper have been provided by Blade.org members including NetApp, BLADE Network Technologies, and Double-Take Software. June 2009 Blade.org 2009 ALL RIGHTS
Virtualization of the MS Exchange Server Environment
MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of
Increasing Flash Throughput for Big Data Applications (Data Management Track)
Scale Simplify Optimize Evolve Increasing Flash Throughput for Big Data Applications (Data Management Track) Flash Memory 1 Industry Context Addressing the challenge A proposed solution Review of the Benefits
StarWind Virtual SAN for Microsoft SOFS
StarWind Virtual SAN for Microsoft SOFS Cutting down SMB and ROBO virtualization cost by using less hardware with Microsoft Scale-Out File Server (SOFS) By Greg Schulz Founder and Senior Advisory Analyst
A1 and FARM scalable graph database on top of a transactional memory layer
A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb
SQL Server Virtualization
The Essential Guide to SQL Server Virtualization S p o n s o r e d b y Virtualization in the Enterprise Today most organizations understand the importance of implementing virtualization. Virtualization
Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May 2014. Copyright 2014 Permabit Technology Corporation
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems [email protected] Big Data Invasion We hear so much on Big Data and
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle
Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server
HyperQ Storage Tiering White Paper
HyperQ Storage Tiering White Paper An Easy Way to Deal with Data Growth Parsec Labs, LLC. 7101 Northland Circle North, Suite 105 Brooklyn Park, MN 55428 USA 1-763-219-8811 www.parseclabs.com [email protected]
Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance
Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance Hybrid Storage Performance Gains for IOPS and Bandwidth Utilizing Colfax Servers and Enmotus FuzeDrive Software NVMe Hybrid
Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.
Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat
Scaling from Datacenter to Client
Scaling from Datacenter to Client KeunSoo Jo Sr. Manager Memory Product Planning Samsung Semiconductor Audio-Visual Sponsor Outline SSD Market Overview & Trends - Enterprise What brought us to NVMe Technology
Benchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
Understanding Enterprise NAS
Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA
Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
SMB Direct for SQL Server and Private Cloud
SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server
SSD Performance Tips: Avoid The Write Cliff
ebook 100% KBs/sec 12% GBs Written SSD Performance Tips: Avoid The Write Cliff An Inexpensive and Highly Effective Method to Keep SSD Performance at 100% Through Content Locality Caching Share this ebook
Increasing Storage Performance
Increasing Storage Performance High Performance MicroTiering for Server DAS Storage Andy Mills President/CEO, Enmotus [email protected] Santa Clara, CA November 2011 Summary Review of challenges of
Boost Database Performance with the Cisco UCS Storage Accelerator
Boost Database Performance with the Cisco UCS Storage Accelerator Performance Brief February 213 Highlights Industry-leading Performance and Scalability Offloading full or partial database structures to
Flash-Friendly File System (F2FS)
Flash-Friendly File System (F2FS) Feb 22, 2013 Joo-Young Hwang ([email protected]) S/W Dev. Team, Memory Business, Samsung Electronics Co., Ltd. Agenda Introduction FTL Device Characteristics
Everything you need to know about flash storage performance
Everything you need to know about flash storage performance The unique characteristics of flash make performance validation testing immensely challenging and critically important; follow these best practices
Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation
Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance Alex Ho, Product Manager Innodisk Corporation Outline Innodisk Introduction Industry Trend & Challenge
Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA
WHITE PAPER April 2014 Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA Executive Summary...1 Background...2 File Systems Architecture...2 Network Architecture...3 IBM BigInsights...5
SSDs: Practical Ways to Accelerate Virtual Servers
SSDs: Practical Ways to Accelerate Virtual Servers Session B-101, Increasing Storage Performance Andy Mills CEO Enmotus Santa Clara, CA November 2012 1 Summary Market and Technology Trends Virtual Servers
EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers
EMC VPLEX FAMILY Continuous Availability and Data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is
FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency
FAS6200 Cluster Delivers Exceptional Block I/O Performance with Low Latency Dimitris Krekoukias Systems Engineer NetApp Data ONTAP 8 software operating in Cluster-Mode is the industry's only unified, scale-out
Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage
The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage sponsored by Dan Sullivan Chapter 1: Advantages of Hybrid Storage... 1 Overview of Flash Deployment in Hybrid Storage Systems...
Deep Dive on SimpliVity s OmniStack A Technical Whitepaper
Deep Dive on SimpliVity s OmniStack A Technical Whitepaper By Hans De Leenheer and Stephen Foskett August 2013 1 Introduction This paper is an in-depth look at OmniStack, the technology that powers SimpliVity
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
Deploying Flash in the Enterprise Choices to Optimize Performance and Cost
White Paper Deploying Flash in the Enterprise Choices to Optimize Performance and Cost Paul Feresten, Mohit Bhatnagar, Manish Agarwal, and Rip Wilson, NetApp April 2013 WP-7182 Executive Summary Flash
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
Speeding Up Cloud/Server Applications Using Flash Memory
Speeding Up Cloud/Server Applications Using Flash Memory Sudipta Sengupta Microsoft Research, Redmond, WA, USA Contains work that is joint with B. Debnath (Univ. of Minnesota) and J. Li (Microsoft Research,
Oracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
Microsoft Windows Server in a Flash
Microsoft Windows Server in a Flash Combine Violin s enterprise-class storage with the ease and flexibility of Windows Storage Server in an integrated solution so you can achieve higher performance and
An Oracle White Paper May 2011. Exadata Smart Flash Cache and the Oracle Exadata Database Machine
An Oracle White Paper May 2011 Exadata Smart Flash Cache and the Oracle Exadata Database Machine Exadata Smart Flash Cache... 2 Oracle Database 11g: The First Flash Optimized Database... 2 Exadata Smart
EMC XtremSF: Delivering Next Generation Performance for Oracle Database
White Paper EMC XtremSF: Delivering Next Generation Performance for Oracle Database Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator
White Paper Optimizing SQL Server AlwaysOn Implementations with OCZ s ZD-XL SQL Accelerator Delivering Accelerated Application Performance, Microsoft AlwaysOn High Availability and Fast Data Replication
SSDs: Practical Ways to Accelerate Virtual Servers
SSDs: Practical Ways to Accelerate Virtual Servers Session B-101, Increasing Storage Performance Andy Mills CEO Enmotus Santa Clara, CA November 2012 1 Summary Market and Technology Trends Virtual Servers
Parallel Computing. Benson Muite. [email protected] http://math.ut.ee/ benson. https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage
Parallel Computing Benson Muite [email protected] http://math.ut.ee/ benson https://courses.cs.ut.ee/2014/paralleel/fall/main/homepage 3 November 2014 Hadoop, Review Hadoop Hadoop History Hadoop Framework
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011
Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011 Executive Summary Large enterprise Hyper-V deployments with a large number
Building a Flash Fabric
Introduction Storage Area Networks dominate today s enterprise data centers. These specialized networks use fibre channel switches and Host Bus Adapters (HBAs) to connect to storage arrays. With software,
Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3.0 SSD Read and Write Caching Solutions
MAXCACHE 3. WHITEPAPER Reduce Latency and Increase Application Performance Up to 44x with Adaptec maxcache 3. SSD Read and Write Caching Solutions Executive Summary Today s data centers and cloud computing
Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator
WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion
EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers
EMC VPLEX FAMILY Continuous Availability and data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is
bigdata Managing Scale in Ontological Systems
Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural
Data Center Solutions
Data Center Solutions Systems, software and hardware solutions you can trust With over 25 years of storage innovation, SanDisk is a global flash technology leader. At SanDisk, we re expanding the possibilities
Design and Evolution of the Apache Hadoop File System(HDFS)
Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop
EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server
White Paper EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server Abstract This white paper addresses the challenges currently facing business executives to store and process the growing
PARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
Intel RAID SSD Cache Controller RCS25ZB040
SOLUTION Brief Intel RAID SSD Cache Controller RCS25ZB040 When Faster Matters Cost-Effective Intelligent RAID with Embedded High Performance Flash Intel RAID SSD Cache Controller RCS25ZB040 When Faster
Overview: X5 Generation Database Machines
Overview: X5 Generation Database Machines Spend Less by Doing More Spend Less by Paying Less Rob Kolb Exadata X5-2 Exadata X4-8 SuperCluster T5-8 SuperCluster M6-32 Big Memory Machine Oracle Exadata Database
Hyperscale Use Cases for Scaling Out with Flash. David Olszewski
Hyperscale Use Cases for Scaling Out with Flash David Olszewski Business challenges Performanc e Requireme nts Storage Budget Balance the IT requirements How can you get the best of both worlds? SLA Optimized
A Deduplication File System & Course Review
A Deduplication File System & Course Review Kai Li 12/13/12 Topics A Deduplication File System Review 12/13/12 2 Traditional Data Center Storage Hierarchy Clients Network Server SAN Storage Remote mirror
Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer
Accelerating Applications and File Systems with Solid State Storage Jacob Farmer, Cambridge Computer SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise
Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division
Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division In this talk Big data storage: Current trends Issues with current storage options Evolution of storage to support big
Flash Memory Technology in Enterprise Storage
NETAPP WHITE PAPER Flash Memory Technology in Enterprise Storage Flexible Choices to Optimize Performance Mark Woods and Amit Shah, NetApp November 2008 WP-7061-1008 EXECUTIVE SUMMARY Solid state drives
Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server
Deploying Affordable, High Performance Hybrid Flash Storage for Clustered SQL Server Flash storage adoption has increased in recent years, as organizations have deployed it to support business applications.
