WHITE PAPER. Permabit Albireo Data Optimization Software. Benefits of Albireo for Virtual Servers. January 2012. Permabit Technology Corporation

Similar documents
IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

Nimble Storage for VMware View VDI

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC b Test Report Date: 27, April

Redefining Microsoft SQL Server Data Management. PAS Specification

VMware vsphere Data Protection 6.0

IOmark-VM. DotHill AssuredSAN Pro Test Report: VM a Test Report Date: 16, August

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

June Blade.org 2009 ALL RIGHTS RESERVED

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

VMware vsphere 5.1 Advanced Administration

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

EMC XTREMIO EXECUTIVE OVERVIEW

Using VMWare VAAI for storage integration with Infortrend EonStor DS G7i

Real-time Compression: Achieving storage efficiency throughout the data lifecycle

vsphere Data Protection 6.0 VDP 6.0

E-Guide. Sponsored By:

ADVANCED DEDUPLICATION CONCEPTS. Larry Freeman, NetApp Inc Tom Pearce, Four-Colour IT Solutions

SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.

VMware vsphere 5.0 Boot Camp

Technology Insight Series

Understanding Data Locality in VMware Virtual SAN

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage

Accelerate Your Virtualisation Journey With Backup Built For VMWare. Frederick Enslin. BRS Technology Consultant. Copyright 2011 EMC Corporation

Ultimate Guide to Oracle Storage

FlashSoft Software from SanDisk : Accelerating Virtual Infrastructures

How To Fix A Fault Fault Fault Management In A Vsphere 5 Vsphe5 Vsphee5 V2.5.5 (Vmfs) Vspheron 5 (Vsphere5) (Vmf5) V

Evaluation of Enterprise Data Protection using SEP Software

Best Practices for Managing Storage in the Most Challenging Environments

Redefining Microsoft Exchange Data Management

Virtual Machine Environments: Data Protection and Recovery Solutions

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

Deduplication on SNC NAS: UI Configurations and Impact on Capacity Utilization

Lab Validation Report

Virtualize Without Compromise. Protecting and Storing Virtualized Data

Symantec NetBackup Appliances

Complete Storage and Data Protection Architecture for VMware vsphere

VMWARE VSPHERE 5.0 WITH ESXI AND VCENTER

Symantec NetBackup 7 Clients and Agents

Barracuda Backup Vx. Virtual Appliance Deployment. White Paper

VMware vsphere: Install, Configure, Manage [V5.0]

Whitepaper. NexentaConnect for VMware Virtual SAN. Full Featured File services for Virtual SAN

Backup and Recovery Best Practices With CommVault Simpana Software

NetBackup for VMware Data Recovery Services to End the Dark Ages of Virtualization

Best Practices for Architecting Storage in Virtualized Environments

EMC VNXe3200 UFS64 FILE SYSTEM

VMware vsphere 4.1 with ESXi and vcenter

Introduction. Setup of Exchange in a VM. VMware Infrastructure

VMware Virtual Machine File System: Technical Overview and Best Practices

Webinar Windows Storage Server. June, 2013 Prepared By: Brian Verenkoff Director of Marketing

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Manage rapid NAS data growth while reducing storage footprint and costs

EMC Business Continuity for Microsoft SQL Server 2008

EMC VNXe File Deduplication and Compression

Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4

Virtual SAN Design and Deployment Guide

Deduplication on EonNAS Pro: UI Configurations and Impact on Capacity Utilization

Understanding EMC Avamar with EMC Data Protection Advisor

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Symantec NetBackup PureDisk Optimizing Backups with Deduplication for Remote Offices, Data Center and Virtual Machines

EMC BACKUP-AS-A-SERVICE

EMC Integrated Infrastructure for VMware

Turnkey Deduplication Solution for the Enterprise

Dell Backup & Disaster Recovery Suite. Resiliency without compromise.

VMware vsphere Design. 2nd Edition

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression

Maximize Your Virtual Environment Investment with EMC Avamar. Rob Emsley Senior Director, Product Marketing

What s New in VMware vsphere 4.1 Storage. VMware vsphere 4.1

Virtual Server Agent v9 with VMware. March 2011

Sanbolic s SAN Storage Enhancing Software Portfolio

Barracuda Backup Deduplication. White Paper

BEST PRACTICES GUIDE: VMware on Nimble Storage

UNDERSTANDING DATA DEDUPLICATION. Tom Sas Hewlett-Packard

Moving Virtual Storage to the Cloud

Technology Fueling the Next Phase of Storage Optimization

EMC Virtual Infrastructure for SAP Enabled by EMC Symmetrix with Auto-provisioning Groups, Symmetrix Management Console, and VMware vcenter Converter

HP StoreOnce & Deduplication Solutions Zdenek Duchoň Pre-sales consultant

Redefining Backup for VMware Environment. Copyright 2009 EMC Corporation. All rights reserved.

3Gen Data Deduplication Technical

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

NEXT GENERATION STORAGE EFFICIENCY WITH OneFS SMARTDEDUPE

Nimble Storage VDI Solution for VMware Horizon (with View)

VMware vsphere Data Protection 6.1

Symantec NetBackup 7.1 What s New and Version Comparison Matrix

DR-to-the- Cloud Best Practices

M710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory

Khóa học dành cho các kỹ sư hệ thống, quản trị hệ thống, kỹ sư vận hành cho các hệ thống ảo hóa ESXi, ESX và vcenter Server

Kaminario K2 All-Flash Array

MICROSOFT SHAREPOINT SERVER: BEST PRACTICES AND DESIGN GUIDELINES FOR EMC STORAGE

Virtual Volumes Technical Deep Dive

NetApp and Microsoft Virtualization: Making Integrated Server and Storage Virtualization a Reality

Efficient Backup with Data Deduplication Which Strategy is Right for You?

The Benefits of Virtualization for Your DR Plan

Merge Healthcare Virtualization

Top 10 Do s/don ts of Data Protection for VMware vsphere

EMC XTREMIO AND MICROSOFT EXCHANGE DATABASES

What s New with VMware Virtual Infrastructure

VMware Data Recovery. Administrator's Guide EN

Flash Storage Roles & Opportunities. L.A. Hoffman/Ed Delgado CIO & Senior Storage Engineer Goodwin Procter L.L.P.

VMware VDR and Cloud Storage: A Winning Backup/DR Combination

Transcription:

WHITE PAPER Permabit Albireo Data Optimization Software Benefits of Albireo for Virtual Servers January 2012 Permabit Technology Corporation Ten Canal Park Cambridge, MA 02141 USA Phone: 617.252.9600 FAX: 617.252.9977 info@permabit.com www.permabit.com

Contents Introduction...3 VMware Storage Background...3 Managing VMware Storage...3 VMware Storage Sprawl...4 Data Optimization Software...4 Permabit Albireo Data Optimization Software...5 Albireo Architecture................................................................. 6 Albireo Performance...6 Benefits of Albireo for VMware...7 Conclusion...7 About Permabit...7 Find Out More...7 The Albireo technology from Permabit will save an OEM 18-24 months getting to market, if they can do it at all. This stuff is so far ahead in its capabilities and performance I can t see why you would want to do it yourself, unless you already have it baked. Steve Duplessie Founder & Sr. Analyst Enterprise Strategy Group 2

Virtualization has significantly reduced data center footprints, and significantly increased storage costs Introduction Server virtualization as popularized by VMware, Microsoft, and others is a widely used tactic for reducing data center costs. By reducing the number of physical servers, data centers have been able to significantly reduce data center footprints and the costs of server acquisitions, energy, management, and more. While server-related costs have decreased, the corresponding storage costs have not in fact, storage costs are rising rapidly as a result of server virtualization. At first glance it is not obvious why storage costs increased, given that cost per GB has consistently fallen. The answer lies in the management of virtualized servers. This paper reviews the popular VMware vsphere Hypervisor and how its management impacts total storage consumption and cost. It then introduces Permabit Albireo Data Optimization Software as a means of reducing virtual server storage requirements. VMware Storage Background A VMware virtual machine uses a virtual disk (VMDK) to store its operating system, program files, and other data associated with its activities. (Figure 1.) The VMDK is a large physical file, or set of files, that can be moved, deleted, and copied as easily as any other file. To store and manage virtual disks, VMware vsphere uses its own special storage space called a VMFS datastore, which is similar to a file system on a logical volume. A VMFS datastore can be created on a wide variety of physical storage devices, including internal and external storage or networked storage devices. Figure 1: Typical VMware Storage Layout ESX Server A ESX Server B ESX Server C Virtual Machine 1 Virtual Machine 2 Virtual Machine 3 VMFS Volume Virtual Disk Files Managing VMware Storage Creating VMFS datastores requires careful planning. For example, configuring fewer, larger VMFS volumes allows for more virtual machine capacity and reduces the odds of requiring additional space to be allocated. Larger VMFS datastores allow more flexibility for resizing virtual disks and reduce the number of VMFS datastores to manage. Alternatively, configuring more, smaller VMFS datastores can improve virtual disk performance (due to locking and SCSI reservation issues), reduce wasted storage space, and support applications such as Microsoft Cluster Service that require each cluster disk resource to have its own LUN. 3

Table 1: Virtual Servers Running Applications A, B, C, D VMware Storage Sprawl The reason that VMware storage management is causing such a sharp increase in storage demands can be demonstrated using a simple example: A virtual server runs applications A, B, C, and D. Applications A, B, C, and D each require Windows Server and a Microsoft SQL Database. An IT best practice is to keep three complete copies of the server installed and running: one for production, one for production standby and one for QA test. (Table 1.) This simple example requires twelve virtual machines, each of which contains a full copy of the operating system, application software, and its copy of the application data. Operating System Database Application Intended Use Windows Server 2008 MS SQL 2008 A Production Windows Server 2008 MS SQL 2008 A Production standby Windows Server 2008 MS SQL 2008 A QA Test Windows Server 2008 MS SQL 2008 B Production Windows Server 2008 MS SQL 2008 B Production standby Windows Server 2008 MS SQL 2008 B QA Test Windows Server 2008 MS SQL 2008 C Production Windows Server 2008 MS SQL 2008 C Production standby Windows Server 2008 MS SQL 2008 C QA Test Windows Server 2008 MS SQL 2008 D Production Windows Server 2008 MS SQL 2008 D QA Test Windows Server 2008 MS SQL 2008 D Production standby Storage sprawl is a direct result of the ease with which virtual machines can be created The storage challenge created by VMware is not just caused by the example in table one. After all, having three running copies of each production application is a normal practice. Rather, this storage sprawl is a direct result of the ease with which virtual machine clones can be created. Because the need for additional physical server deployment is significantly reduced, VMware administrators now create additional virtual machines for patches, bug fixes, operating service packs, and other internal departments (e.g., engineering, support, and marketing). The resulting storage growth is at the discretion of the administrator, but dozens of virtual machine clones result from such solid administrative use cases, each consuming an identical amount of disk space. Each virtual machine also requires space for snapshots, swap files, log files, ISO images, and diagnostic partitions. Depending on the storage management practices of VMware virtual servers, the amount of disk space needed for all virtual machines and their support files can quickly become staggering and costly. Data Optimization Software Implementing data optimization software, such as data deduplication, is the answer to VMware storage sprawl. Data deduplication software identifies duplicate chunks of data so that each unique chunk is stored only once. When applied to VMFS datastores, the amount of disk space necessary to store virtual machines can be reduced significantly. Virtual machines are ideal candidates for data reduction because they so often contain identical operating systems and applications. As a result, the actual differences between files are quite small and lend themselves to significant data reduction via deduplication technology. Figures 2 and 3 illustrate this basic storage reduction technique. In Figure 2, three virtual machines with the same operating system and application software are shown, each with its own virtual disk storage. When data optimization is applied, three virtual disks are reduced to one (Figure 3.). It is clear that virtual server storage can benefit by storage optimization. Storage optimization also benefits virtual machine performance. Virtual machine memory blocks can be deduped in cache so that each virtual machine runs faster with reduced disk access. 4

Figure 2: Virtual Machines Before Deduplication Virtual Machine 1 Virtual Machine 2 Virtual Machine 3 Datastore OS OS OS APP APP APP ESX Cluster OS OS OS APP APP APP RAID Level Traditional Storage Figure 3: Virtual Machines After Deduplication Virtual Machine 1 Virtual Machine 2 Virtual Machine 3 Datastore OS OS OS APP APP APP ESX Cluster OS APP Deduplicated Volume Albireo integrates at any point (inline, parallel, postprocess) at a sub-file level, enabling deduplication to optimize primary storage and downstream replication processes. Permabit Albireo Data Optimization Software Permabit Albireo Data Optimization Software with VMFS datastores reduces storage demands up to 97%. Albireo is embedded within the storage device connected to the virtual server host where it identifies duplicate blocks of data and advises the storage device so it can update its block pointers and avoid writing the same block of data to disk more than once. This saves on disk space and reduces other downstream storage related activities such as replication, snapshots, and backup. Albireo massively improves the performance and efficiency of data creation, transmission, and storage. It integrates at any point (parallel, inline, or post-process) at a sub-file level, enabling deduplication to optimize both primary storage and downstream replication processes. Albireo is delivered as a Software Development Kit (SDK) to OEMs. The SDK contains the Albireo software library, full API documentation, code samples, and application notes for integration. 5

Albireo Architecture Albireo s architecture combines the Albireo High Performance Index Engine with the Albireo content segmentation technologies, and is easily implemented via the Albireo SDK. As shown in Figure 4, Albireo operates as an advisory service outside of the storage application software data path. This ensures that data integrity is never at risk and that there is zero performance impact an important requirement for successful VMware deployments. Figure 4: Albireo Architecture iscsi FC DATA SOURCES NFS CIFS 1. OEM software pushes new data and internal placement information (e.g., filename, inode, offset or LUN, block) to Albireo 2. Content-aware segmentation breaks larger objects into variable-sized chunks 3. Unique content fingerprints are computed 4. Patented indexing technologies determine if the chunk has been previously seen 5. Previous placement information is pushed asynchronously to the OEM software for file, block, or extent unification Albireo s High Performance Index Engine table can identify duplicate data in a matter of microseconds orders of magnitude faster than other deduplication solutions. In the case of VMware, incoming data is managed by the existing VMFS datastore from any VMwaresupported source (e.g., FC, iscsi, SCSI, or NFS). In a parallel integration scenario, once data is received, a copy is made and delivered via the Albireo API with its corresponding metadata (e.g., file name, offset, block, LUN). Using a hash algorithm (SHA-256 or MurmurHash3), unique content fingerprints are computed and compared to existing hash keys using patented high-speed indexing technologies. Information on whether or not a data chunk is a duplicate is asynchronously pushed to the storage application software via the Albireo API for file, block, or extent unification. If the data chunk is unique, then no action is required. If the data chunk is a duplicate, then the storage application software takes steps to modify its storage tables (e.g., inode block data structure for UNIX systems). The advantage of the Albireo architecture is that it operates outside of the storage data flow and avoids any performance penalty. There is no risk to data integrity because Albireo itself does not write the data to disk and data can always be read even if Albireo were to become disabled. Further, when reading data, Albireo avoids having to perform data rehydration, a performance penalty and ease-of-use issue common with other data optimization technologies. Albireo Performance Albireo performance tests were performed by the Enterprise Strategy Group (ESG) in August 2011. For the purpose of testing how well Albireo could perform deduplication, a test environment was constructed using a modified open-source file system. A set of four VMware images, totaling 157 GB, were used. The ESG results confirmed that Permabit Albireo deduplication advisory services can be used to reduce capacity requirements for storing VMware images. ESG Lab recorded an outstanding deduplication rate of 97% (36.2:1) for four VMware virtual server images. Table 2: VMware Deduplication Results with Albireo Data Type Before (GB) After (GB) Deduplication Rate Deduplication Ratio VMware Images 157 4.3 97% 36.2 6

Benefits of Albireo for VMware As shown in the ESG lab results, Albireo can reduce the total disk capacity necessary to store VMFS datastores by over 97%. This is a huge storage savings that comes without compromising disk I/O performance or data integrity. Permabit Albireo is the only primary storage data optimization software that operates out of the data read path and therefore does not impact disk read performance. Even if Albireo were disabled or removed for any reason, the data can remain accessible. Albireo is an advisory service to the storage system and never modifies the data written to disk. The storage device always retains full control of data being written. This protects data integrity and eliminates the need to decompress data during read, a process that is expensive and necessary with compression data optimization technologies. Permabit is working closely with primary storage vendors to integrate Albireo into existing and planned storage devices to benefit virtual environments. Albireo (al-beer-ee-oh) appears to the naked eye to be a single star but can be resolved with a telescope into a double star, consisting of a brighter yellow star and a fainter blue star. Conclusion The broader adoption of virtual servers is hindered by huge amounts of disk storage consumed by hundreds to thousands of virtual machines. With each virtual machine requiring its own independent disk storage, even a reasonably sized virtual server deployment can require a considerable amount of disk space. Permabit Albireo Data Optimization Software has been shown in independent tests to reduce VMware storage requirements by over 97%. Without consuming additional storage space for each separate VMDK, VMware administrators can deploy more virtual machines as needed without sacrificing storage capacity. Deploying Albireo substantially reduces direct storage costs and, equally important, reduces associated energy, space and cooling costs. Albireo operates completely out of the data read path so virtual server/storage performance is maintained and data integrity are never compromised. Permabit s Albireo is truly a breakthrough for the full utilization of virtual servers. About Permabit Permabit is a recognized leader in data efficiency technology. We enable OEMs to leverage their R&D investment, increase margin, accelerate time to market, and achieve competitive advantage. Permabit Albireo software massively improves performance and efficiency of data creation, transmission and storage. Solutions built with Albireo are being delivered by leading hardware, software and service providers. Find Out More To learn more about the Permabit Albireo technology, or to license our products, visit our website at www.permabit.com or call us directly at 617.252.9600. 2012 Permabit Technology Corporation. All Rights Reserved. Permabit is a registered trademark and the Permabit logo, Albireo logo, Permabit Enterprise Archive, and Scalable Data Reduction are trademarks of the Permabit Technology Corporation. All other products or services mentioned may be covered by registered trademarks, trademarks, service marks, or product names as designated by the companies who market those products. 7 Ten Canal Park Cambridge, MA 02141 Phone: 617.252.9600 FAX: 617.252.9977 info@permabit.com www.permabit.com