IBM Spectrum Scale. Spectrum Scale 4.2 File compression

Similar documents
Cloud Computing with xcat on z/vm 6.3

Version 8.2. Tivoli Endpoint Manager for Asset Discovery User's Guide

Tip and Technique on creating adhoc reports in IBM Cognos Controller

Platform LSF Version 9 Release 1.2. Migrating on Windows SC

How to Deliver Measurable Business Value with the Enterprise CMDB

System z Batch Network Analyzer Tool (zbna) - Because Batch is Back!

Installing on Windows

Title. Click to edit Master text styles Second level Third level

How To Manage Energy At An Energy Efficient Cost

Energy Management in a Cloud Computing Environment

IBM Endpoint Manager. Security and Compliance Analytics Setup Guide

IBM XIV Management Tools Version 4.7. Release Notes IBM

Backups in the Cloud Ron McCracken IBM Business Environment

Rapid Data Backup and Restore Using NFS on IBM ProtecTIER TS7620 Deduplication Appliance Express IBM Redbooks Solution Guide

IBM Tivoli Storage FlashCopy Manager Overview Wolfgang Hitzler Technical Sales IBM Tivoli Storage Management

z/vm Capacity Planning Overview

z/osmf Software Deployment Application- User Experience Enhancement Update

Session 1494: IBM Tivoli Storage FlashCopy Manager

DFSMS Object Support Overview: Data Archiving with OAM Session: 17809

IBM Systems Director Navigator for i5/os New Web console for i5, Fast, Easy, Ready

IBM Software Services for Collaboration

Managed Services - A Paradigm for Cloud- Based Business Continuity

TSM for Virtual Environments Data Protection for VMware

IBM Endpoint Manager for Software Use Analysis Version Beta Features Guide

Creating a Cloud Backup Service. Deon George

Oracle on System z Linux- High Availability Options Session ID 252

IBM Tivoli Storage FlashCopy Manager

Tivoli Endpoint Manager for Security and Compliance Analytics. Setup Guide

Endpoint Manager for Mobile Devices Setup Guide

Data Transfer Tips and Techniques

Migrating LAMP stack from x86 to Power using the Server Consolidation Tool

Deploying a private database cloud on z Systems

Session Title: Cloud Computing 101 What every z Person must know

IBM i Network Install using Network File System

Communications Server for Linux

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

IBM Systems and Technology Group Technical Conference

Forecasting Performance Metrics using the IBM Tivoli Performance Analyzer

The Consolidation Process

Java Application Performance Analysis and Tuning on IBM System i

Performance and scalability of a large OLTP workload

Sterling Business Intelligence. Concepts Guide

IBM Smart Business Storage Cloud

Maximo Business Intelligence Reporting Roadmap Washington DC Users Group

IBM Security QRadar Version (MR1) Checking the Integrity of Event and Flow Logs Technical Note

Big Data & Analytics. A boon under certain conditions. Dr. Christian Keller General Manager IBM Switzerland IBM Corporation

IBM Tivoli Identitiy Manager 5.1: Writing Java Extensions and Application Code

IBM Tivoli Storage Manager for Virtual Environments

z/os V1R11 Communications Server system management and monitoring

Practical Web Services for RPG IBM Integrated Web services for i

IBM Endpoint Manager Version 9.2. Software Use Analysis Upgrading Guide

Arwed Tschoeke, Systems Architect IBM Systems and Technology Group

CS z/os Application Enhancements: Introduction to Advanced Encryption Standards (AES)

IBM BigFix Version 9.2. Patch for CentOS User's Guide IBM

IBM Endpoint Manager Version 9.1. Patch Management for Mac OS X User's Guide

IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

IBM Endpoint Manager Version 9.0. Patch Management for Red Hat Enterprise Linux User's Guide

IBM Cognos Controller Version New Features Guide

Accelerate with ATS DS8000 Hardware Management Console (HMC) Best practices and Remote Support Configuration September 23rd, 2014.

Manage rapid NAS data growth while reducing storage footprint and costs

Real-time Compression: Achieving storage efficiency throughout the data lifecycle

Tivoli Endpoint Manager for Security and Compliance Analytics

IBM Cloud Orchestrator Content Pack for OpenLDAP and Microsoft Active Directory Version 2.0. Content Pack for OpenLDAP and Microsoft Active Directory

Release Notes. IBM Tivoli Identity Manager Oracle Database Adapter. Version First Edition (December 7, 2007)

IBM SmartCloud Analytics - Log Analysis. Anomaly App. Version 1.2

TSM (Tivoli Storage Manager) Backup and Recovery. Richard Whybrow Hertz Australia System Network Administrator

IBM Endpoint Manager Version 9.0. Patch Management for Windows User's Guide

IBM Enterprise Content Management Software Requirements

WebSphere Business Compass Version 7. Getting started with process maps

Determining which Solutions are the Best Fit for Linux on System z Workloads

IBM Tivoli Endpoint Manager Version 8.2. Client Manager for Endpoint Protection User's Guide

Virtual Machine Environments: Data Protection and Recovery Solutions

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

Database lifecycle management

Patch Management for Red Hat Enterprise Linux. User s Guide

mmbackup and TSM Integration

Disaster Recovery Procedures for Microsoft SQL 2000 and 2005 using N series

An Oracle Technical White Paper June Oracle VM Windows Paravirtual (PV) Drivers 2.0: New Features

IBM TRIRIGA Anywhere Version 10 Release 4. Installing a development environment

OS Deployment V2.0. User s Guide

Big data management with IBM General Parallel File System

Linux. Managing security compliance

HP D2D NAS Integration with HP Data Protector 6.11

z/os Basics: z/os UNIX Shared File System environment and how it works

VMware ESX 2.5 Server Software Backup and Restore Guide on Dell PowerEdge Servers and PowerVault Storage

Quantum DXi6500 Family of Network-Attached Disk Backup Appliances with Deduplication

ServeRAID M5000 Series Performance Accelerator Key for IBM System x Product Guide

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

Backup and Recovery: The Benefits of Multiple Deduplication Policies

6Gb SAS Host Bus Adapter Product Guide

Access to easy-to-use tools that reduce management time with Arcserve Backup

WebSphere Commerce V7.0

IBM Tivoli Storage Manager Suite for Unified Recovery

An Oracle White Paper January Using Oracle's StorageTek Search Accelerator

Running VirtualCenter in a Virtual Machine

SAN Conceptual and Design Basics

Tivoli Endpoint Manager for Configuration Management. User s Guide

Integrating ERP and CRM Applications with IBM WebSphere Cast Iron IBM Redbooks Solution Guide

Practical Performance Understanding the Performance of Your Application

Transcription:

IBM Spectrum Scale File compression

Agenda : Compression Overview : Compression Current Implementation : Closer Look to Compression : Current limitations to Compression implementation IBM Systems 2

: Compression SPECTRUM SCALE COMPRESSION OVERVIEW IBM Systems 3

: File Compression Compression / decompression is controlled per file Not live compression Policy-driven background compression/decompression CLI-driven compression/decompression with mmchattr Platform support: Linux, AIX, Windows Compression algorithm/library used: deflate/zlib favors saving space over speed Inline read of compressed file Inline decompression of compression group upon writes Improved storage efficiency Reclaim space from cold data Improvement in storage efficiency depends on the file itself Typically 2:1 to 5:1 But could reach 10:1 on trivial files or none on random ones Vision Which files to compress When to compress the file data How to compress the file data IBM Systems 4

: File Compression Improved i/o bandwidth Read/write compressed data reduces load on storage backend Read/write compressed data reduces network usage Client Reduced network traffic Improved client side caching Caching compressed data reduce cache (pagepool) consumption Reduced load on storage backend IBM Systems 5

: File Compression Please keep in mind With this release compression is primarily for: Cold Data If storage path is constraint, reading compressed data is slower than reading raw data due to on-the-fly zlib decompression overhead to be addressed by using faster algorithm (such as lz4), hardware acceleration, and other optimizations Reduced disk IO can benefit read performance more in a busier system Update to PTF 4.2.0.2 or above [10 March 2016] Bug fix which could have occurred on previous version when a user write on a compressed file, while mmrestripefs z is running in background IBM Systems 6

: Compression CURRENT IMPLEMENTATION IBM Systems 7

Compress/Decompress Files CLI command mmchattr command: mmchattr --compression { yes no } -I { yes defer } File mmchattr --compression yes -I yes /gpfs/gpfs0/test-64mb-0 mmchattr --compression yes -I defer /gpfs/gpfs0/test-64mb-0 mmchattr --compression no -I yes /gpfs/gpfs0/test-64mb-0 mmchattr --compression no -I defer /gpfs/gpfs0/test-64mb-0 The compression (--compression yes) or decompression (--compression no) is either Synchronous with -I yes : The file is compressed/decompressed immediately Deferred with I defer : the file will be compressed/decompressed with a mmrestripefs -z or mmrestripefile -z mmrestripefs gpfs0 z mmrestripefile z gpfs/gpfs0/test-64mb-0 The file is marked illcompressed in its flags until the restripe is done IBM Systems 8

Compress Files synchronously 64 MB The file before compression 64 MB IBM Systems 9

Compress Files synchronously The file after sync compression ARCHIVE COMPRESSED COMPRESSED IBM Systems 10

Compress Files synchronously < 7MB The file after sync compression Size 6592KB IBM Systems 11

Compress Files asynchronously, defer option 64 MB The file before compression 64 MB IBM Systems 12

Compress Files asynchronously, defer option defer ILLCOMPRESSED The file after async compression ARCHIVE COMPRESSED Flag illcompressed IBM Systems 13

Compress Files asynchronously, defer option mmrestripefs -z The file is restriped It is now COMPRESSED IBM Systems 14

Compress Files asynchronously, defer option 64 MB The file is compressed Its size is same due to is complexity IBM Systems 15

Compress/Decompress Files with management policy Policy-Driven, management policy Similar to data replication factor change through MIGRATE policy rule MIGRATE supports COMPRESS('yes') and COMPRESS ('no') Example RULE 'COMPRESSION' MIGRATE COMPRESS('yes') WHERE NAME LIKE blue% This rule compress all files starting by blue mmapplypolicy [-I {yes defer}] P <policy file name> Per-file Compression Control and Status bits in inode MISC_ATTRIBUTES c the file is selected to be compressed K some data blocks are ill-compressed IBM Systems 16

File attributes with compression COMPRESSED illcompressed Meaning Not displayed Not displayed The file is not compressed. Any pending decompression is completed Not displayed Displayed The file is marked for decompression which is deferred Displayed Not displayed The file is compressed. Any pending compression is completed Displayed Displayed The file is marked for compression. Compression is deferred, or part of the file is decompressed due to a file update or memory mapping IBM Systems 17

: Compression CLOSER LOOK TO COMPRESSION IBM Systems 18

A Closer Look At a Compressed File Before the compression of the file tsdbfs gpfs0 listda i :display disk address list for inode i in the file system gpfs0 The inode of the file has an address on disk 2 sector 569856 This sector contains indirect blocks addresses The indirect blocks have addresses of the data on disks at sector n: 3:314752 IBM Systems 19

A Closer Look At a Compressed File After the compression of the file tsdbfs gpfs0 listda i :display disk address list for inode i in the file system gpfs0 The inode of the file has not changed Indirect blocks addresses have not changed The data blocks have change Grouped by 10 in compression group Compressed data blocks have compressed flag: C Deallocated block have znull IBM Systems 20

Accessing compressed files Clients vm04 vm03 Network Servers vm02 vm01 /gpfs/gpfs0/test-64mb-0 The compressed file is 6.5 MB IBM Systems 21

Accessing compressed files: reads 6.5 MB Clients vm04 vm03 Before the read the RX is 6 MB Network Servers vm02 vm01 Read the file from the client over the network /gpfs/gpfs0/test-64mb-0 After the read the RX is 12.8 MB : size of compressed file on disk IBM Systems 22

Accessing compressed files: reads Reading compressed files The file is decompressed on the fly where the read is issued The file remains compressed on disk Reduced disk IOs, only the compressed blocks are retrieved from the disks Reduced network transfers, only the compressed blocks are send on the network The read performance is increased when The cost of decompression is balanced by the gain of blocks transfer So it depends on the network type and storage performance Example on VMs Compressed Decompressed Read from NSD server (disk local) 81.2 MB/s 125 MB/s Read from client 60.9 MB/s 55.0 MB/s Decompression slows down on the NSD server Compressed read faster on the client IBM Systems 23

Accessing compressed files: writes Modify 64KB in first data block Clients vm04 vm03 Network Servers vm02 vm01 The compression group containing the modified block is decompressed /gpfs/gpfs0/test-64mb-0 IBM Systems 24

Accessing compressed files: writes Clients vm04 vm03 Network File is illcompressed Servers vm02 vm01 /gpfs/gpfs0/test-64mb-0 IBM Systems 25

Accessing compressed files Writing in compressed files The parts of the files touched are decompressed The decompression is synchronous on the disks The granularity is 10 data blocks decompressed 1 data block touched, implies the 9 other of the decompression group are decompressed The file become illcompressed It needs to be restriped with mmrestripefs -z or mmrestripefile -z IBM Systems 26

: Compression CURRENT LIMITATIONS IBM Systems 27

Current Limitations File compression for cold data Zlib data compression library is used. It favors space over speed If the calculated potential saving space is below 10%, the compression group (10 data blocks) is not compressed The following operations are not supported: Compressing files in snapshots Compressing a clone or clone parent, or directories Compressing files in an AFM cache site or in an AFM-based asynchronous Disaster Recovery (DR) fileset Compressing small files (files that consume fewer than two subblocks, compressing small files into an inode) Compressing files in an FPO environment or horizontal storage pools Cloning a compressed file On Windows: mmapplypolicy command is not available. mmchattr command is available to compress/decompress. Compression of files in Windows hyper allocation mode is not supported. The following Windows APIs are not supported: FSCTL_SET_COMPRESSION to enable/disable compression on a file FSCTL_GET_COMPRESSION to retrieve compression status of a file In Windows Explorer, in the Advanced Attributes window, the compression feature is not supported. IBM Systems 28

DIO, Mapped Files, mmrestorefs, etc. Compress Direct IO file is not supported Direct IO is turned into buffered IO on compressed files To help enforce direct IO, Spectrum Scale does not compress a file that is already opened for Direct IO. An error is returned the compression command Memory Mapped File If the file is already compressed, upon page-in request, Spectrum Scale automatically decompresses the paged-in region and set the illcompressed flag of the file. Spectrum Scale does not compress a file already memory mapped. An error is returned HSM, backup, ADR, AFM: Not compression-aware HSM migrate/recall and backup/restore decompressed data and cause files to become illcompressed Use mmrestripefs or mmrestripefile command to recompress the recalled/restored files mmrestorefs: Restored data from snapshot is decompressed Do not compress files system (using policy, mmchattr, or mmrestripefs) during mmrestorefs to avoid a data loss risk (this will be fixed in PTF1). HSM, backup, ADR, AFM: Not compression-aware HSM migrate/recall and backup/restore decompressed data and cause files to become illcompressed mmrestripefs or mmrestripefile command to recompress the recalled/restored files. IBM Systems 29

Thank you. ibm.com/systems IBM Systems 30

Information and trademarks IBM, the IBM logo, ibm.com, IBM System Storage, IBM Spectrum Storage, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Archive, IBM Spectrum Virtualize, IBM Spectrum Scale, IBM Spectrum Accelerate, Softlayer, and XIV are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. ITIL is a Registered Trade Mark of AXELOS Limited. UNIX is a registered trademark of The Open Group in the United States and other countries. * All other products may be trademarks or registered trademarks of their respective companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-ibm products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed by the local country counsel for compliance with local laws. IBM Systems 31

Special notices This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area. Information in this document concerning non-ibm products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied. All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice. IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies. All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment. IBM Systems 32