Christopher L Poelker Optimizing IT Data Services
US Commission on Cloud Computing Commission on the Leadership Opportunity in U.S. Deployment of the Cloud (CLOUD 2 ) Commission Full Report Text Available online at: www.techamericafoundation.org/cloud2 ComputerWorld Blog http://www.computerworld.com/author/chris-poelker/ Storage Area Networks for Dummies: http://www.wiley.com/wileycda/wileytitle/productcd-0470385138.html
APPROACHES TO IT MATURITY Tech over Time: Creation vs. Evolution One Shot and Hope you Hit it Random Evolution Intelligent Design Omnipotent and Omnipresent 3
MATURITY REQUIRES CHANGE Analysts take on IT Maturity Today 4
IT CHANGES RAPIDLY The IT Life Cycle Runs in Waves Distributed Distributed PC Computer Utility Computing Cloud Computing Smartphone Mini-Computer Dot Com Bust Tablet Virtualization Laptop Mainframe Computer Converged Converged Hyper-Converged? Similar to Gartner Tech Hype-Cycle 1960 1970 1980 1990 2000 2003 2009 2012 2014 http://www.computerworld.com/blog/intelligent-storage-networking 2017 5 5
The State of the Art of Storage: Data Moving Towards the App. Movement of Data
A SINGLE APPROACH MAY NOT BE FOR EVERYONE The Cloud 7
NIST: Cloud Reference Architecture
Infrastructure Working Group: Choosing Applications for the Cloud
Cloud Formation Chart IBM iseries
12 Steps to Become Cloud Ready 1- Enable application and data mobility by virtualizing servers and storage 2- Audit applications to assess areas where cloud would be beneficial 3- Embrace encryption at rest and robust key management guidelines 4- Assess utilization / costs of existing infrastructure and operations 5- Determine data growth trends and dedupe or delete where required 6- Audit data assets by capacity and access metrics and assign classes 7- Create data storage tiers for structured and unstructured data classes 8- Consolidate infrastructure and minimize complexity (Policies / Automate) 9- Perform detailed analysis of application interdependencies 10- Outsource where appropriate 11- Head to the nearest Bar 12- Don t Worry, Be Happy!
CURRENT REALITIES Challenges in IT Applications & Platforms (Physical & Virtual) Storage Platforms (Physical, virtual, SSD, Spindles, Tape) Application Security, Regulations, Records Management, etc.. 12
CURRENT REALITIES Typical Data Center 13
CURRENT REALITIES IT Operations Stack 14
Topology of an Optimized Data Center Compute Resources Physical Physical Virtual Virtual Compute Resources Physical Virtual Virtual M Abstraction/ Data Services WAN Abstraction/ Data Services M FC Thin T1 Pool SAS/SATA Thin T2 Pool Production Disk Pools Backup Node Thin Backup Pool Disk to Disk Backup Pool Sector-level duplicate data elimination Thin Tiered DR Pool DR Disk Pools Backup Node Physical and Virtual Tape Pool
SIMPLIFY AND AUTOMATE The IT Stack Current With Intelligent Abstraction > 40% Reduction in Complexity, 100% Data Agility 16
THE NEXT EVOLUTION The Advent of Intelligent Abstraction + Intelligent Abstraction (IA) combines the power of virtualization and artificial intelligence with policy based unified data services to enable freedom and automation of IT 17
IT MATURITY PYRAMID - Distributed Hyper-Converged Policy Based Distributed Converged = 2 x IBM i 18
SOFTWARE DEFINED REFERENCE ARCHITECTURE Virtual Abstraction + Unified Data Services Intelligent Abstraction Services Stack Application Integration Services Rest API & Policy Manager Converged Network Services Data Mobility Services Automated Protection Services Storage Virtualization Services 19
INTELLIGENT HYPER-CONVERGED Distributed Hyper-Converged Building Blocks Compute Resources Securely Manage, Protect, and Share Information Compute Resources Physical Virtual Cloud Virtual Physical M M Intelligent Abstraction Services Stack Application Integration Services Converged Network Services Data Mobility Services Automated Protection Services Storage Virtualization Services Intelligent Abstraction Services Stack Application integration Services Converged Network Services Data Mobility Services Automated Protection Services Storage Virtualization Services M = management point 20
Five Steps to Optimize IT Data Services Before moving to the cloud, first optimize current IT infrastructure and operations to reduce costs and then assess whether moving to a public cloud makes sense. 1.First focus on low hanging fruit: Backup and Continuity 2.Implement snapshots and continuous data protection 3.Leverage protection storage for test and development 4.Virtualize servers and storage to consolidate and commoditize 5.Centralize and automate operations
What are Data Services? Focus on the 4 Main Aspects of IT Data Services Provisioning Protection Replication Recovery
Changing the Data Services Paradigm through Innovation Technical Innovations to Optimize Data Services 1. Server, Storage and Tape Virtualization 2. Continuous Data Protection and Snapshots 3. Global Deduplication and WAN Optimization
Virtualization: Complete Data Mobility Move or copy data between tiers or arrays without application downtime Migrate data to different storage tiers while apps are up Replicate data between unlike storage arrays Implement thin provisioning on existing storage for better utilization Increase overall performance IP/iSCSI Direct Attached Disk FC Abstraction / Data Services Copy within array SSD Synch copy Asynch copy
Continuous Protection :Change the Physics of Backup Maintain a lower cost tier of protection storage for critical system recovery and dev / test Move data only as it changes, not in bulk, to minimize impact Move only unique data to conserve space Data Services CDP VTL Data Services Direct Attached Disk Production Storage Protection Pool
Use Dedupe to Optimize Data Storage, WAN, and Archives Optimize backup and archives with host-free, SAN-based data movement from protection pool Direct Attached Disk Backup Server Data Services CDP VTL Serverless Backup Data Services Virtual to Physical Production Pool Protection Pool CDP, VTL and Snapshots Pyhsical Tape Library
Understanding The Problem of Data Growth and Costs Primary Data: 20 Terabytes of data 2% change in data, 3% growth of data Five week retention Weekly backup of all data Daily incremental backup of new data Total = Operations managing 110 Terabytes of data on tape 100TB Example: 100TB means operations manages 550 terabytes of data on tape! (5 weeks x 110TB) Tape restoration will require weekly full and daily tape restores until day of failure Data created post backup will have to be recreated
Why dedupe is Important: Without Dedupe Sample parameters: Data volume = 20TB; 2% growth, 3% change weekly Onsite Retention = 5 weeks 110+TB 86TB 63TB 41TB 20TB Week 1 Week 2 Week 3 Week 4 Week 5
With Dedupe Sample parameters: Data volume = 20TB; 2% growth, 3% change weekly Onsite Retention = 5 weeks Total data stored = 15.2TB Redundant data NOT stored: 94.8TB Perspective: 2:1 =50% 5:1 =80% 10:1 =90% 20:1 =95% 30:1 =97% 20TB 6.6TB 10TB 12TB 14TB 15.2TB Week 1 Week 2 Week 3 Week 4 Week 5
Calculating the Benefits Current Costs Assuming LTO3 drives and 20 terabytes of production data: 20TB x 5 week retention = 110TB/400GB (capacity of LTO3) = 282 tapes 282 x $70 per tape media = $19,740 80MB/s (speed of LTO3) = 6.91TB per day To backup 20TB in a 12-hour window, you need 6 drives 6 drives = $3,214 x 6 = $22,500 $22,500 + $19,740 = $42,240 Total = $42,240 for each 20TB of primary data.
Optimize by Adding Dedupe Calculating the Benefits of Dedupe Implement 2 x 80TB dedupe appliances at an approx. cost of $35,400 $42,240 $35,400 = $6,840 savings for every 20TB Dedupe Appliance Dedupe Appliance Savings: Cost of offsite tape storage contract Cost of any array licenses and storage for replication No more media costs Minimal WAN costs (97% less WAN to replicate deduped data) Faster recovery and DR No more shipping, storing or recalling tape
Low End Tape Versus Disk Based Backup Cost are relative to the amount of data needing protection
High End Tape versus Disk Based Backup
DR: Fighting the Insurance Policy Mindset of Data Protection Here is an example of lost-revenue per hour per industry section based on publicly available data. This is useful for determining baseline outage costs. More accurate data can be obtained from an internal analysis. Remember these are HOURLY costs. Time to recover from tape? Lost Revenue per Hour (U.S. Industry Dollars) Energy $2,800,000 Telecom $2,000,000 Manufacturing $1,600,000 Finance $1,500,000 Information Technology $1,350,000 Insurance $1,200,000 Retail $1,100,000 Pharmaceutical $1,100,000 Chemical $700,000 Transport $670,000 Utilities $640,000 Healthcare $640,000 Media $340,000
Recovery: Tape versus Continuous Protection 3 rd party tape backup software Any server Time-Consuming Restore process via LAN (1Gb/s) 1. Locate & mount the Tape 2. Restore (transfer) data to the host 3. Process the data LAN CDP Target Instant data recovery over SAN without tape or restore Any server Virtual LUN SAN (FC, iscsi) No need to transfer data Instant Recovery Replaces Data Restore
Cross Platform Consistent Recovery Consistency Group IPL VTL IP CDP Instant data recovery over SAN without tape or restore CDP Target Virtual LUN SAN (FC, iscsi) No need to transfer data Instant Recovery Replaces Data Restore
Calculating the Benefits Dedupe together with CDP based recovery Calculate the benefits of implementing CDP and IPL from VTL. Assuming the cost of downtime is calculated using the numbers provided in my chart for a media company (340K per hour) and an average current recovery time of only four hours, the calculations are as follows: $340,000 x 4 hours = $1,360,000 (current outage costs) Cost of new solution (2 sites at $100,000 per location = $200,000) Recovery time for NEW solution = 30 minutes ($340,000/2 = $170,000) $1,360,000 - $200,000 -$170,000 = $990,000 Total $990,000 savings on first outage!
Leverage Protection Storage for Production Mount protection storage for recent snapshot copies for test/dev. Use snapshots as source for moving data to archives Data Warehouse Test Dev Hadoop Big Data Data Services Data Services Production Pool Protection Pool CDP, VTL and Snapshots
Calculating the Benefits Leverage Protection Pool for Test and Development Production Storage Costs: $2000 TB Protection storage costs: $600 TB Prospective Savings: $1400 per TB for Test and Development
Virtualize Everything Virtualize servers and storage Implementing virtualization can have a huge impact in multiple areas: Server virtualization commoditizes servers and enables server consolidation and mobility Storage virtualization commoditizes storage and enables complete data mobility and site resiliency Virtual tape for iseries backup and DR enables rapid cost efficient protection for mission critical apps.
Summary: Calculating the Benefits The ability to move data more efficiently via virtualization can reduce storage costs by 50 percent or more. Consolidating applications onto virtual servers can reduce infrastructure costs between 30-60 percent. Dense virtual storage and servers reduces data center power, cooling, and floor space requirements. (For example, a single blade server may be able to run 50 applications versus 50 physical servers). Similar to an LPAR! Virtual disk and tape storage with dedupe lowers costs, simplifies provisioning, and speeds recovery.
Centralize and Automate Enterprise Wide Data Management Small Office Small Office Small Office Small Office Small Office Consolidated Management Console r-hub r-hub r-hub Primary Optimized Continuous Deduped Replication DR e-hub r-hub r-hub e-hub Legend 512k Small Office Data Link T1 Bi-Directional Data Link OC-3 Bi-Directional Active/Active Data link Small Office Small Office Small Office Small Office Small Office
Summary: Optimized Data Services with Public / Private Cloud Optimized Data Services Edge Simplify Operations Automate Processes Virtualize Infrastructure Enterprise Wide Optimized Single Instance of Data Core DR
Optimized Private / Hybrid Cloud Example Compute Resources Physical Physical Virtual Virtual Compute Resources Physical Virtual Virtual M Data Services WAN Data Services M 97% Less WAN Public or Private Cloud Production Pool Protection Pool
Fast System Restore via IPL Virtual tapes function as a data source for system IPL (boot) Fast recovery of systems either locally or at remote site (use for DR, test, lab, etc.) Local Recovery IPL Remote Recovery IPL IBM i Virtual Tape image
iseries Deployment Example Insurance company, Western Europe iseries with V7R1, V6, and V5 TSM Version 6.3 LPAR LPAR LPAR LPAR LPAR LPAR LPAR LPAR LPAR LPAR LPAR Each LPAR has a dedicated virtual library TSM has two dedicated virtual libraries VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB VLIB Virtual LTO-x Nonreplicated data copied to tape Specific libraries set for replication over 1 Gb/s link STK 9310 Tape Library Replication Site 160 Km Windows, Unix and Linux Servers Combines TSM and IBM i backup to one VTL (one solution) Specific libraries set to replicate tapes to second site to protect critical data (flexibility) Data maintained both on disk with deduplication and on physical tape (choice)
Automatic Tape Caching Simplifies Implementation What the IBM i server sees Backup Moves backups to physical tape based on policies Age, time, space, etc. Virtual tape can be kept for a period after copy to tape IBM i server Restore Tape Library What is actually happening VTL Barcodes maintained Backup VTL to Tape Invisible to backup software IBM i server Restore Direct restore from tape One backup job for disk and tape Backup driven Policy driven
Benefits of Deduped VTL vs. Internal i5/os Virtual Tapes I5/OS Virtual Tapes Virtual Tape Library Need to manage virtual tapes in relation to system disk consumption and integrated file system No impact on system disk consumption, no management overhead Consumes costly iseries specific storage Storage independent of iseries storage Limited operational integration with tape Direct operational integration with tape
Simple Implementation The FalconStor VTL exactly emulates existing physical tape infrastructure for IBM i Virtual IBM 3580, 3590, 3592, TS1120 tape drives Virtual LTO-1, LTO-2, LTO-3, LTO-4, and LTO-5 tape drives Virtual IBM 3583, 3584, and 3590 libraries All existing backup tools and methods can continue to be used in the same manner BRMS, LXI MMS, SAVLIB, Attempo, media policies, etc. Daily operational impact is near zero
Summary: Private/Hybrid Cloud Value Proposition Simplify datacenter infrastructure via reference building blocks Provision storage anywhere in a few minutes 95% better recovery time (RTO) 99.999% improvement in recovery point (RPO) Save 80% or more on disk space 80% savings on bandwidth required for DR Use modular storage tiers to reduce CapEx by over 50% Eliminate backup windows Open systems protection is server-less, and LAN free Eliminate most array-based licenses (except for RAID) Everything is stored and moved as efficiently as possible Simplify operations
Technical References FalconStor VTL http://www.falconstor.com Cloud Commission Full Report Text Available online at: www.techamericafoundation.org/cloud2 ComputerWorld Blog http://www.computerworld.com/author/chris-poelker/ Storage Area Networks for Dummies: http://www.wiley.com/wileycda/wileytitle/productcd-0470385138.html Linkedin Groups: AS/400 Professionals