Distributed Block-level Storage Management for OpenStack OpenStack APAC Conference Daniel Lee CCMA/ITRI Cloud Computing Center for Mobile Applications Industrial Technology Research Institute ( 雲 端 運 算 行 動 運 用 科 技 中 心 ) 1 1
Outline Overview Brief on ITRI Cloud OS Cloud storage system in ITRI Cloud OS Distributed Main Storage System Distributed Secondary Storage System Integration of Cloud OS Storage System with OpenStack Summary Q&A 2
Overview Data is growing exponentially and moving on to Could. It needs to be available at all time, secured and protected. 911 in Twin Towers, 311 earthquake in Japan and 420 outage of Amazon datacenters are disasters. Block-level and fully redundant Backup / restore + wide-area Thin provisioning De-duplication Others Cloud OS Storage System has these functionality and is ready to integrate to OpenStack 3
ITRI Cloud OS (An All-in-one IaaS Total Solution) Primary/Secondary Storage Management (EMC) Physical Resource Management (HP/Dell) Virtual Data Center Management Virtual Data Center Provisioning (VMWare) Physical Compute Servers Distributed Main/Secondary Storage All-layer-2 Network Physical Data Center Management (IBM) Security (Checkpoint) Power Management (IBM) *Blue name in bracket indicates the leading company of the component Inter-PM Load Balancing and VM Fail-over (VMWare) Inter-VM Load Balancing (F5) 4
ITRI Cloud OS Service Interface Multiplexing VDC s in a Physical DC Virtual Data Center Management Physical Data Center Management Provision and Deploy Photo Sharing VDC Video Streami ng VDC Web Confere nce VDC Cloud Application Developer Cloud Service Provider Monitor and Configure Virtual Resources Physical Cluster Cloud Service Infrastructure Administrator Carrier Monitor, Diagnose and Configure Physical Resources 5
VDCM Assets (VDC, VC, VM) 6
VDCM System Images 7
VDCM - Volumes 8
Cloud Storage System in Cloud OS Cloud storage aims at cloud-scale data centers, and is designed to be scalable, available and low-cost. Main components Distributed Main Storage subsystem DMS Provide current image I/O request processing Distributed Secondary Storage subsystem DSS Block-level incremental metadata only backup system Take volume snapshot / restore, de-duplication engine, garbage collection and WADB engine 9
Key Features of Cloud Storage System from Users Perspective Use like raw, unformatted block devices Support volume size from 1 GB to 2 TB (64TB architecture-wise) Can be attached by different instances (one at a time for now) Multiple volumes can be mounted to the same instance Data is automatically replicated on different physical storage server (N=3) Ability to clone volumes and/or create point-in-time snapshots of volumes Wide-area backup for disaster recovery Snapshot can be scheduled by assigning a backup policy with time, frequency, retention, and WADB option Save system image for starting multiple VMs using the same image 10
Key Features of Cloud Storage System from Providers Perspective Use of commodity hardware reducing cost (JBOD) Disk space management for up to multiple petabytes Add disk, remove disk without interruption Thin provisioning Enables you to provision a specific level of I/O performance if desired, by setting I/O throttle Block-level de-duplication for reducing space requirement 11
DMS in Cloud OS DMS - Distributed Main Storage Goal: A cloud scale network storage solution that provide high capacity, high reliability, high availability and high performance Features: Volume operations: Create / Attach / Detach / Delete High Reliability: data replication (N=3) High Availability: No single point of failure High Performance: client-side metadata/data cache for performance Thin provisioning is utilized to optimize utilization of available storage. Fast Volume Cloning Save Image; FastBoot Disk I/O Throttle 12
DMS System Architecture Compute Node VM Volume VM Volume VM Volume Service Node DMS In-kernel client Metadata NameNode Data RR Storage Node DataNode DMS three components: 1. Namenode 2. DMS in-kernel client 3. Datanode 13
DMS Compute Node Architecture Compute Node Domain U User Space Domain U Kernel Space RPM Attach/Detach Volume Applications Read/Write Sectors Xen / KVM /dev/hda1 Attach a disk to Dom U Block Frontend Driver I/O Virtualization NameNode User-space Daemon IOCTL Block Backend Driver Metadata In-Kernel Client DataNodes Payload Block /dev/dms01 Devices Domain 0 User Space /dev/dms02..... Domain 0 Kernel Space 14
Read Flow 6. Response (payload) Compute Node VM Volume 1. Read request DMS In-kernel Client 2. query metadata: (VolID, LBID) 3. HBID, N-locations Service Node NameNode 7. Report failed locations 4. Read Data 5. Read Data from another datanode if previous-try failed Storage Node Storage Node Storage Node DataNode DataNode DataNode OpenStack APAC Conference Copyright 2008 ITRI 工 業 技 術 研 究 院 /8/11 15
Write Flow Memory-Ack Approach Compute Node VM Volume Service Node 6. Response 1. Write request 4. Write Data 5. Memory Ack 6. Disk Ack DMS In-kernel Client 2. Request for new space:(volid, LBID) 3. HBID, N-locations 7. Commit (Success, or report failed locations) NameNode Storage Node DataNode Storage Node DataNode Storage Node DataNode Memory-ack approach: In client, I/O response to VM(step6) is right after receiving memory ack from Datanodes (step5). OpenStack APAC Conference Copyright 2008 ITRI 工 業 技 術 研 究 院 /8/11 16
DSS in Cloud OS Share the homogeneous space with DMS Block-level Incremental backup Copy-on-write (COW) and volume cloning techniques Backup is just a snapshot of meta-data Support volume-level restore Block-level de-duplication Garbage collection of expired, un-used and redundant disk blocks Wide-area data backup / restore and near-realtime replication The secondary storage itself is failure-tolerant (HA) 17
DSS Backup Policy 18
DSS Assign Backup Policy 19
DSS WADB Backing up the data to a remote location at the time volume snapshot is taken snapshot Network / VPN restore Beijing Shang-Hai 20
DSS Enabling WADB 21
De-duplication in DropBox 1 1 2 DMS/DSS 2 3 3 22
DSS Software Architecture Diagram Backup/Restore Request SOAP server RPC Backup Policy Tread DSS main Classes DSS Daemon DSS Manager Quartz Scheduler Timer Fired Thread Manager / Flow Control Restore Snapshot WADB Garbage Collection De-dup Scheduler DMS client protocol DB manager RS client Lib De-dup Engine DMS RS 23
OpenStack Web Interface Identity Security Compute Storage Network Hypervisors Architectures (ARM, GPU) Object (Swift) Block File Switch Router Firewall 24
Integration of Cloud OS Storage System with OpenStack User portal User Web Services (OpenStack API enabled) VDCM LOG PDCM PDCM User VMM L2 PRM Nova API Glance Nova Manager VM1 VM2 Nova Compute VMn SECS SLB DB Nova Volume Nova Network volume DMS RS 25
Summary of Advantages Feature Unlimited storage Leverages commodity hardware (JBOD) Built-in replication Used as system volume Thin Provisioning Highly available (HA) storage servers (DMS/DSS) Snapshot and backup for block volumes Policy based backup Wide-area data backup Block-level data de-duplication Standalone volume API, CLI available Benefits highly scalable read/write access No lock-in, lower price/gb Fully redundant (default: N=3) Save Image, Fast Boot, etc. Optimize utilization of available storage Server failover for high available Data protection and recovery for VM data Free of worry set it and forget about it For disaster recovering Store more data in the limited space Easy integration with other compute systems 26
More Feature in V2.0 Feature Volume Cloning Share virtual disk within cluster Disk bandwidth QoS guarantee Storage space optimization Wide-area data replication Integration with Compute / Volume Benefits Fast cloning without copying data Support sharing of volume and cluster file system Dynamic adjustment at I/O request level N-way replication for write-intensive blocks N-parity-group for read-intensive blocks Near real-time wide-area replication Fully integrated to Compute for attaching block volumes and reporting on usage 27
Q&A Thank You!