Cloud File System Gateway & Cloud Data Management Interface (CDMI) Author: Presenter: Imran Khan, Solutions Architect, Calsoft Inc. Parag Kulkarni, VP Engineering, Calsoft Inc.
Agenda Cloud Storage Industry Challenges Brief about CDMI Cloud File System(FS) Cloud FS Architecture Cloud FS Modules Cloud FS Solution Conclusion Q&A Session 2
Abstract Seamless extension of NAS to Cloud Storage using Cloud File System Today NAS stores data on local disks and/or SAN disks. Most enterprises have sufficient file storage capacity to run their day-to-day operations provided some older data is moved to secondary storage. But since most data resides on primary storage (or secondary storage within enterprise boundaries) it becomes necessary to extend storage capacity for NAS. With Cloud storage becoming more secure, accessible, easy to use and cost effective, it can be a considered as secondary storage for enterprise NAS Hierarchical Storage Management. We can even use cloud storage as primary storage by using enterprise storage devices for caching to improve cloud data access throughput. Adding CDMI based interfaces to cloud file system enables us to integrate with any cloud storage provider and store file based data to cloud storage easily. Cloud File System presented and implemented by Calsoft integrates with many cloud storage providers using CDMI. This helps enterprises store file based data to cloud storage and provides throughput similar to local NAS by using efficient caching techniques. 3
Learning Objectives Challenges To store ever growing data and optimally manage storage capacity Hierarchal Storage Management across enterprise storage and cloud storage Measure/monitor user guarantees and SLAs and map them over multiple clouds Optimizing storage capacity between on-premise and cloud storage pools To migrate between cloud storage platforms Solution Adoption of Cloud Storage CDMI move to build an open standards for storing data in the Cloud No impact on existing users/apps using NAS Abstract policy engine to monitor and map the SLAs 4
Cloud Storage Industry Challenges Access Bandwidth, Delay And Disruption Of Service Common Interface To Multiple Clouds Data Security Data Transfer Policy Auxiliary Features 5
Cloud Data Management Interface CDMI A protocol for self-provisioning, administering and accessing cloud storage. It defines RESTful HTTP operations for assessing capabilities of cloud storage system and exporting data via other protocols such as CIFS and NFS CDMI Benefits To manage containers, domains, security access Easy of monitoring / billing For storage that is functionally accessible by legacy or proprietary protocols 6
Calsoft s Cloud File System Leveraging CDMI to provide a common interface to interact with multiple clouds Cloud Access File system interface to the cloud access. Filesystem cache the data from the cloud to provide a quicker access Filesystem interface provided to the clients using NFS, CIFS etc Cloud Request Convert the filesystem demand into cloud requests Convert the data objects back to common file model Multiple Cloud Framework Enables interaction with multiple clouds, while abstracting out many operations Dynamically changes support for various cloud vendors Provides a set of policies to control access patterns 7
Cloud File System Architecture User 1 User 2 User 3 User 4 User n C I F S / N F S S E R V E R CLOUD DATA ACCESSS LAYER 3rd party Cloud Storage Plug-in USER MANAGEMENT POLICY MANAGEMENT Etc. LVM, Disk Driver, RAID, etc. CLOUD FILE SYSTEM Cloud 1 Cloud 2 CDMI SOAP / REST / WebDAV CDMI Compliant Cloud Storage Cloud 1 - CDMI Non Compliant Cloud Storage Cloud 2 - CDMI Non Compliant Cloud Storage Local / SAN Disks 8
Cloud File System Modules Cloud interface and the policy engine CFS User Space Cloud Interface Cloud Plugins S3 Other Other CFS user space and Local FS wrapper NIFS/ CIFS User Command translation Local FS wrapper & Policy Engine Other User Command Translation NIFS/ CIFS Kernnel Cloud FS ( CFS) Layer 1 Functionality Layer 2 Functionality kernel Cloud FS from cache or not? Local Cache FS (LCFS) Ext 3 reiser other NFS / CIFS 9
User Space Vs. Kernel Space Most of the file systems in user space (using FUSE) are designed so for the ease of developing and maintaining them. Example - s3fs for Amazon S3 cloud. Doesn t mean that performance is guaranteed. User space FS makes one or sometimes more than one data copies. When using a cache, in case of a hit, FUSE will still need a context switch and data copy. Linus Trovalds I think that arguing that something _can_ be done with fuse, and thus _should_ be done with fuse is just ridiculous. Polpulating the local cache can be done by just a command, why pass a buffer like fuse does. Layered functionality inside the FS (for ex: splitting) is easy to implement and could prove useful. 10
Policy Engine Pricing models that most clouds use Storage based - $/GB Request based - $/1000 requests. Data transfer based - $/GB Other QoS parameters that determine choice of cloud Easy provisioning Multi-tenancy Security Reliability These parameters especially pricing is tracked by the service provider. There is no easy way for user to track these parameters. Also, there is no standard or specification that defines these parameters. The policy engine module, proves to be an efficient solution To try and define these parameters across multiple clouds To monitor, keep track of these parameters Allows a rule based framework to control the access to these clouds based on the QoS they provide. In future maybe these QoS parameters can be standardized And made accessible via APIs, enabling users to program against these parameters 11
Cloud File System Solution Industry Challenges Access bandwidth, delay and disruption of service Common interface to multiple cloud Security Data transfer policy Auxiliary features Calsoft Solution The policy engine in cloud interface module can be used to distribute or replicate data across multiple clouds. Loss of service from one cloud will not hamper access to any data. The plugins to interface with different clouds supporting different communication protocols can be written independently and loaded at run time The policy engine can also select different security algorithms based on different clouds, which can be applied to the data while sending out over the wire. It is more efficient since it is out of band for a cache hit scenario The Policy engine is user controlled and xml based. The rules can be as simple and as comprehensive as needed The cloud interface and plugins can do book keeping that can be used to verify amount of data transferred and compare the cost of that data transfer against the billed amount 12
Conclusion Cloud File System is an idea that has taken into consideration current events in Cloud world related to Data storage as a Service (DaaS) It is a prediction of how infrastructure around cloud services and management has changed. This model that will improve performance, will enable seamless transitions across CDMI compliant and non-compliant clouds for large enterprises with very less hassle. 13
Presenter Biography Parag Kulkarni VP Engineering, Calsoft Inc. A veteran of storage industry More than 19 years of experience in architecting and developing products Key strength lies in quickly understanding product requirements and translating them into architectural and engineering specs for implementation. Led the engineering team at Calsoft. Led the development of Database Editions product at Veritas (Symantec) A key contributing member at leading storage companies like Informix (IBM). Masters of Technology in Computer Science from IIT Roorkee Degree in Industrial Management from University of Indore, India.
Author Biography Imran Khan Solutions Architect, Calsoft Inc. A veteran of storage industry More than 8 years of experience in architecting and developing products Has dealt with products ranging from backup and replication, SAN simulators, multipathing, SMI-S, filesystems, journaling, link aggregation protocols. Key strength is the ability to have holistic view across stacks of different functionality and their interaction. Bachelors in Computers Engineering from University of Pune, India.
Thank You Questions & Answers Contact info Parag Kulkarni VP Engineering, Calsoft Inc. Email: parag.kulkarni@calsoftinc.com Phone: +1 (408) 834 7086 16