Approved for Public Release; Distribution Unlimited. Case Number 15 0196 Cloud Gateway Monica Stebbins Agenda 2 Cloud concepts Gateway concepts My work
3 Cloud concepts What is Cloud 4 Similar to hosted services or co-lo where management of hardware/software is 3 rd party
How Is It Different 5 Promotes Self Service ( As A Service ) IaaS PaaS SaaS Transmission is in REST/HTTPs Possible cost benefit of on prem vs. off prem Rapid deployment of resources- zero time for procurement As A Service * 6 Minimal Capital Investment at onset Incremental cost as your service usage grows Self-service Best practices built-in Resource sharing Automated deployment Management services Lifecycle management Reuse *http://en.wikipedia.org/wiki/platform_as_a_service#cite_note-1
IaaS 7 Infrastructure as a Service* Virtual Machines/Servers Storage Network *http://en.wikipedia.org/wiki/infrastructure_as_a_service#infrastructure_as_a_service_.28iaas.29 PaaS 8 Platform as a Service* Operating System Web Server/Middleware Database Programming Languages/Applications *http://clean-clouds.com/2011/11/what-is-platform-as-a-service-paas/
9 SaaS Software as a Service* Delivers software as a Web service Cheaper than hosting it yourself (Monthly fee) Eliminates the need to purchase hardware, deploy and manage the application Third party management of application *http://www.webopedia.com/term/s/saas.html Transmission in REST/HTTPs 10 Representational State Transfer* Stateless, client-server, cacheable communications protocol Architecture style for designing networked applications Uses HTTP protocol What does this mean for data center communications? You need something that speaks REST to talk to the cloud http://en.wikipedia.org/wiki/representational_state_transfer
11 Backups- Why use Cloud? Backup workloads are well suited for the cloud Sequential writes Usually there is an offsite requirement for backups Most organizations would rather not manage their own 12 Backup Targets CLOUD GATEWAY BACKUP SERVER DISK CLOUD TAPE
Backup Targets 13 Type Advantages Disadvantages Disk Reliable, Capacity Can be expensive Not energy efficient Tape Cheap, Reliable, Energy efficient, Capacity People Management Obsolete technology needs to be kept around Additional cost for offsite storage/management Cloud Minimize hardware purchases, Management is offloaded Eliminates need secondary data centers Rapid procurement Elastic Bandwidth considerations Newer technology Management is offloaded 14 Questions?
15 Gateway concepts What is a Cloud Gateway 16 A cloud storage gateway is a network appliance or server which resides at the customer premises and translates cloud storage APIs such as SOAP or REST to blockbased storage protocols such as iscsi or Fibre Channel or file-based interfaces such as NFS or CIFS* *http://en.wikipedia.org/wiki/cloud_storage_gateway
What is a Cloud Gateway* 17 Cache Global Namespace Cloud Gateways Protocol Conversion Storage Tiering File Sharing *http://en.wikipedia.org/wiki/cloud_storage_gateway *http://go.panzura.com/rs/panzura/images/customer%20faq%20v1.1.pdf What is a Cloud Gateway Cache 18 Amount of storage that exists on the physical box, which will hold the data to be replicated to the cloud.
What is a Cloud Gateway Protocol Conversion 19 Conversion from SOAP/REST to Block (iscsi/fibre Channel) and/or CIFS/NFS* Translate from cloud protocol to data center protocol *http://en.wikipedia.org/wiki/cloud_storage_gateway What is a Cloud Gateway Global Namespace 20
What is a Cloud Gateway Storage Tiering 21 Storage tiering is a method employed by administrators to reduce storage costs by moving data between types (or tiers) of storage. Storage tiers can be characterized by drive type, drive size, and/or rotational speed. Just as cloud can be a tier, cloud gateways can also be a tier What is a Cloud Gateway File Sharing 22 File sharing allows multiple users access to files from a centralized location. The centralized nature of the repository means that the files do not need to be distributed to users, on an individual basis.* Employee share drives or project share drives *http://en.wikipedia.org/wiki/file_sharing
Some Other Features* 23 Throttling Deduplication Software Agents Compression Snapshots Encryption *http://en.wikipedia.org/wiki/cloud_storage_gateway How Cloud Gateways work 24
Vendor Landscape 25 *http://www.computerlinks.co.uk/fms/18840.cloud_backup_vendor_landscape_report.pdf 26 Questions?
27 My Work 28 My Work The primary goal of this study was to evaluate cloud gateways and their ability to manage cloud storage data transmission issues, and how that related to backups The assessment consisted of performing four functional tests with five cloud gateways. The functional tests were as follows: Develop baseline measurements for backup and restore Evaluate feasibility of deployment by Installing each gateway Configure and perform backups using gateway Configure and perform restores using gateway Measure the speed of the backup/restore to/from the gateway Measure the speed of the backup/restore to/from the cloud
Evaluation- The Why 29 Was interested in the technology and how it interfaced with the cloud Looking for new backup/recovery strategies Picked five and tried them out Baseline Test Environment 30 One Gigabit network connection between all components A total of fifty-three genome files downloaded from NCBI, sized between 1MB and 300MB, were written sequentially to the backup media target. A single virtual machine host server running Windows 2008 R2 with 500GB of storage, serving as both the backup server and the client to be backed up A cloud account created to provide a bucket/container to house the test backups in the cloud Backup software installed on the Windows server facilitating backups to a cloud target A CIFS file share created on a traditional storage array, mounted to the Windows server
Baseline Test environment 31 Gateway Test environment 32 One Gigabit network connection between all components A total of fifty-three genome files downloaded from NCBI, sized between 1MB and 300MB, were written sequentially to the backup media target. A single virtual machine host server running Windows 2008 R2 with 500GB of storage, serving as both the backup server and the client to be backed up A cloud account created to provide a bucket/container to house the test backups in the cloud Backup software installed on the Windows server facilitating backups to a cloud target A CIFS file share created on a each cloud gateway mounted to the Windows server
Gateway Test Environment 33 34 Gateway Deployment Considerations Lack of Support for Proxies for outbound/inbound traffic Opening of Non Standard ports on the firewall Cloud bucket with existing data Lack of support for Local Accounts Install and configure Active Directory and DNS How restores work Is it a one step or two step process to get data restored? Application Crash Consistency features Ex. How would you backup database files?
35 Challenges with using backup software Backup Software installed on the server writes backups from the server to the gateway. The gateway is then responsible for replicating from the gateway to the cloud bucket. The backup software does not know about the replication, it is a separate job from the backup Throughput rates for backups and restores were measured at two different points using different tools Backup server to the gateway Gateway to the cloud This causes a disconnect between what the backup software knows about and what is located in the cloud. The complications can be avoided by: Using software agents Deploy an agent to each host, which is cumbersome if you have thousands of hosts Using backup software to do the replication job Defeats the purpose of the gateway, which in addition to replication provides numerous other functions Challenges with using backup software 36 If the data required for restore resides on the appliance restoration can be initiated through the backup software If backup software is managing backups, then restoration can become complicated If the data required for restore has been evicted to the cloud, restoration may require two steps, depending on the gateway Some gateways have solved this problem by bridging the communication gap between backup software and cloud. In this case, restores will be a one step process. If you have a gateway that doesn t do this effectively, restore involves: Determining which files are required for restore through backup software Retrieving those files using the cloud gateway GUI- this will fetch them from the cloud and place them on the gateway Perform a restore as you normally would with the backup software
Pre-population/Bring into Cache 37 Some gateways allow you to prepopulate/bring back onto the appliance, the data required for restore. If this is not done, restores will take much longer as the data will need to be fetched from the cloud. *Riverbed Steelstore GUI Cache Size and its role 38 A proper caching exercise is important If all of your data resides in cache, the system never has to reach out to the cloud to retrieve data. Considerations Size of your backup? How often do you backup? How long will you retain the backup? Is there data that should be permanently archived?
Implementation Process 39 Cloud Bucket Configure Cloud Bucket Cloud Gateway Data replication from gateway to the cloud Install and configure each gateway Input cloud bucket credentials into each gateway Backup Server Data backup from backup server to gateway Install and configure the backup server Install and configure backup software Configure Cloud Bucket 40 Create bucket Retrieve access keys
41 Configure Cloud Gateway Add cloud bucket to each gateway using keys given when created Gateway needs to know how to access your cloud destination, this is where the gateway will write backups Create a CIFS share on each gateway This is where backup software will write backups to on each gateway Schedule replication from gateway to cloud or accept defaults How often do you want data to replicate from the gateway to the cloud? Keep in mind this can impact your network Measure throughput rates for backup and restore 42 Configure Backup Server Install Backup software on data center server Add each CIFS share created on each gateway to backup software as a target Tell the backup server where the CIFS shares are, so backups can be written Create/schedule backup jobs Backup/Restore data to CIFS targets on gateway Measure backup and restore rates
Results 43 Backing up to the gateway was on par with backing up to any local LAN based disk array Replication from gateway to the cloud was on par or faster than backing up/restoring directly to/from the cloud. Added benefit of getting an extra copy of your backups. Having a primary copy already on disk, alleviates concerns about backup failures and timeouts associated with cloud latency Gateways provide many other features reminiscent of traditional storage arrayscompression, encryption, deduplication Results 44 Source Target Backup Throughput (MB/s) Restore Throughput (MB/s) Client Traditional 15.6 6.07 Client Cloud Bucket 8.90 0.558 Client Gateway 1 24.4 3.02 Client Gateway 2 15.04 27.9 Client Gateway 3 14.82 18 Client Gateway 4 15.37 24.1 Client Gateway 5 4.34 2.55
Factors on Results 45 Size of hardware/virtual machine Deduplication factors Network Bandwidth Encryption Use of proxies for network traffic Compression Size of cache Security Concerns 46 Key management Encryption at Rest Encryption in Transit
Cloud Encryption Gateway* 47 -Intercepts sensitive data while it is still on-premise -Replaces it with a random tokenized or strongly encrypted value Rendering it meaningless should anyone hack the data while it is in transit, processed or stored in the cloud -If encryption is used, the enterprise controls the key -If tokenization is used, the enterprise controls the token vault. *http://perspecsys.com/resources-2/what-is-a-cloud-encryption-gateway/ Futures 48 Big storage companies are buying up small gateway companies and incorporating gateway features into main storage product lines
THANK YOU! 49 QUESTIONS? MONICA STEBBINS THE MITRE CORPORATION mjurs@mitre.org