Cloud Archive & Long Term Preservation Challenges and Best Practices

Similar documents
Archive and Preservation in the Cloud - Business Case, Challenges and Best Practices. Chad Thibodeau, Cleversafe, Inc. Sebastian Zangaro, HP

in the Cloud - What To Do and What Not To Do Chad Thibodeau / Cleversafe Sebastian Zangaro / HP

Cloud Archiving. Paul Field Consultant

<Insert Picture Here> Cloud Archive Trends and Challenges PASIG Winter 2012

Interoperable Cloud Storage with the CDMI Standard

Building the Business Case for the Cloud

15 Best Practices For Comparing Data Protection With Cloud Media Server

Storage Clouds. Karthik Ramarao. Director of Strategy and Technology and CTO Asia Pacific, NetApp Board Director SNIA South Asia

Cloud Data Management Interface (CDMI) The Cloud Storage Standard. Mark Carlson, SNIA TC and Oracle Chair, SNIA Cloud Storage TWG

Storage and Data Management in a post-filesystem

UNDERSTANDING DATA DEDUPLICATION. Thomas Rivera SEPATON

in Transition to the Cloud

Enterprise Architecture and the Cloud. Marty Stogsdill, Oracle

UNDERSTANDING DATA DEDUPLICATION. Tom Sas Hewlett-Packard

Storage Clouds. Enterprise Architecture and the Cloud. Author and Presenter: Marty Stogsdill, Oracle

ARCHIVING FOR DATA PROTECTION IN THE MODERN DATA CENTER. Tony Walker, Dell, Inc. Molly Rector, Spectra Logic

Enterprise Architecture and the Cloud. Marty Stogsdill, Oracle

ADVANCED DEDUPLICATION CONCEPTS. Larry Freeman, NetApp Inc Tom Pearce, Four-Colour IT Solutions

UNDERSTANDING DATA DEDUPLICATION. Jiří Král, ředitel pro technický rozvoj STORYFLEX a.s.

Trends in Application Recovery. Andreas Schwegmann, HP

Deploying Public, Private, and Hybrid Storage Clouds. Marty Stogsdill, Oracle

Today s Agile, Complex and Heterogeneous Data Centers

Strategies and New Technology for Long Term Preservation of Big Data

Restoration Technologies. Mike Fishman / EMC Corp.

How To Create A Cloud Backup Service

Deduplication s Role in Disaster Recovery. Gene Nagle, EXAR Thomas Rivera, SEPATON

Creating a Catalog for ILM Services. Bob Mister Rogers, Application Matrix Paul Field, Independent Consultant Terry Yoshii, Intel

Deduplication s Role in Disaster Recovery. Thomas Rivera, SEPATON

Using Classification to manage File Servers. Nir Ben-Zvi, Microsoft Corporation

Deduplication s Role in Disaster Recovery. Thomas Rivera, SEPATON

Preparing for a Security Audit: Best Practices for Storage Professionals

Data Center Transformation. Russ Fellows, Managing Partner Evaluator Group Inc.

Storage Cloud Environments. Alex McDonald NetApp

CLOUD STORAGE SECURITY INTRODUCTION. Gordon Arnold, IBM

Storage Virtualisation in the Cloud

Best Practices for Long-Term Retention & Preservation. Michael Peterson, Strategic Research Corp. Gary Zasman, Network Appliance

High Performance Computing OpenStack Options. September 22, 2015

Cloud File Services: October 1, 2014

Rethinking Archiving: Exploring the path to improved IT efficiency and maximizing value of archiving solution investments

The Business Case for the Cloud. Presenter: Alex McDonald, Industry Standards, CTO Office, NetApp Author: Marty Stogsdill, Oracle

Cloud Computing Actionable Standards An Overview of Cloud Specifications

A HYPE-FREE STROLL THROUGH CLOUD STORAGE SECURITY

Cloud Storage Clients. Rich Ramos, Individual

Unmasking Virtualization Security. Eric A. Hibbard, CISSP, CISA Hitachi Data Systems

Storage Technology. Standards Trends

How To Understand And Understand The Risks Of Configuration Drift

Visions for Ethernet Connected Drives. Vice President, Dell Oro Group March 25, 2015

Understanding Enterprise NAS

How To Migrate To A Network (Wan) From A Server To A Server (Wlan)

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Active Archive - Data Protection for the Modern Data Center. Molly Rector, Spectra Logic Dr. Rainer Pollak, DataGlobal

Data Center Consolidation: Lessons From The Field. John Tsiofas, Kraft Kennedy David Carlson, Kraft Kennedy

The Secret Sauce of ILM The ILM Assessment Core. Bob Rogers, Application Matrix

The Role of WAN Optimization in Cloud Infrastructures

In ediscovery and Litigation Support Repositories MPeterson, June 2009

Block Storage in the Open Source Cloud called OpenStack

Security Issues in Cloud Computing

Introduction to Data Protection: Backup to Tape, Disk and Beyond. Michael Fishman, EMC Corporation

Trends in Data Protection and Restoration Technologies. Mike Fishman, EMC 2 Corporation (Author and Presenter)

Trends in Data Protection and Restoration Technologies. Jason Iehl, NetApp

ILM: Tiered Services & The Need For Classification

How to Cost Effectively Retain Reference Data for Analytics and Big Data. Molly Rector, EVP Product Management & WW Marketing, Spectra Logic

Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer

Protecting Official Records as Evidence in the Cloud Environment. Anne Thurston

Securing the Cloud - Using Encryption and Key Management to Solve Today's Cloud Security Challenges

Cloud Storage Use Cases

STORAGE SECURITY TUTORIAL With a focus on Cloud Storage. Gordon Arnold, IBM

EXIN Cloud Computing Foundation

Cloud and Big Data initiatives. Mark O Connell, EMC

Cloud Storage Standards Overview and Research Ideas Brainstorm

Exposing the Cloud: It It s More than a Buzzword Tim Connors, Director, AT&T AT&T

The State of Hybrid Cloud

Ex Libris Rosetta: A Digital Preservation System Product Description

See Appendix A for the complete definition which includes the five essential characteristics, three service models, and four deployment models.

Securing The Cloud. Russ Fellows, Managing Partner - Evaluator Group Inc.

Cloud Services Overview

Information Migration

The New Style of IT. Rob McMahon. Director Cloud Computing HP General Western Europe

How To Manage Cloud Data Safely

Data Portability Requirements for EHRs

Implementing, Serving, and Using Cloud Storage

The Impact of EU Data Protection Legislation. Thomas Rivera Hitachi Data Systems

Three Virtualization Management Myths Busted. Rich Corley, NetApp

WHITE PAPER Making Cloud an Integral Part of Your Enterprise Storage and Data Protection Strategy

Cloudy with Showers of Business Opportunities and a Good Chance of. Security. Transforming the government IT landscape through cloud technology

Managing Data Storage in the Public Cloud. October 2009

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage

LEGAL ISSUES IN CLOUD COMPUTING

Lecture 02b Cloud Computing II

What Cloud computing means in real life

LONG TERM RETENTION OF BIG DATA

Kroll Ontrack VMware Forum. Survey and Report

Storage Multi-Tenancy for Cloud Computing. Paul Feresten, NetApp; SNIA Cloud Storage Initiative Member

PROTECTING DATA IN THE BIG DATA WORLD. Thomas Rivera, Hitachi Data Systems

Virtualization in a Multipurpose Commercial Data Center

GETTING THE MOST FROM THE CLOUD. A White Paper presented by

The Other SAAS: Why Service Providers

WAN Optimization and Cloud Computing. Josh Tseng, Riverbed

An Introduction to Storage Management. Raymond A. Clarke, Oracle

SERVER VIRTUALIZATION AND STORAGE DISASTER RECOVERY. Ray Lucchesi, Silverton Consulting

Transcription:

Cloud Archive & Long Term Preservation Challenges and Best Practices Chad Thibodeau, Cleversafe, Inc. Sebastian Zangaro, HP Author: Chad Thibodeau, Cleversafe, Inc. Author: Sebastian Zangaro, HP

SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced in their entirety without modification The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney. The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 2

Abstract Cloud Archive Challenges and Best Practices This session will appeal to Storage Vendors, Datacenter Managers, Developers, and those seeking a basic understanding of how best to implement a Cloud Storage Digital Archive and Cloud Storage Digital Preservation service. In addition, we will discuss how these approaches result in a greener implementation versus traditional inhouse implementations. This session will examine current challenges within the Public Cloud Storage Industry, delve into some specific services profiles, and address some best practices for utilizing cloud storage for archive and preservation needs. 3

Agenda What is the problem? Challenges of Traditional vs. Public Cloud Storage Archive and Preservation Defined SNIA Cloud Archive and Preservation SIG Solution Services Profiles 4

Paradoxes of Archive & Preservation Data will be lost! Migration does not scale Access & use models keep changing Cost overwhelms everything complexity does not 5

Defining the Problem Cloud storage more suitable for local applications less sensitive to latency (backup, archive). The Local Backup to a remote location use case is not sensitive to the latencies of public cloud storage. Regulation challenges require companies to keep cold data available all the time. HIPPA Sarbanes Oxley SAS 70 J-SOX (Japan) Directive 2006/43/EC (EU) Loi de sécurité financière (France) 6

Additional Challenges Lack of uniform semantics and standard interfaces Interoperability between public cloud providers Managing data format changes over time Authenticity verification Compliance and Governance Risk Management & Litigation Security Multi-tenancy 7

Archiving Traditional storage vs. Public Cloud Traditional Lower latency Power, cooling costs Administration costs Migration costs Format Storage platform Backup New technology adoptions (e.g. dedup) Public Cloud Higher latency Service provider costs WAN costs (if using hybrid/public clouds) Migration costs (if using hybrid/public clouds) From one provider to another.

Defining the Problem Cloud-based storage is 74% less expensive than traditional storage infrastructures 1. Operating costs are higher when using local, traditional storage (more capacity than data, redundancy, backups, administration costs, Data Center power/cooling costs) Cooling equipment consumes about 45% of power delivered to data center Storage consumes 13% of total data center power, with 15% for servers) 1. ( File Storage Costs Less in the Cloud Than In-House, Andrew Reichman, Forrester 2011) 9

A new class of data migration challenges? Cloud A Cloud B Data over WAN via vendor specific API s 10

Security Assurance that users see only what they entitled to Assurances that administrators see only what they need to see and not customer data. Rights and Role management Intrusion protection 11

Cloud storage is not going away Archiving in the Cloud 2009-2015 $2,000 $1,500 $1,000 $500 $0 2009 2010 2011 2012 2013 2014 2015 Revenue ($M) IDC. Worldwide Storage in the Cloud 2011-2015 Forecast: The expanding role of Public Cloud Storage Services 12

Archive vs. Preservation Digital Archive Specially designed system / repository to store digital data Systems management Physical security Data security Data backups Disaster recovery ISO 9001 certification Manifest verification Virus check Format verification Fixity check Digital Preservation Process to ensure long-term data availability Refresh Migration Replication Emulation Metadata Attachment Sustainability Timeless

Definitions Digital Archive Service A storage repository or service used to secure, retain, and protect digital information and data for periods of time less than that of longterm data retention. A digital archive can be an infrastructure component of a complete digital preservation service, but is not sufficient by itself to accomplish digital preservation, i.e., long-term data retention. Cloud Digital Archive Service: A cloud-based offering providing a digital archive service. Can be utilized as a component of a complete digital preservation service. Does not necessarily provide adequate services to accomplish digital preservation. 14

Definitions (cont.) Cloud Digital Preservation Service A cloud service providing digital preservation of information and data. A digital preservation service includes a comprehensive management and curation function that controls: Supporting Infrastructure Information Data Storage Services 15

Cloud Reference Architecture Cloud Provider Auditing Cloud Consumer Cloud Broker Business Support Service Intermediation Service Aggregation Service Arbitrage Security / Privacy SaaS PaaS IaaS Service Orchestration and Management DaaS Service Layer Physical Resource Layer Provisioning/ Configuration Service Creation Tools Portability/ Interoperability Resource Abstraction and Control Layer Hardware Facility Network Storage Archive Security/ Privacy Compliance Performance Administration Monitoring / Reporting Metering / Billing Cloud Carrier (private or public network) 16

Information Governance Reference Model Source: EDRM.net 17

Cloud Archive and Preservation SIG Advance the use of public, private and hybrid clouds for archival services and long term retention CDMI Market Education Best Practices Services Profiles Standards Promotion Industry Liaison Interoperability Demonstrations/Certifications and Plugfests Implementation Reference Model Participating companies: BlueArc, Cleversafe, Computer Associates, EMC, HP, Hitachi Data Systems, IMERGE Consulting, Iron Mountain, NetApp, Novell, Oracle, SNIA, Spectra Logic, Strategic Research Corp 18

What is already standardized? Benefits of Industry standards: Allows storage vendors and developers to easily integrate with any cloud infrastructure. Allows Data Object Migration between heterogeneous systems: End User site to Public Cloud Public Cloud A to Public Cloud B From Public Cloud back to the End User Standards already exist such as Self-contained Information Retention Format (SIRF) and CDMI (The Cloud Data Management Interface) SNIA s Cloud Data Management Standard (CDMI) Standardized Data Path (Access) to the Cloud Standardized metadata to express the Archive requirement for the Data put in the cloud Immutability in some cases 19

SIRF Being developed by Storage Networking Industry Association (SNIA), Long Term Retention (LTR), Technical Working Group (TWG) An Analogy Standard physical archival box Archivists gather together a group of related items and place them in a physical box container The box is labeled with information about its content e.g., name and reference number, date, contents description, destroy date SIRF is the digital equivalent Logical container for a set of (digital) preservation objects and a catalog The SIRF catalog contains metadata related to the entire contents of the container as well as to the individual objects SIRF standardizes the information in the catalog [Photo courtesy Oregon State Archives] 20

Cloud Peering 21

CDMI Reference Model 22

How does this work in CDMI? Standarizes the access to data in the cloud Uses RESTful principles Can be implemented on top of the provider s own interface. Cloud Client needs to discover what archiving capabilities are provided by the cloud CDMI does this though Capabilities a type of resource that acts like a service catalog for the functions that the cloud offers customers If the cloud offers the capability, the customer marks the data objects and containers with metadata (Data System Metadata) that specifies the requirements Lastly the Cloud provider has a way of expressing what is actually being provided also through metadata 23

CDMI Services Cloud Digital Archive Storage Services Snapshot type Replication type/class DeDupe type/class Data Integrity Data & Information Services Retention Period Permanent Deletion Confidentiality/Encryption Security Access, Audit logs Physical Migration Indexing/Searching Litigation Hold

CDMI Services Cloud Digital Preservation Storage Services Snapshot type Replication type/class DeDupe type/class Data Integrity Fixity computation Data & Information Services Retention Period Permanent Deletion Confidentiality/Encryption Security Access, Audit logs Physical & Logical Migration Indexing/Searching Litigation Hold Digital Auditing Preservation Objects Provenance

Summary Slide Digital Archive and Preservation Services are becoming more prevalent and a basic requirement for businesses beyond traditional libraries and content repositories Cloud-based digital archives and preservation services offer significant advantages regarding: cost, power/cooling, datacenter footprint, security, and availability Companies can take advantage of green cloud technologies for their archive and preservation requirements in place of using their own internal infrastructure achieving >70% savings 26

Q&A / Feedback Many thanks to the following individuals for their contributions to this presentation. SNIA Cloud Archive and Preservation SIG Michael Peterson Don Post Chris Marsh Thomas Rivera Chad Thibodeau Mark Carlson Ray Clarke Bob Rogers Roger Cummings Sebastian Zangaro Send any questions or comments on this presentation to SNIA: tracktutorials@snia.org 27

28

Digital A&P Taxonomy 29

Digital Preservation Framework Source: www.ltdprm.org 30

We need a vision Archive & Preservation Evolution 1990 2000 2010 2020 **Courtesy of LTDPRM.org 31