Modernizing Data Storage Archive Infrastructure

Similar documents
Identifying the Hidden Risk of Data Deduplication: How the HYDRAstor TM Solution Proactively Solves the Problem

How To Make A Backup System More Efficient

Total Cost of Ownership Analysis

The Microsoft Large Mailbox Vision

June Blade.org 2009 ALL RIGHTS RESERVED

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data

Archiving, Backup, and Recovery for Complete the Promise of Virtualization

Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving

Future-Proofed Backup For A Virtualized World!

SOLUTION BRIEF KEY CONSIDERATIONS FOR LONG-TERM, BULK STORAGE

Driving Down the High Cost of Storage. Pillar Axiom 600

Three Things to Consider Before Implementing Cloud Protection

IBM Global Technology Services September NAS systems scale out to meet growing storage demand.

Protect Data... in the Cloud

Strategic archiving. Using information lifecycle management to archive data more efficiently and comply with new regulations

Slash Storage Costs, Archive to the Cloud

I D C T E C H N O L O G Y S P O T L I G H T

EMC DATA DOMAIN EXTENDED RETENTION SOFTWARE: MEETING NEEDS FOR LONG-TERM RETENTION OF BACKUP DATA ON EMC DATA DOMAIN SYSTEMS

SharePoint Archive Rules Options

DATA CENTER VIRTUALIZATION WHITE PAPER SEPTEMBER 2006

Using EMC SourceOne Management in IBM Lotus Notes/Domino Environments

DEFINING THE RIGH DATA PROTECTION STRATEGY

How To Use The Hitachi Content Archive Platform

How To Save Money On Backup With Disk Backup With Deduplication

The Advantages of Using Fujitsu As a Backup and Archiving Software

Data Sheet: Archiving Symantec Enterprise Vault Store, Manage, and Discover Critical Business Information

A Best Practice Guide to Archiving Persistent Data: How archiving is a vital tool as part of a data center cost savings exercise

Table of contents

How to Manage Critical Data Stored in Microsoft Exchange Server By Hitachi Data Systems

NetApp Syncsort Integrated Backup

The Cost Savings of Archiving

Seven Essential Strategies for Effective Archiving

Case Studies. Data Sheets : White Papers : Boost your storage buying power... use ours!

Every organization has critical data that it can t live without. When a disaster strikes, how long can your business survive without access to its

NetApp Big Content Solutions: Agile Infrastructure for Big Data

Six approaches to storing more intelligently.

The Benefits of Continuous Data Protection (CDP) for IBM i and AIX Environments

Information Governance in the Cloud

Dell PowerVault DL Backup to Disk Appliance Powered by CommVault. Centralized data management for remote and branch office (Robo) environments

SOLUTION BRIEF KEY CONSIDERATIONS FOR BACKUP AND RECOVERY

Does Tape Still Make Sense? An investigation of tape s place in a modern backup system. My 20

Maintaining Business Continuity with Disk-Based Backup and Recovery Solutions

Bringing the edge to the data center a data protection strategy for small and midsize companies with remote offices. Business white paper

Hitachi NAS Platform and Hitachi Content Platform with ESRI Image

Breaking the Storage Array Lifecycle with Cloud Storage

Object Storage A Dell Point of View

White Paper. Why Should You Archive Your With a Hosted Service?

The evolution of data archiving

ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V

5 WAYS STRUCTURED ARCHIVING DELIVERS ENTERPRISE ADVANTAGE

University Health Care System improves patient care with enterprise grid storage system

Real-time Compression: Achieving storage efficiency throughout the data lifecycle

Hitachi Cloud Service for Content Archiving. Delivered by Hitachi Data Systems

Data Storage for Video Surveillance

Archiving A Dell Point of View

Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration

EMC NETWORKER AND DATADOMAIN

and the world is built on information

EMC arhiviranje. Lilijana Pelko Primož Golob. Sarajevo, Copyright 2008 EMC Corporation. All rights reserved.

Disaster Recovery Strategies: Business Continuity through Remote Backup Replication

Benefits of Consolidating and Virtualizing Microsoft Exchange and SharePoint in a Private Cloud Environment

Combining Onsite and Cloud Backup

Reduce your data storage footprint and tame the information explosion

Building a Data Center for Cloud Computing Transform Your Storage Environment to Cut Costs and Increase Efficiency

Symantec Enterprise Vault

DATA ARCHIVING. The first Step toward Managing the Information Lifecycle. Best practices for SAP ILM to improve performance, compliance and cost

solution brief NEC Remote Managed Services Prevent Costly Communications Downtime with Proactive Network Monitoring and Management from NEC

Wanted: Better Backup Poll shows widening gap between expectations and reality

Solution Overview: Data Protection Archiving, Backup, and Recovery Unified Information Management for Complex Windows Environments

Effective, Affordable Data Management with CommVault Simpana 9 and Microsoft Windows Azure

LEVERAGING EMC SOURCEONE AND EMC DATA DOMAIN FOR ENTERPRISE ARCHIVING AUGUST 2011

Datosphere Platform Product Brief

Hitachi Cloud Services for Private File Tiering. Low Risk Cloud at Your Own Pace. The Hitachi Vision on Cloud

ExaGrid - A Backup and Data Deduplication appliance

Detailed Product Description

Disk-to-Disk-to-Tape (D2D2T)

How To Store Data In A Cloud Environment

Eight Considerations for Evaluating Disk-Based Backup Solutions

Archiving for Compliance and Competitive Advantage

Backup Software? Article on things to consider when looking for a backup solution. 11/09/2015 Backup Appliance or

Introduction to NetApp Infinite Volume

DEDUPLICATION BASICS

Unified ediscovery Platform White DISCOVERY, LLC

Data center virtualization

CA Message Manager. Benefits. Overview. CA Advantage

Data Protection. the data. short retention. event of a disaster. - Different mechanisms, products for backup and restore based on retention and age of

Enable unified data protection

Data Deduplication: An Essential Component of your Data Protection Strategy

Object Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.

WHITE PAPER. Deficiencies in Traditional Information Management

I D C V E N D O R S P O T L I G H T. S t o r a g e Ar c h i t e c t u r e t o Better Manage B i g D a t a C hallenges

The Modern Virtualized Data Center

Technology Insight Series

OPTIMIZING SERVER VIRTUALIZATION

IBM Tivoli Storage Manager for Virtual Environments

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

NETAPP WHITE PAPER USING A NETWORK APPLIANCE SAN WITH VMWARE INFRASTRUCTURE 3 TO FACILITATE SERVER AND STORAGE CONSOLIDATION

How would lost data impact your business? What you don t know could hurt you. NETWORK ATTACHED STORAGE FOR SMALL BUSINESS

Turnkey Deduplication Solution for the Enterprise

Can CA Information Governance help us protect and manage our information throughout its life cycle and reduce our risk exposure?

Transcription:

Modernizing Data Storage Archive Infrastructure Avoid massive cost, obsolescence and litigation risk by modernizing archive storage infrastructure Table of Contents 2 Challenge of Unstructured Data 3 Factors Driving Reappraisal of Archive 4 Cost of Storing Static Data on Primary Storage Arrays 5 Solving Archive Challenges 6 Scalability 6 Lower TCO 6 Ease of upgrades 7 Global data deduplication 7 Conclusion 8 References 8 About NEC Multiple risk factors spanning business, legal and regulatory dimensions create daunting problems for many organizations to address archive issues. Solving these challenges while ensuring long-term data access requires modernizing data archives and addressing key business requirements: long-term data preservation, search & retrieval accessibility, and security requirements, all of which will evolve over a span of time measured in decades and beyond. In order to modernize archive, organizations should take two steps: first, address the cost of exponentially increasing data volumes by adopting data deduplication technology to reduce storage consumption by 95% or more; and second, ensure that software systems responsible for the long-term preservation and management of electronic data meet basic prerequisites: Archive: archive content federation; metadata to support legal/compliance requirements, full-text indexing with efficient search capabilities; and separate content storage from applications to facilitate physical and logical migration support at the data layer. Storage: massive scalability, greater operational predictability, resiliency and little to no downtime. Reappraisal should also address rapidly increasing storage management costs as well as the disruption caused by frequent technology refreshes. It has become clear that traditional approaches for managing increasing volumes of data have become inadequate due to rising complexity and risks. Powered by Intel Xeon Processors Page 1 of 8

Of necessity, requirements for both day-to-day storage and data archive need to evolve to meet these challenges: Organizations continue to create and lose more data: This problem is exacerbated by rapid data growth; one government agency expects tenfold data growth between 2006 and 2010, with ongoing annual growth of file data estimated at 40 percent. (1) Data loss occurs over time as the result of obsolete storage media, software or hardware. Data migration projects are disruptive, cause application and server downtime, and can even sometimes cause irrevocable data loss. Growth rates of 50% and greater will cause the cost of migration to soar even further in the years ahead. Rob Stevenson, TheInfoPro Organizations retain data for longer time periods: For many organizations, compliance with data retention policies requires records to be kept for decades, sometimes even indefinitely. At the same time, legal requirements are changing how organizations need to access and manage data. Despite advances in records and retention management technology, the majority of organizations have yet to address this issue and continue to keep historical data for extended periods without full consideration of best practices. Retention length is also being driven by the necessity to sift greater amounts of value from data. Product lifecycles can stretch for extremely long periods of time leading to the so-called long tail phenomenon, enabling firms to mine data for longer periods of time to find new value. The Storage Networking Industry Association (SNIA) has analyzed organizational practices to identify five key categories driving longer retention periods, including business, legal, security, compliance, and other risk factors, such as losing organizational memory. Drawn from this data is the conclusion that losing historical data is a top concern for organizations, while compliance is the top concern for record & information managers (RIMs) and legal risk topping the concerns for IT, security, and legal personnel. (2) Challenges of Unstructured Data The majority of stored data is static and unstructured. Collectively termed Electronically Stored Information (ESI), this includes email and email attachments; documents; file system content; instant messages, wiki, web and social networking content; databases; ERP and other host systems output; voice and video files; and images. Data types, usage, users, and other components of unstructured data sets, should be expected to change regularly. The result of this unpredictability is that managing and migrating ESI data can become an never-ending process and much more challenging without comprehensive software for long-term data preservation. Based on conversations TheInfoPro has had with storage professionals: An estimated 80% of large organizations lack standard data migration operating processes and tool sets: they use whatever local knowledge or on-board tool shipped with the storage array or filer, which can be resource-intensive and error-prone. Consequently data migration projects are disruptive, cause application and sever downtime, and can even sometimes cause irrevocable data loss. Growth rates of 50% and greater will cause the cost of migration to soar even further in the years ahead. (3) Despite challenges with data migration, enterprise data centers are performing them at everincreasing rates, either through choice to take advantage of significant improvements Page 2 of 8

available in newer storage systems, or by force to migrate data to accommodate data growth, whether consolidating on new systems or re-purposing legacy storage systems to extract additional data value. Solving migration challenges is complicated by evolving circumstances governing archival storage. Large enterprises annually spend an estimated $300K on average for data migration software... When you take into account that these are annual spending trends and in an era where data has to be stored for decades, it is possible cumulative costs eventually exceed tens of millions of dollars. TheInfoPro According to TheInfoPro, large enterprises annually spend an estimated $300K on average for data migration software. However, even more impressive than those spending patterns, is the labor and other costs involved with migrations such as downtime, increased staffing costs, additional hardware needs, and business disruptions. These conditions can easily result in costs that are more than what organizations spend on data migration software tools. When you take into account that these are annual spending trends and in an era where data has to be stored for decades, it is possible cumulative costs eventually exceed tens of millions of dollars. (4) Factors Driving Reappraisal of Archival Storage Systems Because of its long life, the cost implications associated with archive data can be quite large. In a 2003 case, several oil companies faced litigation because their MTBE additive had seeped into ground water. Gasoline production data was spread among hundreds of terabytes of stored data spanning as far back as 30 years. With data in both active and inactive databases, and other data available only from old backup tapes, the forecasted cost for litigation preparation including ediscovery, review and production of required documents was hundreds of millions, if not billions, of dollars. The case was settled in May 2008 for $423M. (5) In the face of these costs and risks, organizations have implemented high cost storage infrastructures in an attempt to enable quick response to business, legal and regulatory demands. Existing regulations, such as SEC Rule 17a-4, NASD 3010/3100, Sarbanes-Oxley Act 2002 and the Financial Services and Markets Act 2000, not only provide guidelines for how to store, retrieve and recover data, but often require certain data to be readily accessible for specified durations, whether two years, three years, five years, 99 years or for unspecified periods. While readily accessible is not well-defined, it is often represented as available within a 24 hour period. The readily accessible requirement alone necessitates that data be stored on a medium that can support search, access and extract of potentially large volumes of data within a short time frame; tape will not support this requirement. The common legal process of filtering and culling information to respond to a discovery order with appropriate documents is also not possible when data are stored on tape. Comparison of Tape vs. Disk-based Archive Storage Architecture Tape Disk Not Readily Accessible Readily Accessible $500K ediscovery Cost $50K ediscovery Cost Frequent Media Failures Infrequent Media Failures Not Scalable Scalable Slower Performance Faster Performance Page 3 of 8

As a result of legal and regulatory developments, there are clear consequences for the cost of data storage; storage expenses related to legal discovery now represent 50 to 70 percent of the cost of litigation. (6) The costs to completely and comprehensively manage electronically stored information have risen dramatically with an average of $1,000 to $2,000 per gigabyte of data to identify, collect and process ESI. (7) Storage expenses related to legal discovery now represent 50 to 70 percent of the cost of litigation. Contoural As ediscovery grows in importance, requirements for storing archive data have begun to change. Historically, tapes have been used for archive data because they were viewed as inexpensive to purchase, do not have to be kept online, and do not consume power or require cooling. However, tapes are prone to media failures which can render their data unrecoverable. Even without media failure, recovery from tape is time-consuming and costly because it cannot be efficiently searched online. Companies utilizing third parties for retrieval of relevant documents from tape for ediscovery find their costs can exceed $2,000 per tape. (8) Since relying on tape for archive purposes leads to high ediscovery costs, organizations are shifting to disk-based archives, which can be significantly more cost-effective than tape. The Taneja Group estimates Fortune 1000 companies assume a minimum of $500,000 per lawsuit in discovery costs with tape-based archives, whereas ediscovery costs against a disk-based archive for the same lawsuit could easily be one-tenth the cost. (9) Reappraisal of archive data storage should reflect the changing business environment, the evolution of laws and regulations, the different types of data, and the shortcomings of existing tape-based systems. First-generation alternatives to tape did not effectively overcome the known issues, they have not scaled well, have performed poorly and burdened organizations with additional operational costs. One of the reasons cited for these failures is companies did not deploy comprehensive records management software with scalable storage designed for archive. Addressing the archive issue from a storage-only perspective also fails to effectively address the core requirements for long-term data retention and access. (10) Storing Static Data on Primary Storage Arrays is Expensive Historically, disk-based storage, which was designed to enable dynamic modification and fast retrieval of data, has been used to access live active data with tape limited to backup and long-term preservation of static data. However, organizations are increasingly recognizing that even within a short timeframe (30 to 60 days), data updates are rare or perhaps even prohibited, The Storage Networking Industry Association now estimates 60-80 percent of all stored data is static. (11) Emerging from these facts is the understanding that two classes of disk-based storage are required: primary storage, associated with data creation and dynamic updating; and secondary storage, used to describe systems and devices intended for long-term access, moderate levels of retrieval performance, and much greater levels of economy. When reappraising archive firms need to evaluate the appropriate mix of storage devices and systems that properly accounts for the different value and nature of data. While primary storage offers the fastest possible retrieval performance and is used just in case Page 4 of 8

of need, storing all data on primary storage is costly, estimated at five to eight times more expensive than using secondary storage systems. (12) Given this large cost differential, it is worthwhile to reconsider assumptions about storage infrastructure, especially considering that data retrieval from secondary disk-based storage can easily meet timelines imposed by ediscovery events or business opportunity. Disk-based secondary storage is a sound economic and risk-mitigation alternative. Storing all data on primary storage is costly, estimated at five to eight times more expensive than using secondary storage systems. Considering the economic benefits of secondary storage, the stage has been set for new solutions that optimize the function and economics of archive storage. NEC Unify Archive System Overview The Taneja Group Figure 1. Scalable, resilient archive features Solving Archive Challenges Unify and NEC have joined forces to create a comprehensive solution for data archiving meeting all core requirements and featuring four standout characteristics: scalability; low TCO based on in-place technology refresh on an ageless storage platform (including application-level support for physical and logical migrations); easy upgrades, and global data deduplication. The solution is based on NEC s proven HYDRAstor storage grid technology and makes use of Unify s Core Archive Platform for Electronically Stored Information with key applications for Search & Discovery, Records Management, Legal Case Management and Content Supervision, all with open web-based integration capabilities. Page 5 of 8

Solution Highlights Ease of Management * Single point of access * Corporate data tiering * Long-term content access Functionality * Fast search & retrieval * Federated search Cost-effectiveness * Global data deduplication * Bandwidth-friendly replication Compliance * Auditable chain of custody * WORM functionality Scalability Large enterprises already manage hundreds of terabytes of data due in large part to the advent of personal computers, which has resulted in multiple sources of data and opportunities for repurposing data. This phenomenon along with the lack of efficient means for data deduplication has resulted in large amounts of redundant data. With annual data growth over the next five years expected to average 40 to 60 percent (1), most large enterprises can expect to be managing multiple petabytes of data by the early part of next decade. Lower TCO To deliver the best possible scalability, NEC and Unify have created a comprehensive core archival platform for ESI. This integrated hardware & software solution has been architected to address corporate data tiering, compliance and legal discovery requirements with a single point of access to disparate archived data formats regardless of originating application or messaging system. Designed for long-term data management and accessibility, the product s dynamic data archive translation strategy is designed to promote content accessibility and readability over the lifespan of the data, regardless of availability of the original application or survivability of the originating data format. An integrated grid-based server bank provides memory speed searches across large and growing volumes of archived data. The storage platform features global data deduplication to reduce storage space by up to 50 percent more effectively than alternative solutions. The platform also includes an integrated, auditable chain of custody facility to track up to one trillion objects and events; support for different data types and email formats; and support for both global, regional and centralized implementations of archive systems with federated search to enable fast, accurate access and retrieval of data. Write-Once Read-Many (WORM) functionality for regulatory compliance and the capacity for discretely incrementing bandwidth or storage depending on workload requirements are just a few of the features that enable the NEC and Unify solution to deliver a better overall value proposition than alternatives. Easy upgrades Enabling easy upgrade of processors and disks without incurring forklift upgrades, allows resources to be added over time and across technology generations on an as-needed basis. When capacity is added, performance increases (unlike monolithic storage architectures, where performance degrades as capacity increases). HYDRAstor takes advantage of the inherent redundancy built into grid architectures to maintain availability and high performance without service disruption or loss of data. The NEC-Unify solution relies on HYDRAstor s in-place technology refresh capability to eliminate downtime due to data migration. This capability reduces or eliminates the expense of data migration, which can be up to $1,800 per terabyte, as well as application and system downtime. Both bandwidth and capacity scale independently to allow deduplication of up to Page 6 of 8

20TB of data per day, dynamically utilizing whichever combination of bandwidth or storage is required to optimize performance and responsiveness. Data deduplication is a proven means of reducing storage space consumption by 95 percent or more. Because of applicationaware deduplication, HYDRAstor is able to further reduce storage consumption by as much as 50 percent more effectively than other deduplication alternatives. The NEC-Unify solution provides a simplified single management system and compliance archive application architecture which integrates the management of both archive functions and day-to-day storage. Management costs associated with provisioning (such as deploying multiple appliances, devices and arrays; and managing multiple data formats) are greatly reduced, while centralized control reduces operational expenses for personnel. By having only one system with one interface to administer, TCO is lowered for both hardware and data. As a true archive technology, Unify software requires no associated relational database. This architecture ensures metadata and data are stored together within the storage system, not maintained apart, allowing reduced costs and providing infrastructure benefits for backup, restores, and disaster recovery/business continuity. The proven archive architecture enhances support for replication strategies by replicating all metadata and data together vs. having to replicate each separately in a relational data-store. Global data deduplication Global data deduplication enables the NEC-Unify solution to tune and optimize component interoperation to minimize access time, storage usage, hardware deployment, bandwidth consumption, and staff time spent on administration. Data deduplication is a proven means of reducing storage space consumption by 95 percent or more. One of the unique aspects of NEC s grid storage system is its ability to extend deduplication to become application-aware. NEC s solution extends the opportunity for cost-savings and operational improvements. Because of application-aware deduplication, HYDRAstor is able to further reduce storage consumption by as much as 130 percent more effectively than application software-based deduplication. Conclusion This paper discusses the advantages associated with modernizing archive infrastructure. A modernized system offers the opportunity to increase the flexibilty, scalability and resiliency while lowering costs associated with day-to-day management of storage infrastructure and those specifically associated with ediscovery. The NEC-Unify solution has the architectural foundation, features and proven deployment model to enable a single, all-encompassing data storage archive regardless of application, server and networking infrastructure, or scope of operations because of its scalability, in-place technology refresh capability, and overall resiliency and cost-efficiency. The offering provides the business continuance technology required for global operations. NEC and Unify have created an opportunity for enterprise data centers to solve several fundamental problems associated with managing data over the very long term. The system offers comprehensive data management throughout the data lifecycle, can be deployed enterprise-wide, and eliminates the need for planned downtime during system upgrades. Page 7 of 8

Beyond delivering the scalability and resiliency required at the enterprise level with the lowest TCO, just as significantly, the NEC-Unify archive solution does all this at a price point enabling all data to be brought under active management to extend the benefits across the enterprise. References 1. Contoural presentation at TechTarget E-mail and File Archiving Seminar, July 2009 2. The Storage Networking Industry Association, http://www.csi1000.com/docs/100yratf_archive- Requirements-Survey_20070619.pdf 3. TheInfoPro Storage Study, Q3 2009 4. TheInfoPro Storage Study, Q3 2009 5. Contoural paper, Understanding Archiving from an IT Perspective, 2008, page 5 6. Contoural paper, Understanding Archiving from an IT Perspective, 2008, page 5 7. Contoural paper, Is there a Return on Investment for e-mail Archiving/, 2009, page 6 8. Contoural paper, e-discovery: Six Critical Steps for Managing E-mail, Lowering Costs and reducing Risks., 9. Taneja Group paper, Evaluating Grid Storage for Enterprise Backup, DR and Archiving: NEC HYDRAstor, September 2008, page 4 10. Peer Incite discussion hosted by Wikibon July 31, 2009 entitled Building a Strategic Information Plan to Tame Unstructured Data. 11. The Storage Networking Industry Association, http://www.csi1000.com/docs/100yratf_archive- Requirements-Survey_20070619.pdf 12. Taneja Group paper, Evaluating Grid Storage for Enterprise Backup, DR and Archiving: NEC HYDRAstor, September 2008, page 2 About NEC Corporation of America NEC Corporation of America is a leading technology provider of network, IT and identity management solutions. Headquartered in Irving, Texas, NEC Corporation of America is the North America subsidiary of NEC Corporation. NEC Corporation of America delivers technology and professional services ranging from server and storage solutions, IP voice and data solutions, optical network and microwave radio communications to biometric security and virtualization. NEC Corporation of America serves carrier, SMB and large enterprise clients across multiple vertical industries. For more information, please visit www.necam.com. NEC Corporation of America 2880 Scott Blvd. Santa Clara, CA 95050 1 866 632-3226 1 408 844-1299 sales@necam.com www.necam.com/hydrastor 2009, NEC Corporation of America. HYDRAstor, DynamicStor, DataRedux, Distributed Resilient Data (DRD), RepliGrid and HYDRAlock are trademarks of NEC Corporation; NEC is a registered trademark of NEC Corporation. Intel, the Intel logos, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation in the U.S. and other countries. All other trademarks and registered trademarks are the property of their respective owners. All rights reserved. All specifications subject to change. (WP129-1_0909) Page 8 of 8 NEC Corporation of America