IBM Infrastructure for Long Term Digital Archiving A new generation of archival storage Rudolf Hruška Information Infrastructure Leader IBM Systems & Technology Group rudolf_hruska@cz.ibm.com 2010 IBM Corporation
Agenda Concept and industry standards for digital archives IBM Information Archive Architecture Key functions and secured storage Using disks and tapes Smart Archive Solution for Email, Files, ediscovery 2 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Globally, storage requirement is 80% file-based unstructured data, and growing Worldwide Storage Capacity Shipped by Segment, 2008 2013 3 Source: IDC, State of IBM File-Based Smart Storage Archive Use Strategy in Organizations: and Information Results from Archive IDC's 2009 Trends in File-Based Storage Survey: Dec 2009: 2010 IBM Corporation Doc # 221138
Key Questions for Archives How How is is data data added added to to the thearchive (Ingestion)? How How is is data data stored, stored, protected & managed managed over over time time (Compliance, Preservation)? How How is is data data retrieved retrieved when when needed needed (Discovery, Compliance)? IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Digital Archiving Infrastructure Ecosystem Content Generating Applications Archiving Application TSM Client Content Collector Optim ISVs Content Repository Content Manager FileNet P8 Storage Repository IBM Information Archive Tape IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
What are Open Archival Information Systems OAIS Archival Information System Hardware, software and people who are discharging their responsibilities to acquire, preserve and disseminate information Reference model for long term preservation of digital information Based on NASA work for space data handling and conservation ISO 14721 : 2002 6 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Requirements for Archival Storage Solution Openness enabling maximum user flexibility Interface: standard open file system interface Files: ability to store content in open file formats Platform: ability to migrate archive from one platform to another More control over your environment A single repository for multiple applications, and content not tied to any specific application A common management interface for storage and archive platforms Rapid content discovery Integrated search across multiple data types Self-managing Set of policies on the content 7 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Storage Systems Concepts for Archival Application Middleware File System Blocks Disk Subsystem Application Middleware Files File System Disk Subsystem Application Objects Middleware File System Disk Subsystem 8 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
The two faces of archiving - Why to archive? Data Retention Protect your data for the long term with nonerasable, non-rewriteable storage solutions and demonstrate compliance to regulations e-mail Enterprise Content Management IBM Content Manager, CM OnDemand, IBM FileNet Image Manager, P8 Content Manager, IBM Content Collector Enterprise Bundle, Hyland OnBase, EMC Documentum IBM Content Collector for Email, IBM CommonStore for Exchange/Domino, IBM FileNet Email Manager, Symantec Enterprise Vault, Zantaz, AXS-One, EMC MailXtender Databases File Systems IBM Optim, IBM CommonStore for SAP, IBM FileNet Application Connector for SAP, Solix, EMC DatabaseXtender IBM Content Collector for File Systems, Tivoli Storage Manager (TSM) HSM for Windows and Space Management for Unix and Linux, TSM Archive Client, SSAM Client, Symantec/KVS Enterprise Vault, EMC DiskXtender Space Management move old, inactive files to less expensive storage, resulting in reduced backup windows and reduced storage costs 9 IBM Smart Archive Strategy and Information Archive 9/29/2010 2010 IBM Corporation
Archive versus Backup Archive For active retrieval Moves data Adds operational efficiencies Long-term in nature Data typically secured Useful for regulatory compliance Backup For operational support and recovery Copies data Supports availability Short term in nature Data typically overwritten Not a good solution for regulatory compliance 10 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Introducing IBM Information Archive A universal, scalable, and secure storage repository for structured and unstructured information A fully integrated archive appliance for quick time to value Addressing the complete information retention needs of mid-size and enterprise clients Supports main archiving applications including IBM ECM and Optim Leverages Tape in backend for lower TCO A flexible repository that supports Compliance and Non-compliance archiving IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Information Archive Overview NFS Client SSAM Client Administrator IBM Information Archive Customer Network Remote Support Manager Call Home Management Console IA Graphical User Interface Information Archive Cluster Information Archive Software SSAM or TSM HSM GPFS and IA Middleware Disk Storage Primary data store Information Archive Appliance IBM Tape Library 12 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Advanced Protection Features Multiple Protection Levels Manages data retention. Maximum protection mode is for strict business, legal or regulatory retention needs Enhanced Tamper Protection Enhanced Disaster Recovery Patent-pending feature eliminates root access Advance Copy Services increase the availability of archived documents and prevents data loss in the event of a disaster Admin GUI Encryption Added security for data storage and remote data transmission Shredding The destruction of deleted data to make it difficult to discover or reconstruct that data later. 13 13 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
14 Archive Collections for Flexibility and Scalability Supports multiple ingest and input models including custom applications ECM Archive Repository Users and Applications Custom Applications NFS One Namespace NFS SSAM LAN NAS NAS SSAM Disk Collection 1 Disk Collection 2 Disk Collection 3 Clustered Clustered IBM Information Archive Tape 14 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Key Information Archive Function Overview Protection Levels: Basic -> Intermediate -> Maximum Integrated management Interface (GUI) Direct migration path from DR550 to IA Embedded Data Deduplication and Compression functionality Embedded Monitoring and Alerting High Availability and Disaster protection Extensive Auditing 15 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Disaster protection through mirroring IA Application LAN IA Node Failover aided by scripts IA Node GPFS Filesystem & IA Middleware Management via LAN GPFS Filesystem & IA Middleware IA Disk Collection Active IA Synchronous or asynchronous Mirror via SAN IA Disk Collection 2 Standby IA Tape or other devices 16 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Using Blended Tape and Disk Solutions for Archive Store 250TB with 25% Growth Rate over 10 Years 10 Year TCO Analysis Best Practices Multiple degrees of protection At least three copies of data in different locations and one out of region for DR Technology diversification Copies on different forms of media to avoid a media or system process disaster I/O isolation At least one copy offline to avoid intentional / unintentional corruption Protect access to data At rest and in transit Scenario: Store 250TB 25% Growth Rate Over 10 Years Choosing the Right Hardware and Software for Data Protection, Mesabi Group 17 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation 17
IBM Smart Archive Strategy IBM Information Archive for Email, Files and ediscovery is a specific IBM Smart Archive solution delivering the areas denoted in orange Reports ERP / CRM (SAP, PeopleSoft ) Content (Files) (Documents, Images ) Paper Collaborative (Quickr, SharePoint) Data Email (Notes, Exchange) Value Added Services Optimization Services System Services Managed Services Reference Architecture Information Governance Optimized and Unified Assessment, Collection and Classification On Premise (Custom Config) Flexible and Secure Infrastructure with Unified Retention and Protection Appliance (Pre-Config) As A Service (SaaS, Multiple Options) Cloud Ready Archive Storage with Optional ECM Integrated Compliance, Records Management, Analytics and ediscovery 18 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation 18
IBM Information Archive for Email, Files and ediscovery What is in the package IBM ECM Software Content Collector Discovery Analytics Starter Pack (CCDA SP) Content Collector for Email Content Collector for File Systems ediscovery Manager & ediscovery Analyzer IBM Content Manager Enterprise Edition IBM Hardware IBM Information Archive System x M3 Servers (2) IBM Customer Solution Center (CSC) pre-configuration services Includes pre-configuration of all ECM software and Information Archive by IBM CSC Implementation services at the Customer site Integrate into customer environment Implementation services are for the whole package including IA and ECM software Delivered by IBM ECM Lab services or by qualified IBM Business Partners 19 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Why IBM Information Archive Purpose-built Archive Repository to store information for compliance or business purposes for long term preservation. Only IBM offers a truly flexible retention system (IBM Information Archive) with multiple collections option each with its own information protection level Industry leader in compliance archiving seamless retention / hold management of both data and document objects and offering industry s most secure repository through enhanced tamper protection Only IBM offers automated, integrated data and media migration capability (to disk and tape) to keep technology current and information viable IBM Information Archive can be combined with WORM tape offering for the lowest TCO Ability to offer an end-to-end archiving and ediscovery solution including hardware, software and services all from the same vendor Compared to competitors, IBM supports Open and Standard based architecture CIFS, NFS, Open API s with sample code enables flexible architecture for partners to develop additional capabilities 20 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation
Thank You rudolf_hruska@cz.ibm.com 2010 IBM Corporation
Disclaimer The Techdocs information, tools and documentation ("Materials") are being provided to IBM Business Partners to assist them with customer installations. Such Materials are provided by IBM on an "as-is" basis. IBM makes no representations or warranties regarding these Materials and does not provide any guarantee or assurance that the use of such Materials will result in a successful customer installation. These Materials may only be used by authorized IBM Business Partners for installation of IBM products and otherwise in compliance with the IBM Business Partner Agreement. This information is provided on an "AS IS" basis without warranty of any kind, express or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Some jurisdictions do not allow disclaimers of express or implied warranties in certain transactions; therefore, this statement may not apply to you. Important notes: IBM reserves the right to change product specifications and offerings at any time without notice. This publication could include technical inaccuracies or typographical errors. References herein to IBM products and services do not imply that IBM intends to make them available in all countries. IBM makes no warranties, express or implied, regarding non-ibm products and services, and any implied warranties of merchantability and fitness for a particular purpose. IBM makes no representations or warranties with respect to non-ibm products. Warranty, service and support for non-ibm products is provided directly to you by the third party, not IBM. All part numbers referenced in this publication are product part numbers and not service part numbers. Other part numbers in addition to those listed in this document may be required to support a specific device or function. When referring to storage capacity, GB stands for one billion bytes; accessible capacity may be less. Maximum internal hard disk drive capacities assume the replacement of any standard hard disk drives and the population of all hard disk drive bays with the largest currently supported drives available from IBM. IBM Information and Trademarks The following terms are trademarks or registered trademarks of the IBM Corporation in the United States or other countries or both: the e-business logo, IBM, system x, system p, System Storage SnapLock is a registered trademark of Network Appliance Corporation in the United States Intel, Pentium 4 and Xeon are trademarks or registered trademarks of Intel Corporation. Microsoft Windows is a trademark or registered trademark of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. Other company, product, and service names may be trademarks or service marks of others. 22 IBM Smart Archive Strategy and Information Archive 2010 IBM Corporation