IBM Information Archive: Architecture and Internals



Similar documents
IBM Infrastructure for Long Term Digital Archiving

The IBM Archive Cloud Project: Compliant Archiving into the Cloud

Rapid Data Backup and Restore Using NFS on IBM ProtecTIER TS7620 Deduplication Appliance Express IBM Redbooks Solution Guide

IBM System Storage DR550

PolyServe Matrix Server for Linux

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

SMART ARCHIVING. The need for a strategy around archiving. Peter Van Camp

Solving the long term archiving challenges with IBM System Storage Archive Manager Solutions

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

Symantec NetBackup Appliances

TSM (Tivoli Storage Manager) Backup and Recovery. Richard Whybrow Hertz Australia System Network Administrator

The Smart Archive strategy from IBM

IBM Tivoli Storage Manager

Industry Models and Information Server

IBM Information Archive for , Files and ediscovery

Memory-to-memory session replication

Effective Storage Management for Cloud Computing

EMC DATA PROTECTION. Backup ed Archivio su cui fare affidamento

EMC BACKUP MEETS BIG DATA

EMC Disk Library with EMC Data Domain Deployment Scenario

IBM Solution Framework for Lifecycle Management of Research Data IBM Corporation

IBM Scale Out Network Attached Storage

Creating a Cloud Backup Service. Deon George

Microsoft System Center 2012 SP1 Virtual Machine Manager with Storwize family products. IBM Systems and Technology Group ISV Enablement January 2014

IBM TotalStorage Network Attached Storage 300G

STORAGE CENTER WITH NAS STORAGE CENTER DATASHEET

CommVault Simpana Archive 8.0 Integration Guide

Symantec NetBackup 5000 Appliance Series

How To Use The Hitachi Content Archive Platform

EMC Data Protection Advisor 6.0

WHY DO I NEED FALCONSTOR OPTIMIZED BACKUP & DEDUPLICATION?

(Scale Out NAS System)

ILM et Archivage Les solutions IBM

IBM WebSphere Partner Gateway V6.2.1 Advanced and Enterprise Editions

Isilon OneFS. Version OneFS Migration Tools Guide

IBM Tivoli Storage Manager 6

IBM WebSphere Application Server Communications Enabled Applications

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

IBM Tivoli Network Manager V3.9

StoneFly SCVM TM for ESXi

Netwrix Auditor. Administrator's Guide. Version: /30/2015

IBM System Storage DS5020 Express

TotalStorage Network Attached Storage 300G Cost effective integration of NAS and LAN solutions

Using EonStor FC-host Storage Systems in VMware Infrastructure 3 and vsphere 4

Reduce your data storage footprint and tame the information explosion

Introduction to NetApp Infinite Volume

Building Storage Service in a Private Cloud

EMC DATA DOMAIN RETENTION LOCK SOFTWARE

IBM Content Collector

IBM CommonStore Archiving Preload Solution

IBM Tivoli Provisioning Manager V 7.1

How To Store Data On A Server Or Hard Drive (For A Cloud)

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

Big data management with IBM General Parallel File System

EMC DATA DOMAIN DEDUPLICATION STORAGE SYSTEMS

Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500

IBM Storwize Rapid Application Storage solutions

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide

IBM Storage Management within the Infrastructure Laura Guio Director, WW Storage Software Sales October 20, 2008

Symantec NetBackup OpenStorage Solutions Guide for Disk

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

<Insert Picture Here> Refreshing Your Data Protection Environment with Next-Generation Architectures

How To Choose A Business Continuity Solution

Turnkey Deduplication Solution for the Enterprise

IBM Tivoli Storage Manager

Redpaper. IBM Enterprise Content Management and IBM Information Archive. Front cover. Providing the Complete Solution. ibm.

Scale out NAS on the outside, Object storage on the inside

OVERVIEW. CEP Cluster Server is Ideal For: First-time users who want to make applications highly available

Requirements Specifications for: The Management Action Record System (MARS) for the African Development Bank

Integrating ERP and CRM Applications with IBM WebSphere Cast Iron IBM Redbooks Solution Guide

Big data Devices Apps

OPTIMIZING PRIMARY STORAGE WHITE PAPER FILE ARCHIVING SOLUTIONS FROM QSTAR AND CLOUDIAN

Backup and Recovery for SAP Environments using EMC Avamar 7

REDUCE COSTS AND COMPLEXITY WITH BACKUP-FREE STORAGE NICK JARVIS, DIRECTOR, FILE, CONTENT AND CLOUD SOLUTIONS VERTICALS AMERICAS

EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION

XenData Product Brief: SX-550 Series Servers for LTO Archives

DPAD Introduction. EMC Data Protection and Availability Division. Copyright 2011 EMC Corporation. All rights reserved.

EMC DATA DOMAIN OPERATING SYSTEM

Managing and Securing the Mobile Device Invasion IBM Corporation

EMC DATA DOMAIN OPERATING SYSTEM

Backup and Recovery Redesign with Deduplication

Effective storage management and data protection for cloud computing

WebSphere DataPower Release DNS Enhancements

Online Transaction Processing in SQL Server 2008

GRIDScaler-WOS Bridge

EMC CENTERA VIRTUAL ARCHIVE

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

StorReduce Technical White Paper Cloud-based Data Deduplication

IBM Global Technology Services September NAS systems scale out to meet growing storage demand.

Leveraging WebSphere Commerce for Search Engine Optimization (SEO)

SAP Running on an EMC Virtualized Infrastructure and SAP Deployment of Fully Automated Storage Tiering

Transcription:

Christian Bolik (bolik@de.ibm.com), IBM Research & Development, November 2010 IBM Information Archive: Architecture and Internals

Disclaimer Copyright IBM Corporation 2010. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE. IBM, the IBM logo, ibm.com, DB2, WebSphere, and FileNet P8 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or ), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml 2 IBM Archive Cloud for Financial Services

Agenda IBM s Smart Archive -Strategy What is IBM Information Archive? Physical and Software Architecture of the IA Appliance Key Concepts and Features in IA 3

IBM Smart Archive Strategy http://www.ibm.com/software/data/smart-archive/ Reports ERP / CRM (SAP, PeopleSoft ) Content (Documents, Images ) Paper Collaborative (Quickr, SharePoint) Data Email (Notes, Exchange) Value Added Services Optimization Services System Services Managed Services Reference Architecture Information Governance Optimized and Unified Assessment, Collection and Classification On Premise (Custom Config) Flexible and Secure Infrastructure with Unified Retention and Protection Appliance (Pre-Config) As A Service (SaaS, Multiple Options) Cloud Ready Archive Storage with Optional ECM Integrated Compliance, Records Management, Analytics and ediscovery 4

Existing IBM Archiving Solutions Reports ERP / CRM (SAP, PeopleSoft ) Content (Documents, Images ) Value Added Services Optimization Services System Services Managed Services Reference Architecture Information Governance Paper Capture CMOD On Premise (Custom Config) Collaborative (Quickr, SharePoint) Appliance (Pre-Config) Data Email (Notes, Exchange) Optimized and Unified Assessment, Collection and Classification ECM Repositories Optim SAP Archiving Content Collector and Classification Module Flexible and Secure Infrastructure with Unified Retention and Protection Information Archive As A Service (SaaS, Multiple Options) Cloud Ready Archive Storage with Optional ECM Enterprise Records and ediscovery Analytics Integrated Compliance, Records Management, Analytics and ediscovery 5 CMOD = Content Manager On Demand

Introducing IBM Information Archive Next Generation Information Retention Solution Universal, scalable, and secure storage repository for structured and unstructured information, compliant or non-compliant Integrated Archive Appliance combines the best of IBM Software, Hardware & Services Protects Data by enforcing the industry s most stringent information retention laws Highly versatile, highly scalable information retention solution for mid-size and enterprise organizations 6

What is IBM Information Archive The successor of the IBM System Storage DR550 A universal archiving repository for all types of content that addresses the complete information retention needs of midsize and enterprise clients faced with managing an increasing volume of information Combines fast accessible disk with low-cost tape within a single archive pool to enable businesses to deploy an archive strategy that minimizes total cost of ownership over the life of the archived information Brings together IBM s General Parallel File System technology, Tivoli Storage Manager and patent-pending Enhanced Tamper Protection to offer a high performance, high scalability, and secure platform Designed for archiving a broad range of electronic based records, including e-mail, digital images, databases, applications, instant messages, account records, contracts or insurance claim documents, and other types of storage records 7

Information Archive Announcements 24.09.2010: IBM Information Archive for Email, Files, and ediscovery Bundles IA with servers and licenses for IBM Content Manager, IBM Content Collector for Files and Email, ediscovery Manager and Analyzer Offered with implementation services 22.02.2010: IBM Information Archive R1.2 Improved Disaster Recovery capabilities Scales to up to 608 TB (raw capacity, 444 TB usable with RAID6) 26.10.2009: IBM Smart Archive Strategy 06.10.2009: IBM Information Archive R1.1 8

Information Archive Characteristics Information Archive Architecture Universal Common platform for archive of multiple types of data Variety of data interfaces (NAS, TSM) Scalable Scale-out of processing and storage Tiered storage, including external tape Adaptable Compliant and non-compliant archives Pluggable architecture for future dataspecific function Secure Fully protected, lockable compliant store No root access in full compliance mode through Enhanced Tamper Protection NAS Client TSM API Client Web-browser NAS Interface SSAM Server IA Mgmt GUI GPFS Filesystem & IA Middleware TSM Server Disk Storage Disk Storage Collection 1 Collection 2 9

Physical Architecture 10

ipdu ipdu Hardware Redundancy 11 2231 IA3 2231 IA3 File Archive Configuration Main Rack FC9910 Specified IA3 3-node + no App Srvrs 36 RSM Server (FC5601) Mandatory 35 D1B Disk Exp #1-6 Optional 34 (optional) 33 6+2P; 6+2P 32 D1B Disk Exp #1-5 Optional 31 (optional) 30 6+2P; 6+2P 29 D1B Disk Exp #1-4 Optional 28 (optional) 27 6+2P; 6+2P 26 D1B Disk Exp #1-3 Optional 25 (optional) 24 5+2P; S; 6+2P 23 22 21 Keybd, Monitor, KVM Mandatory 20 Two 24 port Brocade SAN24B4 Optional 19 FC switches (optional but paired) Optional 18 Mgmt Server (FC5600) Mandatory 17 Two SMC 8126L2 26 port Mandatory Ethernet 10/100/1G Sw 16 Mandatory (46M2175) 15 S2M Server Mandatory 14 13 S2M Server Optional 12 (opt 1) 11 S2M Server Optional 10 (opt 2) 9 D1B Disk Exp #1-2 Optional 8 (optional) 7 6+2P; 6+2P 6 D1B Disk Exp #1-1 Optional 5 (optional) 4 6+2P; 65+2P 3 D1A Disk Ctrlr #1 Mandatory 2 1 5+2P; S; 6+2P 112 TB Raw (1TB HDDs) 96TB User (RAID5) 82TB User (RAID6) Mandatory All servers have redundant power supplies Redundant Ethernet switches Redundant Fiber Channel switches Dual ipdu s Bonded Ethernet port configuration Dual internal/external Ethernet paths R1.2 added support for 2 TB drives: Up to 224 TB raw, 164 TB usable (RAID6)

ipdu ipdu Storage Redundancy 12 2231 IS3 File Archive Expansion Rack Storage Expansion Rack for File Archive attachment only (IA3 with FC9910) 36 D1B Disk Exp #2-5 Optional 35 (optional) 34 6+2P; 6+2P 33 D1B Disk Exp #1-5 Optional 32 (optional) 31 6+2P; 6+2P 30 D1B Disk Exp #2-4 Optional 29 (optional) 28 6+2P; 6+2P 27 D1B Disk Exp #1-4 Optional 26 (optional) 25 6+2P; 6+2P 24 D1B Disk Exp #2-3 Optional 23 (optional) 22 5+2P; S; 6+2P 21 D1B Disk Exp #1-3 Optional 20 (optional) 19 5+2P; S; 6+2P 18 D1B Disk Exp #2-2 Optional 17 (optional) 16 6+2P; 6+2P 15 D1B Disk Exp #1-2 Optional 14 (optional) 13 6+2P; 6+2P 12 D1B Disk Exp #2-1 Optional 11 (optional) 10 6+2P; 6+2P 9 D1B Disk Exp #1-1 Optional 8 (optional) 7 6+2P; 6+2P 6 D1A Disk Ctrlr #2 Optional 5 (optional) 4 5+2P; S; 6+2P 3 D1A Disk Ctrlr #1 Mandatory 2 1 5+2P; S; 6+2P 192 TB Raw (1TB HDDs) 164TB User (RAID5) 140TB User (RAID6) Mandatory R1.2 added support for 2 TB drives: Up to 384 TB raw, 280 TB usable (RAID6) Storage Hardware Redundancy All servers have mirrored internal hard drives Each storage controller drawer has two controllers with failover capability RAID 6 used on all fiber Channel attached storage Dual paths from each archive node to storage controllers

Software Architecture 13

Software Failover NFS Client Clustered NFS is integrated with GPFS clustering, when a node fails, the NFS virtual IP is moved to another server in the cluster and the locks are also migrated Given known NFS stateless semantics, customers should sync data before they consider it committed to the system. HTTP (read) HTTP is sharing the virtual IP with clustered NFS. When CNFS fails over, HTTP sessions will be redirected to the failed-to node. Client will have to reauthenticate as the HTTP state is not shared between nodes in the cluster. TSM/SSAM API and Archive Client TSM/SSAM Server IP address is moved to another node, TSM/SSAM is restarted and performs transaction recovery. API/Archive Client will retry on server IP address and either recover and continue current transaction, or fail failing transaction and rolling back. 14

Information Archive Collections Collection = Set of archived documents managed under the same policy domain Types of collections: NAS, SSAM Multiple collections per IA appliance (up to 3 in R1) with separation of data Interface Policy Interface protocol (NAS or TSM API) Object commit method (NAS only) XML file, timeout, NetApp Snaplock TM Retention Policy Controlled internally or by external application Time-based (internal) and Event-based (external) retention Automatic (internal) or manual (external) deletion after expiration Storage Policy Disk replication Tape storage tier Encryption (tape only) Deduplication Shredding (SSAM only in R1) Mode Compliance Policy Delete before expire? Retention Period Shorten? Lengthen? Basic Yes Yes Yes Intermediate No Yes Yes 15 Maximum* No No Yes *SSAM Collections are always Maximum compliance

Customizable Policy-based Retention Features Time Based Minimum Fixed Period X Day 0 Dispose after fixed period from creation date Event Based with Fixed Protection Periods Day 0 Minimum Fixed Period Event Fixed Period X Dispose after fixed period from event date Event Based with no Fixed Protection Period Event X Day 0 Dispose after event 16

Information Archive NAS Collection Architecture (R1) 1 Scale-out NAS interface with global namespace 2 1 2 3 4 5 XML-based application metadata (optional) Parallel ingest processing, including indexing Seamless tiered storage through TSM HSM Disk-based metro and global mirroring Ingest Processing 3 Plug-In Advanced Indexing Analytics, tagging Parse/Index Data + Metadata 4 5 17

Ingestion Chain of IA NAS Collections Archiving Application Information Archive Manager Write document and metafile (files) Set retention and explicitly commit (optional) Process event log, dispatch change log = Archiving Application Update Audit Log Time Implicitly commit document (if required) Apply policies, set service class and retention 1 2 3 4 5 6 7 Pre-migrate to Level 2 storage = Event Log Processor = Policy Manager 18

Meta Files (NAS Collections only) Use existing standards (NFS, XML) to extend the user s ability to manage files in the archive via NAS protocols NetApp SnapLock TM compatibility only provides time-based retention The addition of metafiles enables support of time and event based retention, policy based retention and retention hold/release Customer/application can define user fields Binding and non-binding (binding cannot be updated after document is committed) User defined fields will be indexed in the future Event based retention, holds, etc. can be signaled via EVENT fields in meta files. User/application can use standard file protocols to update meta files. 19

Meta File Example /IIA/col1/data /IIA/col1/data/file1.doc /IIA/col1/meta /IIA/col1/meta/file1.doc <?xml version="1.0" encoding="utf-8"?> <fields> <_SYSTEM_md5Checksum> Now is the time for all good men to come to the aid of their country. da1e100dc9e7bebb810985e37875de38 </_SYSTEM_md5Checksum> <USER_confidential> Yes </USER_confidential> <_EVENT_setRetention_> 20101231 </_EVENT_setRetention_> </fields> 20

Security and Compliance Characteristics Role based security: Security, Systems and Archive Administrator Roles Archive User, Service Engineer and Auditor Roles Audit logs provide compliant audit trail. Physical security: locking cabinet To achieve maximum compliance protection with IBM IA customers enable the patentpending Enhanced Tamper Protection feature Removes root login capability from the IBM IA Cluster Neither customer nor IBM has root login authority Once enabled, cannot be disabled Expected admin and support operations pre-programmed to remove need for root access Best practice to enable during installation We have a procedure for an unforeseen emergency requiring root access by delivering a signed - time-bounded patch to the customer. 21

Information Archive Integrated, Web-based Interface Initial configuration is integrated and guided Start archiving data in 1 day or less Consolidated and integrated management in a single administrative interface* Security, administration, monitoring, notifications (Email and SNMP), troubleshooting, serviceability Manage multiple collections from a single interface Role-based administrative security Simple, efficient, consistent administrative experience Wizards guide user on the creation of objects Overviews provide high level situational awareness of system health Consolidation of information for efficient administration Task-oriented, with drill down where needed for deeper configuration and trouble shooting 22

IBM Information Archive Key Feature Summary Feature Ease of use Installation & Implementation User Interface IBM Information Archive Appears as a file-server, simply drag & drop files to IA. Information Archive is physically installed by a CE. Quick configuration via wizards. Dynamically add storage. Task oriented GUI to manage the entire solution (+ CLI) Application Interfaces NFS, CIFS*, HTTP*, FTP*, SSAM all within IA Scalability Retention Policies Compliance / Security Index / Search Supports billions of objects 1 billion per collection, up to 3 collections per IA (more in the future*) Accepts retention policies from applications or users (Event-based or time-driven) Each collection can be configured with different protection levels (basic, intermediate, maximum). New patent-pending tamper proof technology increases protection. Full text indexing and search (both data and metadata) to be added in a future release* 23 * Statement of direction for a future release

Links for IBM Information Archive Homepage: http://www.ibm.com/systems/storage/disk/archive Also access support information (technical notes etc.) from this page Redbook: http://www.redbooks.ibm.com/abstracts/sg247843.html?open Or go to http://www.redbooks.ibm.com and search for Information Archive IA Wiki at Developer Works: http://www.ibm.com/developerworks/wikis/display/tivolidoccentral/ibm+information+archive Or go to http://www.ibm.com/developerworks/wikis and search page for Information Archive 24

THANK YOU! 25