3 Archiving FOR DUMmIES ORACLE SPECIAL EDITION by Lawrence C. Miller, CISSP
4 Archiving For Dummies, Oracle Special Edition Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ Copyright 2012 by John Wiley & Sons, Inc., Hoboken, New Jersey Published by John Wiley & Sons, Inc., Hoboken, New Jersey No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) , fax (201) , or online at go/permissions. Trademarks: Wiley, the Wiley logo, For Dummies, the Dummies Man logo, A Reference for the Rest of Us!, The Dummies Way, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries, and may not be used without written permission. Oracle is a registered trademark of Oracle International Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book. LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETE- NESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITU- ATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRIT- TEN AND WHEN IT IS READ. For general information on our other products and services, please contact our Business Development Department in the U.S. at For details on how to create a custom For Dummies book for your business or organization, contact For information about licensing the For Dummies brand for products or services, contact ISBN: (pbk); ISBN: (ebk) Manufactured in the United States of America
5 Contents at a Glance Introduction... 1 Chapter 1: Recognizing Today s IT Challenges... 3 Explosive Data Growth...4 Diverse Data Types and Uses...5 Legal and Regulatory Requirements...7 Logical and Physical Data Migration...8 Rising Costs...9 Chapter 2: Archive How Does an Archive Differ from a Backup?...11 What Are the Different Types of Archives?...15 Why Is Tape the Best Archive Media?...17 Chapter 3: Archive Components Archive Software...21 Tape Software...23 Tape Libraries...26 Drives and Media...31 Chapter 4: Archive Use Cases Healthcare...36 Media and Entertainment...36 Telecommunications...37 High-Performance Computing (HPC)...38 Chapter 5: Ten Key Factors to Consider in Implementing Your Archive... 41
6 Publisher s Acknowledgments We re proud of this book and of the people who worked on it. For details on how to create a custom For Dummies book for your business or organization, contact For details on licensing the For Dummies brand for products or services, contact of the people who helped bring this book to market include the following: Acquisitions, Editorial, and Vertical Websites Senior Project Editor: Zoë Wykes Editorial Manager: Rev Mengle Acquisitions Editor: Katie Feltman Senior Business Development Representative: Karen L. Hattan Custom Publishing Project Specialist: Michael Sullivan Composition Services Senior Project Coordinator: Kristie Rees Layout and Graphics: Lavonne Roberts Proofreader: John Greenough Special Help from Oracle: Scott Allen, Doug Chamberlain, Donna Harland, Cindy McCurley, Arthur Pasquinelli, Christine Rogers, Allison Roth, Mark Schaffer, Kerstin Woods Publishing and Editorial for Technology Dummies Richard Swadley, Vice President and Executive Group Publisher Andy Cummings, Vice President and Publisher Mary Bednarek, Executive Director, Acquisitions Mary C. Corder, Editorial Director Publishing and Editorial for Consumer Dummies Kathleen Nebenhaus, Vice President and Executive Publisher Composition Services Debbie Stailey, Director of Composition Services Business Development Lisa Coleman, Director, New Market and Brand Development
7 Introduction Since the beginning of time, mankind has communicated written ideas and information with symbols. From the cave paintings of the Paleolithic Age and the hieroglyphs of ancient Egypt, to modern alphabets around the world, information becomes more or less permanent when it is written. All that is required to read these permanent records is the ability to see it and interpret it. Today, enormous amounts of information whether trivial or profound is written and recorded digitally in thousands of different applications and formats at an absolutely stunning pace. Yet, ironically, this digital information is written as symbols (1 s and 0 s on magnetic media) that represent other symbols (alphabets, for example) that cannot possibly be seen by human eyes let alone interpreted without the proper tools: computers and their associated software and applications. Managing these vast repositories and archives for our use today is a challenge in and of itself. But what computers and technology will exist 50, 100, or even 1,000 years from now to interpret the wealth of information that modern society has amassed? What will be the predominant file format? Will your expensive enterprise hard disks be unreadable fossils in the next millennia? Or will all of our achievements over the last 50 years be lost to future generations in what Popular Mechanics has called the Digital Ice Age?
8 2 While this book can t answer all of these questions for the ages, it can help you solve your organization s archive and data management challenges today and for at least the foreseeable future! About This Book This book consists of five short chapters, covering today s data archiving challenges, the basics of archives, archive components, use cases, and key factors to consider for your archive solution. Each chapter is written as a stand-alone chapter, so feel free to start reading anywhere and skip around throughout the book! Icons Used in This Book Throughout this book, we occasionally use icons to call attention to important information that is particularly worth noting. Here s what to expect. This icon points out information that may well be worth committing to your nonvolatile memory! If you re an insufferable insomniac or vying to be the life of a World of Warcraft party, take note. This icon explains the jargon beneath the jargon. Thank you for reading, hope you enjoy the book, please take care of your writers! Seriously, this icon points out helpful suggestions and useful nuggets of information.
9 Chapter 1 Recognizing Today s IT Challenges In This Chapter Seeing the data forest and all its trees Using, and re-using, different types of data Complying with data retention regulations Keeping data formats and media current Managing data storage costs Data retention in our modern digital era is a major challenge for businesses and organizations of all sizes, in all industries, worldwide. Common issues include the explosive growth of digital data, different data types and uses, complex regulatory requirements, data migration difficulties, and rising power, space, cooling, and management costs. This chapter explores these data retention challenges in depth.
10 4 Explosive Data Growth The march of digital data growth continues at a stunning pace. The 2011 IDC Digital Universe Study estimates that by 2020 the total amount of digital infor mation created, captured, and replicated will grow to 35,000 exabytes (see Figure 1-1). Just to put that in context, it would take almost 1.9 quadrillion (yes, quadrillion) trees to print 35,000 exabytes of data! That s nearly 5,000 times the number of trees on the entire planet (which NASA estimates at approximately 400 billion)! A terabyte is equal to 1024 gigabytes, a petabyte is equal to 1024 terabytes, and an exabyte is equal to 1024 petabytes. Figure 1-1: The nature of storage and data management has to change!
11 In many industries such as health care, life sciences, media/entertainment, and energy and in specialized markets, such as video surveillance and product life cycle management, the shift to digital content is now beyond the point of no return. These digital transformations are already spurring exponential increases in image data and associated content. The expanding use of automated sensors, high resolution medical scanners, earth observation satellites, and high performance technical computing applications across a broad range of industries is likewise driving much of this data growth. At the same time, companies are leveraging more collaboration, social networking, and web-based business applications to boost productivity and improve customer support. Large databases are at the heart of many of these applications. Data mining and analysis of these databases for business intelligence to improve efficiencies and market opportunities is driving the need for storage-intensive data warehousing. Diverse Data Types and Uses Enterprises must not only manage the growth of data, but also recognize the value and types of data and its anticipated uses within their organizations, as well. It is widely estimated that more than 80 percent of all organizational data is unstructured. This means that the vast majority of your storage capacity is being used 5
12 6 for , documents, images, and audio and video files. This unstructured data probably has a different value than the data in your business-critical databases, for example. Rather than treating all of your data equally, shouldn t your lower value data have a corresponding lower storage cost? According to IDC, unstructured data is projected to grow at a compound annual growth rate (CAGR) of more than 60 percent, compared to approximately 20 percent CAGR for transactional data. Data use and re-use presents another challenge or opportunity for organizations seeking innovative solutions to their growing data storage costs. Eighty percent of all data (both structured and unstructured) is never again used or accessed after 90 days. How often do you look at an message, a sales transaction, or a shipping manifest that is more than 90 days old? Yet this data is frequently stored on the same high-speed, high-performance, high-cost disks as the rest of your active production data. At the same time, when you do need a file from last year, it holds high value to you again. Therefore, you cannot simply delete all of that data. And with advances and new ways to search and analyze data coming every day the data you consider inactive today may hold untapped value just around the corner. Archive data must still be retained, protected, and readily accessible when needed, but there are lowercost alternatives that are better suited to data that is infrequently or never again accessed.
13 Legal and Regulatory Requirements Increasingly stringent data retention and protection regulations and complex compliance requirements also contribute to the data growth problem. These include the U.S. Health Insurance Portability and Accountability Act (HIPAA), Gramm-Leach-Bliley Act (GLBA), Sarbanes-Oxley (SOX), Canada s Management of Information Technology Security (MITS) directive, the EU Statutory Audit and Company Reporting Directive (EuroSox), and Japan s Financial Instruments and Exchange Law (J-SOX), among others. According to the Storage Networking Industry Association (SNIA), 80 percent of organizations participating in a recent survey responded that they are required to retain data for more than 50 years, and 68 percent of companies require a 100-year archive! These include governments, digital libraries, research organizations, and industries that need to keep track of data on population-wide drug interactions or individual aircraft for 10, 20, or 50 years, for example. How long does your organization s archive horizon need to be? Challenge your retention requirements to ensure that they do not expose your organization to excess costs and liability, but still meet your business needs. Not only do organizations today have longer data retention requirements, but they also have to have 7
14 8 their archives readily available and easily accessible not just locked away in a cave somewhere. It is absolutely critical that your organization have the right combination of archive hardware and software to ensure your data can be archived efficiently, securely, and reliably. You must be able to accurately catalog and index the contents of your archives and quickly restore to online storage or other media when needed. In the event of litigation, this capability will help your organization reduce the scope of legal discovery and quickly comply with a subpoena while controlling legal costs. Logical and Physical Data Migration Long-term retention of digital information also creates unique technical issues for organizations. These include the logical and physical migration of archive data. Data must be updated, typically every three to five years, to newer formats that are supported by current and future applications. This cycle is known as logical data migration. Although most common applications today provide some level of backward compatibility for data created and saved with older versions of that application, there are limits to that compatibility particularly for proprietary applications and unstructured data.
15 For example, many popular word processing, CAD (computer-aided design), and graphics file formats that were in popular use just 10 or 15 years ago are now obsolete and unreadable. One solution to this problem is to convert data to a common plain-text format, such as ASCII (American Standard Code for Information Interchange) or Unicode. However, these formats do not maintain the original data structure or metadata, and cannot support rich-text features and graphic images. Physical data migration refers to the need to copy archived data to newer storage media in order to preserve its integrity over time which, like logical data migration, typically tends to happen every three to five years, depending on the media type. Physical data migration is also necessary to ensure that current media formats are used, and that current backup and archiving software can read, write, and catalog the data properly. Both logical and physical data migrations require extensive time and resources. As the volume of organizational data continues to grow, so too do the resources required to migrate that data. Rising Costs Although the cost per gigabyte of storage has steadily decreased over time, energy and storage management costs are increasing. Storage consumes almost half of all data center power today, and it is growing at a rapid rate. Within ten 9
16 10 years, the total power consumed by storage will easily represent the majority of the energy consumed in the data center. The cost of managing this data is exploding as well. The increasing number of data sources, data formats, government and industry regulations, and the businesscritical nature of data is driving up management costs year over year even faster than energy utilization rates. Data and storage management will soon become the number one cost within many data centers. Considering that 80 percent of all data older than 90 days is never looked at again, you need a better way to deal with massive amounts of data storage. It is more important than ever to align the value of data with the capabilities and cost of the media on which it is stored. This can best be achieved with storage and archive solutions that: Drive the cost of storage used for data that is almost never accessed again to virtually zero Assure access to valuable content that needs to be accessed over the long-term Increase the amount of data that storage and database administrators can manage
17 Chapter 2 Archive 101 In This Chapter Defining archives versus backups Using disk-based and tape-based archives Choosing the best archive media This chapter explains exactly what an archive is and helps you to differentiate archives from backups. You also find out about the different archive types and why tape is the best media for long-term archive data. How Does an Archive Differ from a Backup? An archive is data storage that is used for long-term retention of permanent records and information. Archives consist of data that is no longer modified or regularly accessed but is still important and has value to the organization. Archive data is retained for a period of time, as defined by organizational policy (or indefinitely) for future reference, and for legal or regulatory compliance. Archives must be cataloged, fully indexed, and searchable, so that data can be easily located and retrieved when needed.
18 12 Data archives are sometimes confused with data backups. Although both archives and backups may employ similar hardware and software technologies, they are distinctly different in several ways. Data archives are used for long-term retention of permanent records. In contrast, data backup is a copy of data that is still in production and is regularly accessed or modified. Archive data is analogous to finished product, whereas production data (and its associated backups) is analogous to work-in-progress (WIP). A backup is a copy of data. An archive is the data. Because production data is regularly accessed and modified, it is susceptible to corruption or destruction. In such an event, the backup copy is used to restore the original data. The purpose of an archive is long-term retention of permanent records. The purpose of a backup is to create a short-term copy of production data in case the original data is corrupted or destroyed. Organizations typically employ a combination of different backup routines to maintain an accurate copy of production data. These include Full backups: All of the data is copied. Incremental backups: Only data that has changed since the last backup is copied. Differential backups: Only data that has changed since the last full backup is copied.
19 13 By comparison, archiving simply moves data to a separate repository, based on a pre-defined policy such as the last time a file was accessed or modified. Both archives and backups must be cataloged so that data can be located when needed. However, archives also require robust indexing and searching capabilities. A typical archive request may be: I need to locate all files that contain the phrase clinical drug trials created between 1995 and A similar request for a file restore from backup should result in the requester being banned from ever using a computer again: I just accidentally deleted a file that contained the word oops created between 1995 and 2002, but I have no idea what the complete name of the file was, what directory it was in, or what server it was on. Can you drop everything and restore it for me?! Speed is important to both archives and backups, but for different reasons. The ability to quickly index files and perform accurate full-text searches of extremely large (several terabytes or more) data repositories is critical for locating archive data. Archive data is, by definition, data that is not regularly accessed or modified, so it can be migrated to an archive from the production environment at pretty much any time. Backups today are increasingly being performed on production data in near real-time as backup systems and software become more robust and sophisticated. But regardless of the backup systems and software, backups can still limit access to certain files while running,
20 14 and can adversely affect system and network performance. For these reasons, most backups typically still occur in a backup window during nonproduction hours. Speed is critical to ensure that all production data can be backed up during the allotted backup window. Speed is also critical when restoring backups. In the event of a disaster, quickly and correctly restoring systems and data can be a daunting task that is of utmost importance to the continuity of business operations. On a much smaller scale, individual disasters happen almost every day, requiring a fast recovery capability: Eke, I just deleted my presentation for our sales meeting and it s only an hour away! Finally, archives and backups often use similar types of storage media. However, archives and backups each have unique characteristics that should more clearly dictate the storage media that is most appropriate for each use. Archive data is written to media only once, but may be accessed many times over a period of many, many years. Over time, the amount of archive data within an organization typically grows exponentially (refer to Chapter 1). For these reasons, your primary factors for selecting archive media (in order) should be Reliability Cost Speed Backup tapes are constantly handled and rotated through a backup cycle that performs numerous high-speed reads and writes of data. This significantly shortens the life of a backup tape. Although you may be replacing backup
21 15 tapes on a regular basis, archive tapes are not normally subjected to that same level of wear and tear. Archive tapes typically have a 30-year life though you may perform data migrations more frequently (refer to Chapter 1). Backup data is written to the same media many times over a relatively short period, defined by your organization s backup cycle. For example, your organization may have a six-week backup cycle that enables it to recover data or system configurations up to six weeks old. Typically, a backup must be completed during a limited backup window to minimize its impact during production hours. For these reasons, your primary factors for selecting backup media (in order) should be Speed Reliability Cost Do not confuse a backup cycle with a recovery point objective (RPO). A backup cycle defines the oldest version of data that can be recovered. An RPO, used in disaster recovery and business continuity plans, defines the most current version of data that can be recovered. What Are the Different Types of Archives? Archives can be either disk-based or tape-based. A disk-based archive usually consists of large disk subsystems or storage arrays and is typically implemented with a tiered storage system.
22 16 A tiered storage system maintains an organization s production data on its highest performance drives such as serial-attached SCSI (SAS) or solid-state drives (SSD) and automatically moves archive data to slower drives such as serial ATA (SATA). Disk-based archive data must also be maintained at an off-site location for disaster recovery purposes. This requires a similar disk configuration at a secondary data center with sufficient network bandwidth for copying and replication between the two sites. Diskbased archives can be very costly to acquire, operate, and maintain. A tape-based archive is usually implemented with a tape library. Tape-based archive data can be quickly accessed and restored when needed. Today s tape technology reduces the latency to access data to very acceptable times for most organizations. Today s enterprise-class tape libraries have advanced capabilities that include automatic compression, WORM (Write-once, Read-many) technology, and encryption. These tape libraries can be automatically managed to augment expensive disk storage capacity with less costly tape-based storage. Finally, tapes containing archive data can be easily and securely copied locally, and then transported to an offsite location for disaster recovery purposes. Alternatively, an additional copy of the archive data can be created at a remote location.
23 17 Why Is Tape the Best Archive Media? Disk space is an important part of any enterprise data storage strategy, but it is simply not practical or even desirable to use disk exclusively for all of your enterprise storage needs. Enterprise storage is not an either/or proposition. Flash storage, disk, and tape all have their place in an enterprise tiered-storage strategy, and you have to use the right tool for the job. Flash storage is ideal for tasks with intensive I/O requirements where speed is the most critical factor. Disk works best for primary storage and as a staging area for backups. And tape is ideal for backups and archives. Tape and disk storage systems can and should coexist in a tiered-storage strategy. Many storage vendors paint tape storage as an inferior solution to disk, a last-generation technology on the verge of extinction a dinosaur, if you will. But the reality is that tape storage is not a dinosaur. Tape storage continues to be a key component in the enterprise data center, and most of the world s information is actually stored on tape! This has been true for many years, and will be well into the future. Tape also has better error correction rates and longer refresh cycles than disk (see Table 2-1).
24 18 Table 2-1 Tape versus Disk Performance Characteristic Disk Tape Max shelf life (bit rot) 10 yrs 30 yrs Best practices for data migration 4-5 yrs yrs Uncorrected bit error rate Power and cooling 290x 1x Finally, tape has a significantly lower total cost of ownership (TCO) compared to disk. The cost and performance advantages of tape include Acquisition costs. The Clipper Group (www. clipper.com) estimates that the cost to implement a disk-based archive is 15 times more than the cost of a tape-based archive. Energy savings. An enterprise-class tape library uses much less energy (290 times less according to the Clipper Group!) than disk because it doesn t spin 24/7 like disk. In a 2010 study, the Clipper Group concluded that the cost of energy alone for the average disk-based solution exceeds the entire TCO for the average tape-based solution. Management savings. Tape has a higher ratio of petabytes managed per storage administrator than disk. This translates to lower overall labor costs. Longevity. No matter how you store your data, eventually it has to be moved either due to obsolescence or deterioration of the storage media. It is not uncommon for archive data to remain on
25 19 tape for up to a decade (though the tape itself can last up to 30 years) disk archives typically need to be replaced every three to five years. Scalability. Tape storage systems are highly scalable simply add more tapes for additional capacity and more drives for performance. The amount of tape and capacity that can be stored in a tape library dwarfs the capacity of comparable disk storage systems. Thus, you get more petabytes of storage per square foot in the datacenter with tape than with disk. Data integrity and auditing. Assuming the data is good when you archive it and the storage media is properly maintained, with tape, WYSIWYG (what you see is what you get) becomes what you store is what you get. But disk is constantly subject to corruption due to bad sectors, disk failure, malware, or accidental overwrites. Debunking five myths about tape storage Myth #1: Tape is more expensive than disk. Tape costs less per terabyte, consumes less energy, and is less expensive to operate than disk. Myth #2: Tape is cheaper to buy, but more expensive to operate. The Data Mobility Group reports the TCO for a Serial ATA-based disk storage system is 11 times higher than an LTO-based tape configuration over a seven year period. (continued)
26 20 (continued) Myth #3: Tape has gone away; no enterprise data center uses it. Most enterprise organizations use a tiered storage strategy with tape as the foundation layer, and nearly half of the world s data is stored on magnetic tape. Myth #4: Tape is unreliable. The bit error rate (BER) for Oracle s enterprise tape products is more than 4 million times better than enterprise disk. Myth #5: Tape is a greater security risk than disk. Tape is designed to be portable and therefore has a higher potential for loss. As a result, tape encryption became a necessity long before other storage media encryption and is far more advanced. Tape encryption is built into the tape drive and runs without performance degradation.
27 Chapter 3 Archive Components In This Chapter Managing your archive data Managing your tape data Checking out tape libraries Comparing tape drives and media In this chapter, you learn about the key components of an enterprise archive: archive software, tape libraries, and drives and media. Archive Software Archive software components consist of content and data management software. Content management software Oracle s WebCenter Content unifies data into a single repository where organizations can track information uniformly using metadata and logging. This information can then be integrated with business processes and enterprise applications.
28 22 WebCenter Content offers best-in-class capabilities for managing data logically throughout its lifecycle based on business needs. Questions such as when data will need to be accessed, when it needs to be stored, and when it needs to be deleted apply to every instance of data a company manages or generates. WebCenter Content automatically manages those lifecycle decisions based on organizational policies to help organizations extract more value from the data. Archiving best practices are to always have at least two copies of your archive data on tape. With WebCenter Content, you can keep up to four copies on tape. WebCenter Content provides Intelligent content management Collaboration and re-use access to multiple applications Content management policies based on content Central search engine capabilities Data management software Working in conjunction with content management software applications, such as WebCenter Content (see the preceding section), Oracle s Sun Storage Archive Manager (SAM) provides physical storage management and is used to optimize data placement across multiple tiers of storage, which can include tape and remote storage, as well as high-capacity disk storage.
29 23 SAM presents the file system as if all data is located on primary disk. As data is accessed that is on archive devices only, SAM dynamically stages the data to the primary disk or directly to the application for immediate access. SAM works transparently in the background with tiered storage and makes archive copies based on policies that define file system characteristics. SAM can manage Thousands of SAN clients Hundreds of file systems Millions of files Petabytes of disk cache Exabytes of archive WebCenter Content manages what the data does and what it means; SAM dynamically manages exactly where the data resides in a hierarchy of storage mediums and protects the data with advanced features that include integrity checks, WORM, and encryption. Tape Software Tape software components consist of tape analytics and the Linear Tape File System (LTFS). Tape analytics Oracle s StorageTek Tape Analytics software simplifies tape storage management, taking a proactive approach to eliminate library, drive, and media errors through an intelligent monitoring software application exclusively available for Oracle StorageTek tape libraries.
30 24 With StorageTek Tape Analytics software, you gain insight into detailed health information that helps you to make decisions about your tape environment prior to device failures (see Figure 3-1). Efficiently monitoring your storage environment is key to cost management. When archive applications encounter problems due to tape drive or media exchange errors, assets sit idle, administrators scramble, and data transfer to end-users is delayed. Any of these setbacks may have significant costs associated with them, leaving storage budgets depleted and users frustrated. With StorageTek Tape Analytics proactive approach to tape monitoring, errors are reduced, data flows freely, and the cost of managing an archive is ultimately reduced. StorageTek Tape Analytics is built to meet four key needs of all archive storage environments: Figure 3-1: Oracle StorageTek Tape Analytics.
31 25 Smart: Intelligent algorithms compute hardware health recommendations. Secure: Out-of-band tape monitoring adds zero risk for implementation. Simple: A tool that monitors tape so customers don t have to. Easy to deploy with a single IP connection to each library and a single pane-of-glass interface. Scalable: Supports multiple libraries and multiple sites, designed to meet the needs of a single library of users to the world s largest archives. Linear Tape File System (LTFS) In order to present a complete file image to a user, two types of data need to be stored: the file metadata containing the file structure, file names, file format, and other data elements that are indexed to simplify access to the data on the tape; and the file data the raw file content that is stored on the tape. A tape that is LTFS-formatted is designed so that it may be split into two partitions. The smaller of the two partitions, at the beginning of the tape, holds all of the file metadata for all of the files on the tape. In the metadata partition, files are stored in a hierarchical directory structure. The rest of the tape, the second partition, is dedicated to data storage, as tape storage has done for decades. Because LTFS is an open format, anyone with a compatible tape drive and the drivers to operate it can read an LTFS tape without assistance from any other software. Oracle s open source StorageTek Linear Tape File System (LTFS), Open Edition software enables customers to write files to tape in this self-describing format, much the same way files are written to disk and flash storage devices.
32 26 When a piece of tape media is loaded into a tape drive, the complete file folder image is displayed, with the file structure being pulled from the first partition and the raw file content being accessed from the second partition. StorageTek LTFS is extremely flexible, with support for all three major tape drive offerings: Oracle s StorageTek T10000C tape drive, Oracle s StorageTek LTO-6 tape drive from HP or IBM. Tape Libraries A tape library is a key infrastructure component in a tiered-storage strategy. Tiered storage aligns the value of your data assets with the most appropriate storage media in order to reduce cost and effectively manage data throughout its lifecycle (see Figure 3-2). Tape libraries provide comprehensive, highly scalable storage solutions for backup and archive applications in enterprise, midrange, distributed, and entry-level data center environments. Figure 3-2: Tiered-storage is comprised of disk and tape.
33 27 To learn more about tiered-storage, download your free copy of Storage Tiering For Dummies, Oracle Special Edition at us/products/servers-storage/storage/ index.html. Large-scale archive (more than 500 TB) For enterprise data centers storing more than 500 TB, Oracle offers enterprise-class tape libraries: Oracle s StorageTek SL3000 and StorageTek SL8500 modular library systems. For large archives (greater than 5 PB), the Oracle StorageTek SL8500 modular library system the world s first exabyte storage solution delivers significant value through heterogeneous data consolidation and multigeneration media support in an ultra-dense footprint. Both the StorageTek SL8500 and StorageTek SL3000 use a unique centerline architecture in which drives are kept at the center of the library, thereby alleviating robot contention. Robots travel one-third to one-half the distance required by other libraries, thereby improving cartridge-to-drive performance by up to 50 percent over other libraries. StorageTek SL8500 and SL3000: At a glance The StorageTek SL8500 and StorageTek SL3000 modular tape library systems (see the following table) are flexible, highly scalable storage solutions that feature Scalability and performance with capacity on demand so that you can install physical capacity in advance, then tap into it incrementally when you need it (continued)
34 28 (continued) RealTime Growth capability to non-disruptively add more cartridge slots, drives, and robotics Easy consolidation with Any Cartridge Any Slot technology for seamless mixed-media support that allows you to combine heterogeneous data sources and media types slot by slot for optimal consideration Industry-leading availability with redundant and hotswappable robotics and library control cards SL8500 SL3000 Cartridge Slots: Up to 100,000 Up to 5925 Capacity (Compressed): Up to 1 exabyte Up to 60 petabytes Tape Drives: Up to 640 Up to 56 Native Throughput Up to Up to 48.4 (TB/hr): Tape Drive Choices: T10000C/B/A T9840D/C/B/A LTO 6/5/4/3/2 T10000C/B/A T9840D/C LTO 6/5/4/3 Number of Physical Partitions: Eight Eight Redundant Components: Robotics, electronics, control path CAPS, fans, power Robotics, electronics, control path CAPS, fans, power
35 29 When integrated with a tiered-storage strategy that includes disk, Oracle s Storage Archive Manager (SAM) and Oracle s StorageTek T10000C tape media, the StorageTek SL8500 delivers a highly scalable enterprise-class archive system. Entry to mid-range archive ( TB) The StorageTek SL150 modular tape library is ideal for consolidation of distributed environments, which helps you save time, space, and energy by consolidating multiple libraries and applications into a central location. The StorageTek SL150 is also perfect for rack-based D2D2T (disk-to-disk-to-tape) solutions, when combined with Oracle s Storage Archive Manager and Oracle s disk storage solutions. For organizations with small to mid-range ( TB) data archiving needs, the StorageTek SL150 provides a flexible and scalable archive solution. However, for customers in this segment who have high availability requirements or also need mainframe connectivity, the StorageTek SL3000 (discussed in the previous section) is the right archive solution. StorageTek SL150: At a glance The StorageTek SL150 is a simple, scalable, rack-mounted tape library, focused on growing businesses. The StorageTek SL150 features Industry-leading ease of use and simplicity with an installation wizard to ensure swift installation, remote management GUI, and customer serviceability model. (continued)
36 30 (continued) Scalability from 30 to 300 cartridge slots enables capacity to reach 1.5 PB (compressed) with the use of StorageTek LTO 6 half-height tape drives Expansion modules that offer seamless performance and capacity upgradeability, by easily upgrading tape drives and adding cartridges The SL150 modular tape library easily scales from entry to mid-range as data storage needs grow: SL150 Base module With 10 modules Cartridge Slots: Up to 30 Up to 300 Capacity (Native): Up to 75 terabytes Up to 750 terabytes Tape Drives (LTO 5): Up to two half-height Up to 20 half-height Native Throughput (TB/hr): Oracle s StorageTek SL150 modular tape library is the first scalable tape library designed for the entry to mid-range market. Built from Oracle software and StorageTek library technology, it delivers an industryleading combination of ease of use and scalability.
37 31 Drives and Media The most common tape drive and media format in use today is LTO (Linear Tape-Open). The latest version is LTO-6 with a native capacity of 2.5 TB (uncompressed) and a maximum speed of 160 MB/s. LTO-6 tape supports dual partitioning and the Linear Tape File System (LTFS, discussed earlier in this chapter), which enables creation of tape-based file systems that are similar to disk-based file systems. For organizations that grow beyond the scalability of LTO, enterprise-class drives (such as Oracle s StorageTek T10000C) and cartridges (such as Oracle s StorageTek T10000 T2) provide higher capacity and throughput performance. Tape drive capacity and throughput are two key considerations when comparing the overall expense of different tape storage solutions. Other factors to consider when comparing tape drive technology include Acquisition cost. It s important to evaluate the combined cost of all drives, media, and library slots not just individual drive and media costs, as the drives have different capacity and performance points. Media re-use. Tape drives have different media re-use strategies. Some drives are able to write to previous generation media, while some tape drives allow you to reuse existing media at the full, higher capacity of future drive generations. Data integrity. StorageTek enterprise tape drives, like the StorageTek T10000C, have many features to improve reliability in archiving environments,
38 32 including data integrity validation (DIV), which ensures that data is not corrupted while traveling along the data path, and StorageTek T10000 T2 media, which boasts 30+ years shelf life. Reliability. Drives are designed for different duty cycles and have different features to improve overall reliability. Table 3-1 summarizes the characteristics of StorageTek T10000C drives and StorageTek LTO-6 tape drives. Table 3-1 Capacity (uncompressed) Throughput (uncompressed) Tape Drive Characteristics T10000C LTO-6 Drive 5 TB* 2.5 TB 252 MB/ sec** 160 MB/sec * 5.5 TB with StorageTek Maximum Capacity feature. ** Native sustained data rate. 240 MB/sec full file host data rate, includes wrap turnarounds.
39 33 Oracle s Optimized Solution for Lifecycle Content Management For unstructured content management across any industry, Oracle s Optimized Solution for Lifecycle Content Management (see the following figure) brings together many of the archive components (discussed in this chapter) into a ready-to-implement architecture that removes guesswork for customers. From content ingestion and creation to long-term retention, this architecture brings together best-of-breed components in a streamlined solution for the best TCO, speed to implementation, and reduced risk.
Enterprise Cloud Infrastructure FOR DUMmIES ORACLE SPECIAL EDITION by Michael Wessler, OCP & CISSP Enterprise Cloud Infrastructure For Dummies, Oracle Special Edition Published by John Wiley & Sons, Inc.
Big Data Analytics ALTERYX SPECIAL EDITION by Michael Wessler, OCP & CISSP Big Data Analytics For Dummies, Alteryx Special Edition Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030-5774
SIP Trunking FOR DUMmIES SONUS SPECIAL EDITION by Pat Hurley SIP Trunking For Dummies, Sonus Special Edition Published by John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030-5774 www.wiley.com Copyright
annual report 2012 Protect. Manage. Access. Solve. A Message from the Chairman of the Board Dear Stockholders, CommVault had an outstanding fiscal year 2012 with strong performance in all aspects of our
Issue 4 Handling Inactive Data Efficiently 1 Editor s Note 3 Does this mean long term backup? NOTE FROM THE EDITOR S DESK: 4 Key benefits of archiving the data? 5 Does archiving file servers help? 6 Managing
WHITE PAPER Meeting Backup and Archive Challenges Today and Tomorrow Sponsored by: Fujitsu Nick Sundby November 2014 IDC OPINION IDC's end-user surveys show data integrity and availability remains a top
Hybrid Cloud NetApp Special Edition by Lawrence C. Miller, CISSP Hybrid Cloud For Dummies, NetApp Special Edition Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030-5774 www.wiley.com
Making Everything Easier! Quest Software Limited Edition Application Performance Monitoring Learn to: Leverage APM to make your business more efficient Discover application issues quickly Proactively monitor
Session Border Controllers FOR DUMmIES 2ND SONUS SPECIAL EDITION By Pat Hurley Session Border Controllers For Dummies, 2nd Sonus Special Edition Published by John Wiley & Sons, Inc. 111 River Street Hoboken,
Implement Voice over IP and start saving money today! XO Communications VoIP A Reference for the Rest of Us! FREE etips at dummies.com With case studies on how VoIP improves productivity Timothy V. Kelly
Compliments of 2nd IBM Limited Edition Business Analytics in Retail Learn to: Put knowledge into action to drive higher sales Use advanced analytics for better response Tailor consumer shopping experiences
Best Practices for Cloud-Based Information Governance Autonomy White Paper Index Introduction 1 Evaluating Cloud Deployment 1 Public versus Private Clouds 2 Better Management of Resources 2 Overall Cloud
DDoS FOR DUMmIES CORERO NETWORK SECURITY EDITION by Lawrence C. Miller DDoS For Dummies, Corero Network Security Edition Published by John Wiley & Sons, Inc. 111 River St. Hoboken, NJ 07030-5774 www.wiley.com
Brochure Best practices for cloud-based information governance Autonomy Cloud solutions Information governance in the cloud Key advantages to cloud computing Cloud computing alleviates adoption complexity,
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
Foreword by Martin Glassborow, aka Storagebod, storage industry expert Rethinking Enterprise Storage A Hybrid Cloud Model Marc Farley PUBLISHED BY Microsoft Press A Division of Microsoft Corporation One
1 Addressing the Broken State of Backup with a New Category of Disk-Based Backup Solutions 2 6 7 Vol 2 Issue 1 Addressing the Broken State of Backup with a New Category of Disk-Based Backup Solutions About
Enterprise Mobility FOR DUMmIES SPECIAL EDITION by Carolyn Fitton Corey Sandler Tom Badgett Enterprise Mobility For Dummies, Special Edition Published by John Wiley & Sons Canada, Ltd. 6045 Freemont Boulevard
Network Monitoring and Troubleshooting FOR DUMmIES RIVERBED CASCADE SPECIAL EDITION by Mike Talley Network Monitoring and Troubleshooting For Dummies, Riverbed Cascade Special Edition Published by John
Compliments of Avaya Leader in IP technology VoIP Security Avaya Limited Edition A Reference for the Rest of Us! FREE etips at dummies.com Realize VoIP benefits and stay secure! Peter H. Gregory, CISA,
The Definitive Guide tm To Cloud Computing Ch apter 10: Key Steps in Establishing Enterprise Cloud Computing Services... 185 Ali gning Business Drivers with Cloud Services... 187 Un derstanding Business
Customer Cloud Architecture for Big Data and Analytics Executive Overview Using analytics reveals patterns, trends and associations in data that help an organization understand the behavior of the people
Microsoft Corporation and HP Using Network Attached Storage for Reliable Backup and Recovery Microsoft Corporation Published: March 2010 Abstract Tape-based backup and restore technology has for decades
White Paper May 2006 Applying Electronic Records Management in the Document Management Environment: An Integrated Approach Written by: Bud Porter-Roth Porter-Roth Associates Table of Contents Introduction
SIP Communications FOR DUMmIES AVAYA 2ND CUSTOM EDITION by Lawrence Miller, CISSP, and Peter H. Gregory, CISA, CISSP Foreword by Alan B. Johnston SIP Communications For Dummies, Avaya 2nd Custom Edition
WHITE PAPER Why AFA Architecture Matters as Enterprises Pursue Dense Mixed Workload Consolidation Sponsored by: Violin Memory Eric Burgener July 2015 IDC OPINION All flash arrays (AFAs) have proven themselves