THE CASE FOR ACTIVE DATA ARCHIVING Written by Ray Quattromini 3 rd September 2007 Contact details Tel: 01256 782030 / 08704 286 186 Fax: 08704 286 187 Email: sales@data-storage.co.uk Web: www.data-storage.co.uk
Introduction For many years the traditional way of storing and maintaining data was to keep purchasing more disk arrays. Over the last few years the volumes of information has increased enormously due to the digital age therefore clearly a better way of storing and managing information needs to be found. Up to 80% of your organisation s data has not been accessed in the last 90 days and at least 60% of it will not be required in the future. This data could be deleted if: HSM Redundant data could be easily identified You could predict which items are required for future reference Data retention was not essential for corporate management or regulatory purposes The traditional way of archiving data was called HSM (hierarchical storage management), this evolved from the mainframe whereby all data in a given area was archived. During the 90 s HSM was really the only way of archiving information to a drive letter or device. The primary problem with HSM is as follows: 1. Data has to be pointed to a physical or mapped location. 2. No migration or policy rules can be applied to the data. 3. It is primarily a manual task. 4. Restores from archive are slow. 5. Users are not aware and cannot recover archived data. These operations must be carried out by system administrators. Due to increasingly complex data compliance regulations the option to archive and store everything is not ideal. 1. Users need to be able to find and recover files. 2. Compliance officers require discovery tools and audit trails. 3. To store everything only increases the amount of storage space required. Keep it on line The reluctance for organisations to archive data is based upon lack of knowledge and the fear of losing valuable information. The mindset of keep it on RAID arrays and backing it up every night is still the example we encounter daily as a business. As a consequence the size of online data volumes are spiralling out of control and storage management has become an ever-increasing challenge. Server performance and data access are diminishing Business legislation and user demands are requiring companies to increase disk space to alleviate the problem Data management puts high overheads on network and backup windows DR policies for recovery take longer as all data needs to be restored rather than the most essential The annual cost of managing this data over it s lifetime is more than 5x the initial purchase price of the equipment 2
Active Data Archiving Clearly for the reasons mentioned a better and more efficient way of managing data needs to be found. Active Data Archiving is a method of managing data types by policy and controlling the movement of this information through different tiers of storage and moving this information to a more suitable storage location. The migrated files leave a 1 4k stub that knows where the files are located in case of restore. An active archive ensures the data is always available and accessible to users, albeit with a slight delay in restoring the information depending where and how it is stored. Active Data Archiving overcomes many of these issues and enables companies to adopt an archiving strategy that can evolve and develop as business conditions dictate. By deploying an Active Data Archiving strategy any business will soon reap the rewards and wonder why they hadn t done anything sooner. Existing Investment in IT Numerous organisations have a huge on-going investment in purchasing data storage systems and this investment is increasing year on year as the demands to store more information increases. After 3-5 years, this equipment is then replaced. With the implementation of an Active Data Archive solution we can actually extend the life of this investment by moving the data to a secure active archive, thereby freeing up valuable disk space on high performing storage solutions and slowing down the necessary and ongoing investment of more storage space giving a huge ROI benefit. An additional benefit with active archiving is that you may be able to utilise your existing older storage systems to archive data. Policies As the need of corporate governance increases, companies need to know, retain and delete information by setting up policies. These policies contain rules which carry out instructions, such as: Copy Delete Migrate Move De-migrate Gather stats Owner File size File type Date created Date Modified Last access date Attribute Energy Savings As mentioned, typically 80% of stored information has not been accessed within 60 days. By moving this data to optical, tape or even a high capacity SATA RAID array will save a considerable amount of energy. Energy will also be saved by the benefit of getting better utilisation of existing storage systems and the slowing down of storage procurement. For more information on energy savings please visit http://www.green-datastorage.com. Backups Savings can be achieved by using an Active Data Archive thus reducing or eliminating the time taken to backup essential data. Backup windows can decrease if the volumes being backed up are reduced by 80% so the need to continually backup old files is eliminated 3
Existing investment in backup technologies can be reduced IT support and maintenance can be reduced You might also want to consolidate some of this data onto fewer servers, therefore reducing your backup software license costs Restores A restore can be for a single file or a whole server. By adopting an Active Archiving strategy the time taken to restore can be reduced by over 80%. Corporate Governance and Data Compliance The need for corporate governance and data compliance is becoming increasingly common in business. An Active Data Archive can become an essential part of corporate governance and data compliant strategy. By setting policies on data types we can determine where it needs to be stored, how long it needs to be retained and when it needs to be deleted or moved. In addition to this, reporting and data discovery is necessary when trying to find/track file movements within the archive. It can also be used to discover file types which are not allowed to be stored on the corporate network and automatically delete the information or move it to another location for processing. This provides an essential part of any compliant archiving solution. Server Performance As disk arrays become full the data becomes fragmented. Implementing an Active Data Archive strategy can alleviate much of this fragmentation by keeping the disks optimised, only storing frequently accessed data thus reducing data volumes by as much as 80%! Disaster Recovery More organisations are adopting a DR strategy as part of their corporate governance and compliance regulations. A DR strategy can be a cluster of servers, remote data replication to a DR site or having backup tapes to restore. An Active Data Archive can significantly help in a DR environment by reducing the amount of information that is being backed up or replicated to a DR site. This will greatly assist DR restores as the only data on a file server that needs to be restored are the archive pointers which are between 1k and 4k in size. In addition to the above, aged data can be automatically replicated to your DR site(s). Secondary Storage Whilst primary storage is fast, it is also expensive to purchase, enhance and maintain. An Active Data Archive can be used to move infrequently accessed information to alternative storage devices; these can be any combination of the following: Primary storage Fibre Channel, SAS, iscsi or SCSI SATA storage arrays Optical jukeboxes Network Attached Storage Tape libraries Storage Management All data needs to be managed, how companies control and police the information lifecycle of their data is a continually growing problem. Many data compliance and corporate governance regulations require information to be stored and kept for a minimum number of years and thereafter deleted. 4
An Active Data Archive can alleviate all of these issues by creating policy rules based on the data type. For example an insurer might issue a new policy to a customer and this information may need to be accessed over the following 60 days for accuracy and formal checks, thereafter it can be archived. The information will therefore remain on primary storage for the first 60 days, moving to SATA after this time and finally migrating to a Blu-ray Optical disc after 120 days for long term archive. All this can be achieved with very little human intervention once the policy rules have been created as the whole information lifecycle process is fully automated. Summary I hope by reading this document, you can understand the genuine business benefits of implementing an Active Data Archiving solution. The technologies needed to achieve this both in hardware and software exist today and we are 100% behind the need to educate and motivate organisations in to looking at Active Archiving solutions. If you would like a demonstration, site visit or discussion on our solutions, please call us on 01256 782030. About Fortuna Power Systems Ltd Fortuna Power Systems Ltd is headquartered in Basingstoke, Hampshire and has been providing data storage solutions to private enterprise and public sector clients for over 13 years. Our clients are in diverse fields and include military, telecommunications, education, advertising, pharmaceutical, local authorities, software developers, government agencies and manufacturing industries. Our data storage solutions specialists work with a consultative approach, tailor making each solution to fit perfectly within each individual customer s application and environment. We supply solutions which can scale with the ever increasing and unpredictable data growth rates generated by today s applications. We are vendor independent so there is no restriction on the type of solution that we can deploy. We have many installations that include various storage devices including, RAID arrays, tape drives, libraries and autoloaders, optical jukeboxes as well as Fibre Channel and iscsi (IP SAN Storage) and NAS (Network Attached Storage). Storage hardware is nothing without storage management software. Fortuna Power systems have an extensive portfolio of storage management solutions. Our consultants have over 70 years experience and a real understanding of backup, application protection, file archiving, e-mail archiving, video archiving, server virtualisation and data compliance as well as an extensive knowledge and understanding of operating systems, networks and infrastructures. We design, implement and maintain all the solutions we supply. Quality Standards We are a fully approved BS EN ISO 9001:2000 registered firm number GB 8820. Copyright 2007 Fortuna Power Systems Ltd. All rights reserved. You may view and print this publication solely for personal, informational, internal, non-commercial purposes. You must not change it in any way or remove any copyright or other proprietary notices. You may not reproduce, distribute or translate any part of the publication without the written permission of Fortuna Power Systems Ltd. These rights constitute a license to use and not a transfer of title. All brand names, trademarks and registered trademarks are the properties of their respective owners. 5