Handling Inactive Data Efficiently

Size: px
Start display at page:

Download "Handling Inactive Data Efficiently"


1 Issue 4 Handling Inactive Data Efficiently 1 Editor s Note 3 Does this mean long term backup? NOTE FROM THE EDITOR S DESK: 4 Key benefits of archiving the data? 5 Does archiving file servers help? 6 Managing messaging systems: How does archiving help? 7 Are you ready to handle Sharepoint growth? 8 Database Archiving 9 Gartner Research: Does Integrated Backup and Archiving Make Sense? 14 Building an Archival Strategy 15 About Ace Data Data is the backbone to any business. In true sense, it is the electronic form of the meaningful information that drives a business. Growth of data is uncontrollable and it should be. With social media becoming popular, data is growing at a much higher pace. Organizations are finding it difficult to handle this large amount of data and decide for how long they need it. Traditionally this data has been well stored on RAID protected redundant storages and additionally protected by backing them up. This practice needs to continue. However, the never ending growth is forcing the organizations to optimize the way they look and work with their data. Beyond a size, you cannot be upgrading your storages forever and increasing the backup demands forever. Backup technologies have improved a lot for years now from the traditional slow moving tapes to RAID protected disks and deduplicated backups. This has now been extended from D2T to D2D and D2D2T and now to D2D2C i.e. Cloud based backups. It is important for organizations to realize the fact that there needs to be a good data management policy to be defined for their data management and protection. These could vary from industry to industry based on the regulations that govern that industry and also between various data types i.e. what is the active life of your file data, s or sharepoint farms? Featuring research from

2 At Ace Data, we have been helping our customers manage their data and storage efficiently. In this Newsletter, we are highlighting the key benefits of archiving the data and expanding the thin line between backup and archiving. Many organizations believe that backup and archiving are same. They believe archiving is nothing more than retaining the backups for a long time. For organizations which really have small amount of data, this could be a safe strategy. In true sense, archiving is not long term backups. Archiving is the process by which you move out the inactive data from the primary disk to a secondary disk which helps in many aspects detailed later in this Newsletter. well. These are closely integrated with ediscovery and Compliance requirements to ensure easy search when required. From strategy to support, Ace Data Devices works closely with you to maximize the value of your IT investments and that of your quality time. Team Ace Data will continue to venture into ever changing and growing market space to ensure that it lives true to its feel relaxed mission for all its customers. We are going to highlight how archiving can help manage the active data more efficiently and improve your ROI on storage infrastructure also without disrupting the end user service experience. With the tremendous growth of data, archiving is becoming a necessity. As a solution provider, we provide solutions around archiving of various forms of data and still making it available both for the users as well as compliance and regulatory purposes. Our solutions help archive file servers, mailing solutions, sharepoint components, instant messaging and Blackberry Enterprise server data with a single for various objects minimizing the storage requirements for archiving as Anuj Mediratta Founder and Director Technology, Ace Data Devices Pvt. Ltd. Handling Inactive Data Efficiently is published by Ace Data. Editorial supplied by Ace Data is independent of Gartner analysis. All Gartner research is 2013 by Gartner, Inc. All rights reserved. All Gartner materials are used with Gartner s permission. The use or publication of Gartner research does not indicate Gartner s endorsement of Ace Data s products and/or strategies. Reproduction or distribution of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Gartner shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The opinions expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company, and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner s Board of Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner research, see Guiding Principles on Independence and Objectivity on its website, 2

3 Does this mean long term backup? Different strategies are adopted for backing up data depending on the infrastructure and RPO, RTO requirements. Many organizations still believe that retaining backups for a long time is as good as archiving the data. It is therefore very important to understand that backups and archival are not the same thing and even the fact that archival is not an extension to backups. Backups serve the purpose of getting back to work when your data is lost or your services are down. Revert to the last backed up good version and you are back in production. Backups help in operational recoveries. They help setup the applications back in production with the same format data as were before the data crash. The key purpose is to get back to production when the applications or data crash. Archival is a different strategy and need to well understood separately. Archival serves the purpose of retaining large amount of data for discovery and long term retention. Before moving on to the long term retention understanding, archival helps in keeping older data slightly away from the recent active data thereby ensuring smooth performance of the active data and databases. Although archival data is not the active data, discovery tools help ensure searching the old data is as smooth and fast as the recent data. Let us take an example here: A telecom organization we work with needs to retain all the records for 10 years as a regulatory requirement. We recently proposed them a backup solution. While the management has got this through, one of the senior DBA has senior concerns over why does he need backups. As a policy, the database generally has records of the last one month. To ensure controlling the size of the database while retaining performance and data availability, the DBA extracts 15 days data out of the database in csv files. That means at any time the database has only 15 active days. These extracted csv files are in store as MIS data for the last six years. At any time, when a user or compliance auditor asks for any particular transaction older than 15 days, he reveals it manually from the csv file. He thought getting a backup solution will automate this process. However, this is not true. What he is currently doing is manual archiving. To automate this, he needs an automation tool and professional database archival tools are available in the market. We asked him a simple question: What would happen if the database crashes today? He says he needs to create a new one. Great and what about the last 15 days data assuming that the older one is manually archived as above. He had no answer. The idea of taking backups is simple. If the database crashes, you can recover it from the backups. What do you lose: time in getting the data back and perhaps some amount of data since the last backup? Both of these can be controlled to minimise depending on the backup strategy you use. The purpose of the backup therefore is to help you be back in action. Backing up on cloud or sending backup offsite by means of tape movement or replication etc. help you protected from a complete site disaster but it is generally good to have active data getting backed up and having older or inactive data getting archived. In recent backup applications, you would find archiving backup is another option. That means for the time you think your backup needs to be active, place it on online storage and then automatically tier it down to the archival storage. This might lead to a slightly longer recovery cycle for the older backups where you can enjoy the luxury, since it helps in keeping the cost of long term retentions low. Source: Ace Data Devices 3

4 Key benefits of archiving the data? Before going on the benefits, let us once again quickly define archiving. Archiving is the process (preferably automated) that moves the data from its online active storage to an offline, inactive storage for various reasons that we are going to discuss in this section. Remember archiving is not restricted to a type of data. Archiving can be done of file servers, mailing applications, sharepoint farms, databases and now even the big data generated out of Instant Messaging tools, log files of Messenger applications and Blackberry Messenger logs. Let us now try and understand the various benefits an organization can achieve by archiving its data. 1. Maintain production performance: Over a period of time as the data grows, the online production data starts behaving and performing slowly. Searching for an old or database record takes longer and so does opening up a file from a heavily filled server. Archiving inactive data reduces the load on the production server and ensures that the performance of production server is not impacted. 2. Reduce Storage: Archiving is done for old inactive data. Archiving is therefore generally done on low cost storage and using low cost disks like NL-SAS disks. This is with the belief and assumption that searching for an older data gives you more time and therefore you can afford a slower search. This brings down the storage cost considerably. If you already have an enterprise storage and not too much of data to archive, you should consider upgrading disks in the storage and archive data on the low cost disks. 3. Increase employee productivity: By using an automated tool, you can ensure that the employee time is available more for their projects and spent less on archiving their old data. You can automate and schedule the archival job for off-peak times so that the archival processing does not hamper the performance. 4. Reduce backup window: Imagine backing up large file servers or mailboxes regularly without even knowing how much of this is inactive and gets backed up every week. Archiving old data reduces the storage used in production thereby reducing the resources spent on backing up. While creating an ROI for an archival solution, don t forget to calculate the reduction in backup resources post archiving. 5. Reducing the risk of human error: This is typically true for a mailing application. If an employee receives or transmits and and deletes it from his mailbox, who owns it now? Next backup cycle would not include such lost data. This could be business critical in both positive and not so positive senses. The employee may need it again or the management needs it for fighting some litigation etc. It is therefore recommended to do an online archival of mail server. This means any moving across the mail server either in or out of the organization gets archived. If you need it, search it from the archival storage. 6. Streamlining records management: When employees create local archive files and take business records home on Flash drives, it s hard to determine where your company s critical business records are and what information is available. Import your PSTs into a centralized archive and automatically capture (and sometimes restrict) s and historical files from employees for comprehensive, convenient top-down records management. 7. Electronic discovery: By archiving old data, organizations can understand information risk with comprehensive data mapping, manage the legal hold lifecycle and conduct investigative search across billions of documents in seconds. Archiving captures file and message content, attachment information, and all metadata for faster, more accurate search results. Many archiving products leverage advanced e-discovery features, such as searching by full text key word and key phrase, word proximity, file size, format, data, sender, recipients, and more. 8. Compliance: Compliance requirements for most organizations require retaining their data for long periods which could be 7 to 10 years depending on the industry and data types. Retaining the huge volumes of data on online storage and production databases, has adverse effect as discussed above. Archiving gives an automated way to answer this. Archiving tools these days are compliant and certified by the various global records management standards. Source: Ace Data Devices 4

5 Does archiving file servers help? Every organization has a huge amount of file data. Right from the onset, the first kind of data an organization creates is file data before it starts adopting ERPs and databases. The key point that everyone misses here is what should be the life of this file data. When a new project starts, IT creates a shared folder on the file server for its users to access this folder, write their files here so that all the project stakeholders access the same documents and move ahead at the same time. When the project is over no one even thinks of what this data would be left out for and for how long would it reside on that file server. We have seen organizations with TBs of file servers and no one really knowing how to improve its performance. Most of them opt for adding more file servers or more storage. This looks good initially but is not the right approach in the long term. Archival tools are available that can help you archive your file servers. In most of such cases, we have run assessment tools. The results are surprising for the IT organization but not for us. We generally see that 60-70% of the file data on file servers has not been accessed for the last six months and more than 50% has not even been accessed for the last one year. So what is this doing on the production file server? File servers can be archived using various criteria like the age of the file, the files that have not been accessed for x months, the files that have not been modified for y months or archive specific extensions. Rules can be set differently for different shared folders and be scheduled to run automatically. If changed or saved, the new file replaces the stub and comes back in production until it qualifies to be archived in the next archive cycle. What are the real benefits of this file archiving then: 1. More space for active production data is created when the older files are archived. 2. Reducing cost of storage upgrades since instead of upgrading the online storage with costly high performing disks and processors, organizations can procure/upgrade lower costs disks. 3. Small files take long to backup therefore backup windows are long. Archive them to reduce your backup window for the active data. Backup the archived data once a month or after running the last archive cycle. Production file server is now impacted only while backing up active data. 4. Backing up lesser amount of data brings down the cost of backups. Organizations backup file servers once full every week and these backups run almost throughout the weekend depending on the data volumes. Backing up only 30% of this data means much smaller backup window and much lesser media required. Source: Ace Data Devices So how does file archival work: Run the archival tool on the desired folders, it will scan through the data and look for the files that match the rule set. These files are automatically removed from here to the destination defined leaving a small stub of a few bytes replacing the entire file that might have been KB or MB in size. For the file server user this file is still visible as it was before running the tool. He can continue to access the file the same way. The only difference is background processing where when he clicks at the file, the stub points to the archival storage and opens the file from there. 5

6 Managing messaging systems: How does archiving help? Mailing system is the backbone of every organization. It does not matter what industry you are and which database or ERP you use, you still need to have a good mailing solution. This brings in another challenge for the IT organization. Organizations are struggling with the costs associated with the out-ofcontrol growth of s on their mail servers. Business expansion getting more employees globally and heavy reliance on mailing solutions for every part of the business is largely contributing to this growth. On the other side, end users storing large size mailing data slow down their desktop/ laptop performance. Many organizations want to control using quotas but for the user it becomes difficult to handle this when the quota exceeds the limit. Intelligent users do manual archiving and manage this well however creating another challenge for the IT administrators. It becomes a challenge when they don t have access to this archive and need an old mail. Furthermore, increasing the need for backing up local desktops and create policies to backup all mail files from the desktop to protect if the user has stored the archived mail file in a personal location. Therefore, it is becoming almost essential for all organizations to consider deploying an archiving system instead of managing large mail servers. In fact, they should also look at deploying retention policies so that the enormous growth is finally checked out somewhere. An organization needs to first classify the based on its business relevance or compliance needs. They then need to build a strategy on the retention policies of how they want to move their s from production to archive systems. archiving generally is done in two main forms especially useful for regulatory and discovery methods: 1. Create a job on your mailboxes with some rules like archive mails sent/received before a given date or six months or by a given sender/receiver type. Run this job on a schedule preferably off-peak hours when running for the first time. 2. Enable journaling i.e. online archiving. This means any that comes in or is sent out of a mailbox is archived when it moves across the mail servers. The first approach needs to be run once when you implement the solution to archive the old s. The second approach makes it online forever. Till this time the mail servers or the users do not get a direct benefit since this is just making a copy of the s. It does however protect against any legal or compliance mode. To get the real benefits for the users and the mail servers, these s need to be deleted from the mail box or the desktop clients. This is done by running a scheduled job generally referred to as creating a stub. Creating rules similar to as stated in point no. 1 above, you can delete the text and attachments from the mailbox reducing the size of the mailbox considerably. The user is left with only the mail header i.e. sender/received and subject. When the user tries to open such , the integrated archival tool refers back to the archival storage and opens up this for the user. This could be slightly slow in access but provides trouble free usage with no botheration of manual archiving to bring down the quota size also. Most archiving tools also offer web based access to the archived s. Integrated and file archiving tools are available now which do a sort of deduplication while archiving. Ex. same or attachment from multiple users is archived only once. Similarly, if an attachment has been archived as a file, it does not get archived again only pointers are created. Organizations need to make a clear policy on how do they want to handle their s during the s life time. One such strategy could be: 1. sent/received: Archive any that is received or sent through the mail server. Therefore a copy of the resides in the archival storage days of sent/receive: Create a stub for the i.e. replace the original from the mail server/ mail client. 3. The message is visible in the mailbox as a pointer and is still accessible from the archive via the user s mailbox days of sent/receive: Delete the stub or pointer from the mailbox. Message is now accessible via a web browser directly through credentials. 5. Message does not reside on the mail server or mailbox now. You can use the archival tools to help search and manage PSTs also across the corporate network. The most important component of using an archiving tool is the educating the users to how archiving will help improve their productivity and maintenance of their mailing systems. Sudden reduction in size and stubbing should not get them by surprise. Mailbox quota problems will also now get answered easily. Most end users will no longer be now required to pull their s manually from their PSTs to manage the mailbox quotas. Source: Ace Data Devices 6

7 Are you ready to handle Sharepoint growth? Sharepoint is a relatively new thing in the market while comparing it with file servers and mailing applications. It might therefore appear to be strange to ask for old data in Sharepoint. However, this is not actually true. Sharepoint helps in collaborative projects. As an administrator you get regular requests for new sites for the new projects. Like file servers, these sites keep growing and mostly you don t have a policy to define the life of a site or a project. You might still know when it becomes inactive. For organizations leveraging SharePoint with MOSS, it would have been 2-3 years that you have been creating sites and filling them with content. In most of the organizations, we have seen that actually 25-30% of sites and related content has become inactive and fits in as content to archive instead of investing in more resources for the new sites. In the classical approach of archiving, you can search across the various repositories of content and make a copy of it on an archival storage. Once validated, you can delete it from the original repository. You need to be sure that you have the deletion policy with you else you can end up with another tough situation. However, remember you help yourself if you remove inactive data. original files can be replaced with stub files to reduce storage. The resulting archive is managed according to retention policy and the content can be easily searched for legal discovery. Utilizing cost effective storage media for large volumes of archived content frees up high-end media, helping enterprises reduce storage and administrative costs. Archival tools available in the market today can help archive sharepoint content based on content type, owner, last modification date and time, workflow state etc. you can apply retention policies to move content through storage tiers to minimize storage cost. Like the file server archiving, you do get features like full text search, restore the archived content, optimize storage resources by compressing data and make it secure by encrypting it. While you may still think you are new to sharepoint, consider going for a solution which already has scalability to add sharepoint content for archival and has a roadmap for sharepoint archival in the future giving you the desired investment protection. Source: Ace Data Devices On the other side, you can look for an archival tool. SharePoint archiving solutions copy documents and content to a central repository and the 7

8 Database Archiving Database archiving is becoming an important new topic for data managers. The need for this function has surfaced at most IT organizations, and the problems it addresses are only getting bigger and bigger. These problems have been caused by the tremendous growth in data across all business lines. These problems include challenges with data retention requirements, application renovations and e-discovery. Most IT data managers recognize the problems but many do not associate database archiving as a solution. This will change as the technology matures and spreads. Database Archiving is the practice of removing selective business that are not expected to be referenced again records from operational databases and storing them in a separate archive data store where they can be retrieved if needed. In a way, database is segregated into an active record set database and a inactive record set database. You actually archive business records from databases. Database archiving is an electronic-form of records retention. For example, an invoicing application has an operational database containing data for the invoices (such as deposits and withdrawals). As data for a single transaction ages, it reaches a point where all intended or expected business uses of the information have been accomplished. The business record includes all data relative to the transaction, including reference information pointed to from the transaction. For example, customer name and address may be copied from the customer master record in order to complete the business record that is moved to the archive. This invoice is a critical piece of information when it is generated and may be even for another 90 days till the time perhaps the payments are received. This policy may be simple (90 days after create) or complex (one year after create unless the account is flagged as under review or the account has a negative balance). Data often reaches a state where all changes are final and all expected reference uses have been accomplished. This is the inactive state. The organization no longer needs this transaction. However, the organization may be required to keep the data available for many more years to satisfy government regulations, or may choose to keep the data for a longer period for un-anticipated uses. Generally, the time required by law for retention exceeds the time the organization would prefer to keep the data. Data in the inactive state usually does not need access to the application programs. Accessing the data for any unplanned uses can be accomplished through simpler generic query and reporting tools. The data can safely be separated from the application and operational environments. Some business record types become inactive and can be archived almost as soon as they are generated. Other types may never reach a point where they can exist independently from the operational environment. However, most applications have transaction data where 8 the data can be safely moved to a database archive for 80 to 90 percent of their required retention period. The data lifecycle and the ability to achieve application and system independence determine the suitability of the data for database archiving. If it qualifies, then as much as 90 percent of the operational data can be offloaded from the operational systems and retained in an archive data store that is cheaper and more efficient for managing inactive data. Problems Addressed by Database Archiving Adding database archiving to the data management practices of an application is a significant move. It is added work to design and implement as well as to provide continuous administration of over the life of the data. In order to justify this, there must be significant problems with the operational environment to make it worthwhile. The driving factors that cause the problems are longer retention periods, application renovation and the increased focus on electronic data for e-discovery. Longer required retention periods have resulted in overloaded databases. Regulations passed in recent years have created retention periods often measured in decades. For many applications, these new requirements mean that no data will be discarded from applicable databases for years. The databases will simply grow and grow. They grow because you cannot discard data, because your business expands and because you merge data into them from companies you acquire. Growth is multidimensional and exponential. Databases are reaching unmanageable size already, and the impact on operational systems is challenging database administrators. The telltale signs of a database that could benefit from database archiving are continuous upgrading of operational systems with attendant hardware and software costs, lengthening time periods required to run backups, reorganizations, recovery and data extracts. Disaster recovery times are also lengthening, even though most administrators are not monitoring this effect. When they do they quickly become disturbed at the time that will be required to execute a disaster recovery. Databases unnecessarily bloated with inactive data are also harder to tune and keep tuned. The inactive data is splattered throughout the database and interferes with all attempts to achieve a high state of performance. The impact is on cost and operational efficiency and can be huge. E-discovery is creeping into the database world. The lawyers have discovered this source for investigation and are increasingly calling on IT shops to produce data and to maintain a legal hold on it for the life of the litigation. A good database archiving practice can help guard against e-discovery failure risk by preserving the integrity of data throughout its lifecycle, making it easier and more accurate to find data, and using the archive to hold the data delivered. Source: Ace Data Devices

9 Gartner Research Does Integrated Backup and Archiving Make Sense? Backup and archiving have long been thought of as complementary, yet few organizations have effectively implemented these technologies together. This research discusses the pros and cons of unifying these activities. Impacts retention and the number of data copies will suffer continuously increasing storage costs and increase their governance risk. and archiving infrastructure may offer a short-term improvement in operational efficiency but in the long term may fail to address users requirements. life cycle policies that include backup and archiving will better support legal, compliance and user objectives. Recommendations deploy archiving for discovery and long-term record retention. critical business processes will need access to data during its production and postproduction phases in order to protect it properly, and to provide appropriate access as it ages. or software infrastructure for backup and archiving to reduce storage costs and provide management efficiencies, but be aware that requirements for recovery point objective (RPO), recovery time objective (RTO) and access to data should dictate the optimal infrastructure that is deployed. and processes in the scope of information governance. that incorporate the service requirements of users addressing both long-term access and immediate physical and logical recovery. Strategic Planning Assumption(s) By 2015, only 15% of organizations will attempt to converge backup and archiving policies and processes, up from 5% today. Analysis For years, organizations have been trying to improve the efficiency, cost and effectiveness of their backup processes. While strides have been made across all market segments, the problem of the shrinking backup window continues to be ever-present. Archiving has been marketed as one solution to this problem if you archive old data and remove it from primary storage, it won t be included in the backup stream, or even deduplicated and, hence, will result in shorter backup windows. Similarly, strategies that promote removing old backups from the backup system and replacing them with archives have been marketed as a way to simplify restores and improve performance. Certainly, archiving can reduce the amount of storage an organization must procure and manage. Archiving has also been deployed for its ability to support e-discovery and compliance, frequently with the intent to stop using backups for long-term retention and discovery activities better suited to a contextually-aware product like archiving; a practice that Gartner endorses. Backup complements archive and archive complements backup theoretically. In reality, what Gartner continues to see is that these technologies are deployed in silos in most environments. Storage and backup administrators are responsible for data protection, and slough off archiving as something the lawyers and businesses need to figure out. Application administrators, archivists and information architects realize that their retention and information governance policies should extend to backup (backup generates copies of data which need to be managed, after all), but they usually don t reach out to their data protection peers. Different administrators, different buyers and different technology value propositions all contribute to a fragmented data life cycle implementation. Gartner believes that organizations must get control of their data by implementing complementary backup and archiving policies for retention, access, recovery and discovery. For example, a simple retention policy may stipulate: Backups should be retained for 90 days, and anything older than that is an archive. Archives should be retained for seven years. A simple access policy may stipulate: Backup to disk should meet RTO objectives of two hours and archives should enable immediate transparent access to archived mail for one year. Organizations that consider the nuances of these policies will be well positioned for improved cost and risk reduction in 2012 and beyond. Despite good information and governance benefits that can be realized by managing backup and archiving holistically, most organizations today find this very difficult to do. Backup administrators and information architect/archivists haven t traditionally spoken the same language and most tools and technologies on the market today address either one or the other of these disciplines. Gartner rarely speaks to organizations interested in archiving that are concerned about deduplication ratios, virtual machine support or backup modernization. Conversely, it is rare to find a backup administrator who is interested in contextual analysis of data for retention, transparent access to retained data from mobile devices and advanced search. Nevertheless, the cost reduction and risk management benefits of looking at backup and archiving holistically as described here can be significant. But a word of caution: These efforts should be balanced with an organization s readiness to tackle this it may be more costly to establish policies and procedures for converged archiving and backup for purposes of information governance than to continue with these as separate disciplines. Impact: IT organizations that fail 9

10 FIGURE 1 Impacts and Top Recommendations for Integrated Backup and Archiving DRAFT Source: Gartner (March 2012) to optimize data retention and the number of data copies will suffer continuously increasing storage costs and increase their governance risk Backup practices and policies are designed to ensure recoverability of data, and are an element of a risk management strategy. Managing information for compliance and e-discovery are also part of a risk management strategy, with a focus on ensuring that the right data can be produced when required. Both of these functions employ retention policy management, that is, the original data (in the case of archiving) or a copy of the data (in the case of backup) is kept for a specified period of time. While these disciplines serve very different purposes, the functionality inherent in backup and archiving (making a secondary copy of data and preserving it for a period of time) greatly overlaps as both processes can keep the original data in place while also making a copy of the data. Many organizations will deploy one of these strategies (nearly always backup) and call it a day. In this case, old, redundant and infrequently accessed data is being backed up over and over leading to increased storage costs. For those organizations that do employ backup and archiving, many do not align policies for retention and deletion and, consequently, data that is archived continues to be backed up, again leading to increased storage costs. Things can be better: operational efficiency can be improved via a rationalization of retention according to desired outcome. Cost can be contained by decreasing the amount of storage that is required when the number of redundant copies of data that are captured, pulled through the network and retained on disk and/or tape (and that are later copied or replicated for disaster recovery reasons) are reduced. A well-deployed archiving strategy can save up to 60% in backup costs and reduce backup times by as much as 80% by reducing the redundant and nonactive data that is continuously captured and stored but is not likely to be accessed. In the last three years, Gartner has found that backup retention has been decreasing as measured by responses to questions about backup retention policies. From European and North American Gartner conference survey data, Gartner found that in late 2011, large enterprises were retaining 50% of backup data for 90 days or less, up from 42% in 2010 (see Figure 21). The largest backup retention period response is 30 to 60 days, with just over a quarter (26%) of those sampled using this policy. This trend accelerated beginning in 2009 due to global macroeconomic concerns and the need to reduce the cost of backup/ recovery and consume less storage. Despite half of organizations effectively using backup for operational recovery of recent (90 days or less) data, 31% of respondents retain data for one year or more, which was also exactly the 10

11 FIGURE 2 Backup Retention Policy Source: Gartner (March 2012) same percentage as in However, the number of organizations retaining backup data indefinitely fell from 10% to 6%. Using backups as an archive is not Gartner s recommended best practice. Backup is for recovery and archiving is for discovery and longterm data preservation. We reiterate our opinion that backups should be used for operational recoveries only, which typically is in the 30- to 90-day range (see Evolving Best Practices for Backup, Archiving and Tape: Strategies for Alignment ). Recommendations: recovery and archive is used for preservation and/or discovery. decrease storage costs and simplify the management process. Impact: An improperly architected unified backup and archiving infrastructure may offer short-term improvement in operational efficiency but in the long term may fail to address users requirements If organizations were to design data life cycle management tools from scratch, with no legacy constraints, they might design a solution that would capture data once through a single set of infrastructure and be able to repurpose that copy for operational recovery all the way through long-term record retention and e-discovery. In the last five years, consolidation has been a major goal of IT organizations and is particularly evident in the backup space. Reducing the number of backup applications, deploying deduplication to contain the redundant number of copies of data and standardizing on common policies and practices to drive down errors and reduce risk have all been common activities within the data center. With this in mind, the notion of a collapsed set of infrastructures for backup and archiving, where multiple agents, policy engines, repositories and administrative consoles give way to a unified toolset becomes attractive. Note that vendors today offer varying degrees of actual integration, but some have road maps for greater future optimization possibilities. Infrastructure should be streamlined as much as possible. Solutions that reduce the software infrastructure and, particularly, that can manage retention holistically are desirable for cost containment and improved manageability. 11

12 Recommendations: leverage the same infrastructure for backup and archiving to simplify management and to contain costs (e.g., common agent, common policy and reporting engine, common repository, etc.). (operational recovery, preservation, discovery or audit). retention and reduce the number of redundant copies of data that are captured and retained (especially if these copies are stored in multiple repositories). capabilities for both backup and archive data and that the solution can integrate with tape for longer-term retention of the archived data. Impact: Information architects who develop data life cycle policies that include backup and archiving will better support legal, compliance and user objectives More organizations are focusing on information governance as a way to ensure that their data meets internal and external compliance requirements (see Note 1). Certainly the management of backup and archiving falls into this definition. Managing information for compliance and e-discovery ensures that you can produce the right data when you need to, that you delete unnecessary data in a timely way and, importantly, that you understand where all the copies of your data reside. Archiving is more suited than backup to these requirements by virtue of its object-level retention management, indexing, basic e-discovery and auditing capabilities. Keeping track of the number of copies of a data object by holding the copy of record in an archive and managing all other copies according to a well-documented retention and deletion schedule means that organizations are less likely to be out of compliance with internal and external regulations for data management. With respect to backup, it s critical that organizations understand how many copies of a backup they re keeping and for how long in order to appropriately respond to requests for information. With information governance as a focus, organizations can focus on keeping track of backup copies, move data into an archive when appropriate and, at that point, discontinue backup of original data. As a documented policy, this lets the organization authoritatively and defensibly produce an inventory of all copies of data when required for discovery or audit. Optimizing backup and archiving policies for purposes of information governance also means better responsiveness to end-user requests for information beyond e-discovery and audits. Users who have lost data should be able to retrieve data according to an SLA with the backup administrator, while users who need transparent access to much older data should be able to get to that data using whatever search criteria or access methodology makes sense for their application. Proper management of data as it ages means that user requirements for retrieval and/or access are consistently being met. Recommendations: unplanned redundant backup copies contribute to the explosion of data volume and little else. including the generation of rogue and unmanaged copies (such as Microsoft Exchange.PST files), and endpoint device issues such as data loss and legal hold. auditors, litigation support and end-users for data recovery or contextual search; understand which tools support which requirements and eliminate redundancy. Evidence 1. Conference kiosk polling data was used from Gartner s 4Q11 European Data Center Summit and U.S. Data Center Conference. These conferences have a large enterprise audience and the respondents in this survey represent a broad cross section of industry verticals. Note 1 Information Governance Source: Gartner Research, G , D. Russell, S. Childs, 21 March 2012 Gartner defines information governance as the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, archival and deletion of information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of information in enabling an organization to achieve its goals. 12

13 Building an Archival Strategy Organizations need to be careful while choosing the software and the overall archival strategy. Archiving is not just another software, it involves a complete infrastructure based on what all you choose to archive. Apart from the software that extracts the data, the destination is equally important for various reasons. Last but not the least, discovery and compliance are the key reasons to archive so organizations cannot ignore this aspect from their archiving strategy. What to archive? This is a simple question that does not have a standard answer. What to archive purely depends on the type and size of data an organization has and the resources they have to manage the same? While some organizations may not have a problem managing large amount of data, some others may find a couple of terabytes large enough. While deciding on this, an organization should consider two critical aspects: 1. Performance of the production servers for the given application. 2. Data size that they want to handle and invest in. Looking at the multiple benefits archival brings in, organizations should start thinking about their archival strategy as quickly as they think their data size is going to grow. Remember not having an archival strategy early in place might lead you into investing a lot on online storage and backup strategy along with heavier compute power to keep the lights on with the growing data sizes. Organizations providing archival solutions do provide assessment services to help you decide how much of data is worth archiving and what strategy should be adopted to get the best Return On Investment. A typical example could be video surveillance type scenarios. Organizations maintain the same could be recording huge amount of data which is read only at the time of litigation. Instead of investing huge on online storage, organization should build in archival as a part of the initial investment. Organizations might choose to keep last three months recordings online and older in archival through an automated process. This would defer their growth investment on low cost archival storage than on high performance online storage. What to look for in an Archival Software In a smaller setup, organizations can look for a small archival solution that could be archiving the data type that the organization looks to archive. Budgets could also drive the reason for this choice and the software could have some limited features. However, an organization that is either already having a lot of data or is growing fast should consider many aspects while deciding on the archival software: 1. Archival software should support multiple data types. Archival software are available that support archiving file data, mailing applications, sharepoint, instant messaging like Yahoo and MSN messenger data, call logs like Blackberry Enterprise server call logs etc. Choosing common software helps easy scale out. 2. Archival software should be able to classify data and be granular while providing archival options and policies. 3. Archival solution should support Write Once Read Many (WORM) capabilities. 4. Archival solution should support a wide variety of devices to be configured as destination. The destination disk could be a DAS, NAS or SAN connectivity. It should also support optical devices and tape media for archiving so that if the organization chooses a strategy to move on to optical beyond after a given time, the software should support it. 5. Searching of data should be smooth and easy to use interface should be available. 6. Search should also be possible through a web based utility in event of primary system not available for searching the desired data. 7. The solution should deliver comprehensive litigation support for pro-active e-discovery. 8. Since archiving is done to take care of compliance needs, the archiving software should be capable of adhering to the compliance and review requirements of global standards like SEC 17a-4, NASD 3010, FERC 717 etc. 13

14 Where to archive? This again is an important aspect as this is where the archived data lies. There are two aspects in this: 1. On-premise or cloud based. 2. Type of hardware. While most cloud service providers would prefer to offer a disk based archival solution, tapes have also been in use for archiving data. The choice of disk for cloud service providers should look obvious ensuring that writing after travelling in internet bandwidth should not slow it down. Hardware supporting RAIN architecture is built on multiple nodes. Every data item written on the archival storage is written at more than one location within the storage. This ensures that if one node fails, data is available on a second node and gets regenerated on node replacement. This brings in a slightly higher investment but offers more protection for organization or data types that have to cater to stricter compliance needs. Archival is not restricted to a disk based device only. Most applications support archiving data back to optical devices and tapes. However, since the reliability of tapes has always been a concern, disk based archiving is more popular. Source: Ace Data Devices Cloud based offerings bring in the fact that it lowers the upfront investments. Service providers would either charge you on per user or per GB of archived data. Single instancing in many cases reduces the amount of data archived thereby further protecting your ongoing investments. Cloud based archival solutions are having strong focus on , file server and social media content. Data to be archived gets classified and compressed on-premise and stored on the public cloud giving the advantage of reduction locally and storage remotely. This can also help organizations in outsourcing management of their archival strategy avoiding the need to hire trained manpower to take care of this additional new IT strategy. On-premise solutions on the other side keep the data in-house. They need initial investments in terms of archival software and hardware including compute and storage. As against the cloud based model, initial investments are required to be made anticipating growth. Manpower needs to be deployed to take care of the archival operations and data management on the archived systems. In general these solutions are used for capturing content generated within the Data Center like s and corporate file servers. In both cases, the destination could have various options. Ex. any low cost disk solution could act as a good archival destination. However, vendors do provide specialized storages for archival purposes. These storages not only support the standard RAID and WORM capabilities but also support RAIN architecture. 14

15 About Ace Data OUR STIMULATING PERFORMERS Neeraj was among the first IT professionals to recognize the importance of information and data protection. Says he: Data loss is worse than losing machinery through fire. You can always replace the machinery, but it s almost impossible to replace data collected over the years. Anuj is the technical ace of the partnership. The two bring complementary skills to the table. Together, they have propelled Ace Data Devices to the pinnacle of their business. Says Anuj: What sets us apart from others is that we re specialists. We re single minded in our focus, that is, end-to-end solutions in business critical data storage and backup technologies. Others dabble in this space; we own it. So sit back and relax because now data troubles will take a backseat forever so that you can take a better life forward. As the ace drivers, Neeraj and Anuj bring their complementary skills to the table. Together, they have propelled Ace Data Devices to the pinnacle of their business. GIVE YOU CALM We make your data work for you. Each practice spans solutions made up of hardware and software from multiple storage innovators, along with a comprehensive suite of professional and support services like: BACKUP AND RECOVERY Ace Data offers a suite of solutions and professional services that address local and remote backup, as well as disaster recovery. STORAGE CONSOLIDATION At Ace Data the consolidation solutions are tailor-made keeping in mind the exact configuration and requirements. ARCHIVAL MANAGEMENT The Ace Data archival solutions enable you to manage the growth of structured and unstructured data through policy-based archiving and discovery capabilities. CONTENT MANAGEMENT Ace Data provides intricate content management solution and is capable of trumping a challenge of any complexity. IMPLEMENTATION AND INTEGRATION SERVICES Ace Data is equipped with knowledge and experience that helps them study and analyze customer s needs and prepare a good price-performance solution. RELAX: YOUR DATA ADMINISTRATION IS IN SAFE HANDS With their information management solutions, ensuing work pressure is taken care of efficiently. No wonder then discerning customers prefer Ace Data for the management and safety of all their important data. Ace Data is capable of designing a tailor-made solution of global standard, in the shortest possible time. Ace Data s fully transparent and process driven approach and execution speeds up the entire process, leaving nothing to chance thus, ensuring you absolute peace of mind. management and support, it s a seamless series of action. Every assignment we undertake goes through a certain process - our quality experts examine the business environment, identify need / concern areas, plan and design options, supervise implementation and work out the management and support services thereafter. transparency, effective reporting, time management and service deployment. almost obsessed with data. They keep upgrading their skills and keep abreast with global advances at the Ace library and laboratory. The aim is to deliver solutions in the most efficient manner. Implementation follows detailed planning and documentation. Engineers arrive at each site armed with their bible: a stepby-step flowchart. and technicians with extensive experience in handling complex technical challenges, place their collective skills at your disposal. The team comes up with a customized solution, cost-effective and future proof. 15

The Definitive IP PBX Guide

The Definitive IP PBX Guide The Definitive IP PBX Guide Understand what an IP PBX or Hosted VoIP solution can do for your organization and discover the issues that warrant consideration during your decision making process. This comprehensive

More information

Best Practices for Cloud-Based Information Governance. Autonomy White Paper

Best Practices for Cloud-Based Information Governance. Autonomy White Paper Best Practices for Cloud-Based Information Governance Autonomy White Paper Index Introduction 1 Evaluating Cloud Deployment 1 Public versus Private Clouds 2 Better Management of Resources 2 Overall Cloud

More information

IT Security Trends. A strategic overview featuring Gartner content. In this issue

IT Security Trends. A strategic overview featuring Gartner content. In this issue In this issue 1. Introduction: Virtualization Security Strategies for Today s Businesses 2. Strategies for Protecting Virtual Servers and Desktops: Balancing Protection with Performance 7. A Practical

More information

An introduction and guide to buying Cloud Services

An introduction and guide to buying Cloud Services An introduction and guide to buying Cloud Services DEFINITION Cloud Computing definition Cloud Computing is a term that relates to the IT infrastructure and environment required to develop/ host/run IT

More information

Demonstrating the Business Value of Software Asset Management and Software License Optimization

Demonstrating the Business Value of Software Asset Management and Software License Optimization Issue 1 Demonstrating the Business Value of Software Asset Management and Software License Optimization Leveraging Best Practice Processes and Technology to Improve the Bottom Line Featuring research from

More information

IT asset management. A best practice guide for IT asset management. Business white paper

IT asset management. A best practice guide for IT asset management. Business white paper IT asset management A best practice guide for IT asset management Business white paper Table of contents Executive summary...3 Introduction...3 The growing necessity to manage and control assets...4 A

More information

White Paper. Practical Disaster Recovery Planning. A Step-by-Step Guide

White Paper. Practical Disaster Recovery Planning. A Step-by-Step Guide White Paper Practical Disaster Recovery Planning A Step-by-Step Guide January 2007 Table of Contents Purpose of the Guide...3 Our Approach...3 Step 1. Develop the Planning Policy Statement...4 Step 2.

More information

Cyber-Security Essentials

Cyber-Security Essentials Cyber-Security Essentials for State and Local Government Best Practices in Policy and Governance Operational Best Practices Planning for the Worst Case Produced by with content expertise provided by For

More information



More information

Meeting Backup and Archive Challenges Today and Tomorrow

Meeting Backup and Archive Challenges Today and Tomorrow WHITE PAPER Meeting Backup and Archive Challenges Today and Tomorrow Sponsored by: Fujitsu Nick Sundby November 2014 IDC OPINION IDC's end-user surveys show data integrity and availability remains a top

More information

Good Business for Small Business. Handbook Best financial practices for Canadian businesses

Good Business for Small Business. Handbook Best financial practices for Canadian businesses Good Business for Small Business Handbook Best financial practices for Canadian businesses www.visa.ca/smallbusiness Table of Contents Introduction ii I. Financing: Getting money to start and run your

More information

The Microsoft Office 365 Buyer s Guide for the Enterprise

The Microsoft Office 365 Buyer s Guide for the Enterprise The Microsoft Office 365 Buyer s Guide for the Enterprise Guiding customers through key decisions relative to online communication and collaboration solutions. Version 2.0 April 2011 Note: The information

More information

Information Technology Outsourcing

Information Technology Outsourcing Information Technology Outsourcing GTAG Partners AICPA American Institute of Certified Public Accountants www.aicpa.org CIS Center for Internet Security www.cisecurity.org CMU/SEI Carnegie-Mellon University

More information

Vendor Landscape Plus: Email Archiving

Vendor Landscape Plus: Email Archiving Headline / Subhead Vertical Spacing Vendor Landscape Plus: Email Archiving It s not just about the future. Preserve your email history, or put your organization at risk. Introduction Meeting compliance

More information

Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success

Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success Convergence of Social, Mobile and Cloud: 7 Steps to Ensure Success June, 2013 Contents Executive Overview...4 Business Innovation & Transformation...5 Roadmap for Social, Mobile and Cloud Solutions...7

More information


HEALTH INFORMATION TECHNOLOGY HEALTH INFORMATION TECHNOLOGY This transcript of the Health Information Technology online modules is provided for information purposes only. To obtain your AMA PRA Category 1 Credit for these modules,

More information


OPEN DATA CENTER ALLIANCE : Big Data Consumer Guide OPEN DATA CENTER ALLIANCE : sm Big Data Consumer Guide SM Table of Contents Legal Notice...3 Executive Summary...4 Introduction...5 Objective...5 Big Data 101...5 Defining Big Data...5 Big Data Evolution...7

More information

Mergers, Acquisitions, Divestitures and Closures. Records and Information Management Checklists

Mergers, Acquisitions, Divestitures and Closures. Records and Information Management Checklists Mergers, Acquisitions, Divestitures and Closures Records and Information Management Checklists By John T. Phillips March 2011 Project Underwritten by: ARMA International Educational Foundation Endowment

More information

Cloud-Based Disaster Recovery on AWS

Cloud-Based Disaster Recovery on AWS Cloud-Based Disaster Recovery on AWS Best Practices Cloud-Based Disaster Recovery Best Practices Fall 2014 Contents Introduction......................... 2 Traditional Disaster Recovery (DR)....... 3

More information


THE FUTURE OF INSURANCE IT INFRASTRUCTURE THE FUTURE OF INSURANCE IT INFRASTRUCTURE A SURVEY OF GLOBAL INSURANCE LEADERS This is an authorised reprint of an independently researched and executed report granted by Celent exclusively to Wipro Technologies.

More information

Creating Value-Based Archiving Solutions with IBM Content Collector

Creating Value-Based Archiving Solutions with IBM Content Collector Front cover Creating Value-Based Archiving Solutions with IBM Content Collector Content archiving and retention management with use cases Integration with IBM Content Classification Integration with IBM

More information

Tivoli Endpoint Manager for Software Usage Analysis Version 1.3 Implementation Guide

Tivoli Endpoint Manager for Software Usage Analysis Version 1.3 Implementation Guide Tivoli Endpoint Manager for Software Usage Analysis Version 1.3 Implementation Guide ii Tivoli Endpoint Manager for Software Usage Analysis Version 1.3 Implementation Guide Contents Software Usage Analysis

More information

How to Decide to Use the Internet to Deliver Government Programs and Services

How to Decide to Use the Internet to Deliver Government Programs and Services How to Decide to Use the Internet to Deliver Government Programs and Services 1 Internet Delivery Decisions A Government Program Manager s Guide How to Decide to Use the Internet to Deliver Government

More information

Digital Asset Management A Closer Look at the Literature

Digital Asset Management A Closer Look at the Literature Digital Asset Management A Closer Look at the Literature By Franziska Frey, Ph.D. Professor, School of Print Media Shellee Williams-Allen MBA Student, School of Business Howard Vogl Graduate Student, School

More information

Software Usage Analysis Version 1.3. Implementation Guide

Software Usage Analysis Version 1.3. Implementation Guide Software Usage Analysis Version 1.3 Implementation Guide Implementation Guide i Note: Before using this information and the product it supports, read the information in Notices. Copyright IBM Corporation

More information

Rethinking Enterprise Storage

Rethinking Enterprise Storage Foreword by Martin Glassborow, aka Storagebod, storage industry expert Rethinking Enterprise Storage A Hybrid Cloud Model Marc Farley PUBLISHED BY Microsoft Press A Division of Microsoft Corporation One

More information

Software Asset Management

Software Asset Management Ville-Pekka Peltonen Software Asset Management Current state and use cases Helsinki Metropolia University of Applied Sciences Master of Business Administration Master's Degree Programme in Business Informatics

More information

User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011

User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 Gartner Dataquest Research Note G00208112, April Adams, Naveen Mishra, 22 October 2010 Complementing Gartner

More information


A PRAGMATIC, EFFECTIVE AND HYPE-FREE APPROACH FOR STRATEGIC ENTERPRISE DECISION MAKING An Essential Guide to Possibilities and Risks of Cloud Computing An Essential Guide to Possibilities and Risks of Cloud Computing By 2011, early technology adopters will forgo capital expenditures and

More information


HOW SAAS CHANGES AN ISV S BUSINESS HOW SAAS CHANGES AN ISV S BUSINESS A GUIDE FOR ISV LEADERS Sponsored by Microsoft Corporation Copyright 2012 Chappell & Associates Contents Understanding the Move to SaaS... 3 Assessing SaaS...3 Benefits

More information