2 Table of Contents Table of Contents Table of Contents... 1 Introduction... 3 Storage Challenges... 4 How Deduplication Helps... 5 How It Works... 6 Deduplication at source... 6 Deduplication at target... 8 Deduplication database Deduplication data store Recovery Compacting Compression and encryption Replicating data Enhancements in Acronis Backup Advanced 11.5 Update When to Use Deduplication Use Case 1: Big environment with similar machines Use Case 2: WAN optimization Use Case 3: Production Application servers Deduplication Restrictions Deduplication Best Practices Performance Tests Test configuration Backup and indexing tests Useful Links Appendix A. Estimating Storage Capacity Backup scheme and retention period
3 Table of Contents Compression Ratio Number of machines and amount of data Unique data percentage Daily incremental backup size Storage capacity calculations Appendix B. How to Upgrade Deduplication How to find the datastore file Re-attaching the managed vault New Deduplication-related configuration parameters About Acronis
4 Introduction Introduction Powered by the Acronis AnyData Engine, Acronis Backup Advanced delivers robust, easy-to-use unified data protection and disaster recovery for multi-system environments. Based on your business needs, you can deploy individual solutions or seamlessly blend them together into one efficient backup solution and manage using a single unified console, the Acronis Management Server (AMS). Once blended, Acronis Backup Advanced provides additional centralized storage option through Acronis Storage Node. One of key capabilities of Acronis Storage Node is a Deduplication. The deduplication technology helps to reduce storage costs and network bandwidth utilization by eliminating duplicate data blocks backup and transfer. Acronis Backup Deduplication will help you to: - Reduce storage space usage, as only unique data is stored - Eliminate the need to invest into data deduplication-specific hardware - Reduce network load, as less data is transferred, leaving more bandwidth for your production tasks This document describes main aspects of Acronis Backup Deduplication technology and details of its implementation. The document provides IT professionals with a valuable information to design and implement of a backup infrastructure within the organization. 3
5 Storage Challenges Storage Challenges We are living in the era of big data. In 1990, the hard disk of personal computer was 10 Megabytes. Now, multi-terabyte disks are the norm, and every 10 minutes humanity creates as much data, as was created since dawn of civilization until year You have to protect and back up all this data otherwise your company may lose productivity, reputation, time even an entire business could close down. However, 75% of SMBs surveyed by Acronis and IDC have admitted that their data is not fully protected. One of the primary challenges is the sheer amount of data. Let us take, for example, a company with 400 employees using desktops and laptops. Average laptop can hold from fifty to few hundred gigabytes of data on the hard disk so all PCs contain from 20 to 150TB of data. With 2:1 compression, the backup administrator would need to provision from 10 to 75TB for every full backup, and have the space for incremental and differential backups. Eventually this company may need to acquire as much as a one petabyte of storage for PCs backups alone. Let us assume this company would decide to invest into this enormous and expensive storage for PC backups. Next, even bigger challenge will be to deliver backups from PCs to this storage. Onehundred megabit network can only transfer 10 megabytes of data per second. That means that full backup for the same company would take from 2 weeks to 3 months that s how long it would take to transfer 10 to 75 TB of data over 100-mbit network. Yet, every desktop has the same Windows, same applications, and often the numerous copies of the same data. Storing and transferring the same data multiple times to the same storage is a waste of time and resources. If backup solution would transfer and store only unique data, the company could decrease storage capacity and network requirements by up to 50 times! Deduplication allows companies to achieve these savings. 4
6 How Deduplication Helps How Deduplication Helps Deduplication minimizes storage space taken by the data by detecting data repetition and storing the identical data only once. Deduplication also reduces network load: if, during a backup, a data is found to be a duplicate of an already stored one, it is not transferred over the network. Acronis Backup will deduplicate backups saved to a managed vault when you enable deduplication during the vault creation. A vault where deduplication is enabled is called a deduplicating vault. The deduplication operates on data blocks. It works for all operating systems, supported by Acronis Backup Advanced. Deduplication produces the maximum effect in following cases: When creating full backups of similar data from different sources, like operating systems or virtual machines, and applications deployed from a standard image. When creating full backups of the systems that you have backed up to the same deduplicating vault previously. When performing incremental backups of similar data from different sources, for example when you deploy OS updates to multiple systems and run the incremental backup. When performing incremental backups of data that does not change itself, but changes its location, for example when multiple pieces of data circulate over the network or within one system. Each time a file moves, it will be included into the incremental backup, increasing its size. Deduplication solves the issue: each time a file appears in a new place, deduplication will not store a second copy. 5
7 How It Works How It Works During deduplication, the backup data is split into blocks. Each block uniqueness is checked through a special database, which tracks all the stored blocks checksums. Unique blocks are sent to the storage and duplicates are skipped. For example, if ten virtual machines are being backed up to the deduplicated vault and some block is found in five of them, only one copy of this block will be stored. This algorithm of skipping duplicate blocks allows to save storage space and network traffic. The following sections describe deduplication technology and its implementation in Acronis Backup Advanced. Deduplication at source When performing a backup to a deduplicating vault, Acronis Backup Agent calculates a fingerprint or a checksum of each data block. This fingerprint or a checksum is often called a hash value. The data block size is 4 KB for disk-level backups and 1 byte to 256 KB for file-level backups. Each file that is less than 256 KB is considered a complete data block. Files larger than 256 KB are split into 256-KB blocks. Before sending the data block to the vault, the agent queries the storage node to determine whether the block's hash value is already present. If so, the agent sends only the hash value; otherwise, it sends the block itself. The storage node saves the received data blocks in a temporary file. Saving backups to a temporary file first, instead of writing those directly to the Deduplication Data Store, increases scalability and allows parallel processing of backups from multiple agents. 6
8 How It Works Some data, like encrypted files or disk blocks of a non-standard size, cannot be deduplicated. The agent always transfers this data to the vault without calculating the hash values for higher efficiency. For more information about limitations of deduplication, see Deduplication Restrictions section below. Once the backup process is completed, the vault contains the resulting backup and the temporary file with the unique data blocks. The temporary file will be processed on the next stage. The backup (TIB file) contains hash values and the data that cannot be deduplicated. Further processing of this backup is not needed you can recover data from it. 7
9 How It Works Deduplication at target After a backup to a deduplicating vault is completed, the storage node runs the indexing task. This task deduplicates the data in the vault as follows: 1. Data blocks are moved from the temporary file to a special file within the vault, storing duplicate items there only once. This file is a deduplication data store. 2. Hash values and links (or offsets) are saved, so the data can be easily reassembled (rehydrated). 3. After all the data blocks have been moved, temporary file is deleted. As a result, the data store contains a number of unique data blocks. Each block has one or more references from the backups. The references are contained in the deduplication database. The backups remain untouched. They contain hash values and the data that cannot be deduplicated. 8
10 How It Works The following diagram illustrates the result of deduplication at target. The diagram shows two archives, each of them has a separate set of backups. Blue blocks h1..h7 are hash values, stored in the backup files. Green blocks are the data blocks, which cannot be deduplicated. Archive 2 is supposed to be encrypted, thus it completely consists of green blocks. As a result, Deduplication Database contains hash values of blocks that can be deduplicated, and Deduplication Data Store contains data blocks from both Archive 1 and Archive 2. According to Deduplication Best Practices Deduplication Database and Deduplication Data Store should be stored on separate disks for better performance. The indexing task may take some time to complete. You can view this task s state on the management server, by selecting the corresponding storage node and clicking View details. You can also manually start or stop this activity in that window. If the RAM capacity on the storage node is not sufficient to deduplicate large amount of data, the indexing activity may fail. The backups will continue to run successfully. You can add more RAM to the storage node, or delete unnecessary backups and run compacting task. After the next scheduled backup, the indexing will commence again. 9
11 How It Works Deduplication database Acronis Backup Storage Node managing a deduplicating vault, maintains the deduplication database, which contains the hash values of all data blocks stored in the vault, except for those that cannot be deduplicated, i.e. encrypted files. The deduplication database is stored in the storage node local disk. You can specify the database path when creating the vault. The size of the deduplication database is about 1.5 percent of the total size of unique data stored in the vault. In other words, each terabyte of new (non-duplicate) data adds about 15 GB to the database. If the database is corrupted or the storage node is lost, while the vault itself survives, the new storage node can rescan the vault and recreate the vault database and the deduplication database. The vault database is a tiny database, which contains the metadata of all archives stored in the vault. Deduplication data store Deduplication data store is a special set of files within the centralized managed deduplicated vault, where unique data blocks are stored except those that cannot be deduplicated (see the Deduplication Restrictions section for more details). Acronis Backup Storage Node keeps a separate deduplication data store for each deduplicated vault. Thus, the backup data is deduplicated within a single vault only. 10
12 How It Works Recovery For the recovery, Acronis Backup Agent requests for the data from the Acronis Storage Node through proprietary secure protocol. The Storage Node reads backup data from the Vault and if some block references to the deduplication data store, Storage Node reads data from it. For an Agent, the recovery process is transparent and independent of the deduplication. If the indexing task did not finish yet, then part of the data can be still in the temporary file. Recovery will not be affected the only difference in this case is the Storage Node would take backup data from the temporary file instead of Deduplication Data Store. Compacting After one or more backups have been deleted from the vault either manually or through retention rules the data store may contain blocks, which are no longer referred to by any backup. Such orphan blocks are deleted by the compacting task, run by storage node on schedule. By default, the compacting task runs every Sunday night at 03:00. You can re-schedule the task by selecting the corresponding storage node, clicking View details, and then clicking Compacting schedule. You can also manually start or stop the task on that tab. During compacting Storage Node iterates through all backups in the vault and marks all referenced blocks as used (appropriate hash is marked as used in the Deduplication Database). After that, Storage Node iterates through all blocks in the old Deduplication Data Store and moves 11
13 How It Works all used blocks to the newly created Data Store and appropriate records in the Deduplication Database are updated. After this process the newly created Data Store replaces an old one; all Deduplication Database records, not marked as used, are deleted. Compacting process may require additional system resources. That is why the compacting task runs only when a sufficient amount of data to compact has accumulated. The threshold is determined by the Compacting Trigger Threshold configuration parameter. Compression and encryption Backup data is compressed before being sent to the server. Hash value for each data block is calculated before compression. If two equal blocks are compressed with different level of compression, they still will be recognized as duplicate. Data blocks are not recompressed on the server, and are stored with the same compression level, as specified in relevant backup plan. Backups, encrypted on the client side, are not deduplicated for security reasons. To leverage both encryption and deduplication, Acronis Backup Advanced allows to encrypt managed vault itself. Anything written to such vault will be encrypted with AES cryptographic algorithm and anything read from it will be decrypted transparently by the storage node using a vault-specific encryption key stored on the storage node. If the storage medium is stolen or accessed by an unauthorized person, the vault cannot be decrypted without access to the storage node server. The AES cryptographic algorithm operates in the Cipher-block chaining (CBC) mode and uses a randomly generated key with a user-defined size of 128, 192 or 256 bits. The larger the key size, the longer it will take for the program to encrypt the archives stored in the vault and the more secure the archives will be. The encryption key is also encrypted with AES-256 using a SHA-256 hash of the word, generated on the basis of user-entered password, as a key. The word, used for generating an encryption key, is not stored anywhere on the disk; the word s hash is used for verification purposes. With this two-level security, the archives are protected from any unauthorized access, but recovering a lost word is not possible. Replicating data Replication is a process of copying a backup data to another location. By default, the data is copied immediately after backup. A user has the option to postpone copying the backup by setting up replication inactivity time. 12
14 How It Works When the data is replicated/staged to a deduplicated vault, deduplication still applies - duplicate data blocks are not resent if they are already on the destination vault. In this case, only the unique the data is transferred between two Storage Nodes. Source Storage Node behaves like an Agent with deduplication on target enabled it reads backup data blocks, generates hashes for it, sends the request for the hashes to the target Storage Node and if there are some blocks with these hashes in the deduplication data store, these blocks is not sent over the network. As a result, there is the same network bandwidth economy during replication, as in the case of backing up data. Enhancements in Acronis Backup Advanced 11.5 Update 6 Starting with Acronis Backup Advanced 11.5 Update 6, Acronis Backup Storage Node has a new indexing algorithm for server-side deduplication and a new format of the deduplication database. Now you can deduplicate more data with the same memory - the minimum of 8 GB RAM enables you to back up 3 TB of unique data, compared with 500GB previously. In addition, the performance will remain same even if amount of stored unique data grows. without any performance degradation. With file-level backups you can store more than 12 TB of unique data with the same amount of RAM. If the data reaches RAM limits, the indexing activity may fail. The error message would contain the recommendation to increase the RAM. Nevertheless, the backups will still succeed. When you add more RAM to the storage node, the indexing will run after the next backup. Index is now loaded into RAM on storage node service startup, so the indexing performance is much higher. New configuration parameters allow you to choose between the old and new indexing algorithms and control the storage node memory allocation. Note: new indexing algorithm requires the new deduplication database format. Please refer to Appendix B. How to Upgrade Deduplication article for more details. New indexing algorithm A new index is based on combination of hash table, Bloom filter and principle of locality. You can store more data with the same RAM. Started from Update 6 only 8 bytes are used per item (data block) instead of 80 bytes per item Higher indexing speed with no degradation. Insertion to the new index became 300 times faster than previously Faster deduplication index rebuild. Now 1 TB deduplication data store may be rebuilt in 3 hours instead of 24 hours on average hardware 13
15 How It Works Improvements in deduplication performance of Acronis Backup Advanced 11.5 Update 6 Performance comparison tests helped to make the following conclusions: 1. There is backup speed is constant in Update 6: 2. Indexing is much faster in Update 6: 3. The recovery speed is constant in Update 6: 14
16 When to Use Deduplication When to Use Deduplication Deduplication has highest effect when deduplication ratio has the lowest value. Here is the formula for Deduplication ratio calculation (for more calculations please refer to Appendix A. Estimating Storage Capacity article): Deduplication Ratio = Unique Data Percentage + (1 Unique Data Percentage) / Number of Machines It means that: 1. Deduplication is most effective in the environments with a lot of duplicate data on each machine 2. Deduplication is most effective in the environments with lots of similar machines/vms/apps to be backed up In addition, deduplication can help in some other scenarios, such as WAN optimization. Let s take a look at some typical use cases: Use Case 1: Big environment with similar machines Environment Backup system includes 100 similar workstations. Initial deployment of most of the workstations was done with Acronis Snap Deploy. Deduplication effect Workstations were deployed from a single image with Acronis Snap Deploy, so at least OS and generic applications are identical on all of them there are a lot of duplicates. Plus, the number of workstations is high, which makes deduplication even more effective. Conclusion Deduplication is very effective for this scenario achieving significant storage savings. 15
17 When to Use Deduplication Use Case 2: WAN optimization Environment Backup system includes 40 similar workstations in the main office. Backup is performed to the remote location. Deduplication effect There is no information if workstations are deployed from the single image or not. However, similar types of OS often have many similar files. Let s make an assumption that each PC contains 50% of unique data still quite good for deduplication: Deduplication Ratio = 50% + (100% 50%) / 40 = 51.25% This means, that approximate storage and network traffic savings will be 100% % = 48.75%, almost a half. For the case of backing up to remote location, WAN connection can be relatively slow, so reducing traffic 2 times is a big advantage. Conclusion Deduplication is an effective solution for this case, achieving time savings through WAN optimization. Use Case 3: Production Application servers Environment Backup system includes five Application servers with different applications installed. Amount of data to back up is 20 TB. Deduplication effectiveness Application servers have huge amount of data and host different applications. This means there will be very few, if no duplicates at all. Moreover, the total amount of backup data to be processed is very high. In this case, large amount of data will have to be indexed by the Storage Node, but the effect of this indexing will be very low because of no duplicates. In worst-case scenario, a single Storage Node may not be able to process all the backups during the day. Conclusion Deduplication is not effective for this case. Backup to simple high-capacity NAS will be better solution. 16
18 Deduplication Restrictions Deduplication Restrictions Common restrictions Deduplication does not apply, if you protect the archive with a password and encryption. Data blocks of password-protected encrypted archives are stored in the backups as they would be in a non-deduplicating vault. If you want to encrypt an archive while still deduplicating it, leave the archive unencrypted and encrypt the deduplicating vault itself with a password. You can do this when creating the vault. The data will still be safe at-rest as well as in-transit the latter will be enforced by the encryption of the proprietary backup protocol. Disk-level backup Deduplication of disk blocks does not apply if the volume's allocation unit size also known as cluster size or block size is not divisible by 4 KB. Tip: The allocation unit size on most NTFS and ext3 volumes is 4 KB. This allows for block-level deduplication. Other examples of allocation unit sizes allowing for block-level deduplication include 8 KB, 16 KB, and 64 KB. File-level backup Deduplication of a file is not performed if the file is encrypted by OS and the In archives, store encrypted files in decrypted state checkbox in the backup options is not set (it is cleared by default). In the NTFS file system, a file may have one or more additional sets of data associated with it often called alternate data streams. When such file is backed up, so are all its alternate data streams. However, these streams are never deduplicated even when the file itself is. 17
19 Deduplication Best Practices Deduplication Best Practices Deduplication is a complex process that depends on multiple factors. The most important factors that influence deduplication speed are: The speed of access to the deduplication database (IOPS) The RAM capacity of the storage node The number of deduplicating vaults created on the storage node. These are recommendations how to increase deduplication performance. Put the deduplication database and deduplicating vault on separate physical devices To increase the speed of access to a deduplication database, the database and the vault must be located on separate physical devices. It is best to allocate dedicated devices for the vault and the database. If this is not possible, at minimum avoid placing a vault and/or database on the same disk with the operating system it performs a large number of background hard disk read/write operations, which significantly slows down the deduplication. Selecting a disk for a deduplication database The database must reside on a fixed drive. Please do not try to place the deduplication database on external detachable drives or network devices, like over-the-lan iscsi. Minimal disk access time is important. The best hard drive choice is an enterprise-grade Solid-State Drive (SSD), though using a fast SATA hard drive (7200 RPM or faster) or a SAS drive can be acceptable in limited scale configurations. RAID arrays can work too, but not that access time of RAID may drop if large amounts of data are being written. The volume to store the deduplication database should have at least 10 GB of free space. When backing up a large number of machines, the required free space may be larger. 18
20 Deduplication Best Practices The disk space required for a deduplication database can be estimated by using the following formula: S = U / , where S disk size, in GB U estimated amount of unique data in the deduplication data store, in GB. For example, if the planned amount of unique data in the deduplication data store is U=5 TB, the deduplication database will require the free disk space of Selecting a disk for a deduplicating vault S = 5*1024 / = 170 GB To avoid losing the data, we recommend using RAID level. Best option is to use RAID 10 or similar. Note, that RAID 5 and RAID 6 are prone to double-rebuild failures on large hard disk capacities, and as such are not recommended. RAID 0 as it does not have redundancy, and RAID 1 does not increase performance. You can use both local disks or SAN. 2.6 GB of RAM per 1 TB of unique source data When the limit is reached, deduplication will stops but backups and recoveries will continue to work. In general, the more RAM you have, the bigger amount of unique data can be stored. Only one deduplicating vault on each storage node It is highly recommended that you create only one deduplicating vault on a storage node. Otherwise, the RAM will be divided proportionally to the number of the vaults. 64-bit operating system The storage node requires 64-bit operating system. It is also recommended not to share the storage node with other resource-intensive applications, like Database Management Systems (DBMS) or Enterprise Resource Planning (ERP) systems. Minimum of quad-core 2.5 GHz CPU We recommend that you use a minimum of 2.5GHz CPU with at least four cores. Multi-CPU systems are also supported. 19
21 Deduplication Best Practices Ample free space in the vault Indexing of a backup may require as much free space as the backup data occupies immediately after saving it to the vault. Without taking compression or deduplication into account, it will be equal to size of the original data backed up. We strongly do not recommend creating a deduplicating managed vault on a FAT32 volume. Vault stores all deduplicated items in two very large files, and maximum file size in the FAT32 file systems is limited to 4 GB. The storage node may stop functioning when this limit is reached. High-speed LAN 1 Gbit LAN is recommended. It will allow the software to perform 5-6 backups with deduplication in parallel. Back up a typical machine before backing up multiple machines with similar contents When backing up several machines with similar contents, it is recommended to back up one machine first and wait until the end of the backed-up data indexing. After that, the other machines will be backed up much faster due to the efficient deduplication. First machine's backup has been indexed, so most of the data is already in the deduplication data store, and will not be transferred/stored. Back up machines at different times If you back up large number of machines, spread out the backup operations over time. To do this, create several backup plans with various schedules, or use spreading functions of centralized backup plan of Acronis Backup Advanced Use fast cataloging Indexing of a backup starts after its cataloging has been completed. To reduce the overall time required for backup processing, switch automatic cataloging to the fast mode. You can start full cataloging manually outside of the backup window. Configure alert notifications It is recommended that you configure the "Vaults" alert notification in the management server options. This can help you to promptly react in out-of-order situations. For example, a timely reaction to a "There is a vault with low free space" alert can prevent an error when next backing up to the vault. 20
22 Performance Tests Performance Tests The decision on implementing deduplication in the backup solution should base on, at least, approximate understanding of expected deduplication performance. Acronis have measured performance of deduplication in few scenarios backing up, indexing, and recovering. While the results below are configuration- and environment-specific, these may give you an understanding and help you to estimate results in your own environment. Test configuration The testing environment conforms to requirements from Deduplication Best Practices section. Storage Node A dedicated server-grade system 2x CPUs: Intel Xeon E V2 (4 cores, 1,8GHz) providing for parallel processing of several incoming backups (Indexing is not parallel) 48 GB of RAM for effective processing of 48 / 2.6 = 18 TB of unique data. 2 x 8 TB RAID volumes for deduplicated vault Dedicated 150 GB SSD for deduplication database Client machines 20x virtual machines, backed up by 10 Acronis Backup Advanced virtual appliances. Initial data size on each VM is 7 GB. Initially all VMs were identical. Data change rate is 5 GB per backup. In fact, general average daily change rate is 1-2%, but for testing purposes, we have sped things up. Network 10 GBit/s network connection between virtual appliances and a Storage Node. 21
23 Performance Tests Backup and indexing tests Backing up and indexing ran for all VMs in cycles. Each cycle consisted of: 1. Deleting previously generated random data from each VM 2. Generating 5 GB of new random data on each VM 3. Backing up all VMs 4. Letting all backups complete 5. Reindexing the vault for all new backup data to be deduplicated Tests have shown the following results: Notes: 1. Incremental backup of all machines (20 VMs with 5 GB of new unique data on each of them, so in total 100 GB of incremental data) is in average 1250 seconds, so backup speed was 288 GB/Hour. There was no reduction of backup speed even after 100 cycles. 2. Indexing of incremental backups of all machines is in average 1370 seconds, so indexing speed is 263 GB/Hour. There was no reduction of indexing speed even after 100 cycles. 3. Combined backing up + indexing speed is 164 GB/Hour. Backup of VMware ESX virtual machines is, by default, performed with help of CBT technology. Unchanged data and empty blocks are not processed, so backup is faster than in case of physical machine backup. RAM limits were not reached, so deduplication worked even during 100 th cycle. Unlike in earlier updates of Acronis Backup Advanced, the latest update turns off indexing when the server is about to run out of RAM. Thus, backups will continue with the same speed even if there are more than 18.5 TB of unique data, maximum for 48 GB of RAM we had on Storage Node. 22
24 Useful Links Useful Links You can find more information at: Acronis website: Acronis Backup Advanced online help: Acronis Storage Node hardware configuration wizard: Acronis Knowledge Base: https://kb.acronis.com 23
25 Appendix A. Estimating Storage Capacity Appendix A. Estimating Storage Capacity Deduplication technology saves on storage capacity, yet it does not eliminate the need for it. The required storage capacity depends on following factors: Backup scheme Amount of data to back up per machine Retention period Unique data percentage Compression Ratio Daily incremental backup size Number of machines Here are some advices on estimating required storage capacity. Backup scheme and retention period Backup scheme and retention period define how many backups will be on a storage after a specific period of time. The following table shows the numbers of backups (full, incremental and differential) after specific periods depending on a backup scheme and retention period: Scheme 4 weeks 6 months 1 year 2 years Full Diff. Incr Full Diff. Incr Full Diff. Incr Full Diff. Incr Simple GFS (keep monthly backups indefinitely) GFS (keep monthly backups for 1 year) TOH (2 levels) TOH (6 levels)
26 Appendix A. Estimating Storage Capacity The size of full backup changes because of new data. For example, if daily incremental change rate is 1% and backups run only on workdays, full backup size after 52 weeks may be up to 3.6 times higher than first full. For capacity estimations it is recommended to use the projected backup size at the end of the retention period. For more accurate results, you can take an average between the initial backup size and the projected last backup size. Incremental backup size depends on the frequency of backups. Daily incremental backup size is one of the parameters defined initially. It is possible to configure backups to run weekly, monthly or at any day so the real incremental backup size needs to be estimated. Differential backup size depends on amount of daily changes and the number of days from the last full backup. To estimate size of largest differential backup, take the number of days between first full and last of its differential backups. For example, in case of GFS the longest period between full and differential backups is 15 days, so the maximum differential backup size will be the size of daily incremental data multiplied by 15. For more accurate estimation, use average between the first and the last differential backups of the same full backup. Compression Ratio The data in all types of backups is usually compressed. Default, Normal compression level, reaches from 40% to 60% compression ratios (compressed is 40% to 60% of original). Number of machines and amount of data Number of machines has few effects on your deduplicated backup: Backing up more than one machine in parallel creates additional LAN traffic Parallel connections from multiple machines requires Storage Node to be able to effectively handle multithreaded backups More machines more backup data Deduplication ratio is also depends on the number of machines: the more machines you back up, more storage savings you are going to achieve. 25
27 Appendix A. Estimating Storage Capacity Unique data percentage The amount of unique data on the machine depends on the role of the system: Virtual Machines: 30% unique Office Workstations: 50% unique Database Servers: 65% unique File Servers: 75% unique Daily incremental backup size Daily incremental backup size is the amount of data which is added or changed on the machine daily equal to size of daily incremental backup (before compression and deduplication). Storage capacity calculations When you plan storage capacity of your system, following formulae could help. Few of them are for initial backup planning (as it affects initial backup time), the rest are for total capacity planning. Initial storage capacity The following formulas may help with calculating the first backup size, before incremental and differential backups: Total Data Size = Number of Machines * Amount of data to back up per machine Full Non-Deduplicated Backup Size = Total Data To Back Up Size * Compression Ratio%/100% Deduplication Ratio = Unique Data Percentage/100% + (1 Unique Data Percentage/100%) / Number of Machines Initial Storage Space = Full Non-Deduplicated Backup Size * Deduplication Ratio 26
28 Appendix A. Estimating Storage Capacity Storage Space at the end of the period Estimating initial storage space is just part backup storage planning. Storage capacity at the end of period is also required:. Total Backup Data Size = Number of Machines * (Data size per machine + Daily Incremental Backup Size * No of Weeks * 5) Full Non-Deduplicated Backup Size = Total Backup Data Size * Number of Full Backups * Compression Ratio%/100% Incremental Non-Deduplicated Backup Size = Incremental Backup Size * Number of Machines * Number of Incremental Backups * Compression Ratio%/100% Differential Non-Deduplicated Backup Size = Differential Backup Size * Number of Machines * Number of Differential Backups / Compression Ratio All Non-Deduplicated Backup Size = Full Non-Deduplicated Backup Size + Incremental Non-Deduplicated Backup Size + Differential Non-Deduplicated Backup Size Deduplication Ratio = Unique Data Percentage + (1 Unique Data Percentage) / Number of Machines Total Backup Storage Space = All Non-Deduplicated Backup Size * Deduplication Ratio These formulae have some assumptions, acceptable during estimations: Daily Incremental Backup Size is assumed as a data addition rate, but actually some data is changed and replaces existing data on a machines, so actual number could be lower Period is in weeks and 5 is used as the number of work days, holidays are count as workdays 27
29 Appendix B. How to Upgrade Deduplication Appendix B. How to Upgrade Deduplication The new indexing algorithm only works with the new deduplication database format. To apply this algorithm to an existing vault, you must recreate the vault s deduplication database. Exporting data to a newly created vault is not recommended because it requires additional time to rehydrate and deduplicate data, and additional space to temporarily store this data. Redirecting backups from an old vault to a new vault would result in a full backup of the data and a lower deduplication ratio, because data is not deduplicated across vaults. There are two methods of recreating the deduplication database: Re-attaching the managed vault. If the vault is encrypted, you would need to provide the encryption password when re-attaching the vault. Changing the deduplication database path. This method requires stopping the Acronis Storage Node Service. Both methods use the same mechanism: if the storage node cannot find the database, it will recreate it from scratch. This process duration will depend on the datastore size and disk I/O speed. On average, 1 TB of the datastore would require 3 hours. The next section describes how to find the datastore file. 28
30 Appendix B. How to Upgrade Deduplication How to find the datastore file Use the vault path displayed on the vault page: If a local path is displayed, this path is on the machine with the storage node. The datastore file has either the *.ds.0 or the *.ds.1 extension. Two files with these extensions are in the vault while compacting is in progress. If you find two files, use the sum of both file sizes to estimate the upgrade time. Re-attaching the managed vault This method requires that you know the location of the deduplication database files. How to find the deduplication database files Use the deduplication database path displayed on the vault page: The deduplication database files have the *.db3, *.db3-wal, and *.db3-shm extensions. Re-attaching the vault The procedure varies slightly, depending on where the database files are stored. If the vault is encrypted, you must provide the encryption password when re-attaching the vault. Deduplication database and datastore are in different folders 1. Note the folder where the old deduplication database files are stored (to be able to roll back the changes). 2. Detach the managed vault. 3. Attach the managed vault. When attaching, specify different folder for new deduplication database. 29
31 Appendix B. How to Upgrade Deduplication Deduplication database and datastore are in the same folder 1. Detach the managed vault. 2. Move the old deduplication database files to another folder. 3. Attach the managed vault. When attaching, specify the same path, including the same deduplication database path. Validating the result Once the operation is completed, verify that the database files are present in the corresponding folder. The new database files look as follows: After verifying the files are present, you can delete the old deduplication database files. If something goes wrong, you can detach and attach the vault again, pointing to the old deduplication databases. If the database files were successfully created, but you would like to switch to the old indexing algorithm, you can force the storage node to use that algorithm by using the PreferedDedupIndex configuration parameter. New Deduplication-related configuration parameters By default, Acronis Backup Advanced 11.5 Update 6 Storage Node uses new indexing algorithm whenever possible and consume 80%, but leave at least 2 GB of RAM free, during its start. This algorithm can be changed via Windows Registry. Choosing indexing algorithm The following parameter specifies the preferred indexing algorithm. Using the most recent indexing algorithm is preferred. PreferedDedupIndex Possible values: 0 (use best algorithm version), 1 (use pre-update 6 algorithm version) or 2 (use Update 6 algorithm version) Registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Acronis\ASN\Configuration\StorageNode\PreferedDedupIndex The parameter applies only to newly created deduplication databases. For existing vaults, the databases must be rebuilt after changing this parameter (see Appendix B. How to Upgrade Deduplication section). 30
32 Appendix B. How to Upgrade Deduplication Configuring consumed memory limits When Acronis Backup Advanced 11.5 Update 6 Storage Node is started, it consumes specified amount of memory for its index and other operating data. The amount of consumed memory is calculated based on the following rule: Amount of consumed memory is DatastoreIndexCacheMemoryPercent percents but not more than total available RAM minus DatastoreIndexReservedMemory. For the cases when there is enough memory on the server, Storage Node will take DatastoreIndexCacheMemoryPercent percents, which is 80% by default. If amount of memory is small (less than 10 GB in case of default parameters), Storage Node will leave DatastoreIndexReservedMemory (which is 2048 MB by default) for operating system and other applications. DatastoreIndexCacheMemoryPercent Possible values: 0 up to 100 in percents. Default value is 80 % Registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Acronis\ASN\Configuration\StorageNode\DatastoreIndexCacheMemo rypercent DatastoreIndexReservedMemory Possible values: 0 up to RAM size in megabytes. Default value is 2048 MB Registry key: HKEY_LOCAL_MACHINE\SOFTWARE\Acronis\ASN\Configuration\StorageNode\DatastoreIndexReservedMe mory 31
33 About Acronis About Acronis Acronis sets the standard for new generation data protection through its backup, disaster recovery, and secure access solutions. Powered by the AnyData Engine and set apart by its image technology, Acronis delivers easy, complete and safe backups of all files, applications and OS across any environment virtual, physical, cloud and mobile. Founded in 2002, Acronis protects the data of over 5 million consumers and 300,000 businesses in over 130 countries. With its more than 100 patents, Acronis products were named best product of the year by Network Computing, TechTarget and IT Professional and cover a range of features, including migration, cloning and replication. For additional information, please visit Follow Acronis on Twitter: 32
LDA, the new family of Lortu Data Appliances Based on Lortu Byte-Level Deduplication Technology February, 2011 Copyright Lortu Software, S.L. 2011 1 Index Executive Summary 3 Lortu deduplication technology
Acronis Backup & Recovery for Mac Acronis Backup & Recovery & Acronis ExtremeZ-IP This document describes the technical requirements and best practices for implementation of a disaster recovery solution
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
Page 1 of 31 Quick Start - Virtual Server idataagent (Microsoft/Hyper-V) TABLE OF CONTENTS OVERVIEW Introduction Key Features Complete Virtual Machine Protection Granular Recovery of Virtual Machine Data
DeltaV Distributed Control System Product Data Sheet Backup and Recovery Best-in-class offering. Easy-to-use Backup and Recovery solution. Data protection and disaster recovery in a single solution. Scalable
Simplifying Server Workload Migrations This document describes the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual
April 2013 Page 1 Protect your plant data with the solution. Best-in-class offering Easy-to-use solution Data protection and disaster recovery in a single solution Scalable architecture and functionality
Disk-to-Disk-to-Offsite Backups for SMBs with Retrospect Abstract Retrospect backup and recovery software provides a quick, reliable, easy-to-manage disk-to-disk-to-offsite backup solution for SMBs. Use
This document shows you how to use a Drobo iscsi SAN Storage array with Veeam Backup & Replication version 5 in a VMware environment. Veeam provides fast disk-based backup and recovery of virtual machines
Beyond: Optimizing Gartner clients using deduplication for backups typically report seven times to 25 times the reductions (7:1 to 25:1) in the size of their data, and sometimes higher than 100:1 for file
How to Backup and Restore a VM using Veeam Table of Contents Introduction... 3 Assumptions... 3 Add ESXi Server... 4 Backup a VM... 6 Restore Full VM... 12 Appendix A: Install Veeam Backup & Replication
VIPERVAULT STORAGECRAFT SHADOWPROTECT SETUP GUIDE Solution Overview Thank you for choosing the ViperVault cloud replication, backup and disaster recovery service. Using this service you can replicate your
This document shows you how to use a Drobo iscsi array with Veeam Backup & Replication version 6.5 in a VMware environment. Veeam provides fast disk-based backup and recovery of virtual machines (VMs),
New Generation Data Protection Powered by Acronis AnyData Technology Acronis Backup Product Line Speaker name Introducing Acronis Backup Acronis Backup Target: Smaller environments, home office, remote
May 2013 Page 1 This document answers frequently asked questions regarding the Emerson system Backup and Recovery application. www.deltav.com May 2013 Page 2 Table of Contents Introduction... 6 General
Paragon Protect & Restore ver. 3 Centralized and Disaster Recovery for virtual and physical environments Tight Integration with hypervisors for agentless backups, VM replication and seamless restores Paragon
Top Ten Questions to Ask Your Primary Storage Provider About Their Data Efficiency May 2014 Copyright 2014 Permabit Technology Corporation Introduction The value of data efficiency technologies, namely
Acronis Backup & Recovery 10 Management Server reports Technical white paper Table of Contents 1 Report data... 3 2 Time format... 4 3 Relationship between views... 4 4 Relationship diagram... 6 5 Current
How Deduplication Benefits Companies of All Sizes An Acronis White Paper Copyright Acronis, Inc., 2000 2009 Table of contents Executive Summary... 3 What is deduplication?... 4 File-level deduplication
September 2013 Page 1 Protect your plant data with the solution. Best-in-class offering Easy-to-use solution Data protection and disaster recovery in a single solution Scalable architecture and functionality
Protecting your Data in a New Generation Virtual and Physical Environment Read this white paper to learn how you can easily and safely protect your data in a new generation virtual and physical IT environment,
Acronis Backup & Recovery 11 Quick Start Guide Applies to the following editions: Advanced Server Virtual Edition Advanced Server SBS Edition Advanced Workstation Server for Linux Server for Windows Workstation
TECHNICAL WHITE PAPER Backup and Recovery Best Practices With CommVault Simpana Software www.tintri.com Contents Intended Audience....1 Introduction....1 Consolidated list of practices...............................
Backup & Disaster Recovery Options Since businesses have become more dependent on their internal computing capability, they are increasingly concerned about recovering from equipment failure, human error,
Backup Solution Testing on UCS for Small-Medium Range Customers (Disk to Tape) Acronis Advanced Backup Software First Published: April 28, 2014 Last Modified: May 06, 2014 Americas Headquarters Cisco Systems,
TGL VMware Presentation Guangzhou Macau Hong Kong Shanghai Beijing The Path To IT As A Service Existing Apps Future Apps Private Cloud Lots of Hardware and Plumbing Today IT TODAY Internal Cloud Federation
Kaseya 2 Backup User Guide Version 7.0 English September 3, 2014 Agreement The purchase and use of all Software and Services is subject to the Agreement as defined in Kaseya s Click-Accept EULATOS as updated
Understanding EMC Avamar with EMC Data Protection Advisor Applied Technology Abstract EMC Data Protection Advisor provides a comprehensive set of features to reduce the complexity of managing data protection
Doc. Code OceanStor VTL6900 Technical White Paper Issue 1.1 Date 2012-07-30 Huawei Technologies Co., Ltd. 2012. All rights reserved. No part of this document may be reproduced or transmitted in any form
efolder BDR for Veeam Cloud Connection Guide Setup Connect Preload Data uh6 efolder BDR Guide for Veeam Page 1 of 36 INTRODUCTION Thank you for choosing the efolder Cloud for Veeam. Using the efolder Cloud
Installation and Setup: Setup Wizard Account Information Once the My Secure Backup software has been installed on the end-user machine, the first step in the installation wizard is to configure their account
Cloud Offerings cstor Cloud Offerings As today s fast-moving businesses deal with increasing demands for IT services and decreasing IT budgets, the onset of cloud-ready solutions has provided a forward-thinking
New features in BackupAssist v6... 2 VSS application backup (Exchange, SQL, SharePoint)... 3 System State backup... 3 Restore files, applications, System State and mailboxes... 4 Fully cloud ready Internet
W H I T E P A P E R S O L U T I O N : D I S A S T E R R E C O V E R Y T E C H N O L O G Y : R E M O T E R E P L I C A T I O N Disaster Recovery Strategies: Business Continuity through Remote Backup Replication
Centrally managed backup solution User Manual Contents Desktop application...2 Requirements...2 The installation process...3 Logging in to the application...6 First logging in to the application...7 First
Using the new features in BackupAssist v6... 2 VSS application backup (Exchange, SQL, SharePoint)... 2 Backing up VSS applications... 2 Restoring VSS applications... 3 System State backup and restore...
USER'S GUIDE Table of contents 1 Introduction...3 1.1 What is Acronis True Image 2015?... 3 1.2 New in this version... 3 1.3 System requirements... 4 1.4 Install, update or remove Acronis True Image 2015...
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...
Redefining Microsoft SQL Server Data Management APRIL Actifio 11, 2013 PAS Specification Table of Contents Introduction.... 3 Background.... 3 Virtualizing Microsoft SQL Server Data Management.... 4 Virtualizing
IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark
BDR for ShadowProtect Solution Guide and Best Practices Updated September 2015 - i - Table of Contents Process Overview... 3 1. Assess backup requirements... 4 2. Provision accounts... 4 3. Install ShadowProtect...
Reference Architecture EMC BACKUP-AS-A-SERVICE EMC AVAMAR, EMC DATA PROTECTION ADVISOR, AND EMC HOMEBASE Deliver backup services for cloud and traditional hosted environments Reduce storage space and increase
Acronis Backup & Recovery 11 Virtual Edition Backing Up Virtual Machines Copyright Acronis, Inc., 2000-2011. All rights reserved. Acronis and Acronis Secure Zone are registered trademarks of Acronis, Inc.
White Paper EMC VNXe File Deduplication and Compression Overview Abstract This white paper describes EMC VNXe File Deduplication and Compression, a VNXe system feature that increases the efficiency with
Release Notes Cloud Attached Storage 2.5.32 January 2011 Copyright 2009-2011 CTERA Networks Ltd. All rights reserved. No part of this document may be reproduced in any form or by any means without written
VMware@SoftLayer Cookbook Backup, Recovery, Archival (BURA) IBM Global Technology Services: Khoa Huynh (email@example.com) Daniel De Araujo (firstname.lastname@example.org) Bob Kellenberger (email@example.com) 1
Backing Up the CTERA Portal Using Veeam Backup & Replication CTERA Portal Datacenter Edition May 2014 Version 4.0 Copyright 2009-2014 CTERA Networks Ltd. All rights reserved. No part of this document may
Acronis Backup & Recovery 11 Backing Up Microsoft Exchange Server Data Copyright Acronis, Inc., 2000-2012. All rights reserved. Acronis and Acronis Secure Zone are registered trademarks of Acronis, Inc.
hosted by Mario Blandini @ Drobo Backup and recovery as agile as the virtual machines being protected Approaches and architectures for protecting VMware virtual machines using image-based backup Special
Acronis Backup & Recovery 11.5 Update 2 Backing Up Microsoft Exchange Server Data Copyright Statement Copyright Acronis International GmbH, 2002-2013. All rights reserved. Acronis and Acronis Secure Zone
CrashPlan PRO Enterprise Backup People Friendly, Enterprise Tough CrashPlan PRO is a high performance, cross-platform backup solution that provides continuous protection onsite, offsite, and online for
Enterprise Backup and Restore technology and solutions LESSON VII Veselin Petrunov Backup and Restore team / Deep Technical Support HP Bulgaria Global Delivery Hub Global Operations Center November, 2013
Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper bex30102507wpor Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies Introduction Are you a database administrator
WHITE PAPER: TECHNICAL PST Migration with Enterprise Vault 8.0: Part 1 - Solution Overview Author: Andy Joyce, EV Technical Product Management Date: April, 2009 White Paper: Symantec Technical PST Migration
What is VM Upload? 1. VM Upload allows you to import your own VM and add it to your environment running on CloudShare. This provides a convenient way to upload VMs and appliances which were already built.
Managing the information that drives the enterprise STORAGE Buying Guide: inside 2 Key features of source data deduplication products 5 Special considerations Source dedupe products can efficiently protect
Cloud Backup Service Service Description PRECICOM Cloud Hosted Services Table of Contents Table of Contents 2 1. Cloud Backup Service Service Summary 3 2. Cloud Backup Service Service Definition 4 2.1.
Parallels Cloud Storage White Paper Best Practices for Configuring a Parallels Cloud Storage Cluster www.parallels.com Table of Contents Introduction... 3 How Parallels Cloud Storage Works... 3 Deploying
Protecting the Microsoft Data Center with NetBackup 7.6 Amit Sinha NetBackup Product Management 1 Major Components of a Microsoft Data Center Software Hardware Servers Disk Tape Networking Server OS Applications
Acronis Backup & Recovery 11.5 Update 2 Backing Up Microsoft Exchange Server Data Copyright Statement Copyright Acronis International GmbH, 2002-2013. All rights reserved. Acronis and Acronis Secure Zone
R E L E A S E N O T E S LiveVault Version 7.65 Release Notes Revision 0 This document describes new features and resolved issues for LiveVault 7.65. You can retrieve the latest available product documentation
Deep Dive on SimpliVity s OmniStack A Technical Whitepaper By Hans De Leenheer and Stephen Foskett August 2013 1 Introduction This paper is an in-depth look at OmniStack, the technology that powers SimpliVity
User Guide Acronis Backup Advanced 11.5 Solution Guide and Best Practices Updated May 2015 - i - Table of Contents Solution Overview... 3 Acronis Backup and Recovery Components... 3 System Requirements...
Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,
Acronis Backup & Recovery Online Stand-alone User Guide Table of contents 1 Introduction to Acronis Backup & Recovery Online...4 1.1 What is Acronis Backup & Recovery Online?... 4 1.2 What data can I back
(Formerly Double-Take Backup) An up-to-the-minute copy of branch office data and applications can keep a bad day from getting worse. Double-Take RecoverNow for Windows (formerly known as Double-Take Backup)
VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup
1 WWW.FUSIONIO.COM WHITE PAPER WHITE PAPER Executive Summary Fusion iovdi is the first desktop- aware solution to virtual desktop infrastructure. Its software- defined approach uniquely combines the economics
A Disaster Recovery Use Case New Generation Data Protection This use case describes a disaster recovery scenario affecting a medium-size company. It includes a systematic recovery timeline demonstrating
Barracuda Backup Deduplication White Paper Abstract Data protection technologies play a critical role in organizations of all sizes, but they present a number of challenges in optimizing their operation.
EaseUS Todo Backup Reliable Backup & Recovery Solution EaseUS Todo Backup Solution Guide. All Rights Reserved Page 1 Part 1 Overview EaseUS Todo Backup Solution Guide. All Rights Reserved Page 2 Introduction
NETAPP SYNCSORT INTEGRATED BACKUP Technical Overview Peter Eicher Syncsort Product Management Current State of Data Protection Production Data Protection Storage Physical & VM App Servers Backup Servers
For Hyper-V Edition Practical Operation Seminar 4th Edition 3.5 for Hyper-V 1. ActiveImage Protector made available in 8 editions Server Edition Support for backup of server OS s, Windows 2000 or later,
NETGEAR ReadyRECOVER Offsite Data Protection: Replication Overview and Configuration Guide Table of Contents NETGEAR ReadyRECOVER Offsite Data Protection... 1 Table of Contents... 2 ReadyRECOVER Introduction...