Brian LaGoe, Systems Administrator Benjamin Jellema, Systems Administrator Eastern Michigan University 1
Backup & Recovery Goals and Challenges Traditional/EMU s Old Environment Avamar Key Features EMU Implementation 2
Reduce our backup windows Ensure the integrity and recoverability of daily backups Eliminate the need for tape or large standard disk arrays for backup Deploy a scalable solution to keep up with high data growth rates Cost effectively retain backup data for extended periods of time Streamline data protection processes and lower administrative costs 3
Significantly reduce the amount of backup data transferred over the network Replicate backup data with encryption to an offsite location for disaster recovery Provide an affordable and efficient method for backing up our new VMware environment Provide a solution which can more easily meet legal and regulatory requirements for data retention and recovery Provide enterprise quality backup and recovery for servers beyond the datacenter to the rest of campus and our remote locations 4
Client 2 A Client 1 A Store many copies of same files: from same machine, multiple machines, versions across time Multiple copies of document A exist on different client on the network Client 3 A A A A A Multiple version of documents exist on the same client A Backup Server Same documents backed up with every full backup Client 4 Full backups taken once/ week and retained for months and years 5
Daily Weekly Solaris ATL-P1000 INC FULL Dup INC FULL Dup Windows Over 600 tapes INC INC FULL FULL Dup Dup NAS/SAN EDM Databases Mail Linux Staging Server Off to Iron Mountain 6
Disk-Based Storage Global De-Duplication Systemic fault tolerance: RAID/RAIN/Checkpoints and Replication Heterogeneous IP based Client Architecture Scalable Server Architecture Flexible Deployment Options Centralized Management 7
Completely Disk-based solution Eliminates user error on tape handling Uses commodity based hardware Speed and performance of disk / No tape-library overhead seek times Take maximum advantage of inherent hard disk characteristics 8
Client 2 A Client 1 A Only unique objects are stored in backups. The data for this file is stored only once Identify and store only unique sub-file data objects Client 3 A A Client 4 A Store objects taking max advantage of disk Create and store trees that link all data objects Recreate files for data restore 9
Break data into atom (sub-file, variable-length segments of data) O Send and store each atom only once Avamar backup repository O H H H O H O H H O H H H O H H up to 500 times daily data reduction At the source De-duplication before data is transported across the network At the target Assures coordinated de-duplication across sites, servers, and over time Granular Small, variable-length sub-file segments guarantee most effective deduplication 10
RAID (redundant array of independent disks) protection for disk data corruption. RAID-5 nodes with hot swappable disks. RAIN (redundant array of independent/ inexpensive node) provides failover and fault tolerance across nodes Replication is our modern version of offsite protection Checkpoints in case of operational failures 11
Redundant Array of Independent Nodes (Rain) architecture Each server node with internal disk storage and CPU Provides high availability and fault tolerance Grid architecture for online scalability and performance Daily integrity checks for Avamar and data recoverability RAID protection from disk failures Parity across storage nodes Utility and spare node Avamar Server Verified checkpoint 12
Host based clients that support AIX, HP-UX, Linux, Mac OS, Netware, Solaris, VMWare, and Windows File system and database plug-ins Database plug-ins for Microsoft Exchange, Microsoft SQL, DB2, SAP, and Oracle Clients use operating system specific techniques and libraries to handle open files Client operates using standard TCP/IP protocols with either Axion or AES-128 encryption Through global based source de-duplication and encryption, this eliminates the need for costly out-of-band backup networks 13
Utility Node This provides internal Avamar server processes and services. Including the Administrator server, cron jobs, DNS, external authentication, NTP, and web access. Data Storage Node This provides the Avamar storage. It communicates with the Utility node and provides the RAIN based storage. Currently supported configurations are up to 16 x 1TB or 2TB Data Storage Nodes. NDMP Accelerator Node This is a specialized node, which provides complete backup and recovery for NAS devices via the Network Data Management Protocol (NDMP) standard. 14
Hardware: Non-Rain This is 1 3 nodes, (1 x 1) or (1 x 2), providing Utility Node and Data Nodes RAIN This is (1 x N + S), where S is the spare node. The smallest RAIN configuration is 4 hardware nodes. Software: Software Edition Based on customer-supplied, EMC certified hardware platforms Avamar Data Store Edition Based upon EMC specified hardware + the software edition Virtual Server Edition One host on a VMWare cluster ideal for satellite offices 15
Provides the framework for least-rights privileges, which allows for data security as well as empowering clients access to their own backup and restore destiny Tools are completely IP-based allowing for efficient remote management PostgreSQL database and provided tools allow for user targeted reports Consistent policy enforcement Eliminates need to manage complete tape system, particularly at satellite offices Provides encryption allowing for compliance with data security standards 16
Environment Description Installing Avamar Plug-ins Daily Administration Tasks Real World Statistics 17
33 Windows Hosts 55 Solaris Hosts 25 Linux Hosts 36 Oracle Databases with ~2 TB of data 6 Microsoft SQL Databases 3 Macintosh Hosts 1 NAS EMC Celerra Host with ~3 TB of data 2 VMWare ESX Clients 2 Novell Netware Hosts 18
Solaris Windows AS1A Cluster Main Data Center Utility Node AS2A Cluster Remote Data Center Spare Node Linux Databases Data Nodes Mail NDMP Accelerator NAS 19
Documentation and Client Software Available from the web service on the utility node Allows for rapid deployment of agents File system agents Easy to configure and deploy Very fast and efficient Oracle RMAN No longer a per-agent License No longer need BCVs, for temp storage 20
SQL/VSS/OTM allows for the handling of open or hot files often found with SQL databases and certain system files. NDMP accelerator for NAS Had a few set backs during migration Can now backup entire Celerra environment in a few hours, instead of a few days or not at all. Vmware No per-vm Licnese Global de-duplication, at the sub-file level even for VMDK files. Use NFS and backup with the NDMP accelerator eliminating the need for any backup agents on the ESX servers. 21
Avamar Dashboard, Backup Status, and Completion of Utility Node Cron Jobs Quick and easy at a glance monitoring Capacity Management Requires a proactive stance, can not just wait until things get full Important to strive for a Steady State: We hope to expire old data at the same rate as we import new data. Client Installs Utilizing group policies, provides for consistency and rapid roll out. Changing Retention based upon client needs 22
23
24
25
Commonality Factor of Backups Backup and Restore Times Replication/Duplication Times Network Loads Server Load 26
PluginName Clients Backups Avg % common AvgGBNew Windows File System 33 901 99.57 0.2192 Solaris Oracle RMAN 36 565 84.82 0.0571 Solaris File System 55 1512 98.87 0.3735 Linux Oracle RMAN 6 363 82.62 0.0218 Linux File System 25 616 99.7 0.1701 EMC Celerra via NDMP 1 198 91.18 0.407 27
Total backup window was reduced from 2-3 days for a full backup to 4-5 hours. Typical file system backups are completed in less 15mins, all file systems are backups nightly within 1.5 hours after starting. 2.5tb of NAS files backed up nightly just over 4 hours. Our smaller databases are backed up with in minuets wile our larger databases can take around 5 hours depending on the speed of the server. Restores of single files can be completes with in seconds, larger database restores can take several hours, but is significantly faster than we were getting with tape. 28
Replication/Duplication Times All backups are replicated daily to our secondary data center in less than 2 hours total. Network Loads With commonality factors typically better than 99%, very little data is transferred daily over our network. No need for a private backup network. Backups and replication are easily done over long distances on standard IP connections. Server Load Server load is typically minimal, it is import to have enough memory for the server to do the daily de-dupe hash calculations. If the client has millions of small files, you need to ensure enough system memory. The general rule is that, for every one million files, Avamar requires 512 MB of physical RAM. 29
What is the return on our investment with Avamar at EMU We have drastically reduce your backup time Reduce CPU time spent on backup and eliminate resource contention Reduced the demand on the network Been able to recover disk stores previously used sole for backup We are more confident in the integrity and recoverability of daily backups, and the replicated copies No longer require extra staff to handle tapes, and eliminated risk of tapes being lost or stolen We can more easily meet legal and regulatory requirements for data retention and recovery Confident we have a scalable architecture that will go with our data growth 30
31