1 AFS Usage and Backups using TiBS at Fermilab Presented by Kevin Hill
2 Agenda History of AFS at Fermilab Current AFS usage at Fermilab About Teradactyl How TiBS and TeraMerge works
3 AFS History at FERMI Original Mission Pilot program started on FNALU (the central Unix cluster) Used for general purpose storage Many multi-flavored smaller systems FNALU to be Restricted to physics applications AFS seen as a method to share physics data site-wide and to other HEP labs that were beginning to use AFS
4 AFS History at FERMI Quickly the mission changed Growth of FNALU general user base due to migration off VMS FNALU became a general purpose UNIX cluster with a focus on Physics analysis Need for more CPU changed the system spec to a few large multi-cpu machine
5 The Last Few Years Usage increase Introduction of Linux a large factor Introduction of a stable Windows client provides a method for file sharing between Windows and UNIX Many more central web servers hosted from AFS Scientific users finally appreciate a single login environment and password
6 Current Usage Control Room Logbook Log archive for k5 and w2k logs UNIX/Windows/Mac file-sharing UPS product and patch distribution 6/2002 3/2003 Products (+16%)
7 Current Usage Central web servers keep pages in AFS Allows web authors (~900) to edit content directly w/o accounts on the central web servers Main WebServer Departmental/ Experiment Pubs 6/ / (+7%) 2 (+100%)
10 Statistics 6/2002 3/2003 Disk in Use Volumes Served 1.2TB TB (+25%) (+7%)
11 AFS Currently Sparc Servers SAN Based Storage 3 Database Servers 6 File Servers 1 Backup Server Running OpenAFS
12 Current Usage 4.19 TB total storage 1.42 TB in use Volumes 3667 AFS users
13 Backups Original backup system 5 DLT 7000's Operator assisted tape mounts Scripts using native afs backup utilities Window approaching 40+ hours for Full backup
14 Backups Current System 4 AIT3 drives 120 slot tape library Using TiBS Software from Teradactyl 4-5 hour backup window Now includes over 180 non-afs client machines Full over network rarely needed
15 About Teradactyl Founded in 1999 Located in Albuquerque, New Mexico Com pany develops advanced backup and recovery solutions Support for AFS, Linux, Mac OS, UNIX, and Windows Primary product line: True incremental Backup System (TiBS) Patented TeraMerge technology Prim ary business focus is education and governm ent. DOE sole source for AFS backup solution at Fermi National Accelerator Lab.
16 Traditional Level Backup Approach Uses multiple levels 0-9, with each level indicating the extent of the backup 1. Level 0 = Full backup, all data on a client is backed up 2. Higher level backups send data that has either been changed or created since the most recent backup at a lower level Factors involved in determining a backup strategy size of data rate of data change network bandwidth
17 Example: 3 level backup (4 week cycle) Level 0: Full backup every four weeks Level 1: Weekly increm ental backup Level 2: Daily increm ental backup Typical network bandwidth over a four week period: Bandwidth used Time
18 What s wrong with the traditional approach? Tradeoff between backup and restore 1. Fewer levels make restores faster, but increase the impact of backups on networks and clients. 2. More levels reduce the impact of backups, but increase restore time, complexity, and risk to tape failure. Multiple copies of unchanging data is inefficient Periodic full backups are taken when only a percentage of the data has changed. Incremental backups continue to take copies of changed data even after it has already been backed up.
19 TeraMerge Minimizes the time required for daily network backups TiBS clients only send file changes since the last successful backup. Produces lower level backups without network intervention TiBS reuses data to generate new backup volumes. This process eliminates the need to copy files again from backup clients. Creates full backup volumes entirely offline A TiBS server uses the most recent backup data to create a current full backup image.
20 Incremental Network Merge Backup Diagram This process copies changes since the last successful network backup (Full or Incremental) from a backup client volume into the backup server cache. These changes are merged with previous incremental data to create a new incremental backup volume. The new volume represents all of the current changes since the last lower level backup. Incremental Network Merge Backups can be taken in parallel from several backup clients to a backup server disk. As each backup completes, the data is streamed efficiently to tape from the backup server cache. By reusing data in the backup server disk cache, this process reduces the workload that daily backups place on networks and backup clients.
21 Flush Merge Backup Diagram This process takes cumulative data from a higher level Flush Merge or Incremental backup in the backup server disk cache and merges it with the previous Flush Merge backup at this level. The merged data is sent to the backup server cache. As individual backup volumes are completed they are efficiently streamed to tape. If there are no pending lower level backup, the data is removed from the backup cache once it has been written to tape. The Flush Merge process may be performed in parallel from multiple tape devices. Additional levels of Flush Merge backups allow a backup server to support larger amounts of data.
22 Full Merge Backup Diagram This process takes cumulative data from a higher level Flush Merge or Incremental Backup in the backup server disk cache and merges it with the previous Full Backup tape volume. The merged data is sent to the backup server cache. As individual backups are completed they are efficiently streamed to tape. The data is removed from the backup cache once it has been written to tape. This process may be performed in parallel from multiple tape devices.
23 Key Advantages of TeraMerge The key advantages of TeraMerge Reduce loads on networks and backup clients Reduce number of tapes required for restores Reduces average restore times Recovers from single tape failures on lower level backups Current incremental data is mirrored in the cache Lower tape cost than tape mirroring and tape striping.
24 TiBS Backup Features TiBS only needs to backup new and changed files No periodic network full backups are required or recommended! Automatic verification of tapes as they are merged Backup sub-directories and not just entire file systems Single file backup capability Omit rules to eliminate unnecessary data backup Windows registry and security data UNIX special files are supported by TiBS Disk caching allows parallel backup processing TeraMerge of lower level off-line backups Automated generation of mirrored volumes Status notification can be sent to different administrators based on configuration Flexible and customizable reporting
25 TiBS Restore Features Efficient single pass restore Restores only versions of files required Restore incremental data from disk cache Search for any file on any tape Recover backup server from tape Redirect restore to any other TiBS client Data checksums ensure data is restored correctly Can restore data while continuing with backups Does not restore data which was intentionally deleted
26 TiBS Archive Features Long-term storage managed by site tape retention policies. Tape pools can have individual tape retention policies. Original files can be retained or manually deleted by backup administrator. A reference of every file is automatically maintained by the TiBS File Lookup Database (FLDB) for easy location. Archive tapes can be managed in separate tape library for easy retrieval. Tape scan utilities can check integrity of stored tape volumes over time. Archive tapes are produced by taking a final incremental from a backup client and then generating a final consolidated full backup from data already on the backup server.
27 TiBS Centralized Solution for AFS Generates new full and lower level backups offline Detects and reports corrupted volumes and orphaned vnodes Reports discrepancies between vldb and file servers Uses UNIX backup client to backup vldb and file server critical inform ation Online file lookup database (CMU-ECE 20GB since Jan 2000) Only processes volumes which have been updated/created on a daily basis Supports both IBM-AFS to OpenAFS
28 OS Support Supports a wide variety of Linux OS clients. Backup Servers available in Linux and Solaris operating system s. Linux Backup Server supports AFS, Mac OS, UNIX, and Windows clients. A single server can support Terabytes of data from thousands of partitions. Backup Servers are com patible with the autom ated tape library interface. Linux tape support for AIT, DLT, SDLT, LTO and other popular formats. Clients available for Windows, Linux, Solaris, SGI, and MacOS
29 Conclusions AFS Good TiBS allows decent frequency of Full Backups without the network overhead Produces real full tapes Easy to use command line tools Needs GUI tools
Considerations when Choosing a Backup System for AFS By Kristen J. Webb President and CTO Teradactyl LLC. October 21, 2005 The Andrew File System has a proven track record as a scalable and secure network
WHITE PAPER VERITAS NetBackup Technical Overview VERITAS NETBACKUP TECHNICAL OVERVIEW 1 TABLE OF CONTENTS VERITAS NetBackup Technical Overview...1 Product Overview...4 Key Features of NetBackup...4 NetBackup
NETAPP TECHNICAL REPORT Open Systems SnapVault Best Practices Guide Jeremy Merrill, Darrin Chapman TR-3466 ABSTRACT This document is a deployment guide for architecting and deploying Open Systems SnapVault
EMC NetWorker Version 8.2 SP1 Server Disaster Recovery and Availability Best Practices Guide 302-001-572 REV 01 Copyright 1990-2015 EMC Corporation. All rights reserved. Published in USA. Published January,
Redefining Microsoft SQL Server Data Management APRIL Actifio 11, 2013 PAS Specification Table of Contents Introduction.... 3 Background.... 3 Virtualizing Microsoft SQL Server Data Management.... 4 Virtualizing
BACKUP AND RECOVERY FOR MICROSOFT-BASED PRIVATE CLOUDS LEVERAGING THE EMC DATA PROTECTION SUITE A Detailed Review ABSTRACT This white paper highlights how IT environments which are increasingly implementing
Windows Server 2008 R2 Hyper-V Live Migration Table of Contents Overview of Windows Server 2008 R2 Hyper-V Features... 3 Dynamic VM storage... 3 Enhanced Processor Support... 3 Enhanced Networking Support...
NetVault : Backup Application Plugin Module (APM) for Exchange Server version 4.1 User s Guide MEG-101-4.1-EN-01 10/29/09 Copyrights NetVault: Backup APM for Exchange Server User s Guide Software Copyright
Remote Backup Systems, Inc. Online Backup Software Overview Understanding BitBackup BitBackup technology is included in version 9.0 and greater of the RBackup client software. When enabled, this technology
One Stop Data & Networking Solutions PREVENT DATA LOSS WITH REMOTE ONLINE BACKUP SERVICE Prevent Data Loss with Remote Online Backup Service The U.S. National Archives & Records Administration states that
Backup and Recovery With Isilon IQ Clustered Storage An Isilon Systems Best Practices Guide August 2007 ISILON SYSTEMS 1 Table of Contents 1. Assumptions... 4 2. Introduction... 4 3. Data Protection Strategies...
WHAT IS FALCONSTOR? FalconStor Optimized Backup and Deduplication is the industry s market-leading virtual tape and LAN-based deduplication solution, unmatched in performance and scalability. With virtual
REMOTE BACKUP-WHY SO VITAL? Any time your company s data or applications become unavailable due to system failure or other disaster, this can quickly translate into lost revenue for your business. Remote
Symantec NetBackup (NBU) Design Best Practices with Data Domain GlassHouse Whitepaper Introduction Written by: Brian Sakovitch and Kelley Alexander GlassHouse Technologies, Inc. Protecting the ever expanding
Service Overview Enterprise Cloud Backup Techgate s Enterprise Cloud Backup, powered by Asigra, is a service that gives you state-ofthe-art data protection at an affordable price. Vernon King Sales Operations
Dell PowerVault DL Backup to Disk Appliance Powered by CommVault Centralized data management for remote and branch office (Robo) environments Contents Executive summary Return on investment of centralizing
EMC Retrospect 7.5 for Windows Backup and Recovery Software Data Protection for Small and Medium Business EMC Retrospect backup and recovery software delivers automated, reliable data protection for small
Mosaic Technology s IT Director s Series: : Why Tape, Disk, and Archiving Fall Short Mosaic Technology Corporation * Salem, NH (603) 898-5966 * Bellevue, WA (425)462-5004 : Why Tape, Disk, and Archiving
Service Overview Business Cloud Backup Techgate s Business Cloud Backup service is a secure, fully automated set and forget solution, powered by Attix5, and is ideal for organisations with limited in-house
ESX Server 3.0 and VirtualCenter 2.0 Revision: 20060615 Item: VI-ENG-Q206-216 You can find the most up-to-date technical documentation at: http://www.vmware.com/support/pubs The VMware Web site also provides
Acronis Backup & Recovery 11.5 Update 2 Backing Up Microsoft Exchange Server Data Copyright Statement Copyright Acronis International GmbH, 2002-2013. All rights reserved. Acronis and Acronis Secure Zone
Data Protection for Isilon Scale-Out NAS A Data Protection Best Practices Guide for Isilon IQ and OneFS By David Thomas, Solutions Architect An Isilon Systems Best Practices Guide May 2009 ISILON SYSTEMS
Disk-to-Disk-to-Offsite Backups for SMBs with Retrospect Abstract Retrospect backup and recovery software provides a quick, reliable, easy-to-manage disk-to-disk-to-offsite backup solution for SMBs. Use
CA ARCserve Backup for Windows Implementation Guide r15 This documentation and any related computer software help programs (hereinafter referred to as the "Documentation") are for your informational purposes