Planning for an Amanda Disaster Recovery System



Similar documents
Restoring a Suse Linux Enterprise Server 9 64 Bit on Dissimilar Hardware with CBMR for Linux 1.02

Advanced SUSE Linux Enterprise Server Administration (Course 3038) Chapter 5 Manage Backup and Recovery

Navigating the Rescue Mode for Linux

USTM16 Linux System Administration

Other trademarks and Registered trademarks include: LONE-TAR. AIR-BAG. RESCUE-RANGER TAPE-TELL. CRONY. BUTTSAVER. SHELL-LOCK

Linux System Administration

Creating a Cray System Management Workstation (SMW) Bootable Backup Drive

Week Overview. Running Live Linux Sending from command line scp and sftp utilities

Table of Contents. Online backup Manager User s Guide

EVault Software. Course 361 Protecting Linux and UNIX with EVault

How to backup and restore the Virtual I/O Server

Rsync: The Best Backup System Ever

Configure Cisco Emergency Responder Disaster Recovery System

Quick Start Guide. Version R91. English

Overview. Remote access and file transfer. SSH clients by platform. Logging in remotely

Backup policies. Or - how not to get annoyed when you accidentally delete stuff. Warning - this does get a little technical

How to Restore a Linux Server Using Bare Metal Restore

Do it Yourself System Administration

4PSA Total Backup User's Guide. for Plesk and newer versions

How To Install A Virtual Image Facility On Linux (Amd64) On A 3390 Dasd 3390 (Amd32) Dasda (Amd86) (Amd66) (Dasd) (Virtual Image Facility) (For

When Bad Things Happen to Good Amanda Backup Servers

Acronis Backup & Recovery 10 Server for Linux. Update 5. Installation Guide

Linux Disaster Recovery best practices with rear

FalconStor Recovery Agents User Guide

Setting-Up an Open-Source Backup Software Amanda Community in About 15 Minutes

Backup and Recovery Procedures

MySQL Backups: From strategy to Implementation

Acronis Backup & Recovery 10 Server for Linux. Command Line Reference

BrightStor ARCserve Backup for UNIX

EVault for Data Protection Manager. Course 361 Protecting Linux and UNIX with EVault

CBMR for Linux v6.2.2 User Guide

Buildroot for Vortex86EX (2016/04/20)

Linux Disaster Recovery best practices with rear

RocketRAID 2640/2642 SAS Controller Ubuntu Linux Installation Guide

Red Hat Linux Administration II Installation, Configuration, Software and Troubleshooting

Oracle VM Server Recovery Guide. Version 8.2

Microsoft Exchange 2003 Disaster Recovery Operations Guide

Quick Start Guide for Linux Based Recovery

A candidate following a programme of learning leading to this unit will be able to:

Updates Click to check for a newer version of the CD Press next and confirm the disc burner selection before pressing finish.

BrightStor ARCserve Backup for Linux

Managing Software and Configurations

WES 9.2 DRIVE CONFIGURATION WORKSHEET

How to Backup XenServer VM with VirtualIQ

Attix5 Pro Server Edition

How To Backup In Cisco Uk Central And Cisco Cusd (Cisco) Cusm (Custodian) (Cusd) (Uk) (Usd).Com) (Ucs) (Cyse

On Disk Encryption with Red Hat Enterprise Linux

Introduction to Operating Systems

USB 2.0 Flash Drive User Manual

SmartFiler Backup Appliance User Guide 2.0

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

Using Symantec NetBackup with Symantec Security Information Manager 4.5

BackupAssist Common Usage Scenarios

Cloning Utility for Rockwell Automation Industrial Computers

RocketRAID 174x SATA Controller Ubuntu Linux Installation Guide

SAP HANA Disaster Recovery with Asynchronous Storage Replication Using Snap Creator and SnapMirror

Cover sheet. How do you create a backup of the OS systems during operation? SIMATIC PCS 7. FAQ November Service & Support. Answers for industry.

Using Network Attached Storage with Linux. by Andy Pepperdine

BackupAssist v6 quickstart guide

Acronis True Image 2015 REVIEWERS GUIDE

Acronis Backup & Recovery 10 Server for Linux. Quick Start Guide

Reboot the ExtraHop System and Test Hardware with the Rescue USB Flash Drive

BACKUP YOUR SENSITIVE DATA WITH BACKUP- MANAGER

Over-the-top Upgrade Guide for Snare Server v7

LBNC and IBM Corporation Document: LBNC-Install.doc Date: Path: D:\Doc\EPFL\LNBC\LBNC-Install.doc Version: V1.0

Intelligent disaster recovery. Dell DL backup to Disk Appliance powered by Symantec

Configuring Virtual Blades

VOICE IMPROVEMENT PROCESSOR (VIP) BACKUP AND RECOVERY PROCEDURES - Draft Version 1.0

GUARD1 PLUS SE Administrator's Manual

Administering the Network Analysis Module. Cisco IOS Software. Logging In to the NAM with Cisco IOS Software CHAPTER

FileCruiser Backup & Restoring Guide

Kaseya 2. User Guide. Version 7.0. English

NBU651 BMR. Avi Weinberger

4 Backing Up and Restoring System Software

This appendix describes the following procedures: Cisco ANA Registry Backup and Restore Oracle Database Backup and Restore

SIRIS. Bare Metal Restore Guide

Private Server and Physical Server Backup and Restoration:

Dell NetVault Bare Metal Recovery User s Guide

Relax and Recover (rear) Workshop

Tivoli Storage Manager Lunch and Learn Bare Metal Restore Dave Daun, IBM Advanced Technical Support

Using iscsi with BackupAssist. User Guide

Table of Contents. Online backup Manager User s Guide

Zmanda: Open Source Backup

Adafruit's Raspberry Pi Lesson 1. Preparing an SD Card for your Raspberry Pi

Linux Disaster Recovery as a Service (with rear)

User Guide. Laplink Software, Inc. Laplink DiskImage 7 Professional. User Guide. UG-DiskImagePro-EN-7 (REV. 5/2013)

Amanda in 15 Minutes

Automating the deployment of FreeBSD & PC-BSD systems. BSDCan Kris Moore PC-BSD / ixsystems kris@pcbsd.org

Fiery Clone Tool For Embedded Servers User Guide

2. Boot using the Debian Net Install cd and when prompted to continue type "linux26", this will load the 2.6 kernel

Computer Backup Strategies

Linux Boot Camp. Our Lady of the Lake University Computer Information Systems & Security Department Kevin Barton Artair Burnett

for the VaultDR Online Plugin for Linux-based Operating Systems

Installing and Upgrading to Windows 7

HARFORD COMMUNITY COLLEGE 401 Thomas Run Road Bel Air, MD Course Outline CIS INTRODUCTION TO UNIX

NovaBACKUP. User Manual. NovaStor / November 2011

Intelligent Video Analytics. Server Installation Guide. - Windows - Linux

Transcription:

Planning for an Amanda Disaster Recovery System Bernd Harmsen bjh@datasysteme.de www.datasysteme.de 22nd April 2003 Contents 1 Introduction 1 1.1 Why we need a specialized Amanda Disaster Recovery System?.............. 2 2 Goals 2 3 Disaster recovery with native tools and possible optimizations. 3 3.1 Provide working Hardware and Emergency System..................... 3 3.2 Restore a Linux-Backup-Client................................ 4 3.3 Restore a Linux-Backup-Server............................... 5 3.4 Make the System bootable.................................. 7 4 Starting points for optimization 8 4.1 Essential Backup Tool.................................... 8 4.1.1 Easy Amanda Database export / import....................... 9 4.2 Specialized Amanda Recovery System on CD........................ 9 4.2.1 Remote Access.................................... 9 4.2.2 Full automatic partitioning, formating and mounting................ 9 4.2.3 Amrestore Scripts.................................. 9 1 Introduction This document was written to provide information about how to do a disaster recovery with Amanda and to plan a specialized disaster recovery system for Amanda. We (ds-datasysteme) are a small company, specialized on Linux networks that provide Amanda backup system to our customers. We think that Amanda is a great backup tool, very fast, reliable and with low hardware recommendations. 1

2 GOALS 2 But we also think, that Amanda is lacking some features for recovery. Recovery is more complicated than backup. This is normal, because during a recovery you have to deal with an undefined, unknown situation. (E.g. a customer who want to get some files back normally only knows parts of the filename.) But, this is OK. The real problem for us is the case of a disaster recovery. In case the harddisk of an importand server is broken (or the server is completely lost) there are high costs, less time and impatient customers. For this we need a more secure, reliable and fast way to get the system working again. We like to create a specialized Amanda Disaster Recovery System, maybe together with other members of the Amanda community, or to participate in an existing system. We like to publish this system under the GPL or a similar license. You can download current versions of this document and our Amanda add ons from www.datasysteme.de. Simply follow the link Download -> Linux. 1.1 Why we need a specialized Amanda Disaster Recovery System? Because the disaster recovery process as described in Chapter 3 on the following page is to complicated (less reliable because of human errors) and to slow. A disaster recovery consist of many different steps that all need time and care. On the other hand there are customers who want their server back. The following timesheet shows what we think about the maximum time we have for a disaster recovery 0.0h A server fails 0.5h The customer call the support. A member of the support team do a diagnostic talk with the customer and pack some hardware for replacement. 1.5h Now the support is on the way to the customer 2.0h The support member arrived at the customer, analyzes the problem and repairs the system. 3.0h The hardware is working again. Now the support member starts to recover the data from the Amanda backups. For this we plan: 1.5h Active work with the recovery tools. 2.0h Data transport over the network. 6.5h The system is mostly working again. 8.0h The system is well tested. All the upcoming small problem are solved. As you can see, it takes a whole working day to get the system up and running again. This is very long and we should try to save some time at some points. But this timesheet is also optimistic. We think that it is hard to meet its deadlines without a specialized disaster recovery system. It assumes that the support worker makes no bigger errors. With a less trained worker it can even take 16 hours. 2 Goals What are the goals of an specialized Amanda Disaster Recovery System. 1. Make the Disaster Recovery more easy and reliable (less affected from human errors). 2. Make the Disaster Recovery more fast.

3 DISASTER RECOVERY WITH NATIVE TOOLS AND POSSIBLE OPTIMIZATIONS. 3 3 Disaster recovery with native tools and possible optimizations. This section describes how a disaster recovery can be done without a specialized system. It uses only the installation media of an Debian GNU/Linux 3.0 system and the Amanda backup. The concept is to install a separate minimal Debian system on a own partition and use this to restore the original partitions. This section has two intentions: 1. Provide a step by step guide for a disaster recovery. You can use it as guide. But the procedure is not well tested, because I write it after my last disaster recovery. Feel free to send me corrections and suggestions. 2. Show how complicated and time-consuming a disaster recovery can be and find some good points to start optimization. This is the main goal. The described way is too time-consuming and too complicated for a stressfull situation with an impatient customer behind you. So we like to build or participate in an more optimized and automated disaster recovery system. 3.1 Provide working Hardware and Emergency System 1. Provide working hardware. 2. Plan partition table. Additional to the partitions for the system you want to recover (destination-system), you must provide a partition for the emergency system. Put this partition at the beginning of the table and give it e.g. 300MB. You need a Backup of all your partition tables for that. Possible optimization: Full automatic partitioning, initialization and mounting (see 4.2.2 on page 9). 3. Install a Debian-Base-System Use your normal Debian installation method/media to install a base system on the additional partition.we will use this as emergency system. Create the partitions as planed above but only initialize and mount the partition for the emergency system. Install the following additional packages: Amanda: amanda-client, amanda-server, tar, dump Remote-Access: ssh, isdnutils-base, ipppd Possible optimization: Use an specialized Amanda Recovery System on a bootable CD (see 4.2 on page 9). 4. Boot the emergency system. 5. Configure the IP-Network manually using ifconfig and route. 6. If you need remote access, e.g. for assistance from your office, configure ipppd manually. Possible_optimization: Provide good defaults for the isdn config files (see 4.2.1 on page 9). 7. Initialize and mount the destination partitions. Possible optimization: Full automatic partitioning, initialization and mounting (see 4.2.2 on page 9).

3 DISASTER RECOVERY WITH NATIVE TOOLS AND POSSIBLE OPTIMIZATIONS. 4 (a) Initialize the Swap-Partition mkswap <DEVICE> (b) Initialize destination filesystem partitions Initialize ext2-filesystems with the following command: mke2fs /dev/<device> (c) Mount destination partition. Compose the destination partitions under the mountpoint /mnt. Use the following steps for that: i. Mount destination- / -partition under /mnt. mount /dev/<device> /mnt ii. Create mountpoints for other partitions in the destination- / -filesystem. e.g.: /var, /home, /groups, /usr mkdir /mnt/<mountpoint> iii. Mount all other destination partitions. mount /dev/<device> /mnt/<mountpoint> 8. Set correct date and time. date <MMDDhhmmYYYY> 3.2 Restore a Linux-Backup-Client Use this step if you have a working Amanda-Backup-Server and want to restore a Linux-Backup-Client. Now we restore the data from our Backup-Server to the inactive destination system. For each partition we first restore the last level 0 backup and then the last backup of each higher level. 1. Get root permissions. su 2. Go to the highest directory of the selected destination partition. cd /mnt/<mountpoint> 3. Run Amrecover amrecover <CONFIG> -s <BACKUP-SERVER> -t <BACKUP-SERVER> 4. Set source partition. sethost <NAME> setdisk <MOUNTPOINT> 5. Select all files and directories: add * 6. Verify the list of files marked for extraction. Note which tapes are needed. list 7. Note the number of the archive you need on each tape. history You will see lines like: 201-2002-03-06 0 ds-daily4 8 The last column shows the number of the archive and the second last the name of the tape. You need all listed tapes since the last level 0 backup. 8. Start the restore. extract

3 DISASTER RECOVERY WITH NATIVE TOOLS AND POSSIBLE OPTIMIZATIONS. 5 9. Verify if the shown destination directory is correct. 10. Load tape and wind to the beginning of the archive. (a) Login on the Amanda backup server. (b) Load the tape wanted by amrecover. Wait until the streamer is quiet again. (c) Wind to the X. Filemark. Attention: X = archive-number - 1 mt --file=/den/<device> rewind mt --file=/dev/<device> fsf <X> (d) Wait until you get the next prompt. 11. Confirm to Amrecover on the backup client that the correct tape is loaded. Load tape <NAME> now Continue? [Y/n]: Y 12. Wait until restoration finishes. 13. Confirm restoration of origin permissions to the top level directory. set owner/mode for.? [yn] y 14. If Amrecover want another tape, proceed with step 10. 15. Leave Amrecover. quit 16. Proceed with step 3 on the page before to restore the next partition. 3.3 Restore a Linux-Backup-Server Use this step if your Amanda-Backup-Server itself is defect. Because the Backup-Server has failed, there is no Amanda database and you cannot use amrecover. So we restore each partition with the less comfortable tool amrestore. You must manually find out, which tapes and which archive-numbers you need for recovery. 1. Find out the tapes and archive-numbers. For each destination partition you need the last level 0 backup and the last backup of each higher backup level, but where there are. If you have a current export of your amanda database you can find this information very easy. Simply look in the file which tapes and archive-numbers are listed for the wanted partition. If you do not have a current copy of this database, you can find this information manually in the e-mails you have gotten from amverify in the past. But this is much more complicated. Here is an example: Following you find an extract from different amverify e-mails. Each e-mail shows the content of one tape. The last number shows the backup level and the number of the Checked... line (count from top) gives the number of the archive on the tape. In the example we want to restore the /home -Partition of out Backup-Server amun. We start with the last level 0 backup in archive-number 11 on tape ds-daily4. After that we have to restore the last level 1 backup in archive-number 10 on tape ds-daily7. There is no level 2 backup, so we need only two tapes. Date: Wed, 5 Mar 2003 12:51:21 +0100 Subject: ds-daily AMANDA VERIFY REPORT FOR ds-daily4

3 DISASTER RECOVERY WITH NATIVE TOOLS AND POSSIBLE OPTIMIZATIONS. 6 Using device /dev/nst0 Volume ds-daily4, Date 20030305 Checked upuaut.datasys._boot.20030305.0 Checked inpu.datasys._boot.20030305.0 Checked amun.datasys. ra.datasys_e$.20030305.1 Checked amun.datasys. aset.datasys_e$.20030305.1 Checked amun.datasys. ra.datasys_d$.20030305.1 Checked inpu.datasys._var_lib.20030305.0 Checked amun.datasys._usr.20030305.0 Checked amun.datasys. djhuti.datasys_e$.20030305.0 Checked amun.datasys. djhuti.datasys_f$.20030305.1 Checked inpu.datasys._var.20030305.3 Checked amun.datasys._home.20030305.0 Date: Thu, 6 Mar 2003 12:59:49 +0100 Subject: ds-daily AMANDA VERIFY REPORT FOR ds-daily6 Using device /dev/nst0 Volume ds-daily6, Date 20030306 Checked amun.datasys._usr.20030306.1 Checked inpu.datasys._boot.20030306.1 Checked upuaut.datasys._.20030306.1 Checked upuaut.datasys._boot.20030306.1 Checked amun.datasys._.20030306.1 Checked inpu.datasys._var_lib.20030306.1 Checked upuaut.datasys._var.20030306.1 Checked amun.datasys. aset.datasys_e$.20030306.1 Checked inpu.datasys._.20030306.1 Checked amun.datasys. djhuti.datasys_f$.20030306.1 Checked amun.datasys. ra.datasys_e$.20030306.1 Checked amun.datasys._var.20030306.1 Checked amun.datasys. ra.datasys_c$.20030306.1 Checked amun.datasys. aset.datasys_c$.20030306.1 Checked amun.datasys. djhuti.datasys_c$.20030306.1 Checked amun.datasys. aset.datasys_d$.20030306.1 Checked amun.datasys. djhuti.datasys_e$.20030306.1 Checked inpu.datasys._var.20030306.0 Checked amun.datasys. ra.datasys_d$.20030306.0 Checked amun.datasys. djhuti.datasys_d$.20030306.0 Checked amun.datasys._home.20030306.1 Date: Fri, 7 Mar 2003 13:41:35 +0100 Subject: ds-daily AMANDA VERIFY REPORT FOR ds-daily7 Using device /dev/nst0 Volume ds-daily7, Date 20030307 Checked inpu.datasys._boot.20030307.1 Checked amun.datasys._usr.20030307.1 Checked upuaut.datasys._.20030307.1 Checked upuaut.datasys._boot.20030307.1 Checked amun.datasys._.20030307.1

3 DISASTER RECOVERY WITH NATIVE TOOLS AND POSSIBLE OPTIMIZATIONS. 7 Checked inpu.datasys._var_lib.20030307.1 Checked upuaut.datasys._var.20030307.2 Checked amun.datasys. ra.datasys_d$.20030307.1 Checked inpu.datasys._.20030307.1 Checked amun.datasys._home.20030307.1 Possible optimization: Provide an easy export/import mechanism for the Amanda database to use amrecover here (see 4.1.1 on page 9). 2. TAR or DUMP? For each partition you must find out, if the backup was made using tar or dump. You find this information in your amanda disklist file (e.g.: /etc/amanda/<config>/disklist), if you have a separate backup of it. Possible optimization: Provide an Essential Backup tool, that stores such information in a separate backup (see 4.1 on the next page). 3. If you do not have root permission in the emergency system, get it now. su 4. Restore destination partitions (a) Change to the top level directory of the destination partition. cd /mnt/<mountpoint> (b) Insert correct tape (c) Wind to the X. Filemark. Attention: X = archive-number - 1 mt --file=/den/<device> rewind mt --file=/dev/<device> fsf <X> (d) Run amrecover For DUMP-Backups amrestore -p /dev/<device> <HOSTNAME> <MPOINT>$ restore -rv -b2 -f- For TAR-Backups amrestore -p /dev/<device> <HOSTNAME> <MPOINT>$ tar -xvpmi -f- --ignore-failed-read --same-owner Possible optimization: Provide simple scripts that run this nasty commands (see 4.2.3 on page 9). (e) If there are more backup levels for this partition, proceed with step 4b. (f) If there are more partitions proceed with step 4a. 3.4 Make the System bootable 1. Change / to destination system. With this command the destination system becomes the active system. You can mostly use it as if you have booted it. chroot /mnt 2. Make sure that /proc is an empty directory /proc is an virtual file system provided by the kernel. During the restore process it was maybe restored with it contents, but it should only be a mountpoint. rm -f /proc/*

4 STARTING POINTS FOR OPTIMIZATION 8 3. Check /etc/fstab Is the fstab conform with the new partition table? 4. Check /etc/lilo.conf Are the params root and boot conform with the new partition table? root = boot = Device that contains the / -partition (e.g. /dev/sda2). Device that should contain the bootsector (e.g. /dev/sda). 5. Write a new bootsector liloconfig 6. Exit chroot exit 7. Boot restored destination system. shutdown -r now 8. Thats all. 4 Starting points for optimization This part shows the possible targets for optimization, extracted from chapter 3. At the moment this is more a brainstorming than a detailed concept. We like to read your ideas about that. 4.1 Essential Backup Tool This little script should collect all the essential informations that is need in case of an disaster recovery and store it in one or more a save places appart from the normal backups. It can be installed on all Linux hosts and started by (ana)cron e.g. once a week. The informations we consider essential are: Configuration (/etc/*, incl. full amanda config) Partition table Installed packages (dpkg get-selections) Amanda database (only on the Backup-Server, amadmin <CONFIG> export) There are plans to provide ways to save this informations: on a local floppy disk. by GPG encrypted e-mail. by sftp or ftp. This script is partly ready and in production use. It is part of our package dslinuxskripte which you can download from www.datasysteme.de. Simply follow the link Download -> Linux.

4 STARTING POINTS FOR OPTIMIZATION 9 4.1.1 Easy Amanda Database export / import Provide a way to use amrecover even if the Backup-Server has failed. For this we need an easy import of the Amanda database from the last essential backup. If there are problems with that, we can provide a script that extracts the informations about tapes and archive-numbers from a amanda database and optionally calls amrestore (see 4.2.3). 4.2 Specialized Amanda Recovery System on CD Provide an bootable emergency system on cd, that contains: a base system all necessary tools some scripts to make disaster recovery more easy. an import function for the essential backups. maybe it is nice to have a kind of GUI where you only select the name of the host you want to restore and everything else runs automatic. But we think this is much work and should be delayed for a later step. 4.2.1 Remote Access Provide good callin defaults for the isdn config files device.ippp0 and ipppd.ippp0. The support worker should only load the correct kernel module and change the MSN. With this feature a less trained worker can start the disaster recovery system and someone in the main office can proceed or assist. 4.2.2 Full automatic partitioning, formating and mounting For this we can write a script that reads all necessary information from the essential backup of the selected host (see 4.1 on the preceding page) and automatic: partitioning the harddisk(s). initialize the partitions with the correct filesystem or swap. mount the partitions for disaster recovery. 4.2.3 Amrestore Scripts Provide little scripts (e.g. amrestoredump and amrestoretar ) that runs the following nasty amrestore commands on the backup server, in cases where we cannot use amrecover. But maybe this can/should be more automatic. For DUMP-Backups: amrestore -p /dev/<device> <HOSTNAME> <MPOINT>$ restore -rv -b2 -f- For TAR-Backups: amrestore -p /dev/<device> <HOSTNAME> <MPOINT>$ tar -xvpmi -f- --ignore-failed-read --same-owner E.g.: amrestoretar <DEVICE> <HOSTNAME> <MPOINT>