Website Disaster Recovery
Contents Overview... 2 Disaster Preparedness for the Internet Age... 2 Some Fundamental Questions... 2 Planning Your Recovery... 3 Start with a Backup Plan... 4 Backup Commandments... 4 Types of Recovery... 4 Static (Manual) Recovery... 4 Static (Automatic) Recovery... 4 Dynamic (Automatic Recovery)... 5 Automatic Recovery Time... 5 Best Case:... 6 Worst Case:... 6 Spreading Your Risks... 6 Licensing Issues... 7 Where to go From Here... 7 1
Overview Websites and database applications running on the Internet or a Intranet are critical resources for many businesses in running their day to day operations. If a critical Internet website or intranet server goes down for any length of time things come screeching to a halt very quickly. Time and money start wasting away as sales are lost, workers sit idle, product doesn t shipped and access to critical information is denied. Now is a good time to prepare for this and add a section on Website Disaster Recovery to your Disaster Preparedness and Recovery Plan. Website Disaster Recovery is not about clustering servers for load balancing, but rather it s all about having redundant location and access to your data. Disaster Preparedness for the Internet Age There are so many ways that disasters can happen one would go crazy trying to come up with a separate plan for each scenario. In the case of websites and database applications it is critical to be able to recover not only the website content but also any underlying data and to restore the ability to access this information in a timely manner, no matter what has occurred. Some Fundamental Questions In deciding on the best course of action to take in disaster preparedness and recovery one has to answer some questions. 1. How long can you afford to be down? The answer to this really depends on what type of website and database we are dealing with and how critical your information is for your day to day operations. a. A personal blog can probably afford to be down a lot longer than an ecommerce site or internal customer service application. b. Your down time is inversely proportional to the amount of money you have to invest to mitigate this risk. 2. What are your Critical and Non-Critical data and files (resources)? Not all data and files are created equal so you might not treat them the same. You need 2
to make a list of resources and then determine how critical these files are to the running of your business. Here are some examples or resources. a. External, customer facing website based on WordPress b. MySQL database for customer website (WordPress database) c. External, customer facing order system (shopping cart) d. MySQL database for customer order system e. Internal CRM (customer management software) f. Database for Internal CRM (customer management software) g. Internal FAQ / Helpdesk system h. Internal order processing system tied to external system via Java Each of these items or resources needs to be listed in a Resource Definition Table. Sample Resource Definition Table Resource Critical Level (1-10) Failover Needed Max Recovery Time Duplication Frequency External Backup Frequency Static Website 5 No 4 hours Daily Daily Shopping Cart 10 Yes 10 minutes Hourly Daily Shopping Cart Database 10 Yes 10 minutes Real Time 2 x Daily Marketing Blog 5 Yes 2 hours Hourly Daily Intranet 7 Yes 30 minutes Hourly Daily Internal KB 3 No 1 hour NA Daily The above times are not recommended or typical times. Planning Your Recovery Once you have a Resource Definition Table you can plan what steps to take for a timely recovery. There are several ways to duplicate your website and critical databases; the complexity depends on how fast you need to be up and running again (Max Recovery Time). No matter what you decide you should always have a backup plan. 3
Start with a Backup Plan How often you backup and how many copies and versions you need really depends on how critical your data is. Having a backup is like having a parachute, when you really need one it s too late to acquire one. Backups fail and can be undependable, always have more than one and keep more than one day s worth of data. Backup Commandments 1. You shall always have a current backup of your website and underlying data. 2. You shall keep more than one daily copy of your backups 3. You shall keep backup copies in more than one location. 4. You shall make sure you know how to restore your backup. 5. Never depend on someone else to have a backup. How often you backup and how many copies and versions you need really depends on how critical your data is. Types of Recovery Static (Manual) Recovery Many static (infrequently updated) sites can be recovered by just uploading a local copy of your content to a new server and pointing your DNS to the new website location. Your actual recovery time will depend on how fast you can setup a new server and upload your site to a new location and have DNS point to the new location. This would work for both database driven and sites with html content. If your website is using cpanel you can make a daily backup of the site and just restore your backup to the new server. Requirements & Cost 1. Local backup of content 2. Ability to restore content to new (server) location 3. Ability to change DNS to point to new server. 4. Low cost, slow recovery Static (Automatic) Recovery Having automatic recovery for static sites is a little more complicated as it requires duplicate servers and custom scripts to make sure your website and any databases are duplicated. 4
Requirements & Cost 1. Duplicate servers in separate locations 2. Duplicate website configuration for each server 3. Duplicate MySQL server configuration in each location 4. Custom Scripts to copy over website content to backup server 5. Custom Scripts to backup and restore databases to backup server 6. Server monitoring and DNS switching system 7. Medium cost, giving fast recovery but may not be current data. It is fairly easy to provide one or more redundant servers for fairly static websites. The technical difficulty comes in requiring a custom script to copy over each website and database. Changes to the server (domains added or deleted) require new scripting. Dynamic (Automatic Recovery) Providing for automatic recovery of dynamic sites is slightly more complex in that any MySQL databases would have to be replicated in real time. This mirroring can be a Master:Slave or Master:Master relationship. This replication, while not instantaneous, is very close to real time. It is designed to be used over both fast and slow connections. Requirements & Cost 1. Duplicate servers in separate locations 2. Duplicate website configuration for each server 3. Custom Scripts to copy over website content to backup server 4. Correctly configured MySQL servers in each location 5. Server monitoring and DNS switching system 6. Slightly higher cost, giving fast recovery using current data. While this is slightly more complex to setup and maintain, this option provides the ability to get back up and running in almost real time. Automatic Recovery Time 5
The time it takes to recover is the same as for Static (Automatic) Recovery and Dynamic (Automatic) Recovery. There are 4 components involved in recovery time, with some typical times 1. Monitoring frequency - every X minutes (1-15) 2. Number of retests to fail, prevents false down times test 3 times 3. DNS refresh rate - 5 minutes, your DNS Refresh Days is dependent upon when DNS was last checked. 4. DNS propagation rate - less than or equal to the DNS refresh rate Typical best and worst case recovery times can be calculated as follows: Best Case: Monitoring frequency x (# Retests+1) + 0 DNS Refresh Delay + 0 DNS propagation rate. For a 5 minutes test interval with 3 retests you have 5 x (3+1) + 0 + 0 = 20 minutes For a 1 minutes test interval with 5 retests you have 1 x (5+1) + 0 + 0 = 6 minutes Worst Case: Monitoring frequency x (# Retests+1) + DNS Refresh Delay + Max DNS propagation rate. For a 5 minutes test interval with 3 retests you have 5 x (3+1) + 5 + 5 = 30 minutes For a 1 minutes test interval with 5 retests you have 1 x (5+1) + 5 + 5 = 16 minutes Spreading Your Risks It pays to not have all your eggs in one basket. For example, we run a separate mail server from our web server. This way if our website goes down we still have email. If our mail server goes down people can still reach our website and get our contact phone number. Here are some options. 1. Have separate servers for separate functions. With today s cloud based virtual servers it is easy to separate Mail and Internet and Intranet servers from each other. 2. If possible locate your servers in different locations. We have virtual server locations in the US (multiple locations), Canada, Europe and Asia. 3. Store backups off site from your server locations. Amazon s S3 service is perfect for daily backups. 6
Licensing Issues Many companies use applications which have a license tied to them, make sure your application license allows running on redundant servers. Where To Go From Here At we can assist you in designing a recovery plan that fits both your needs and your budget. Contact us for a free consultation. http://www.active-server.com/support/contact-us.php has been providing reliable website hosting, system administration and consulting services since 1997. 7