Internet Resiliency and Recovery Scott Hofer Executive Network Architect Scott Hofer, Executive Network Architect, IBM IBM Certified Executive Network IT Specialist M.S. Telecommunications 11 years with IBM Business Continuity and Resiliency Services Network Offering Manager, 7 years Delivery Support for mainframe and open systems, 4 years Previously Network administrator, University of Colorado, Boulder Satellite Tracking Station Manager, Defense Mapping Agency Patent: System, Method, and Program for Re-Routing Internet Packets Email: hofer@us.ibm.com Abstract Internet connectivity is a critical network component for businesses. The need to keep employees, end-users and suppliers connected is no longer optional. The technology to prepare your environment for Internet recovery includes these key components: IP addressing and DNS (domain name server) resolution. Additionally, you should assess the resiliency of your Internet services both in production and recovery. Resiliency has many aspects, including both inbound and outbound connectivity to the Internet, connections to multiple ISPs, multiple WAN and LAN connections, and critical routing decisions. This session will review considerations for your Internet environment to address recovery and resiliency. 1
Agenda Why Internet Continuity? Internet Resiliency Overview Internet Recovery Methodologies Overview Preparing Your Environment for IP Redirect Why Internet Continuity? Companies have implemented many mission critical tasks over the Internet: ebusiness VoIP Suppliers Video Conferencing Customers emeetings Business Partners e-mail Remote Access VPNs Data Vaulting / Rapid Recovery Internet Continuity includes resiliency AND recovery Internet Resilience Internet Resiliency supports a businesses ability to adapt and respond to risks and opportunities Infrastructure strategy and design IT recovery High availability 2
Agenda Why Internet Continuity? Internet Resiliency Overview Internet Recovery Methodologies Overview Preparing Your Environment for IP Redirect ISP 1 ISP 2 Internet 1 ISP 3 ISP N Edge Router 1 (BGPv4) 2 Edge Router 2 (BGP v4) VLAN 1 3 VLAN 2 6 Path Selection Device Distribution Switch 1 Distribution Switch 2 4 5 End Devices Internet Resiliency Overview 1 2 3 4 5 6 Multiple Internet Service Providers (ISPs) Physically diverse network pathing (separate sheathing, separate ingress and egress points into the building, diverse POPs) Multiple Edge Routers Fully Redundant Cabling Automatic failover Redundant Distribution Layer Redundant Connections to End Devices Real time best path selection Real time load balancing 3
Resilient Internet Front End Can survive all but a catastrophic site-wide outage. single point of failure. Agenda Why Internet Continuity? Internet Resiliency Overview Internet Recovery Methodologies Overview Preparing Your Environment for IP Redirect Internet Recovery What to consider for Internet Recovery options Static IP/ DNS (Domain Name System) DNS Based IP Address Redirection 4
Internet Recovery Static IP/ DNS Production Site IP Address Range 1 Any ISP Internet Any ISP Recovery Site IP Address Range 2 Internet Recovery Static IP/ DNS Considerations Must plan time for system IP address change Use different IP addresses at the recovery site More difficult for people to target IP addresses rather than hostnames For example entering 192.168.0.1 into a web browser vs. www.mycompany.com Must communicate all IP address changes to System administrators End users Business partners Vendors Other IP addresses typically change back to original state when going home Internet Recovery DNS Based Primary DNS Server Production Site IP Address Range 1 Any ISP Internet Any ISP Any ISP Recovery Site IP Address Range 2 Alternate DNS Server at Alternate Site (e.g. Recovery Site) 5
Internet Recovery DNS Based Considerations Use new IP addresses at the recovery site along with DNS The hostnames may stay the same Can pre-stage DNS configuration Must communicate all the hostname changes to interested parties Can create test domains or hosts Example domain: Change host.mycompany.com to host.dr.mycompany.com Example Host: Change host.mycompany.com to drhost.mycompany.com Must plan for time to propagate changes worldwide Typically can cover the U.S. in 24 hours International may take up to 72 hours Must plan time for system IP address change IP addresses may not be available for permanent plan, may need to incorporate time to acquire, plan, and implement new IP addresses during an outage DNS server must be recovered before you can use this Sample DNS Zone File Production Zone File: ; Database file COMPANY.COM.dns for company.com zone. ; Zone version: 259 ; @ IN SOA dns. administrator. (... ; Zone records hostname1 A 64.96.43.1 hostname2 A 64.96.43.2 hostname3 A 64.96.43.3 Recovery Zone File: ; Database file COMPANY.COM.dns for company.com zone. ; Zone version: 259 ; @ IN SOA dns. administrator. (... ; Zone records hostname1 A 123.123.123.27 hostname2 A 123.123.123.28 hostname3 A 123.123.123.29 Internet Recovery IP Address Redirect ISP A Internet ISP A Production Site Same IP Addresses at Both Sites Recovery Site Same IP Addresses at Both Sites 6
Internet Recovery IP Address Redirect Considerations Eliminates most previous issues static IP/ DNS and DNS issues revolve mostly around changing IP addresses May plan to keep original Hostnames during outage May use original or new Hostname during test Planning Issues: Must move large block of addresses OR use the same ISP at the production site and recovery site Agenda Why Internet Continuity? Internet Resiliency Overview Internet Recovery Methodologies Overview Preparing Your Environment for IP Redirect Preparing for IP Redirect Internet Recovery Choose a methodology Single homed, same ISP, static routes Single homed, same ISP, BGPv4 Multi-homed, same ISPs, BGPv4 Multi-homed, different ISPs, BGPv4 7
Preparing for IP Redirect Internet Recovery Single homed, same ISP, static routes Install same ISP at both the recovery and production sites Create static routes between you and ISP (both sites) Have ISP change static route for test/outage Subject to ISP change windows, response time Manual failover only Flexible down to 30-bit mask Preparing for IP Redirect Internet Recovery Single homed, same ISP, BGPv4 Install same ISP at both the recovery and production sites Run BGPv4 between you and ISP Full or partial Internet routing tables May use private (or public) ASN (Autonomous System Number) Setup BGPv4 to prefer production site route(s) Allows for automatic or manual failover Flexible down to 30-bit mask Preparing for IP Redirect Internet Recovery Multi-homed, same ISPs, BGPv4 Install same ISPs at both the recovery and production sites Run BGPv4 between you and the ISPs Full Internet routing tables (requires 256M on router) Public ASN required (www.arin.net) Work with ISPs or Continuity Service Provider to preconfigure BGP filters Allows for automatic or manual failover Flexible down to 30-bit mask 8
Preparing for IP Redirect Internet Recovery Multi-homed, different ISPs, BGPv4 ISP independent at the recovery and production sites Run BGPv4 between you and the ISPs Full Internet routing tables (requires 256M on router) Public ASN required Work with ISPs or Continuity Service Provider to preconfigure BGP filters Allows for automatic or manual failover Requires 24-bit mask or larger IP Redirect Summary Single homed, same ISP, static routes Same ISP? BGP? Full routing tables? Public ASN? Manual or Automatic? Manual Full routing tables? Single homed, same ISP, BGPv4 Either Multi-homed, different ISPs, BGPv4 Either Multi-homed, same ISPs, BGPv4 Either Ways to Speed Recovery (or Ease the Pain) Setup a DHCP Server Save individual device configuration time Ease end user pain configuring PCs Remote Access VPNs End users and/or administrators may be: Unable (or unwilling) to fly Sticking out (or stuck in) regional disaster (e.g. Hurricane) Consoling family In a Hotel Site to Site VPNs Can save $ vs. traditional telecommunications 9
Creating a Test Plan Identify IP address ranges to test Understand impact to production site Many continuity service providers have ranges of IP addresses available for test purposes Best to test actual redirect of all IP addresses Redirect of subset of IP addresses is the next best thing Internet Security Assessment Denial of Service Intrusion Prevention Anti-virus management Penetration testing Emergency Response Services Conclusion Internet connectivity is no longer a luxury, it is business critical Planning is crucial to success 10
Q&A Any questions? 11