Unix System Administration Caleb Phillips Data Redundancy CSCI 4113, Spring 2010
Where are We? Advanced Topics NTP and Cron Data Redundancy (RAID, rsync, backups...) Division of Labor (NFS, NIS, PAM, LDAP,...)... Guest Lectures
Overview Losing Data is Bad Want to Protect from: Hardware failures (predominantly disk failures) User Mistakes (oh noes! I be-leted it!?) Solutions: Real-time Data Redudancy ala RAID Many types/approaches Delayed Data Redundancy ala Backups Many types/approaches
RAID Tape-backup is sloooow and (can be) costly Redundant Array of Inexpensive Disks to the rescue! General idea is to spread/mirror data across multiple disks to be robust to hardware failures (and maybe get a performance boost too) Different schemes are called levels 0-5 in practice :-( :-)
RAID 0 Simple Striping Data is striped across multiple devices Not Redundant! No storage overhead Higher bandwidth
RAID 1 Simple Mirroring Two or more disks contain identical data 200% minimum storage overhead No slower than a single disk Good enough for many situations
RAID 2-5 Striping with Parity no one uses RAID 2 RAID 3 is byte-level striping with a single parity disk RAID 4 is block-level striping with a single parity disk RAID 5 is block-level striping with distributed parity information RAID 5 is the most popular Require at least 3 disks and can handle 1 disk failure Parity calculation may hurt performance slightly Less overhead than RAID-0 (1/3 instead of 1/2)
RAID 6 Striping with Dual Parity Like RAID 5 but keep two parity blocks distributed Can handle 2 simultaneous disk failures
Hardware versus Software RAID Used to be that if you wanted RAID, you bought an expensive RAID controller card Now days, these cards just offload processing onto the CPU Software RAID, vis-a-vis the Linux Kernel is just as fast and is cheaper and (often) more flexible mdadm tool controls/creates/monitors arrays /dev/md0, /dev/md1, /dev/md2,...
Linux Software RAID Example on dante... $ df -h $ less /etc/fstab $ less /etc/mdadm/mdadm.conf $ cat /proc/mdstat $ sudo mdadm query detail /dev/md0
Logical Volume Management (LVM) More flexible partitioning scheme than standard schemes: LVM partitions can span multiple physical disks partitions can be resized and moved easily Physical Volume Physical Partitions Physical Extents (PEs) Logical Extents (LEs) Volume Group (VG) Logical Volumes (LVs) Filesystems
Backups In the beginning there was tar... Today: All hail rsync! rsync does backups over the network and only transfers the parts that have changed very efficient Still can take a while on big disks You can do tape/cd/dvd backups too if you want but harddisks are cheap and fast and ubiquitious... Schedule periodic backups with cron Can keep several rolling backups too...
rsync examples rsync /etc/ /backup/etc/ rsync -avc /etc/ /backup/etc/ checksum files to make sure they transferred w/o errors archival mode (preserve perms and owner) be verbose rsync -e ssh -avc /etc me@elsewhere:/backup/ encapsulation mode (rsync has it's own remote transfer protocol too)
Let's build a server... Web server Backup server Database server Email server DNS server Firewall
Let's build a server... Web server Big/Fast RAIDed /var/ RAID-5? RAID-1? reiserfs? Slow/static RAIDed? / ext3? Swap space Backup server Database server Email server DNS server Firewall
Let's build a server... Web server Backup server Big/Slow RAIDed /storage/ Raid-1? Raid-5? xfs? jfs? ext4? Slow/static / Swap space Database server Email server DNS server Firewall
Let's build a server... Web server Backup server Database server Big/fast RAIDed /var/ Slow/static / swap space regular/rolled backups Email server DNS server Firewall
Let's build a server... Web server Backup server Database server Email server Big/fast RAIDed /home/ Big/fast RAIDed /var/ Small/Slow / swap space DNS server Firewall
Let's build a server... Web server Backup server Database server Email server DNS server Slow/static / Fast memory, good connection swap space Periodic backups Firewall
Let's build a server... Web server Backup server Database server Email server DNS server Firewall Slow/static everything Small solid-state (i.e. compact flash) media Periodic backups Tripwire?