Building Elastix2.4 High Availability Clusters with DRBD and Heartbeat (using a single NIC) This information has been modified and updated by Nick Ross. Please refer to the original document found at: Changes made to this document will be explained at the very end in Appendix A. Document Last updated October 15th, 2015. Credits A great deal of credit goes out to Daniel Guevara and Amjad Jabali, who authored previous versions of this document. Daniel Guevara's document is linked above, but it appears Amjad Jabali's is offline. While I have added a great deal to this document, and made many changes, a great deal of work was done by these other authors, to the point where this document would not exist without them. Thanks for the great work guys. 1
INDEX Operational Overview What Is DRBD. eat does.4.. Credits.... 2
Operational Overview What is DRBD? 3
failover failback switchover. Equipment Overview 4
yum y update
NOTE Press t to change the partition system ID Press 3 to choose partition number Press w to save changes RESTART SERVER mke2fs j /dev/sda3 dd if=/dev/zero bs=1m count=500 of=/dev/sda3; sync yum install heartbeat drbd83 kmoddrbd83 Note: 192.168.1.243 voipserver.drbd 192.168.1.242 voipbackup.drbd global { usagecount no; } resource r0 { protocol C; startup { wfctimeout 10; degrwfctimeout 30; } #change timers to your need disk { onioerror detach; } # or panic,... net { 6
aftersb0pri discardleastchanges; aftersb1pri discardsecondary; aftersb2pri callprilostaftersb; cramhmacalg "sha1"; sharedsecret "Cent0Sru!3z"; } syncer { rate 5M; } on voipserver.drbd { device /dev/drbd0; disk /dev/sda3; address 192.168.1.242:7788; metadisk internal; } on voipbackup.drbd { device /dev/drbd0; disk /dev/sda3; address 192.168.1.243:7788; metadisk internal; } } Note: aftersb0pri discardleastchanges; aftersb1pri discardsecondary; aftersb2pri callprilostaftersb; Reference: scp /etc/drbd.conf root@voipbackup.drbd:/etc/ drbdadm createmd r0 service drbd start cat /proc/drbd 7
drbdadm overwritedataofpeer primary r0 watch n 1 cat /proc/drbd mkfs.ext3 /dev/drbd0 mkdir /replica mount /dev/drbd0 /replica drbdadm role r0 cd /replica amportal chown tar zcvf etcasterisk.tgz /etc/asterisk tar zxvf etcasterisk.tgz tar zcvf varlibasterisk.tgz /var/lib/asterisk tar zxvf varlibasterisk.tgz tar zcvf usrlibasterisk.tgz /usr/lib/asterisk/ tar zcvf varwww.tgz /var/www/ tar zxvf usrlibasterisk.tgz tar zcvf varspoolasterisk.tgz /var/spool/asterisk/ tar zxvf varspoolasterisk.tgz tar zcvf varlibmysql.tgz /var/lib/mysql/ tar zxvf varlibmysql.tgz tar zcvf varlogasterisk.tgz /var/log/asterisk/ tar zxvf varlogasterisk.tgz tar zxvf varwww.tgz rm rf /etc/asterisk rm rf /var/lib/asterisk rm rf /usr/lib/asterisk/ rm rf /var/spool/asterisk rm rf /var/www
rm rf /var/lib/mysql/ rm rf /var/log/asterisk/ ln s /replica/etc/asterisk/ /etc/asterisk ln s /replica/var/lib/asterisk/ /var/lib/asterisk ln s /replica/usr/lib/asterisk/ /usr/lib/asterisk ln s /replica/var/spool/asterisk/ /var/spool/asterisk ln s /replica/var/lib/mysql/ /var/lib/mysql ln s /replica/var/log/asterisk/ /var/log/asterisk ln s /replica/var/www /var/www cd / service mysqld restart service mysqld stop service asterisk stop service httpd stop service elastixupdaterd stop service elastixportknock stop ls /replica/ Note: drbdadm role r0 Execute df h on the primary to confirm that our /dev/drbd0 partition is Note: not display the /dev/drbd0 partition unless it s assuming primary mode. rm rf /etc/asterisk rm rf /var/lib/asterisk rm rf /usr/lib/asterisk/ umount /replica ; drbdadm secondary r0 mkdir /replica ; drbdadm primary r0 ; mount /dev/drbd0 /replica
rm rf /var/spool/asterisk rm rf /var/lib/mysql/ rm rf /var/log/asterisk/ rm rf /var/www ln s /replica/etc/asterisk/ /etc/asterisk ln s /replica/var/lib/asterisk/ /var/lib/asterisk ln s /replica/usr/lib/asterisk/ /usr/lib/asterisk ln s /replica/var/spool/asterisk/ /var/spool/asterisk ln s /replica/var/lib/mysql/ /var/lib/mysql ln s /replica/var/log/asterisk/ /var/log/asterisk ln s /replica/var/www /var/www service mysqld restart service mysqld stop service asterisk stop service httpd stop service elastixupdaterd stop service elastixportknock stop umount /replica/ ; drbdadm secondary r0 drbdadm primary r0 ; mount /dev/drbd0 /replica chkconfig drbd on Heartbeat Configuration chkconfig asterisk off chkconfig mysqld off chkconfig httpd off chkconfig elastixupdaterd off chkconfig elastixportknock off service mysqld stop service asterisk stop service httpd stop service elastixportknock stop service elastixupdaterd stop debugfile /var/log/hadebug logfile /var/log/halog 10
logfacility local0 keepalive 2 deadtime 30 warntime 10 initdead 120 udpport 694 bcast eth0 auto_failback off node voipserver.drbd node voipbackup.drbd NOTE: I've set auto_failback to off. This seems more appropriate to me. use the following command on the current secondary to switch back: sh /usr/lib/heartbeat/hb_takeover voipserver.drbd drbddisk::r0 Filesystem::/dev/drbd0::/replica::ext3 IPaddr::192.168.1.245/24/eth0/192.168.1.255 mysqld asterisk httpd elastixupdaterd elastixportknock voipserver.drbd MailTo::your@emailgoeshere.com,your@emailgoeshere.com::DRBD/HAALERT [root@voipserver.drbd ha.d]# [root@svoipbackup.drbd ha.d]# auth 1 1 sha1 MySecret chmod 600 /etc/ha.d/authkeys service heartbeat start chkconfig add heartbeat chkconfig heartbeat on drbdadm role r0 Execute h on the primary to confirm that our /dev/drbd0 partition is 11
it doesn t lose connectivity. Make Special Note: Troubleshooting: tcpdump i eth0:0 s 1500 w captura.pcap #capture traffic mv captura.pcap /var/www/html #move file to web for download Credits References Author: 12
1
1
1
cd /replica tar zcvf etcasterisk.tgz /etc/asterisk tar zxvf etcasterisk.tgz tar zcvf varlibasterisk.tgz /var/lib/asterisk tar zxvf varlibasterisk.tgz tar zcvf usrlibasterisk.tgz /usr/lib/asterisk/ tar zcvf varwww.tgz /var/www/ tar zxvf usrlibasterisk.tgz tar zcvf varspoolasterisk.tgz /var/spool/asterisk/ tar zxvf varspoolasterisk.tgz tar zcvf varlibmysql.tgz /var/lib/mysql/ tar zxvf varlibmysql.tgz tar zcvf varlogasterisk.tgz /var/log/asterisk/ tar zxvf varlogasterisk.tgz tar zxvf varwww.tgz rm rf /etc/asterisk rm rf /var/lib/asterisk rm rf /usr/lib/asterisk/ rm rf /var/spool/asterisk rm rf /var/lib/mysql/ rm rf /var/log/asterisk/ rm rf /var/www ln s /replica/etc/asterisk/ /etc/asterisk ln s /replica/var/lib/asterisk/ /var/lib/asterisk ln s /replica/usr/lib/asterisk/ /usr/lib/asterisk ln s /replica/var/spool/asterisk/ /var/spool/asterisk ln s /replica/var/lib/mysql/ /var/lib/mysql ln s /replica/var/log/asterisk/ /var/log/asterisk ln s /replica/var/www /var/www cd / 1
1
2
2
APPENDIX I IP Sourcing Part 2 The previous section ensures that external traffic will be sent from the box using the cluster IP address. What it does not do, is use the cluster IP address on the internal LAN. This could be a problem for certain equipment on your LAN. For devices that register with your asterisk PBX, the line "bindaddr=192.168.1.245" in sip_general_custom.conf will take care of the issue. HOWEVER, a problem still exists with devices that your PBX registers with. For instance, VoipServer.drbd will try registering itself to another device on the LAN using the IP address 192.168.1.242. The only solution to this problem is to specify an IP source address when trying to reach individual hosts on the network. This is not often an issue, but nevertheless is something that you may run into. To fix this, we need to implement a new service on our linux system. These steps must be implemented on both the primary and secondary servers. Step 1 Type the following command: nano /etc/init.d/pbxiprouting Step 2 Paste the code found on the following page into the editor. YOU MUST CHANGE THE IP ADDRESSES IN THE SCRIPT There are two entries. One is under start(), the other is under stop(). I've used 192.168.1.29 as an arbitrary IP address. The IP address that you use here should represent another system on the internal network that your asterisk PBX will INITIATE communication with. A good example would be an Analog Gateway Device, where your server reaches out to it in order to register. It can really be any device on the local network, aside from the servers in our drbd cluster. If you wish to do this for multiple devices, you can copy and paste, entering multiple lines with different IP addresses. Use CTRL+ O and CTRL + X to save & exit. Step 3 Enter the following command: chmod 755 /etc/init.d/pbxiprouting Step 4 Verify that the script works, with the commands: service pbxiprouting start service pbxiprouting stop Step 5 If the above works normally, the last step is to add an entry within your /etc/ha.d/haresources file. Change: (...)IPaddr::192.168.1.245/24/eth0/192.168.1.255 mysqld asterisk httpd(...) to (...)IPaddr::192.168.1.245/24/eth0/192.168.1.255 pbxiprouting mysqld asterisk httpd(...) This change ensures that the necessary routing changes are only made when the cluster is owned by THAT host. It also ensures that the routing changes are removed when the host releases the cluster.
Script for /etc/init.d/pbxiprouting #!/bin/bash # description: pbxiprouting # process name: pbxiprouting # Author: Nick Ross. /etc/init.d/functions RETVAL=0 getpid() { pid=`ps eo pid,comm grep "asterisk" awk '{ print $1 }'` } start() { echo n $"Starting PBXIPRouting: " route add host 192.168.1.29 dev eth0:0 RETVAL=0 if [ $RETVAL eq 0 ]; then touch /var/lock/subsys/pbxiprouting echo_success else echo_failure fi echo return $RETVAL } stop() { echo n $"Stopping PBXIPRouting: " route delete host 192.168.1.29 RETVAL=0 rm f /var/lock/subsys/pbxiprouting echo_success return $RETVAL } # See how we were called. case "$1" in start) start ;; stop) stop ;; status) getpid if [ n "$pid" ]; then echo "PBXIPRouting (pid $pid) is running..." else RETVAL=1 echo "PBXIPRouting is stopped" fi ;; restart) stop start ;; *) echo $"Usage: $0 {start stop status restart}" exit 1 ;; esac exit $RETVAL
APPENDIX J IPSec for DRBD If you are not using a two NIC configuration, with a secured and separate network for DRBD, its very likely that your DRBD data is vulnerable while in transit. DRBD transmits raw disk data, without any encryption. Changes to your configuration, passwords, etc., are all transmitted over the wire and vulnerable to interception. Luckily, this is very easy to secure in a linux environment, via IPSec. This will have to be done on BOTH the primary and secondary server. Step 1 Install the ipsectools package. Use the following command: yum install ipsectools Step 2 Make a file to start the ipsec connection. Use the command: nano /etc/sysconfig/networkscripts/ifcfgipsec0 Step 3 Enter the following in the test editor (this assumes you are on VoipMain.drbd): DST=192.168.1.243 TYPE=IPSEC ONBOOT=yes IKE_METHOD=PSK (note: on voipserver, you would change the DST field to DST=192.168.1.242) The DST field always contains the ip of the REMOTE server, NOT the ip of the server you are on. CTRL+O saves, CTRL+X exits the editor. Step 4 Make a key file. Type the command: nano /etc/sysconfig/networkscripts/keysipsec0 Step 5 Choose a key for the ipsec connection (change it from what I put below): Type in something like this in the editor: IKE_PSK=supersecretpassword12345! CTRL+O saves, CTRL+X exits Step 6 Secure the file by typing the following command: chmod 600 /etc/sysconfig/networkscripts/keysipsec0 Step 7 Repeat this on the secondary server. Please remember to enter the proper IP address on the secondary, and do not simply copy and paste the same IP address. See step 3 again for clarification of the DST field. Step 8 To get the tunnel working without a reboot, you'll have to start it manually. On both servers, type the command: ifup ipsec0 That's it, you are done. The ipsec connection should come online automatically when you reboot. If you'd like to verify the ipsec connection is working, you can use tcpdump like so: tcpdump n host 192.168.1.242 and host 192.168.1.243 Tcpdump should should an AH and ESP field, indicating the header and payload are protected by ipsec. It may take up to ten seconds before you see results. If you ever want to turn of ipsec, the "ifdown ipsec0" command should be executed on both hosts. To prevent IPSec from starting automatically upon boot, go back to step 3 and set ONBOOT=no.