Increasing XenServer s VM density Jonathan Davies, XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 24 Oct 2013 Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 1 / 34
Outline 1 Scalability expectations 2 Hard limits 3 Soft limits 4 Benchmarks Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 2 / 34
Outline Scalability expectations 1 Scalability expectations 2 Hard limits 3 Soft limits 4 Benchmarks Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 3 / 34
Scalability expectations Users expect VM density to scale with hardware Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 4 / 34
Scalability expectations Users expect VM density to scale with hardware Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 5 / 34
Scalability expectations Users get upset when... Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 6 / 34
Scalability expectations XenServer s VM density scalability hard density limit :-( hardware's theoretical capacity XS 6.1 (and earlier) Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 7 / 34
Scalability expectations XenServer s VM density scalability hard density limit :-( hardware's theoretical capacity XS 6.1 (and earlier) XS 6.2 practical density limit (depending on nature of VMs) hard density limit :-) Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 8 / 34
Outline Hard limits 1 Scalability expectations 2 Hard limits 3 Soft limits 4 Benchmarks Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 9 / 34
Hard limits Enumerated causes of limitations Hard limit 1: dom0 event channels Cause of limitation XenServer uses a 32-bit dom0 This means 1,024 dom0 event channels #define MAX_EVTCHNS(d) \ (BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d)) Various VM functions use a dom0 event channel VM density hard limit Mitigation for XS 6.2 Mitigation for future 225 VMs per host (PV with 1 vcpu, 1 VIF, 1 VBD) 150 VMs per host (HVM with 1 vcpu, 1 VIF, 3 VBDs) Hack for dom0 to enjoy 4,096 event channels 800 VMs per host (PV with 1 vcpu, 1 VIF, 1 VBD) 570 VMs per host (HVM with 1 vcpu, 1 VIF, 3 VBDs) Change the ABI to provide unlimited event channels this would remove the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 10 / 34
Hard limits Enumerated causes of limitations Hard limit 2: blktap2 device minor numbers Cause of limitation blktap2 only supports up to 1,024 minor numbers (despite the kernel allowing up to 1,048,576) #define MAX_BLKTAP_DEVICE 1024 Each virtual block device requires one device VM density hard limit 341 VMs per host (with 3 disks per VM) Mitigation for XS 6.2 Double this constant to 2,048 682 VMs per host (with 3 disks per VM) Mitigation for future Move away from blktap2 altogether? this would remove the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 11 / 34
Hard limits Enumerated causes of limitations Hard limit 3: number of aio requests Cause of limitation VM density hard limit Each blktap2 instance creates an asynchronous I/O context for receiving 402 events. Default system-wide number of aio requests was 444,416 in XS 6.1. 368 VMs per host (with 3 disks per VM) Mitigation for XS 6.2 Set fs.aio-max-nr to 1,048,576 869 VMs per host (with 3 disks per VM) Mitigation for future Increase fs.aio-max-nr further or use storage driver domains this would remove the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 12 / 34
Hard limits Enumerated causes of limitations Hard limit 4: dom0 grant references Cause of limitation VM density hard limit Mitigation for XS 6.2 Windows VMs use receive-side copy (RSC) by default in XS 6.1. netback allocates (at least) 22 grant-table entries per virtual interface for RSC. dom0 had a total of 8,192 grant-table entries in XS 6.1. 372 VMs per host (with 1 interface per VM) Don t use RSC in Windows VMs anyway this removes the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 13 / 34
Hard limits Enumerated causes of limitations Hard limit 5: connections to xenstored Cause of limitation xenstored uses select(2), which can only listen on 1,024 file descriptors. #define FD_SETSIZE 1024 qemu opens 3 file descriptors to xenstored. VM density hard limit Mitigation for XS 6.2 Mitigation for future 333 VMs per host (HVM) Make two qemu watches share a connection 500 VMs per host (HVM) Upstream qemu doesn t connect to xenstored this will remove the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 14 / 34
Hard limits Enumerated causes of limitations Hard limit 6: connections to consoled Cause of limitation VM density hard limit Mitigation for XS 6.2 Similarly, consoled uses select(2) Each PV domain opens 3 fds to consoled 341 VMs per host (PV) Use poll(2) rather than select(2) in consoled this removes the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 15 / 34
Hard limits Hard limit 7: dom0 low memory Enumerated causes of limitations Cause of limitation VM density hard limit Mitigation for future Each running VM eats about 1MB of dom0 lowmem around 650 VMs per host Use a 64-bit dom0 this will remove the limit Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 16 / 34
Summary of hard limits Hard limits Summary of hard limits Limits on number of HVM guests with 1 vcpu, 1 VBD, 1 VIF (with PV drivers) Limitation XS 6.1 XS 6.2 Future dom0 event channels 225 800 no limit blktap minor numbers 1024 2048 no limit aio requests 1105 2608 no limit dom0 grant references 372 no limit no limit xenstored connections 333 500 no limit consoled connections no limit no limit no limit dom0 low memory 650 650 no limit Overall limit 225 500 very high Limited by event channels xenstored something else! Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 17 / 34
Summary of hard limits Hard limits Summary of hard limits Limits on number of HVM guests with 1 vcpu, 3 VBDs, 1 VIF (with PV drivers) Limitation XS 6.1 XS 6.2 Future dom0 event channels 150 570 no limit blktap minor numbers 341 682 no limit aio requests 368 869 no limit dom0 grant references 372 no limit no limit xenstored connections 333 500 no limit consoled connections no limit no limit no limit dom0 low memory 650 650 no limit Overall limit 150 500 very high Limited by event channels xenstored something else! Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 18 / 34
Summary of hard limits Hard limits Summary of hard limits Limits on number of PV guests with 1 vcpu, 1 VBD, 1 VIF Limitation XS 6.1 XS 6.2 Future dom0 event channels 225 1000 no limit blktap minor numbers 1024 2048 no limit aio requests 368 869 no limit dom0 grant references no limit no limit no limit xenstored connections no limit no limit no limit consoled connections 341 no limit no limit dom0 low memory 650 650 no limit Overall limit 225 650 very high Limited by event channels dom0 lowmem something else! Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 19 / 34
Hard limits 500 Windows VMs on a host Summary of hard limits Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 20 / 34
Outline Soft limits 1 Scalability expectations 2 Hard limits 3 Soft limits 4 Benchmarks Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 21 / 34
Soft limits xenstored High dom0 CPU utilisation by xenstored top - 16:29:33 up 36 min, 1 user, load average: 0.80, 0.56, 0.47 Tasks: 132 total, 1 running, 131 sleeping, 0 stopped, 0 zombie Cpu(s): 40.1%us, 40.0%sy, 0.0%ni, 17.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4186504k total, 443480k used, 3743024k free, 23696k buffers Swap: 524280k total, 0k used, 524280k free, 132504k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7339 root 20 0 6732 2240 840 S 80.2 0.1 0:10.22 xenstored 6665 root 20 0 4344 2636 584 S 0.4 0.1 0:04.03 fe 7225 root 20 0 48892 5356 1736 S 0.3 0.1 0:03.35 xcp-rrdd 7269 root 20 0 23704 3684 1308 S 0.3 0.1 0:03.47 xcp-rrdd-iostat 7413 root 20 0 195m 21m 8932 S 0.3 0.5 0:10.28 xapi 7283 root 20 0 7492 4860 1200 S 0.3 0.1 0:08.65 xcp-rrdd-xenpm 10938 root 20 0 29808 1856 956 S 0.3 0.0 0:00.40 v6d 16403 root 20 0 2428 1104 824 R 0.3 0.0 0:02.31 top 1 root 20 0 2164 656 564 S 0.0 0.0 0:00.83 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.01 migration/0 Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 22 / 34
Soft limits xenstored High dom0 CPU utilisation by xenstored dom0 vcpus domu vcpus Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 23 / 34
Soft limits xenstored High dom0 CPU utilisation by xenstored xenstored's dom0 vcpu other dom0 vcpus domu vcpus Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 24 / 34
Soft limits xenstored High dom0 CPU utilisation by xenstored Cause of limitation Mitigation for XS 6.2 xenstored CPU utilisation bottleneck Reduce xenstore use by XenServer s toolstack: remove some spurious writes replace polling with watching Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 25 / 34
Soft limits High dom0 CPU utilisation due to qemu qemu top - 16:40:27 up 2:07, 1 user, load average: 89.62, 87.22, 76.90 Tasks: 1015 total, 65 running, 950 sleeping, 0 stopped, 0 zombie Cpu(s): 23.4%us, 55.5%sy, 0.0%ni, 4.8%id, 0.0%wa, 0.0%hi, 15.4%si, 0.5%st Mem: 4180480k total, 1615840k used, 2564640k free, 3804k buffers Swap: 524280k total, 0k used, 524280k free, 122852k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7143 root 20 0 0 0 0 R 33.9 0.0 17:21.63 rpciod/0 6653 root 10-10 12264 7796 1152 R 31.8 0.2 36:14.34 ovs-vswitchd 16496 tcpdump 20 0 5508 2132 1248 R 10.5 0.1 5:35.12 tcpdump 16970 root 20 0 2952 1552 736 R 6.3 0.0 0:00.11 top 997 65583 20 0 24696 4732 1572 S 3.1 0.1 0:56.30 qemu-dm 3195 65684 20 0 24632 4736 1572 S 3.1 0.1 0:27.34 qemu-dm 3497 65656 20 0 24760 4740 1576 R 3.1 0.1 0:28.65 qemu-dm 3562 65685 20 0 24696 4732 1572 S 3.1 0.1 0:26.97 qemu-dm 3993 65546 20 0 24888 4744 1580 S 3.1 0.1 0:53.19 qemu-dm 7597 65659 20 0 24632 4736 1576 S 3.1 0.1 0:28.86 qemu-dm 8150 65550 20 0 24760 4740 1580 R 3.1 0.1 0:51.71 qemu-dm 8679 65627 20 0 24632 4740 1576 R 3.1 0.1 0:31.18 qemu-dm 8974 65661 20 0 24568 4736 1572 S 3.1 0.1 0:27.97 qemu-dm 11937 root 20 0 0 0 0 S 3.1 0.0 1:12.92 nfsiod 12545 65556 20 0 24824 4748 1584 S 3.1 0.1 0:58.46 qemu-dm 14053 65598 20 0 24760 4736 1576 S 3.1 0.1 0:31.33 qemu-dm 17752 65567 20 0 24952 4740 1576 S 3.1 0.1 0:56.82 qemu-dm Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 26 / 34
Soft limits qemu burning dom0 CPU qemu 200 idle Windows guests, each qemu utilising 3% of a CPU means 6 dom0 vcpus wasted! Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 27 / 34
Soft limits What is qemu busy doing? qemu Emulated device qemu events per VM per second USB 221 CD-ROM 38 Buffered I/O & RTC timer 13 Parallel port 1 Serial port 1 VNC 1 qemu monitor 1 Mitigation for XS 6.2 Use an event-channel for buffered I/O notifications Provide options to disable all emulated devices Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 28 / 34
Outline Benchmarks 1 Scalability expectations 2 Hard limits 3 Soft limits 4 Benchmarks Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 29 / 34
Booting 90 Win7 VMs Benchmarks Bootstorm 2500 Tampa (XS 6.1) Clearwater (XS 6.2) Time to fully boot 90 VMs (25 at a time) 2000 Elapsed time (s) 1500 1000 500 XS 6.2 is 60% faster 0 0 25 50 75 100 125 150 175 200 VM index Machine used: Dell PowerEdge R815 (Quad 16-core AMD Opteron 6272 @ 2.1GHz) Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 30 / 34
Booting 120 Win7 VMs Benchmarks Bootstorm 2500 Tampa (XS 6.1) Clearwater (XS 6.2) Time to fully boot 120 VMs (25 at a time) 2000 Elapsed time (s) 1500 1000 500 XS 6.2 is 75% faster 0 0 25 50 75 100 125 150 175 200 VM index Machine used: Dell PowerEdge R815 (Quad 16-core AMD Opteron 6272 @ 2.1GHz) Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 31 / 34
Booting 200 Win7 VMs Benchmarks Bootstorm 2500 2000 Tampa (XS 6.1) Clearwater (XS 6.2) Time to fully boot 200 VMs (25 at a time) XS 6.1 can't even get 200 VMs running! Elapsed time (s) 1500 1000 It took XS 6.2 just 13 minutes to boot 200 VMs (on this hardware) 500 0 0 25 50 75 100 125 150 175 200 VM index Machine used: Dell PowerEdge R815 (Quad 16-core AMD Opteron 6272 @ 2.1GHz) Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 32 / 34
Benchmarks LoginVSI LoginVSI: number of usable Windows VMs number of VMs performing acceptably XS 6.1 XS 6.2 number of VMs running Machine used: Quad 10-core Intel E7-4860 @ 2.27GHz Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 33 / 34
Questions Questions? Jonathan Davies (Citrix) Increasing XenServer s VM density 24 Oct 2013 34 / 34