SCLP: Segment-oriented Connection-less Protocol for High-Performance Software Tunneling in Datacenter Networks Ryota Kawashima Shin Muramatsu Hiroki Nakayama Tsunemasa Hayashi Hiroshi Matsuo Nagoya Institute of Technology, BOSCO Technologies, Inc.
The Goal Improving performance of overlay-based virtual networks Our Proposal SCLP VXLAN (UDP) (SCLP) Geneve (SCLP) (UDP) 1
Outline v Backgrounds Ø Network Virtualization Ø Tunneling Protocols Ø L4 protocol characteristics v Proposal Ø SCLP (Segment-oriented Connection-less Protocol) v Evaluation Ø -to- communication using VXLAN over SCLP 2
Network Virtualization v Multi-tenant Datacenter Networks l Each tenant can have its own virtual networks l Each virtual network shares the physical network resources Virtual networks Tenant 2 Tenant 1 Tenant 3 Physical network 3
The Overlay-based Approach v NVO3: Network Virtualization Overlays l RFC 7364, 7365 NVE : Network Virtualization Edge NVE Virtual Network NVE Virtual Network L3 tunnel Physical network 4
Outline v Backgrounds Ø Network Virtualization Ø Tunneling Protocols Ø L4 protocol characteristics v Proposal Ø SCLP (Segment-oriented Connection-less Protocol) v Evaluation Ø -to- communication using VXLAN over SCLP 5
Tunneling Protocols v L2-in-L3 Tunneling Ether Ether L2-L4 hdrs Virtual switch Ether L3 tunnel Ether Virtual switch Physical server Ether Physical server 6
Major Tunneling Protocols v VXLAN (RFC 7348) l UDP based l Linux kernel, OVS, ware NSX, Cisco Nexus 1000V Ethernet (Physical) IP (Physical) UDP VXLAN Ethernet (Virtual) Payload FCS v NVGRE (RFC draft) l GRE based (no L4 protocol) l Microsoft Hyper-V Ethernet (Physical) IP (Physical) NVGRE Ethernet (Virtual) Payload FCS 7
Upcoming Tunneling Protocol v Geneve (RFC draft) l UDP based l TLV option header l H/W segmentation offload (future) Ethernet (Physical) IP (Physical) UDP Geneve Opt. Ethernet (Virtual) Payload FCS 8
Yet Another Tunneling Protocol v STT (Stateless Transport Tunneling, RFC draft) l Pseudo-TCP header Ø Exploiting TSO (TCP Segmentation Offload) feature Ø Semantics of header fields are modified l ware NSX Ethernet (Physical) IP (Physical) ptcp STT Ethernet (Virtual) Payload FCS Protocol number is 6 (TCP) "This is a usual TCP packet!" NIC 9
Problems of Existing Protocols v Performance l VXLAN, NVGRE, Geneve Maximum throughput falls to one-half! v Compatibility l STT Middleboxes can discard STT packets! 10
Outline v Backgrounds Ø Network Virtualization Ø Tunneling Protocols Ø L4 protocol characteristics v Proposal Ø SCLP (Segment-oriented Connection-less Protocol) v Evaluation Ø -to- communication using VXLAN over SCLP 11
L4 Protocol Types v Message-oriented (e.g. UDP) l Packets are independent of each other v Segment-oriented (e.g. TCP) l Each packet has byte-level sequence l Consecutive packets can be reassembled 12
Why is L4 Protocol Important? Message-oriented Segment-oriented A sends a large Ethernet frame A sends a large Ethernet frame The frame is encapsulated and divided to multiple packets The frame is encapsulated and divided to multiple packets Each packet is decapsulated and forwarded to a destination The handles lots of frames Consecutive packets are reassembled Each reassembled packet is decapsulated and forwarded to a destination The handles fewer frames 13
Packet Structure Example (Tx) 14
Packet Structure Example (Rx) 15
Outline v Backgrounds Ø Network Virtualization Ø Tunneling Protocols Ø L4 protocol characteristics v Proposal Ø SCLP (Segment-oriented Connection-less Protocol) v Evaluation Ø -to- communication using VXLAN over SCLP 16
Our Proposal v SCLP: Segment-oriented Connection-less Protocol l L4 protocol l Segment-oriented l Connection-less l Usage: Outer L4 protocol of existing tunneling protocols Ø e.g.) VXLAN over SCLP, Geneve over SCLP Ethernet (Physical) IP (Physical) UDP VXLAN Ethernet (Virtual) Payload FCS Ethernet (Physical) IP (Physical) SCLPVXLAN Ethernet (Virtual) Payload FCS 17
Protocol Format v Identification: Original payload ID v F: First Segment Flag v Remaining: Remaining payload size 18
How SCLP Works (Tx) Original Payload (3000 bytes) Segmentation (MSS=1468) 'identification': 0x12345678 SCLP Payload 1 (1468 bytes) SCLP Payload 2 (1468 bytes) SCLP Payload 3 (64 bytes) 'F': 1 'remaining': 1532 'F': 0 'remaining': 64 'F': 0 'remaining': 0 19
How SCLP Works (Rx) id : - size : 0 offset : 0 id : 0x12345678 size : 3000 offset : 1468 id : 0x12345678 size : 3000 offset : 2936 id : 0x12345678 size : 3000 offset : 3000 Payload Buffer (NULL) Received payload 1 (id = 0x12345678, length = 1468, F = 1, remaining = 1532) 1468 Received payload 2 (id = 0x12345678, length = 1468, F = 0, remaining = 64) 2936 Received payload 3 (id = 0x12345678, length = 64, F = 0, remaining = 0) 3000 offset == size (len + rem) (normal order!) 20
2-Level Pre-reassembling v 1st: GRO (Generic Receive Offload) Protocol stack GRO Reassembling NIC driver v 2nd: NVE s decapsulation processing NVE Protocol stack Reassembling & Decapsulation 21
Implementation v VXLAN over SCLP l CVSW component l Virtual NIC implementation of NVE Applications Protocol stack CVSW Encapsulation Decapsulation v GSO/GRO offloading l Linux kernel module virtio OVS https://github.com/sdnnit/cvsw_net Protocol stack Offload module NIC driver Physical server GSO (SCLP) GRO (SCLP) 22
Outline v Backgrounds Ø Network Virtualization Ø Tunneling Protocols Ø L4 protocol characteristics v Proposal Ø SCLP (Segment-oriented Connection-less Protocol) v Evaluation Ø -to- communication using VXLAN over SCLP 23
Evaluation v Throughput of -to- communication with Iperf 1. TCP communication 2. Effect of 2-level pre-reassembling v Competitors l VXLAN (over UDP) l NVGRE l STT l Geneve (w/o HW offloading) 24
Evaluation Environment 25
Evaluation Results (TCP) v TCP communication 26
Evaluation Results (Pre-reassembling) GRO s reassembling NVE s reassembling 27
Conclusion v Network virtualization l Overlay-based approach has become popular l VXLAN is a de-facto tunneling protocol l UDP-based tunneling has performance problems v Proposal: SCLP l Segment-oriented and connection-less L4 protocol l 2-level pre-reassembling before decapsulation l STT-comparable performance v Future work l Implementation of OVS-based SCLP l Open source 28
NVE: Network Virtualization Edge v Tunnel End-Point l Physical switches l Virtual switches Ø Open vswitch (OVS), NSX switch, Hyper-V virtual switch NVE Encapsulates/ Decapsulates VNI VNI Overlay Module L3 tunnel Logical ports 29
Evaluation Results (UDP) v UDP communication 30
Offloading Effects (STT) Bandwidth [Mbps] 14000 12000 10000 8000 6000 4000 2000 ALL:on TSO:off GSO:off GRO:off ALL:off Offload Tx / Rx NIC / Kernel TSO Tx NIC GSO Tx Kernel GRO Rx Kernel GRO effect! 0 4096 8192 16384 32768 65536 Packet Size [bytes] 31