20. Switched Local Area Networks n Addressing in LANs (ARP) n Spanning tree algorithm n Forwarding in switched Ethernet LANs n Virtual LANs n Layer 3 switching n Datacenter networks John DeHart Based on material from Jon Turner, Roch Guerin and Kurose & Ross
Virtual LANs (VLAN) C A id=3 D n Allows hosts to be divided among different VLANs» Ethernet packets do not propagate beyond VLAN boundaries» to go between VLANs, packet must pass through a router but many switches support router-like functions that handle this VLANs often correspond to IP subnets, but need not n VLANs can increase network s traffic capacity n Packet s VLAN is identified by VLAN id carried in packet» 12 bit VLAN tag inserted just before the ethertype field» packets with VLAN id X are sent only on ports that belong to X host ports typically belong to one VLAN VLAN tag is typically added/removed by switches at host ports» routers and servers often participate in multiple VLANs in this case, endpoint adds the VLAN tag B id=7 E F 2
Ethernet Frame With VLAN Tag preamble (7 bytes) start of frame destination address pri d source address x8100 vlan id type (2 bytes) data (46-1500 bytes) CRC n Tag starts with two-byte value x8100» takes place of type field, allowing packet to be identified as tagged packet n Priority field (3 bits)» 0 for best-effort, 7 for highest priority n Drop Eligible Bit» indicates packet that can be preferentially discarded during congestion n VLAN Identifier (12 bits)» value of 0 means no vlan n Double tagging» a vlan tag that starts with 0x9100, identifies the first in a pair of tags» allows ISPs to use VLAN tags while carrying customer-tagged packets 3
VLANs and Overlay Networks n VLANs can be used to implement virtual links joining routers» configured by network managers» can be provisioned to provide guaranteed bandwidth» support for private WANs n Routers treat these much like physical links VLAN paths overlay links Ethernet switches» link rates can be configured so several overlay links can share physical links no constraint on individual link rates and several overlay links can share a single router port» makes it easy to add new links between routers, as needed» overlay link rates can be changed in response to traffic may require re-routing of some overlay links» effectively replaces router ports with cheaper switch ports 4
Some New-ish Developments n Faster STP response to topology changes» original protocol can take nearly a minute to converge on new spanning tree after a link fails» Rapid Spanning Tree Protocol cuts time to under 10 seconds n Computing multiple spanning trees» when VLANs were first introduced, all VLANs used the same spanning tree manual configuration required to use all available links» Multiple Spanning Tree Protocol allows automatic configuration of multiple subtrees (that is, may restrict a tree to a region) VLANs are mapped to trees (so several VLANs may share a tree) n Shortest Path Bridging (standardized in 2012)» link-state protocol (based on IS-IS) switches distribute topology information, compute shortest-pathtrees and configure local routing tables» Utilizes more of the network paths than STP allowed 5
MPLS MPLS header edge router n Multiprotocol Label Switching extends IP to provide virtual links within IP networks» MPLS-only switches are potentially less expensive than routers» MPLS features often added to standard routers» allows finer-grained management of traffic than IP routing alone n MPLS headers added by edge routers (before IP header) n Core routers switch packets using s in MPLS headers» s used to select entries in MPLS routing tables» packets may contain multiple stacked headers» routing table entries can be configured to select output based on, replace value with another, push/pop headers core router 6
Base MPLS Label Distribution n Relies on fact that all routers in a domain share the same routing table» Routers distribute s together with route entries Bandwidth can be included in modified routing protocols» IP packets are pre-pended with corresponding by ingress routers» Routers swap s at each hop based on downstream mapping» Egress router removes before forwarding the IP packet R6 R5 R4 modified link state flooding D A 7
Base MPLS Label Distribution Src In Dest R6 10 A 0 12 D 0 R5 8 A 1 iface In Dest 10 6 A 1 12 9 D 0 iface R6 R5 In R4 1 Dest 8 6 A 0 0 R2 iface R3 0 0 1 In D 0 Dest 6 - A 0 A iface 8
Base MPLS Label Distribution DA=A Src In Dest R6 10 A 0 12 D 0 R5 8 A 1 iface In Dest 10 6 A 1 12 9 D 0 iface DA=A R6 R5 In DA=A DA=A 1 0 R4 R3 L=8,DA=A Dest 8 6 A 0 L=10,DA=A R2 iface 0 0 1 L=6,DA=A In D L=6,DA=A 0 Dest 6 - A 0 DA=A A DA=A iface 9
Base MPLS Label Distribution n Relies on fact that all routers in a domain share the same routing table» Routers distribute s together with route entries» IP packets are pre-pended with corresponding by ingress routers» Routers swap s at each hop based on downstream mapping» Egress router removes before forwarding the IP packet <L2,R1> <L6,R1> <L5,R1> <L2,R1> <L6,R1> <L4,R1> <L2,R1> <L6,R1> <L2,R1> <L4,R1> <L4,R1> R1:158.130.0.0/16 10
Base MPLS Packet Forwarding n Relies on fact that all routers in a domain share the same routing table» Routers distribute s together with route entries» IP packets are pre-pended with corresponding by ingress routers» Routers swap s at each hop based on downstream mapping» Egress router removes before forwarding the IP packet <L2, 158.130.14.67> Packet to: 158.130.14.67 <L6, 158.130.14.67> <L4, 158.130.14.67> <L2, 158.130.14.67> <L6, 158.130.14.67> <L4, 158.130.14.67> <L2, 158.130.14.67> <L6, 158.130.14.67> <L4, 158.130.14.67> <L2, 158.130.14.67> <L4, 158.130.14.67> <158.130.14.67> 11
Explicit Routing n Overcome the shortest path, destination-based constraint of standard IP forwarding» Greater flexibility and control in distributing traffic across links Packets to: 158.130.0.0/16 R1 R2 R3 IP Forwarding <158.130.14.67> 12
Explicit Routing n Overcome the shortest path, destination-based constraint of standard IP forwarding» Greater flexibility and control in distributing traffic across links Packets to: 158.130.0.0/16 R1 R2 R3 From R1 MPLS Forwarding From R2 From R3 <158.130.14.67> 13
Layer 3 Switching n Many switches have extensive support for IP routing» routing features first added to connect subnets in different VLANs» feature sets have expanded as way to add value to products n Example features» IP forwarding, ARP, ICMP,, RIP,...» Diffserv QoS with 8 queues per link» IGMP support (snooping and querier functions)» Access Control lists (firewall functions) n Getting harder to distinguish routers and switches» routers support different kinds of layer 2 links (not just Ethernet) and support multiple L3 protocols (not just IP)» routers have more extensive feature sets, more configurable» routers have larger routing tables & buffers, flexible queueing» switches generally less expensive
Data Center Networks n 10 s to 100 s of thousands of hosts, often closely coupled, in close proximity:» e-business (e.g. Amazon)» content-servers (e.g., YouTube, Akamai, Apple, Microsoft)» search engines, data mining (e.g., Google) n Challenges» multiple applications, each serving many clients» managing/balancing load» multiple tenants must keep tenants isolated actions of one tenant must not interfere with another Inside a 40-ft Microsoft container, Chicago data center
Scaling Up n Example» Each rack has 32 servers each with 32 cores TOR switch with 1G and/or 10G links» 32 rack pod has 1024 servers, and 32K cores» 64 pod configuration has 32K servers, 1M cores n Networking challenges backbone switch pod switch Top-Of-Rack switch servers» addressing too many L2 addresses for typical switches» must balance load across many paths Ethernet routing too restrictive (even with advanced routing features) basic options: per packet load-balancing, flow-based load balancing» requires new class of data center switches n Some analysis of Amazon s AWS scale:......» http://www.enterprisetech.com/2014/11/14/rare-peek-massive-scale-aws/...... pod...
Blurring the L2/L3 Boundary n Market forces are pushing Ethernet beyond the LAN» traditionally, large volume of switch market has kept prices low» smaller sales volume for routers kept prices high n Technology improvements support greater capability» early switches were simple and cheap (no VLANs, a few thousand routing table entries)» modern switches can have >100K routing table entries and support thousands of VLANs, extensive IP features,...» market expectations have kept prices moderate, even as feature sets have ballooned n Not clear how successful some advanced features will be» big attraction of Ethernet is simple operation, but advanced features compromise simplicity» challenge for new switches to interoperate with legacy switches 17
Putting It Together An End-to-End Scenario browser DNS server Comcast network 68.80.0.0/13 school network 68.80.2.0/24 web page web server 64.233.169.105 Google s network 64.233.160.0/19 18
A day in the life connecting to the Internet connecting laptop needs to get its own IP address, addr of UDP IP Eth Phy UDP IP Eth Phy router (runs ) first-hop router, addr of DNS server: use request encapsulated in UDP, encapsulated in IP, encapsulated in 802.3 Ethernet Ethernet frame broadcast (dest: FFFFFFFFFFFF) on LAN, received at router running server Ethernet demuxed to IP demuxed, UDP demuxed to 19
A day in the life connecting to the Internet UDP IP Eth Phy UDP IP Eth Phy router (runs ) n server formulates ACK containing client s IP address, IP address of first-hop router for client, name & IP address of DNS server encapsulation at server, frame forwarded (switch learning) through LAN, demultiplexing at client client receives ACK reply Client now has IP address, knows name & addr of DNS server, IP address of its first-hop router 20
A day in the life ARP (before DNS, before HTTP) before sending HTTP request, need IP address of www.google.com: DNS DNS DNS DNS UDP DNS IP DNS query created, encapsulated ARP Eth Phy ARP query in UDP, encapsulated in IP, encapsulated in Eth. To send frame to router, need MAC address of router interface: ARP ARP reply ARP Eth Phy router (runs ) ARP query broadcast, received by router, which replies with ARP reply giving MAC address of router interface client now knows MAC address of first hop router, so can now send frame containing DNS query 21
A day in the life using DNS DNS DNS DNS DNS DNS UDP IP Eth Phy DNS DNS DNS DNS DNS UDP IP Eth Phy DNS server DNS router (runs ) IP datagram containing DNS query forwarded via LAN switch from client to 1 st hop router Comcast network 68.80.0.0/13 IP datagram forwarded from campus network into comcast network, routed (tables created by OSPF, IS-IS, EIGRP and/or BGP routing protocols) to DNS server demux ed to DNS server DNS server replies to client with IP address of www.google.com 22
A day in the life TCP connection carrying HTTP HTTP SYNACK SYNACK SYNACK HTTP TCP IP Eth Phy SYNACK SYNACK SYNACK TCP IP Eth Phy web server 64.233.169.105 router (runs ) to send HTTP request, client first opens TCP socket to web server TCP SYN segment (step 1 in 3- way handshake) inter-domain routed to web server web server responds with TCP SYNACK (step 2 in 3-way handshake) TCP connection established! 23
A day in the life HTTP request/reply HTTP HTTP HTTP HTTP HTTP TCP IP Eth Phy web page finally (!!!) displayed HTTP request sent into TCP socket HTTP HTTP HTTP HTTP HTTP TCP IP Eth Phy router (runs ) IP datagram containing HTTP request routed to www.google.com web server responds with HTTP reply (containing web page) web server 64.233.169.105 IP datagram containing HTTP reply routed back to client 24
Exercise 1. In the diagram at right, suppose that host d is on vlan 7 at switch D, host f is on vlan 3 at switch F and router c is on switch C and has connections to both VLANs. What sequence of links is used by a packet going from d to f assuming no other routers? c C B A id=7 id=3 E D f d F 25
Exercise 1. In the diagram at right, suppose that host d is on vlan 7 at switch D, host f is on vlan 3 at switch F and router c is on switch C and has connections to both VLANs. What sequence of links is used by a packet going from d to f assuming no other routers? f The packet would need to be delivered to router c so that it can be forwarded from vlan 7 to vlan 3. Assuming that host d knows the MAC address of router c and that entries are present in the switch forwarding tables of vlan 7 for that MAC address, the packet from host d is forwarded on links D-F, F-E, E-B, and B-C. Assuming that router c knows the MAC address of host f and that that entries are present in the switch forwarding tables of vlan 3 for that MAC address, the packet from host d is forwarded on links C-A, A-E, and E-F. c C B A id=7 id=3 E D d F 26
Exercise 2. The diagram at right represents a core network for some ISP. Assume all the nodes are MPLS switches and that each connects to one or more edge routers. Describe how MPLS can be used to distribute traffic B A F between switches C and F to use two different paths. Show MPLS routing table entries for all the switches along these paths, using different s on each hop. Can you spread the load like this if the nodes were all conventional routers, using OSPF-routing? C E D
Exercise 2. The diagram at right represents a core network for some ISP. Assume all the nodes are MPLS switches and that each connects to one or more edge routers. Describe how MPLS can be used to distribute traffic B A F between switches C and F to use two different paths. Show MPLS routing table entries for all the switches along these paths, using different s on each hop. Can you spread the load like this if the nodes were all conventional routers, using OSPF-routing? Two link disjoint paths between C and F are C-E-D-F and C-B-A-F. Two realize those paths, we need two distinct s (or sets of s). C would have two s, say L1 and L2, associated with subnets connected to F. Each would point to a different next hop, e.g., L1 points to E and L2 to B. Next would need associated mappings at the intermediate MPLS switches on each path. Specifically, B would have an entry mapping incoming L2 to next hop A with an outgoing of L2 A (which could be equal to L2), similarly, A would have an entry mapping incoming L2 A to next hop F and outgoing L2 F. A similar set of mappings would be present on the path C-E-D-F for L1. In this case, a similar outcome could be realized with OSPF, if both paths were equal cost shortest paths. In this case, traffic would be load-balanced across both paths. C E D 28