From Electronic Design Automation to NDA: Treating Networks like Chips or Programs George Varghese With Collaborators at Berkeley, Cisco, MSR, Stanford
Networks today SQL 1001 10* P1 1* P2 Drop SQL,P2 P1 P2 Load balancing Access Control Lists (ACLs) Multiple Protocols: 6000 RFCs (MPLS, GRE...) Multiple Vendors: Broadcom, Arista, Cisco,... Manual Configurations: Additional arcane programs kept working by masters of complexity (Shenker) Crude tools: SNMP, NetFlow, TraceRoute,... 2
Simple questions hard to answer today o Which packets from A can reach B? o Is Group X provably isolated from Group Y? o Is the network causing poor performance or the server? o Why is my backbone utilization poor? o Is my load balancer distributing evenly? o Where are there mysterious packet losses? BOTTOM UP ANALYSIS OF EXISTING SYSTEMS 3
Motivation to do better Internal: o Errors often caused by configuration changes External: (2012 NANOG Network Operator Survey): o o 35% > 25 tickets per month, > 1 hour to resolve Welsh: vast majority of Google production failures due to bugs in configuration settings As we migrate to services ($100B public cloud market), network failure will be a debilitating cost. 4
Networks Tomorrow Online services latency, cost sensitive Merchant Silicon Build your own router Rise of Data centers Custom networks Software defined Networks (SDNs) custom design routing program P4 (next generation SDN) redefine hardware forwarding at runtime TOP DOWN DESIGN OF FUTURE NETWORKS TO OPTIMIZE GOAL 5
Digital Hardware Design as Inspiration? Specification Specification Functional Description (RTL) Functional Verification Logical Synthesis Static Timing Place & Route Design Rule Checking (DRC) Layout vs Schematic (LVS) Parasitic Extraction Testbench & Vectors Post Siilicon validation Policy Language, Semantics Verification (Reachabilty) Synthesis (e.g., Forwarding Rules) Performance verification? Network Topology Design Static checking (local checks) Wiring Checkers Interference estimation? Test Packet Generation Dynamic checkers/ debuggers Electronic Design Automation (McKeown SIGCOMM 2012) Network Design Automation (NDA)?
Outline Part 1: Tools for operators today o Static Analysis, Test Packet Generation o Analysis via Symbolic Execution Part 2: Tools, processes for designers & operators tomorrow. o Network Design Automation o Synthesis via Optimization 7
Many forwarding flavors/ 1 essence 10010 IP Router 10* P1 1* P2 PREFIX MATCH 01A1A2 MAC Bridge 01A1A2 P1... EXACT MATCH 5, 6 MPLS Switch 5 P1,Pop 5... INDEXED LOOKUP ESSENTIAL INSIGHT FOR OPENFLOW. USE SAME INSIGHT FOR UNDERSTANDING EXISTING PROTOCOLS 8
Idea: Treat Network as a Program Model header as point in high dimensional space and all networking boxes as transformers of header space Match 0xx1..x1 + 1 Packet Forwarding Action Send to port 3 Rewrite header 3 2 NETWORK BOX ABSTRACTED AS SET OF GUARDED COMMANDS.. NETWORK BECOMES A PROGRAM CAN USE PL METHODS 9
Header Space Framework Model all networking boxes as transformers of header space Transfer Function: 10
Computing Reachability (Kazemian et al, NSDI 12) All All Packets Packets that that A A can use can to possibly communicate send with B T -1 1 A T -1 1 Box 1 T 1 (X,A) All Packets that A can possibly send to box 2 through box 1 T -1 2 Box 2 T 2 (T 1 (X,A)) All Packets that A can possibly send to box 4 through box 1 T -1 4 Box 4 T 4 (T 1 (X,A)) Box 3 B T -1 3 T -1 3 T 3 (T 2 (T 1 (X,A)) U T 3 (T 4 (T 1 (X,A)) 11
Tool 3: Automatic Test Packet Generation (Zheng et al CoNext 12: ) As in hardware, automatically generate test packets to detect faults Different optimization from hardware testing: o o o Maximize link/queue coverage Performance (e.g., latency) not stuck-at faults Respect constraints on terminal ports Up to160x reduction over all-pairs - aspects in Microsoft Autopilot Bounded network graph allows simple set cover compared to program testing (KLEE) 12
Semantics (Plotkin et al) Symmetry Modularity New semantics that has: Symmetry Theorem: Can reduce fat tree to thin tree using a simulation and verify reachability cheaply in latter Modularity Theorem: reuse of parts of switching network 13
Tool 4: Batfish (Fogel et al, NSDI 05) So far all tools are for network data plane Need control plane tools for proactive analysis Check configuration sanity before applying to the network Check safety in the presence of certain routing changes Check back-ups are properly implemented Config files Environment Parser Control plane logic Logic solver Query solver Data plane state Provenance Query 14
Other Work Geometric Packet Classification. (SIGCOMM 1998) Static Reachability of IP Networks (INFOCOM 2005) Anteater. (SIGCOMM 2011) Veriflow. (HotSDN 2012) SAT Based Data Plane Verification (HotSDN 2012) Flowlog (HotSDN 2012) NetKat/Netcore 15
PART 2: NETWORK SYNTHESIS VIA OPTIMIZATION 16
Network Design Automation? Specification Policy Language, Semantics Verification Synthesis (e.g., Forwarding Rules) Performance verification? Network Design Test Packet Generation Early work Static checking (Local) Wiring Checkers Dynamic checkers/ debuggers HOW MIGHT WE GO BEYOND EARLY WORK? WHAT NEW AREAS CAN WE TOUCH? 17
Static Checkers: Booleans Quantities A B 3 G C Probability 1.0 0.9 D 1 2 Flow Bandwidth Given end-to-end flow rates, calculate link loads in face of failures (Juniwal et al, in progress) Given flow rate histograms, pack as many flows as possible & keep overflow probability within threshold 18
Probabilistic Knapsack (w. Bjorner, Gopalan, Karp, Kannan) Packing distributions Knapsack of size 3 Which items? Probability 1.0 0.9 Item A Item B 1.0 1.0 0.5 0.9 Item C 1 2 1 2 Correctness: Failure probability < T, e.g., T = 0.05 Performance: Find subset that minimizes expected waste Likely hard: even checking if one subset fails is exponential 1 2 19
Other Synthesis Problems Synthesizing Rules: Synthesize ACLs based on policy (Kang et al, Princeton) Synthesizing Virtual Networks: Rao et al (Purdue) & Xie at al (Princeton) Synthesizing Tables within a router: Table Synthesis P4 Routers (Jose et al, Stanford) Synthesizing Transports: Deadline driven alternatives to TCP (MSR Cambridge) 20
Interactive Debugging (AEV 15) Old Traceroute A C New Trace Bit + ID in data packets Timestamp, Router queue information collected later Existing network debuggers (MSR Sherlock, Stanford NDB, Berkeley Xtrace are Batch Debuggers What might equivalent be of setting a Watch point and then stepping into network? Example: Stepping Into by New Trace route message. Old TraceRoute not real-time. New hardware 21
Exploiting Domain Structure Technique Header Space Analysis (Symbolic Execution) Structure exploited Limited negation, no loops, small equivalence classes Net Plumber (Incremental Verification) ATPG (Test Generation) Exploiting Symmetry Network Graph, rule dependencies structure Network graph limits size of state space compared to KLEE Known symmetries because of design (vs on logical structures) 22
Conclusion Inflection Point: Rise of services, data centers, Software Defined Networks Ideas: Symbolic execution (analysis) & optimization (synthesis) Intellectual Opportunity: Rethink existing techniques exploiting domain structure Systems Opportunity Working chips with billions of gates Why not large operational networks next? 23
Collaborators MSR: Nikolaj Bjorner, Patrice Godefroid, Karthick Jayaram Nuno Lopes, Ratul Mahajan. Ming Zhang Stanford: Peyman Kazemian, Nick McKeown, James Zheng Berkeley: Garvit Juniwal, Sanjit Seshia Cisco: Mohammed Alizadeh, Tom Edsall