Abstractions For Software-Defined Networks



Similar documents
How To Understand The Power Of The Internet

Frenetic: A Programming Language for OpenFlow Networks

Software Defined Networks

The Internet: A Remarkable Story. Inside the Net: A Different Story. Networks are Hard to Manage. Software Defined Networking Concepts

Software Defined Networking What is it, how does it work, and what is it good for?

Formal Specification and Programming for SDN

Software Defined Networking

Modular SDN Programming with Pyretic

Software-Defined Networking (SDN) enables innovation in network

SDN. What's Software Defined Networking? Angelo Capossele

SDN Programming Languages. Programming SDNs!

Software Defined Networking What is it, how does it work, and what is it good for?

SDN AND SECURITY: Why Take Over the Hosts When You Can Take Over the Network

Software Defined Networking & Openflow

What is SDN? And Why Should I Care? Jim Metzler Vice President Ashton Metzler & Associates

Open Source Network: Software-Defined Networking (SDN) and OpenFlow

Abstractions for Network Update

Languages for Software-Defined Networks

Implementation of Address Learning/Packet Forwarding, Firewall and Load Balancing in Floodlight Controller for SDN Network Management

OpenFlow Based Load Balancing

How the emergence of OpenFlow and SDN will change the networking landscape

Lithium: Event-Driven Network Control

Testing Challenges for Modern Networks Built Using SDN and OpenFlow

Applying SDN to Network Management Problems. Nick Feamster University of Maryland

A very short history of networking

Software-Defined Networking for the Data Center. Dr. Peer Hasselmeyer NEC Laboratories Europe

SDN/Virtualization and Cloud Computing

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

How the Emergence of OpenFlow and SDN will Change the Networking Landscape

OpenFlow and Onix. OpenFlow: Enabling Innovation in Campus Networks. The Problem. We also want. How to run experiments in campus networks?

Network Virtualization Solutions

The CompCert verified C compiler

Information- Centric Networks. Section # 13.2: Alternatives Instructor: George Xylomenos Department: Informatics

How To Understand The Power Of A Network In A Microsoft Computer System (For A Micronetworking)

From Active & Programmable Networks to.. OpenFlow & Software Defined Networks. Prof. C. Tschudin, M. Sifalakis, T. Meyer, M. Monti, S.

Software Defined Network Application in Hospital

CSCI-1680 So ware-defined Networking

OpenFlow: Load Balancing in enterprise networks using Floodlight Controller

Transactional Support for SDN Control Planes "

STRUCTURE AND DESIGN OF SOFTWARE-DEFINED NETWORKS TEEMU KOPONEN NICIRA, VMWARE

Central Control over Distributed Routing fibbing.net

Definition of a White Box. Benefits of White Boxes

SDN and NFV in the WAN

Project 3 and Software-Defined Networking (SDN)

Analysis of Network Segmentation Techniques in Cloud Data Centers

Designing Virtual Network Security Architectures Dave Shackleford

Leveraging SDN and NFV in the WAN

Network Virtualization Network Admission Control Deployment Guide

Improving Network Management with Software Defined Networking

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

Software-Defined Networking Architecture Framework for Multi-Tenant Enterprise Cloud Environments

Network Functions Virtualization in Home Networks

Software Defined Networking and OpenFlow: a Concise Review

BROCADE NETWORKING: EXPLORING SOFTWARE-DEFINED NETWORK. Gustavo Barros Systems Engineer Brocade Brasil

基 於 SDN 與 可 程 式 化 硬 體 架 構 之 雲 端 網 路 系 統 交 換 器

SOFTWARE-DEFINED NETWORKS

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

Global Headquarters: 5 Speen Street Framingham, MA USA P F

A Presentation at DGI 2014 Government Cloud Computing and Data Center Conference & Expo, Washington, DC. September 18, 2014.

SDN CENTRALIZED NETWORK COMMAND AND CONTROL

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

Cloud Networking Disruption with Software Defined Network Virtualization. Ali Khayam

Network Management: - SNMP - Software Defined networking

MASTER THESIS. Performance Comparison Of the state of the art Openflow Controllers. Ahmed Sonba, Hassan Abdalkreim

SOFTWARE DEFINED NETWORKS REALITY CHECK. DENOG5, Darmstadt, 14/11/2013 Carsten Michel

Virtualizing the SAN with Software Defined Storage Networks

The Business Case for Software-Defined Networking

SDN Software Defined Networks

Software Defined Networking

How Router Technology Shapes Inter-Cloud Computing Service Architecture for The Future Internet

A Look at the New Converged Data Center

Software Defined Networking and the design of OpenFlow switches

co Characterizing and Tracing Packet Floods Using Cisco R

What is SDN all about?

Testing Software Defined Network (SDN) For Data Center and Cloud VERYX TECHNOLOGIES

The Road to SDN: Software-Based Networking and Security from Brocade

Network Virtualization

Center SDN & NFV. Modern Data IN THE

Ten Things to Look for in an SDN Controller

An SDN Reality Check. Authored by. Sponsored by

How To Set Up A Net Integration Firewall

SOFTWARE-DEFINED NETWORKING AND OPENFLOW

Transport SDN - Clearing the Roadblocks to Wide-scale Commercial

Software Defined Networking

SDN and Streamlining the Plumbing. Nick McKeown Stanford University

How To Manage A Network With A Network Management System (Qoe)

The Mandate for a Highly Automated IT Function

ViSION Status Update. Dan Savu Stefan Stancu. D. Savu - CERN openlab

CARRIER LANDSCAPE FOR SDN NEXT LEVEL OF TELCO INDUSTRILIZATION?

Software Defined Networks (SDN)

Architecture of distributed network processors: specifics of application in information security systems

SDN. WHITE PAPER Intel Ethernet Switch FM6000 Series - Software Defined Networking. Recep Ozdag Intel Corporation

Enabling Practical SDN Security Applications with OFX (The OpenFlow extension Framework)

Introduction to Software Defined Networking

Trusting SDN. Brett Sovereign Trusted Systems Research National Security Agency 28 October, 2015

Transcription:

Abstractions For Software-Defined Networks Nate Foster Cornell Jen Rexford & David Walker Princeton

Software-Defined Networking The Good Logically-centralized architecture Direct control over the network Images by Billy Perkins

Software-Defined Networking The Good Logically-centralized architecture Direct control over the network The Bad Low-level programming interfaces Functionality derived from hardware Images by Billy Perkins

Software-Defined Networking The Good Logically-centralized architecture Direct control over the network The Bad Low-level programming interfaces Functionality derived from hardware Images by Billy Perkins The Ugly Program pieces don t compose Many distributed systems challenges

This talk: Outline SDN Basics Architecture Programming model Network-Wide Abstractions Global network view Network updates Modularity Composing programs Declarative policies and queries Vision Challenges Opportunities

SDN Basics

Network-Wide Abstractions

Network Updates, Formally Global Config C, Q Update u C,Q Packet Queue

Network Updates, Formally Global Config C, Q Update u C,Q Packet Queue Theorem An update u from C 1 to C 2 is per-packet consistent if and only if it preserves all properties satisfied by C 1 and C 2.

Network Updates, Formally Global Config C, Q Update u C,Q Packet Queue Theorem An update u from C 1 to C 2 is per-packet consistent if and only if it preserves all properties satisfied by C 1 and C 2.

Modularity

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICFP 11, September 19 21, 2011, Tokyo, Japan. Copyright c 2011 ACM 978-1-4503-0865-6/11/09... $10.00 reflect the official policy or position of the US Military Academy, the Department of the Army, the Department of Defense, or the US Government. Copyright 2012 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by a contractor or affiliate of the U.S. Government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. POPL 12, January 25 27, 2012, Philadelphia, PA, USA. Copyright 2012 ACM 978-1-4503-1083-3/12/01... $10.00 Frenetic [ICFP 11, POPL 12] Network Programming Language Streaming functional language no events! Declarative semantics Separates reads (queries) from writes (policy) Nate Foster Cornell University Christopher Monsanto Princeton University Abstract Frenetic: A Network Programming Language Modern networks provide a variety of interrelated services including routing, traffic monitoring, load balancing, and access control. Unfortunately, the languages used to program today s networks lack modern features they are usually defined at the low level of abstraction supplied by the underlying hardware and they fail to provide even rudimentary support for modular programming. As a result, network programs tend to be complicated, error-prone, and difficult to maintain. This paper presents Frenetic, a high-level language for programming distributed collections of network switches. Frenetic provides a declarative query language for classifying and aggregating network traffic as well as a functional reactive combinator library for describing high-level packet-forwarding policies. Unlike prior work in this domain, these constructs are by design fully compositional, which facilitates modular reasoning and enables code reuse. This important property is enabled by Frenetic s novel runtime system which manages all of the details related to installing, uninstalling, and querying low-level packet-processing rules on physical switches. Overall, this paper makes three main contributions: (1) We analyze the state-of-the art in languages for programming networks and identify the key limitations; (2) We present a language design that addresses these limitations, using a series of examples to motivate and validate our choices; (3) We describe an implementation of the language and evaluate its performance on several benchmarks. Categories and Subject Descriptors D.3.2 [Programming Languages]: Language Classifications Specialized application languages General Terms Languages, Design Keywords Network programming languages, domain-specific languages, functional reactive programming, OpenFlow 1. Introduction Today s networks consist of hardware and software components that are closed and proprietary. The difficulty of changing these components has had a chilling effect on innovation, and forced network administrators to express policies through complicated and frustratingly brittle interfaces. As discussed in recent a New York Rob Harrison Michael J. Freedman Princeton University Princeton University Jennifer Rexford Alec Story David Walker Princeton University Cornell University Princeton University Times article [30], the rise of data centers and cloud computing have brought these problems into sharp relief and led a number of networks researchers to reconsider the fundamental assumptions that underpin today s network architectures. In particular, significant momentum has gathered behind Open- Flow, a new platform that opens up the software that controls the network while also allowing packets to be processed using fast, commodity switching hardware [31]. OpenFlow defines a standard interface for installing flexible packet-forwarding rules on physical network switches using a programmable controller that runs separately on a stock machine. The most well-known controller platform is NOX [20], though there are several others [1, 8, 25, 39]. OpenFlow is supported by a number of commercial Ethernet switch vendors, and has been deployed in several campus and backbone networks. Using OpenFlow, researchers have already created a variety of controller applications that introduce new network functionality, like flexible access control [9, 33], Web server load balancing [21, 40], energy-efficient networking [22], and seamless virtual-machine migration [18]. Unfortunately, while OpenFlow and NOX now make it possible to implement exciting new network services, they do not make it easy. OpenFlow programmers must constantly grapple with several difficult challenges. First, networks often perform multiple tasks, like routing, access control, and traffic monitoring. Unfortunately, decoupling these tasks from each other and implementing them independently in separate modules is effectively impossible, since packet-handling rules (un)installed by one module often interfere with overlapping rules (un)installed by other modules. Second, the OpenFlow/NOX interface is defined at a very low level of abstraction. For example, the OpenFlow rule algebra directly reflects the capabilities of the switch hardware (e.g., bit patterns and integer priorities). Simple high-level concepts such as set difference require multiple rules and priorities to implement correctly and more powerful wildcard rules are a limited hardware resource that programmers must manage by hand. Third, controller programs only receive events for packets the switches do not know how to handle. Code that installs a forwarding rule might prevent another, different event-driven call-back from being triggered. As a result, writing programs for Open- Flow/NOX quickly becomes a difficult exercise in two-tiered programming programmers must simultaneously reason about the packets that will processed on switches and those that will be processed on the controller. Fourth, because a network of switches is a distributed system, it is susceptible to various kinds of race conditions. For example, a common NOX programming idiom is to handle the first packet of each network flow on the controller and install switch-level rules to handle the remaining packets. However, such programs can be susceptible to errors if the second, third, or fourth packets in a A Compiler and Run-time System for Network Programming Languages Compiler and Run-time System Translates high-level programs to switches Automatically manages low-level resources Christopher Monsanto Nate Foster Princeton University Cornell University Abstract Software-defined networks (SDNs) are a new kind of network architecture in which a controller machine manages a distributed collection of switches by instructing them to install or uninstall packet-forwarding rules and report traffic statistics. The recently formed Open Networking Consortium, whose members include Google, Facebook, Microsoft, Verizon, and others, hopes to use this architecture to transform the way that enterprise and data center networks are implemented. In this paper, we define a high-level, declarative language, called NetCore, for expressing packet-forwarding policies on SDNs. Net- Core is expressive, compositional, and has a formal semantics. To ensure that a majority of packets are processed efficiently on switches instead of on the controller we present new compilation algorithms for NetCore and couple them with a new run-time system that issues rule installation commands and traffic-statistics queries to switches. Together, the compiler and run-time system generate efficient rules whenever possible and outperform the simple, manual techniques commonly used to program SDNs today. In addition, the algorithms we develop are generic, assuming only that the packet-matching capabilities available on switches satisfy some basic algebraic laws. Overall, this paper delivers a new design for a high-level network programming language; an improved set of compiler algorithms; a new run-time system for SDN architectures; the first formal semantics and proofs of correctness in this domain; and an implementation and evaluation that demonstrates the performance benefits over traditional manual techniques. Categories and Subject Descriptors D.3.2 [Programming Languages]: Language Classifications Specialized application languages General Terms Languages, Design Keywords Software-defined Networking, OpenFlow, Frenetic, Network programming languages, Domain specific languages The views expressed in this paper are those of the authors and do not Rob Harrison David Walker US Military Academy Princeton University 1. Introduction A network is a collection of connected devices that route traffic from one place to another. Networks are pervasive: they connect students and faculty on university campuses, they send packets between a variety of mobile devices in modern households, they route search requests and shopping orders through data centers, they tunnel between corporate networks in San Francisco and Helsinki, and they connect the steering wheel to the drive train in your car. Naturally, these networks have different purposes, properties, and requirements. To service these requirements, companies like Cisco, Juniper, and others manufacture a variety of devices including routers (which forward packets based on IP addresses), switches (which forward packets based on MAC addresses), NAT boxes (which translate addresses within a network), firewalls (which squelch forbidden or unwanted traffic), and load balancers (which distribute work among servers), to name a few. While each of these devices behaves differently, internally they are all built on top of a data plane that buffers, forwards, drops, tags, rate limits, and collects statistics about packets at high speed. More complicated devices like routers also have a control plane that run algorithms for tracking the topology of the network and computing routes through it. Using statistics gathered from the data plane and the results computed using the device s specialized algorithms, the control plane installs or uninstalls forwarding rules in the data plane. The data plane is built out of fast, special-purpose hardware, capable of forwarding packets at the rate at which they arrive, while the control plane is typically implemented in software. Remarkably, however, traditional networks appear to be on the verge of a major upheaval. On March 11th, 2011, Deutsche Telekom,Facebook, Google,Microsoft,Verizon,and Yahoo!,owners of some of the largest networks in the world, announced the formation of the Open Networking Foundation [19]. The foundation s proposal is extraordinarily simple: eliminate the control plane from network devices. Instead of baking specific control software into each device, the foundation proposes a standard protocol that a separate, general-purpose machine called a controller can use to program and query the data planes of many cooperating devices. By moving the control plane from special-purpose devices onto stock machines, companies like Google will be able to buy cheap, commodity switches, and write controller programs to customize and optimize their networks however they choose. Networks built on this new architecture, which arose from earlier work on Ethane [4] and 4D [10], are now commonly referred to as Software-Defined Networks (SDNs). Already, several commercial switch vendors support OpenFlow [17], a concrete realization of the switch-controller protocol required for implementing SDNs, and researchers have used OpenFlow to develop new network-wide algorithms for server load-balancing, data center routing, energyefficient network management, virtualization, fine-grained access

Vision (and Challenges)

Tony Hoare s Mistake I call it my billion-dollar mistake. It was the invention of the null reference in 1965. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

Abstract Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging testing tools; D.3.2 [Programming Languages]: Language Classifications C; D.3.4 [Programming Languages]: Processors compilers General Terms Languages, Reliability Keywords compiler testing, compiler defect, automated testing, random testing, random program generation 1. Introduction The theory of compilation is well developed, and there are compiler frameworks in which many optimizations have been proved correct. Nevertheless, the practical art of compiler construction involves a morass of trade-offs between compilation speed, code quality, code debuggability, compiler modularity, compiler retargetability, and other goals. It should be no surprise that optimizing compilers like all complex software systems contain bugs. Miscompilations often happen because optimization safety checks are inadequate, static analyses are unsound, or transformations are flawed. These bugs are out of reach for current and future automated program-verification tools because the specifications that need to be checked were never written down in a precise way, if they were written down at all. Where verification is impractical, however, other methods for improving compiler quality can succeed. This paper reports our experience in using testing to make C compilers better. c ACM, 2011. This is the author s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), San Jose, CA, Jun. 2011, http://doi.acm.org/10.1145/nnnnnnn.nnnnnnn Xuejun Yang Yang Chen Eric Eide John Regehr University of Utah, School of Computing { jxyang, chenyang, eeide, regehr }@cs.utah.edu 1 1 int foo (void) { 2 signed char x = 1; 3 unsigned char y = 255; 4 return x > y; 5 } Figure 1. We found a bug in the version of GCC that shipped with Ubuntu Linux 8.04.1 for x86. At all optimization levels it compiles this function to return 1; the correct result is 0. The Ubuntu compiler was heavily patched; the base version of GCC did not have this bug. We created Csmith, a randomized test-case generator that supports compiler bug-hunting using differential testing. Csmith generates a C program; a test harness then compiles the program using several compilers, runs the executables, and compares the outputs. Although this compiler-testing approach has been used before [6, 16, 23], Csmith s test-generation techniques substantially advance the state of the art by generating random programs that are expressive containing complex code using many C language features while also ensuring that every generated program has a single interpretation. To have a unique interpretation, a program must not execute any of the 191 kinds of undefined behavior, nor depend on any of the 52 kinds of unspecified behavior, that are described in the C99 standard. For the past three years, we have used Csmith to discover bugs in C compilers. Our results are perhaps surprising in their extent: to date, we have found and reported more than 325 bugs in mainstream C compilers including GCC, LLVM, and commercial tools. Figure 1 shows a representative example. Every compiler that we have tested, including several that are routinely used to compile safety-critical embedded systems, has been crashed and also shown to silently miscompile valid inputs. As measured by the responses to our bug reports, the defects discovered by Csmith are important. Most of the bugs we have reported against GCC and LLVM have been fixed. Twenty-five of our reported GCC bugs have been classified as P1, the maximum, release-blocking priority for GCC defects. Our results suggest that fixed test suites the main way that compilers are tested are an inadequate mechanism for quality control. We claim that Csmith is an effective bug-finding tool in part because it generates tests that explore atypical combinations of C language features. Atypical code is not unimportant code, however; it is simply underrepresented in fixed compiler test suites. Developers who stray outside the well-tested paths that represent a compiler s comfort zone for example by writing kernel code or embedded systems code, using esoteric compiler options, or automatically generating code can encounter bugs quite frequently. This is a significant problem for complex systems. Wolfe [30], talking about independent software vendors (ISVs) says: An ISV with a complex code can work around correctness, turn off the optimizer in one or two files, and usually they have to do that for any of the compilers they use (emphasis ours). As another example, the front Programming Language Abstractions Many high-profile mistakes! Polymorphism + references Bounded quantification Pretty much every C compiler :-) Finding and Understanding Bugs in C Compilers

Abstract Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging testing tools; D.3.2 [Programming Languages]: Language Classifications C; D.3.4 [Programming Languages]: Processors compilers General Terms Languages, Reliability Keywords compiler testing, compiler defect, automated testing, random testing, random program generation 1. Introduction The theory of compilation is well developed, and there are compiler frameworks in which many optimizations have been proved correct. Nevertheless, the practical art of compiler construction involves a morass of trade-offs between compilation speed, code quality, code debuggability, compiler modularity, compiler retargetability, and other goals. It should be no surprise that optimizing compilers like all complex software systems contain bugs. Miscompilations often happen because optimization safety checks are inadequate, static analyses are unsound, or transformations are flawed. These bugs are out of reach for current and future automated program-verification tools because the specifications that need to be checked were never written down in a precise way, if they were written down at all. Where verification is impractical, however, other methods for improving compiler quality can succeed. This paper reports our experience in using testing to make C compilers better. c ACM, 2011. This is the author s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), San Jose, CA, Jun. 2011, http://doi.acm.org/10.1145/nnnnnnn.nnnnnnn Xuejun Yang Yang Chen Eric Eide John Regehr University of Utah, School of Computing { jxyang, chenyang, eeide, regehr }@cs.utah.edu 1 1 int foo (void) { 2 signed char x = 1; 3 unsigned char y = 255; 4 return x > y; 5 } Figure 1. We found a bug in the version of GCC that shipped with Ubuntu Linux 8.04.1 for x86. At all optimization levels it compiles this function to return 1; the correct result is 0. The Ubuntu compiler was heavily patched; the base version of GCC did not have this bug. We created Csmith, a randomized test-case generator that supports compiler bug-hunting using differential testing. Csmith generates a C program; a test harness then compiles the program using several compilers, runs the executables, and compares the outputs. Although this compiler-testing approach has been used before [6, 16, 23], Csmith s test-generation techniques substantially advance the state of the art by generating random programs that are expressive containing complex code using many C language features while also ensuring that every generated program has a single interpretation. To have a unique interpretation, a program must not execute any of the 191 kinds of undefined behavior, nor depend on any of the 52 kinds of unspecified behavior, that are described in the C99 standard. For the past three years, we have used Csmith to discover bugs in C compilers. Our results are perhaps surprising in their extent: to date, we have found and reported more than 325 bugs in mainstream C compilers including GCC, LLVM, and commercial tools. Figure 1 shows a representative example. Every compiler that we have tested, including several that are routinely used to compile safety-critical embedded systems, has been crashed and also shown to silently miscompile valid inputs. As measured by the responses to our bug reports, the defects discovered by Csmith are important. Most of the bugs we have reported against GCC and LLVM have been fixed. Twenty-five of our reported GCC bugs have been classified as P1, the maximum, release-blocking priority for GCC defects. Our results suggest that fixed test suites the main way that compilers are tested are an inadequate mechanism for quality control. We claim that Csmith is an effective bug-finding tool in part because it generates tests that explore atypical combinations of C language features. Atypical code is not unimportant code, however; it is simply underrepresented in fixed compiler test suites. Developers who stray outside the well-tested paths that represent a compiler s comfort zone for example by writing kernel code or embedded systems code, using esoteric compiler options, or automatically generating code can encounter bugs quite frequently. This is a significant problem for complex systems. Wolfe [30], talking about independent software vendors (ISVs) says: An ISV with a complex code can work around correctness, turn off the optimizer in one or two files, and usually they have to do that for any of the compilers they use (emphasis ours). As another example, the front Programming Language Abstractions Many high-profile mistakes! Polymorphism + references Bounded quantification Pretty much every C compiler :-) So language researchers have developed a body of techniques for modeling and reasoning precisely about language abstractions Operational semantics Denotational semantics Axiomatic semantics Bisimulations Finding and Understanding Bugs in C Compilers σ,c = φ [[e]] P P e e e v Γ e : τ

Abstract Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging testing tools; D.3.2 [Programming Languages]: Language Classifications C; D.3.4 [Programming Languages]: Processors compilers General Terms Languages, Reliability Keywords compiler testing, compiler defect, automated testing, random testing, random program generation 1. Introduction The theory of compilation is well developed, and there are compiler frameworks in which many optimizations have been proved correct. Nevertheless, the practical art of compiler construction involves a morass of trade-offs between compilation speed, code quality, code debuggability, compiler modularity, compiler retargetability, and other goals. It should be no surprise that optimizing compilers like all complex software systems contain bugs. Miscompilations often happen because optimization safety checks are inadequate, static analyses are unsound, or transformations are flawed. These bugs are out of reach for current and future automated program-verification tools because the specifications that need to be checked were never written down in a precise way, if they were written down at all. Where verification is impractical, however, other methods for improving compiler quality can succeed. This paper reports our experience in using testing to make C compilers better. c ACM, 2011. This is the author s version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), San Jose, CA, Jun. 2011, http://doi.acm.org/10.1145/nnnnnnn.nnnnnnn Xuejun Yang Yang Chen Eric Eide John Regehr University of Utah, School of Computing { jxyang, chenyang, eeide, regehr }@cs.utah.edu 1 1 int foo (void) { 2 signed char x = 1; 3 unsigned char y = 255; 4 return x > y; 5 } Figure 1. We found a bug in the version of GCC that shipped with Ubuntu Linux 8.04.1 for x86. At all optimization levels it compiles this function to return 1; the correct result is 0. The Ubuntu compiler was heavily patched; the base version of GCC did not have this bug. We created Csmith, a randomized test-case generator that supports compiler bug-hunting using differential testing. Csmith generates a C program; a test harness then compiles the program using several compilers, runs the executables, and compares the outputs. Although this compiler-testing approach has been used before [6, 16, 23], Csmith s test-generation techniques substantially advance the state of the art by generating random programs that are expressive containing complex code using many C language features while also ensuring that every generated program has a single interpretation. To have a unique interpretation, a program must not execute any of the 191 kinds of undefined behavior, nor depend on any of the 52 kinds of unspecified behavior, that are described in the C99 standard. For the past three years, we have used Csmith to discover bugs in C compilers. Our results are perhaps surprising in their extent: to date, we have found and reported more than 325 bugs in mainstream C compilers including GCC, LLVM, and commercial tools. Figure 1 shows a representative example. Every compiler that we have tested, including several that are routinely used to compile safety-critical embedded systems, has been crashed and also shown to silently miscompile valid inputs. As measured by the responses to our bug reports, the defects discovered by Csmith are important. Most of the bugs we have reported against GCC and LLVM have been fixed. Twenty-five of our reported GCC bugs have been classified as P1, the maximum, release-blocking priority for GCC defects. Our results suggest that fixed test suites the main way that compilers are tested are an inadequate mechanism for quality control. We claim that Csmith is an effective bug-finding tool in part because it generates tests that explore atypical combinations of C language features. Atypical code is not unimportant code, however; it is simply underrepresented in fixed compiler test suites. Developers who stray outside the well-tested paths that represent a compiler s comfort zone for example by writing kernel code or embedded systems code, using esoteric compiler options, or automatically generating code can encounter bugs quite frequently. This is a significant problem for complex systems. Wolfe [30], talking about independent software vendors (ISVs) says: An ISV with a complex code can work around correctness, turn off the optimizer in one or two files, and usually they have to do that for any of the compilers they use (emphasis ours). As another example, the front Programming Language Abstractions Many high-profile mistakes! Polymorphism + references Bounded quantification Pretty much every C compiler :-) Finding and Understanding Bugs in C Compilers So language researchers have developed a body of techniques for modeling and reasoning precisely about language abstractions Operational semantics Denotational semantics Axiomatic semantics Bisimulations Proving obvious theorems often reveals bugs [[e]] e e σ,c = φ Γ e : τ Writing down a semantics is an efficient way to communicate ideas A lot of effort has gone into making these techniques scalable! P P e v

Opportunities and Challenges

Opportunities and Challenges SDNs offer a unique opportunity to Define new abstractions for networks Develop their mathematical properties Design efficient implementations Deploy verification tools that provide assurance and avoid (the analogues of ) Hoare s mistake!

Opportunities and Challenges SDNs offer a unique opportunity to Define new abstractions for networks Develop their mathematical properties Design efficient implementations Deploy verification tools that provide assurance and avoid (the analogues of ) Hoare s mistake! Challenge #2 Want to program virtual networks Slices? Logical forwarding plane? Want to validate implementations, prove isolation properties

Opportunities and Challenges SDNs offer a unique opportunity to Define new abstractions for networks Develop their mathematical properties Design efficient implementations Deploy verification tools that provide assurance and avoid (the analogues of ) Hoare s mistake! Challenge #1 Combining conflicting policies Constraint-based policies? FML [Hinrichs+ 09] and Cologne [Liu+ 12] Challenge #2 Want to program virtual networks Slices? Logical forwarding plane? Want to validate implementations, prove isolation properties

Thank You! Collaborators Shrutarshi Basu (Cornell) Mike Freedman (Princeton) Stephen Gutz (Cornell) Rob Harrison (West Point) Chris Monsanto (Princeton) Joshua Reich (Princeton) Mark Reitblatt (Cornell) Emin Gün Sirer (Cornell) Cole Schlesinger (Princeton) Alec Story (Cornell) Jen Rexford (Princeton) David Walker (Princeton) http://frenetic-lang.org Funding