Euler: A System for Numerical Optimization of Programs



Similar documents
Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

MSc in Autonomous Robotics Engineering University of York

Introduction to Logistic Regression

WISE Sampling Distribution of the Mean Tutorial

Clustering and scheduling maintenance tasks over time

GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1

AP Physics 1 and 2 Lab Investigations

SUCCESSFUL PREDICTION OF HORSE RACING RESULTS USING A NEURAL NETWORK

A Confidence Interval Triggering Method for Stock Trading Via Feedback Control

A Confidence Interval Triggering Method for Stock Trading Via Feedback Control

CATIA V5 Tutorials. Mechanism Design & Animation. Release 18. Nader G. Zamani. University of Windsor. Jonathan M. Weaver. University of Detroit Mercy

InvGen: An Efficient Invariant Generator

Supplement to Call Centers with Delay Information: Models and Insights

How Programmers Use Internet Resources to Aid Programming

DYNAMIC RANGE IMPROVEMENT THROUGH MULTIPLE EXPOSURES. Mark A. Robertson, Sean Borman, and Robert L. Stevenson

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

A Learning Algorithm For Neural Network Ensembles

Scenario: Optimization of Conference Schedule.

YOU CAN COUNT ON NUMBER LINES

Chapter 6. The stacking ensemble approach

AMS 5 CHANCE VARIABILITY

Assignment 2: Option Pricing and the Black-Scholes formula The University of British Columbia Science One CS Instructor: Michael Gelbart

The Scientific Data Mining Process

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

USING DIRECTED ONLINE TUTORIALS FOR TEACHING ENGINEERING STATISTICS

Experiments on the local load balancing algorithms; part 1

Monotonicity Hints. Abstract

TEST 2 STUDY GUIDE. 1. Consider the data shown below.

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

Introduction to programming moway

Mobile Robotics I: Lab 2 Dead Reckoning: Autonomous Locomotion Using Odometry

1 Error in Euler s Method

Practical Time Series Analysis Using SAS

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Fixed Point Theorems

Metrics on SO(3) and Inverse Kinematics

Euler s Method and Functions

Discrete mechanics, optimal control and formation flying spacecraft

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 2. Software: (preview draft) Getting Started with Stella and Vensim

Estimating the Average Value of a Function

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Stochastic Processes and Queueing Theory used in Cloud Computer Performance Simulations

Solving Simultaneous Equations and Matrices

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach

Adaptive Online Gradient Descent

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Polynomials. Dr. philippe B. laval Kennesaw State University. April 3, 2005

Psy 210 Conference Poster on Sex Differences in Car Accidents 10 Marks

Virtual Landmarks for the Internet

Problem Solving and Data Analysis

Stochastic Gradient Method: Applications

Introduction to time series analysis

A Computational Framework for Exploratory Data Analysis

Adequate Theory of Oscillator: A Prelude to Verification of Classical Mechanics Part 2

Shroudbase Technical Overview

William E. Hart Carl Laird Jean-Paul Watson David L. Woodruff. Pyomo Optimization. Modeling in Python. ^ Springer

Runtime Hardware Reconfiguration using Machine Learning

CHAPTER 5 PREDICTIVE MODELING STUDIES TO DETERMINE THE CONVEYING VELOCITY OF PARTS ON VIBRATORY FEEDER

ARTIFICIAL NEURAL NETWORKS FOR ADAPTIVE MANAGEMENT TRAFFIC LIGHT OBJECTS AT THE INTERSECTION

Chemotaxis and Migration Tool 2.0

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

Table of contents. 1. About the platform MetaTrader 4 platform Installation Logging in 5 - Common log in problems 5

Stochastic Inventory Control

Path Tracking for a Miniature Robot

6.4 Normal Distribution

Unit 1: INTRODUCTION TO ADVANCED ROBOTIC DESIGN & ENGINEERING

VISUAL GUIDE to. RX Scripting. for Roulette Xtreme - System Designer 2.0

Least-Squares Intersection of Lines

Competitive Analysis of On line Randomized Call Control in Cellular Networks

Make Better Decisions with Optimization

Models of Cortical Maps II

Big Data & Scripting Part II Streaming Algorithms

Chapter 14 Managing Operational Risks with Bayesian Networks

24. The Branch and Bound Method

PSTricks. pst-ode. A PSTricks package for solving initial value problems for sets of Ordinary Differential Equations (ODE), v0.7.

Section 3.7. Rolle s Theorem and the Mean Value Theorem. Difference Equations to Differential Equations

Figure 1. A typical Laboratory Thermometer graduated in C.

Multiple Linear Regression in Data Mining

Factoring Algorithms

ENGINEERING COUNCIL DYNAMICS OF MECHANICAL SYSTEMS D225 TUTORIAL 1 LINEAR AND ANGULAR DISPLACEMENT, VELOCITY AND ACCELERATION

Multi-Robot Tracking of a Moving Object Using Directional Sensors

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Grade 6 Mathematics Performance Level Descriptors

Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

An Overview Of Software For Convex Optimization. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM

Step 3: Go to Column C. Use the function AVERAGE to calculate the mean values of n = 5. Column C is the column of the means.

Knowledge Discovery from patents using KMX Text Analytics

Solutions to Problem Set 1

STATISTICAL DATA ANALYSIS COURSE VIA THE MATLAB WEB SERVER

SOLIDWORKS SOFTWARE OPTIMIZATION

2.2 Creaseness operator

Mean-Shift Tracking with Random Sampling

Transcription:

Euler: A System for Numerical Optimization of Programs Swarat Chaudhuri 1 and Armando Solar-Lezama 2 1 Rice University 2 MIT Abstract. We give a tutorial introduction to Euler, a system for solving difficult optimization problems involving programs. 1 Introduction This paper is a tutorial introduction to Euler, a system for solving unconstrained optimization problems of the following form: Let P be a function that is written in a standard C-like language, and freely uses control constructs like loops and conditional branches. Find an input x to P such that the output P (x) of P is minimized with respect to an appropriate distance measure on the space of outputs of P. Many problems in software engineering are naturally framed as optimization questions like the above. While it may appear at first glance that a standard optimization package could solve such problems, this is often not so. For one, white-box optimization approaches like linear programming are ruled out here because the objective functions that they permit are too restricted. As for blackbox numerical techniques like gradient descent or simplex search, they are applicable in principle, but often not in practice. The reason is that these methods work well only in relatively smooth search spaces; in contrast, branches and loops can cause even simple programs to have highly irregular, ill-conditioned behavior [1] (see Sec. 4 for an example). These challenges are arguably why numerical optimization has found so few uses in the world of program engineering. The central thesis of the line of work leading to Euler is that program approximation techniques from program analysis can work together with blackbox optimization toolkits, and make it possible to solve problems where programs are the targets of optimization. Thus, programs do not have to be black boxes, but neither do they have to fall into the constricted space of what is normally taken to be white-box optimization. Specifically, the algorithmic core of Euler is smooth interpretation [1, 2], a scheme for approximating a program by a series of smooth mathematical functions. Rather than the original program, it is these smooth approximations that are used as the targets of numerical optimization. As we show in Sec. 4, the result is often vastly improved quality of results. This work was supported by NSF Awards #1156059 and #1116362.

double parallel () { Error = 0.0; for(t = 0; t < T; t += dt) { if (stage==straight) { // < Drive in reverse if (t >??) stage= INTURN; } // Parameter t 1 if (stage==inturn) { // < Turn the wheels towards the curb car.ang = car.ang??; // Parameter A 1 if (t >??) stage= OUTTURN; } // Parameter t 2 if (stage==outturn) { // < Turn the wheels away from the curb car.ang = car.ang +??; } // Parameter A 2 simulate_car(car ); } Error = check_destination(car ); // < Compute the error as the difference // between the desired and actual return Error ; // positions of the car } Fig. 1. Sketch of a parallel parking controller 2 Using Euler Programming Euler: Parallel parking Euler can be downloaded from http://www.cs.rice.edu/~swarat/euler. Now we show to use the system to solve a program synthesis problem that reduces to numerical optimization. The goal here is to design a controller for parallel-parking a car. The programmer knows what such a controller should do at a high level: it should start by driving the car in reverse, then at time t 1, it should start turning the wheels towards the curb (let us say at an angular velocity A 1 ) and keep moving in reverse. At a subsequent time t 2, it should start turning the wheels away from the curb (at velocity A 2 ) until the car reaches the intended position. However, the programmer does not know the optimal values of t 1, t 2, A 1, and A 2, and the system must synthesize these values. More precisely, let us define an objective function that determines the quality of a parallel parking attempt in terms of the error between the final position of the car and its intended position. The system s goal is to minimize this function. To solve this problem using Euler, we write a parameterized program a sketch that reflects the programmer s partial knowledge of parallel parking. 3 The core of this program is the function parallel shown in Fig. 1. For space reasons, we omit the rest of the program however, the complete sketch is part of the Euler distribution. It is easy to see that parallel encodes our partial knowledge of parallel parking. The terms?? in the code are holes that correspond, in order, to the parameters t 1, A 1, t 2, and A 2, and Euler will find appropriate values for them. Note that we can view parallel as a function parallel(t 1, A 1, t 2, A 2 ). This is the function that Euler is to minimize. 3 The input language of Euler is essentially the same as in the Sketch programming synthesis system [4].

Occasionally, a programmer may have some insights about the range of optimal values for a program parameter. These may be communicated using holes of the form??(p,q), where (p,q) is a real interval. If such an annotation is offered, Euler will begin its search for the parameter from the region (p, q). Note, however, that this is no more than a hint to guide the search. Euler performs unconstrained optimization, and it is not guaranteed that the value finally found for the parameter will lie in this region. We leave the task of extending Euler to constrained optimization for future work. Running Euler Suppose we have built Euler and set up the appropriate library paths (specifically, Euler requires Gsl the GNU Scientific Library), and that our sketch of the parallel parking controller has been saved in a file parallelpark.sk. To perform our optimization task, we compile the file by issuing the command $ euler parallelpark.sk". This produces an executable parallelpark.out. Upon running this executable ("$./parallelpark.out ), we obtain an output of the following form: Parameter #1: -37.0916; Parameter #2: 19.4048; Parameter #3: -41.1728; Parameter #4: 1.11344; Optimal function value: 10.6003 That is, the optimal value for t 1 is -37.0916, that for A 1 is 19.4048, and so on. As mentioned earlier, Euler uses a combination of smooth interpretation and a blackbox optimization method. In the present version of Euler, the latter is fixed to be the Nelder-Mead simplex method [3], a derivative-free nonlinear optimization technique. However, we also allow the programmer to run, without the support of smoothing, every optimization method available in Gsl. For example, to run the Nelder-Mead method without smoothing, the user issues the command $./parallelpark.out -nosmooth -method neldermead." For other command-line flags supported by the tool, run "$./parallelpark.out -help. 3 System internals Now we briefly examine the internals of Euler. First, we recall the core ideas of smooth interpretation [1, 2]. The central idea of smooth interpretation is to transform a program via Gaussian smoothing, a signal processing technique for attenuating noise and discontinuities in real-world signals. A blackbox optimization method is now applied to this smoothed program. Consider a program whose denotational semantics is a function P : R k R: on input x R k, the program terminates and produces the output P (x). Also, let us consider Gaussian functions N x,β with mean x R k and a fixed standard deviation β > 0. Smooth interpretation aims to compute a function P equaling the convolution of P β and N x,β : P β (x) = y R k P (y) N x,β (y) dy. For example, consider the program z := 0; if (x 1 > 0 x 2 > 0) then z := z 2 where z is the output and x 1 and x 2 are inputs. The (discontinuous)

semantic function of the program is graphed in Fig. 2-(a). Applying Gaussian convolution to this function gives us a smooth function as in Fig. 2-(b). As computing the exact convolution of an arbitrary program is undecidable, any algorithm for program smoothing must introduce additional approximations. In prior work, we gave such an algorithm [1] this is what Euler implements. z (a) (b) z x2 x1 x2 x1 We skip the details of this algorithm here. However, it is worth noting that function P β is parame- Fig. 2. (a) A discontinuous program (b) terized by the standard deviation β Gaussian smoothing of Nx,β. Intuitively, β controls the extent of smoothing: higher values of β lead to greater smoothing and easier numerical search, and lower values imply closer correspondence between P and P β, and therefore, greater accuracy of results. Finding a good value of β thus involves a tradeoff. Euler negotiates this tradeoff by starting with a moderately high value of β, optimizing the resultant smooth function, then iteratively reducing β and refining the search results.!"#$%&' 01*2)3#"4),(%(##"#04)8) 9) :) 0.1-/231,4'' 53%"#46'78!9:' (;#%)$35*#' 01567)3#"4) ()*#+'' %,-./*#+' '(+>)04)8)9):) β 0, β1, β2,... *'557/;(%(##"#)) 0<=);4)8) )))9) :),.. x!2,, x!1.!"#$"%&'"($) *+',#"-) *"(%./) x!0 Fig. 3. Control flow in Euler The high-level control flow in the tool is as in Fig. 3. Given an input file (say parallelpark.sk), the Euler compiler produces an executable called parallelpark.out. In the latter, the function parallel(x) has been replaced by a smooth function smoothparallel(β, x). When we execute parallelpark.out, β is set to an initial value β0, and the optimization backend (Gsl) is invoked on the function smoothparallel(β = β0, x). The optimization method repeatedly queries smoothparallel, starting with a random initial input x0 and perturbing it iteratively in subsequent queries, and finally returns a minimum to the top-level loop. At this point, β is set to a new value and the same process is repeated. We continue the outer loop until it converges or there is a timeout.

4 Results Now we present the results obtained by running Euler on our parallel parking example. These results are compared with the results of running the Nelder-Mead method (Nm), without smoothing, on the problem. As any local optimization Fig. 4. Percentages of runs that lead to specific ranges of parallel(x min ) (lower ranges are better). algorithm starts its search from a random point x 0, the minima x min computed by Euler and Nm are random as well. However, the difference between the two approaches becomes apparent when we consider the distribution of parallel(x min ) in the two methods. In Fig. 4, we show the results of Euler and Nm on 100 runs from initial points generated by a uniform random sampling of the region 10 < t 1, A 1, t 2, A 2 < 10. The values of parallel(x min ) on all these runs have been clustered into several intervals, and the number of runs leading to outputs in each interval plotted as a histogram. As we see, a much larger percentage of runs in Euler lead to lower (i.e., better) values of parallel(x min ). The difference appears starker when we consider the number of runs that led to the best-case behavior for the two methods. The best output value computed by both methods was 10.6003; however, 68 of the 100 runs of Euler resulted in this value, whereas Nm had 26 runs within the range 10.0-15.0. Let us now see what these different values of parallel(x min ) actually mean. We show in Fig. 6-(a) the trajectory of a car on an input x for which the value parallel(x) equals 40.0 (the most frequent output value, rounded to the first decimal, identified by Nm). The initial and final positions of the car are marked; the two Fig. 5. Landscape of numerical search for parallel parking algorithm black rectangles represent other cars; the arrows indicate the directions that the car faces at different points in its trajectory. Clearly, this parking job would earn any driving student a failing grade. On the other hand, Fig. 6-(b) shows the trajectory of a car parked using Euler on an input for which the output is 10.6, the most frequent output value found by Euler. Clearly, this is an excellent parking job.

The reason why Nm fails becomes apparent when we examine the search space that it must navigate here. In Fig. 5, we plot the function parallel(x) for different values of t 2 and A 1 (t 1 and A 2 are fixed at optimal values). Note that the search space is rife with numerous discontinuities, plateaus, and local minima. In such extremely irregular search spaces, numerical methods are known not to work smoothing works by making the space more regular. Unsurprisingly, similar phenomena were observed when we compared Euler with other optimization techniques implemented in Gsl. These observations are not specific to parallel parking similar effects are seen on other parameter synthesis benchmarks [1] that involve controllers with discontinuous switching. Several of these benchmarks including a model of a thermostat and a model of a gear shift are part of the Euler distribution. More generally, the reason why parallel is so ill-behaved is fundamental: even simple programs may contain discontinuous if-then-else statements, which can be piled on top of each other through composition and loops, causing exponentially many discontinuous regions. Smooth interpretation is only the first attempt from the software community to overcome these challenges; more approaches to the problem will surely emerge. Meanwhile, readers are welcome to try out Euler and advise us on how to improve it. (a)!"#$%-'")%*+,&(+'%!"#$%&'&(")%*+,&(+'%!.#/% (b)!"#$%&'&(")%*+,&(+'%!"#$%-'")%*+,&(+'%!.#/% Fig. 6. (a) Parallel parking as done by Nm; (b) Parallel parking as done by Euler References 1. S. Chaudhuri and A. Solar-Lezama. Smooth interpretation. In PLDI, pages 279 291, 2010. 2. S. Chaudhuri and A. Solar-Lezama. Smoothing a program soundly and robustly. In CAV, pages 277 292, 2011. 3. J.A. Nelder and R. Mead. A simplex method for function minimization. The computer journal, 7(4):308, 1965. 4. A. Solar-Lezama, R. Rabbah, R. Bodík, and K. Ebcioğlu. Programming by sketching for bit-streaming programs. In PLDI, pages 281 294. ACM, 2005.