Why have SSA? Birth Points (cont) Birth Points (a notion due to Tarjan)



Similar documents
CSE 504: Compiler Design. Data Flow Analysis

Static Single Assignment Form. Masataka Sassa (Tokyo Institute of Technology)

Lecture 2 Introduction to Data Flow Analysis

Introduction to Data-flow analysis

Optimizations. Optimization Safety. Optimization Safety. Control Flow Graphs. Code transformations to improve program

Data Structure [Question Bank]

2) Write in detail the issues in the design of code generator.

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Advanced compiler construction. General course information. Teacher & assistant. Course goals. Evaluation. Grading scheme. Michel Schinz

CIS570 Lecture 9 Reuse Optimization II 3

DATA STRUCTURES USING C

Algorithms and Data Structures

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Data Structures. Level 6 C Module Descriptor

Data Structures and Algorithms

Instruction scheduling

PES Institute of Technology-BSC QUESTION BANK

Data Structures Fibonacci Heaps, Amortized Analysis

Review; questions Discussion of Semester Project Arbitrary interprocedural control flow Assign (see Schedule for links)

Obfuscation: know your enemy

10CS35: Data Structures Using C

CIS570 Lecture 4 Introduction to Data-flow Analysis 3

Stack Allocation. Run-Time Data Structures. Static Structures

3. The Junction Tree Algorithms

Efficient Data Structures for Decision Diagrams

Sequential Data Structures

Shortest Path Algorithms

Binary Search Trees CMPSC 122

Trace-Based and Sample-Based Profiling in Rational Application Developer

Static Analysis. Find the Bug! : Analysis of Software Artifacts. Jonathan Aldrich. disable interrupts. ERROR: returning with interrupts disabled

Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

ABCD: Eliminating Array Bounds Checks on Demand

GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. Course Curriculum. DATA STRUCTURES (Code: )

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

Solving Rational Equations and Inequalities

Common Data Structures

Pseudo code Tutorial and Exercises Teacher s Version

Partitioning under the hood in MySQL 5.5

A Note on Maximum Independent Sets in Rectangle Intersection Graphs

Automata and Computability. Solutions to Exercises

Algorithms and Data Structures

Lecture Notes on Binary Search Trees

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Linear Scan Register Allocation on SSA Form

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

x 2 + y 2 = 1 y 1 = x 2 + 2x y = x 2 + 2x + 1

CompSci-61B, Data Structures Final Exam

Chapter 7 - Roots, Radicals, and Complex Numbers

Absolute Value Equations and Inequalities

The Goldberg Rao Algorithm for the Maximum Flow Problem

Compilers I - Chapter 4: Generating Better Code

Dynamic Programming. Lecture Overview Introduction

Data Structures and Data Manipulation

1.4 Compound Inequalities

Tail Recursion Without Space Leaks

Intrusion Detection via Static Analysis

Java Software Structures

Satisfiability Checking

Data Structure with C

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Software Pipelining - Modulo Scheduling

Address Taken FIAlias, which is equivalent to Steensgaard. CS553 Lecture Alias Analysis II 2

Any two nodes which are connected by an edge in a graph are called adjacent node.

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Introduction to Computers and Programming. Testing

6.3 Conditional Probability and Independence

Chapter 2: Linear Equations and Inequalities Lecture notes Math 1010

Small Maximal Independent Sets and Faster Exact Graph Coloring

CS 3719 (Theory of Computation and Algorithms) Lecture 4

language 1 (source) compiler language 2 (target) Figure 1: Compiling a program

gprof: a Call Graph Execution Profiler 1

Inside the PostgreSQL Query Optimizer

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Data Structures Using C++ 2E. Chapter 5 Linked Lists

Instruction Scheduling. Software Pipelining - 2

Parallelization: Binary Tree Traversal

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Level 2 Routing: LAN Bridges and Switches

Static Analysis of Virtualization- Obfuscated Binaries

Know or Go Practical Quest for Reliable Software

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

An Evaluation of Self-adjusting Binary Search Tree Techniques

Performance Improvement In Java Application

Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction

Test Case Design Techniques

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

GRAPH THEORY LECTURE 4: TREES

Efficiently Monitoring Data-Flow Test Coverage

Shortest Inspection-Path. Queries in Simple Polygons

Module 2 Stacks and Queues: Abstract Data Types

CIS570 Modern Programming Language Implementation. Office hours: TDB 605 Levine

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India Dr. R. M.

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Glossary of Object Oriented Terms

CS473 - Algorithms I

Stacks. Data Structures and Data Types. Collections

Transcription:

Why have SSA? Building SSA Form SSA-form Each name is defined exactly once, thus Each use refers to exactly one name x 17-4 What s hard? Straight-line code is trivial Splits in the CFG are trivial x a + b x y - z Joins in the CFG are hard x 13 Slides mostly based on Keith Cooper s set of slides (COMP 512 class at Rice University, Fall 2002). Building SSA Form Insert Φ-functions f at birth points? z x q Used with kind permission. Rename all values for uniqueness s w - x SSA 2 Birth Points (a notion due to Tarjan) Consider the flow of values in this example Birth Points (cont) Consider the flow of values in this example x 17-4 x a + b x y - z The value x appears everywhere It takes on several values. Here, x can be 13, y-z, or 17-4 Here, it can also be a+b x 17-4 x a + b x y - z New value for x here 17-4 or y -z x 13 z x q If each value has its own name Need a way to merge these distinct values Values are born at merge points x 13 z x q New value for x here 13 or (17-4 or y -z) s w - x s w - x New value for x here a+b or (13 or (17-4 or y-z)) SSA 3 SSA 4

Birth Points (cont) Static Single Assignment Form Consider the flow of values in this example x a + b x 17-4 x y - z x 13 s w - x z x q These are all birth points for values All birth points are join points Not all join points are birth points Birth points are value-specific SSA-form Each name is defined exactly once Each use refers to exactly one name What s hard Straight-line code is trivial Splits in the CFG are trivial Joins in the CFG are hard Building SSA Form Insert Φ-functions at birth points Rename all values for uniqueness A Φ-function is a special kind of a move instruction that selects one of its parameters. The choice of parameter is governed by the CFG edge along which control reached the current block. y 1... y 2... y 3 Φ(y 1,y 2 ) However, real machines do not implement a Φ-function in hardware. SSA 5 SSA 6 SSA Construction Algorithm (High-level sketch) 1. Insert Φ-functions 2. Rename values that s all... 1. Insert Φ-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches it (will be unique) of course, there is some bookkeeping to be done... SSA 7 SSA 8

Reaching Definitions Domain is DEFINITIONS, same as number of operations The equations REACHES(n 0 ) = Ø REACHES(n ) = p preds(n) DEFOUT(p ) (REACHES(p ) SURVIVED(p)) 1. Insert Φ-functions at every join for every name 2. Solve reaching definitions 3. Rename each use to the def that reaches it (will be unique) REACHES(n) is the set of definitions that reach block n DEFOUT(n) is the set of definitions in n that reach the end of n SURVIVED(n) is the set of defs not obscured by a new def in n Computing REACHES(n) Use any data-flow method (i.e., the iterative method) This particular problem has a very-fast solution (Zadeck) What s wrong with this approach Builds maximal SSA Too many Φ-functions (precision) Too many Φ-functions (space) Too many Φ-functions (time) Need to relate edges to Φ-functions parameters (bookkeeping) FK F.K. Zadeck, Incremental data-flow analysis in a structured program editor, Proceedings of the SIGPLAN 84 Conf. on SSA Compiler Construction, June, 1984, pages 132-143. 9 To do better, we need a more complex approach SSA 10 1. Insert Φ-functions a)calculate a.) dominance frontiers b.) find global names for each name, build a list of blocks that define it c.) insert Φ-functions global name n Moderately complex Compute list of blocks where each name is assigned. Use this list as the worklist. block b in which n is defined block d in b s dominance frontier Creates the iterated { insert a Φ-function for n in d dominance frontier add d to n s list of defining blocks This adds to the worklist! Use a checklist to avoid putting blocks on the worklist twice; keep another checklist to avoid inserting the same Φ-function twice. SSA 11 2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b 1 counter per name for subscripts a.) generate unique names for each Φ-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses s of global l names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Φ-function parameters of successor blocks d.) recurse on b s children in the dominator tree Reset the state e.) <on exit from block b > pop names generated in b from stacks Need the end-of-block name for this path SSA 12

Aside on Terminology: Dominators Definitions x dominates y if and only if every path from the entry of the control-flow graph to the node for y includes x By definition, x dominates x We associate a Dom set with each node Dom(x ) 1 Immediate dominators For any node x, x there must be a y in Dom(x), y x such that y is closest to x We call this y the immediate dominator of x As a matter of notation, we write this as IDom(x) By convention, IDom(x0) is not defined for the entry node x0 SSA 13 Dominators (cont) Dominators have many uses in program analysis & transformation Finding loops A m 0 a + b n Building SSA form 0 a + b Making code motion decisions Dominator sets Block Dom IDom A A B A,B A C A,C A D A,C,D C E A,C,E C F A,C,F C G A,G A Dominator tree A B C G Let s look at how to compute dominators D E B p 0 c + d r 0 c + d F G C q 0 a + b r 1 c + d D e 0 b + 18 E e a + 17 1 s 0 a + b t 0 c + d u 0 e + f u 1 e + f F r 2 φ(r 0,r 1 ) y 0 a + b z 0 c + d e 3 φ(e 0,e 1 ) u 2 φ(u 0,u 1 ) v 0 a + b w 0 c + d x 0 e + f SSA 14 SSA Construction Algorithm (Low-level detail) Computing Dominance First step in Φ-function insertion computes dominance. A node n dominates m iff n is on every path from n 0 to m. > Every node dominates itself > n s immediate dominator is its closest dominator, IDOM(n) Computing DOM DOM(n 0 ) = { n 0 } DOM(n) = { n } ( p preds(n) DOM(p)) These equations form a rapid data-flow framework. Iterative algorithm will solve them in d(g) + 3 passes > Each pass does N unions & E intersections, > E is O(N 2 ) O(N 2 ) work Initially, DOM(n) = N, n n 0 B 2 B 6 Flow Graph Progress of iterative solution for DOM Iteration 0 1 2 3 DOM(n ) 4 5 6 7 0 0 N N N N N N N 1 0 01 0,1 012 0,1,2 013 0,1,3 0134 0,1,3,4 0135 0,1,3,5 0136 0,1,3,6 017 0,1,7 2 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 Results of iterative solution for DOM 0 1 2 3 4 5 6 7 DOM 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 IDOM 0 0 1 1 3 3 3 1 SSA IDOM(n) n, unless n is n 0, by convention. 15 SSA 16

B 2 B 6 Progress of iterative solution for DOM Iteration 0 1 2 3 DOM(n ) 4 5 6 7 0 0 N N N N N N N 1 0 01 0,1 012 0,1,2 013 0,1,3 0134 0,1,3,4 0135 0,1,3,5 0136 0,1,3,6 017 0,1,7 2 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 Results of iterative solution for DOM 0 1 2 3 4 5 6 7 DOM 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 IDOM 0 0 1 1 3 3 3 1 x Φ(...) B 2 x B... 4 x Φ(...) B 6 Dominance Frontiers & Φ-Function Insertion A definition at n forces a Φ-function at m iff n DOM(m) but n DOM(p) for some p preds(m) DF(n) is fringe just beyond region n dominates 0 1 2 3 4 5 6 7 DOM 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 DF 7 7 6 6 7 1 DF(4) is {6}, so in 4 forces a Φ-function in 6 Dominance Tree There are asymptotically faster algorithms. With the right data structures, the iterative algorithm can be made faster. See Cooper, Harvey, and Kennedy. SSA 17 x Φ(...) in 6 forces a Φ-function in DF(6) = {7} Dominance Frontiers in 7 forces a Φ-function in DF(7) = {1} in 1 forces a Φ-function in DF(1) = Ø (halt) For each assignment, we insert the Φ-functions SSA 18 Computing Dominance Frontiers Only join points are in DF(n) for some n Leads to a simple, intuitive algorithm for computing dominance frontiers B 2 For each join point x (i.e., preds(x) > 1) For each CFG predecessor of x Run up to IDOM(x) in the dominator tree, adding x to DF(n) for each n between x and IDOM(x) Dominance Frontiers 0 1 2 3 4 5 6 7 B 6 DOM 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7 DF 1 7 7 6 6 7 1 For some applications, we need post-dominance, the post-dominator tree, and reverse dominance frontiers, RDF(n) > Just dominance on the reverse CFG > Reverse the edges & add unique exit node We will use these in dead d code elimination using SSA SSA 19 SSA Construction Algorithm (Reminder) 1. Insert Φ-functions at some join points a.) calculate dominance frontiers b.) find global names Needs a little more detail for each name, build a list of blocks that define it c.) insert Φ-functions global name n block b in which n is defined block d in b s dominance frontier insert a Φ-function for n in d add d to n s list of defining blocks SSA 20

SSA Construction Algorithm Finding global names Different between two forms of SSA Maximal uses all names Otherwise, we do not need a Φ-function Semi-pruned SSA uses names that are live on entry to some block > Shrinks name space & number of Φ-functions > Pays for itself in compile-time speed For each global name, need a list of blocks where it is defined > Drives Φ-function insertion > b defines x implies a Φ-function for x in every c DF(b) Pruned SSA adds a test to see if x is live at insertion point SSA 21 Excluding local names avoids Φ s Φ s for y & z B 2 b i a Φ(a,a) i Φ(i,i) i i a Φ(a,a) d) Assume a, b, c, & d defined before B 6 b With all the Φ-functions Lots of new ops Renaming is next 2. Rename variables in a pre-order walk over dominator tree (use an array of stacks, one stack per global name) Staring with the root block, b 1 counter per name for subscripts a.) generate unique names for each Φ-function and push them on the appropriate stacks b.) rewrite each operation in the block i. Rewrite uses s of global l names with the current version (from the stack) ii. Rewrite definition by inventing & pushing new name c.) fill in Φ-function parameters of successor blocks d.) recurse on b s children in the dominator tree Reset the state e.) <on exit from block b > pop names generated in b from stacks Need the end-of-block name for this path Adding all the details... for each global name i counter[i] 0 stack[i] Ø call Rename(n 0 ) NewName(n) i counter[n] counter[n] counter[n] + 1 push n i onto stack[n] return n i Rename(b) for each Φ-function in b, x Φ( ) rename x as NewName(x) for each operation x y op z in b rewrite y as top(stack[y]) rewrite z as top(stack[z]) rewrite x as NewName(x) for each successor of b in the CFG rewrite appropriate Φ parameters for each successor s of b in dom. tree Rename(s) for each operation x y op z in b pop(stack[x]) SSA 23 SSA 24

B 2 b i a Φ(a,a) c Φ(c,c) d Φ(d,d) i Φ(i,i) i i Before processing Assume a, b, c, & d defined d before B 2 b a Φ(a 0,a) b Φ(b 0,b) c Φ(c 0,c) d Φ(d 0,d) i Φ(ii 0,i) End of a Φ(a,a) c Φ(c,c) d) 1 1 1 1 0 a 0 b 0 c 0 d 0 i has not been defined a Φ(a,a) d) 1 1 1 1 1 End of End of B 2 B 2 b B 2 b 2 a Φ(a,a) d) 3 2 3 2 2 a 2 c 2 a Φ(a 2,a) b Φ(b 2,b) c Φ(c 3,c) d Φ(d 2,d) 3 3 4 3 2 a 2 b 2 c 2 d 2 c 3

Before starting End of B 2 b 2 a Φ(a 2,a) b Φ(b 2,b) c Φ(c 3,c) d Φ(d 2,d) 3 3 4 3 2 a 2 c 2 a Φ(a 2,a) b Φ(b 2,b) c Φ(c 3,c) d Φ(d 2,d) 4 3 4 4 2 a 2 c 2 a 3 d 3 End of End of c 4 a Φ(a 2,a) b Φ(b 2,b) c Φ(c 3,c) d Φ(d 2,d) d Φ(d 4,d) c Φ(c 2,c) 4 3 4 5 2 a 2 c 2 a 3 d 3 d 4 a Φ(a 2,a) b Φ(b 2,b) c Φ(c 3,c) d Φ(d 2,d) d Φ(d 4,d 3 ) c Φ(c 2,c 4 ) 4 3 5 5 2 a 2 c 2 d 3 a 3 c 4

End of B 6 Before c 4 c 4 a Φ(a 2,a 3 ) b Φ(b 2,b 3 ) c Φ(c 3,c 5 ) d Φ(d 2,d 5 ) d 5 Φ(d 4,d 3 ) c 5 Φ(c 2,c 4 ) b 3 B 6 4 4 6 6 2 a 2 b 3 c 2 d 3 a 3 c 5 d 5 a Φ(a 2,a 3 ) b Φ(b 2,b 3 ) c Φ(c 3,c 5 ) d Φ(d 2,d 5 ) d 5 Φ(d 4,d 3 ) c 5 Φ(c 2,c 4 ) b 3 B 6 4 4 6 6 2 a 2 c 2 i 0 a 1 Φ(a 0,a 4 ) b 1 Φ(b 0,b 4 ) c 1 Φ(c 0,c 6 ) d 1 Φ(d 0,d 6 ) i 1 Φ(ii 0,i 2 ) End of i 0 a 1 Φ(a 0,a 4 ) b 1 Φ(b 0,b 4 ) c 1 Φ(c 0,c 6 ) d 1 Φ(d 0,d 6 ) i 1 Φ(ii 0,i 2 ) After renaming Semi-pruned SSA form We re done c 4 c 4 a 4 Φ(a 2,a 3 ) b 4 Φ(b 2,b 3 ) c 6 Φ(c 3,c 5 ) d 6 Φ(d 2,d 5 ) y a 4 +b 4 z c 6 +d 6 i 2 i 1 +1 i 2 > 100 d 5 Φ(d 4,d 3 ) c 5 Φ(c 2,c 4 ) b 3 B 6 5 5 7 7 3 a 2 b 4 c 2 d 6 i 2 a 4 c 6 B 6 a 4 Φ(a 2,a 3 ) b 4 Φ(b 2,b 3 ) c 6 Φ(c 3,c 5 ) d 6 Φ(d 2,d 5 ) y a 4 +b 4 z c 6 +d 6 i 2 i 1 +1 i 2 > 100 d 5 Φ(d 4,d 3 ) c 5 Φ(c 2,c 4 ) b 3 Semi-pruned only names live in 2 or more blocks are global names.

SSA Construction Algorithm (Pruned SSA) What s this pruned SSA stuff? Minimal SSA still contains extraneous Φ-functions Inserts some Φ-functions where they are dead Would like to avoid inserting ng them Two ideas Semi-pruned SSA: discard names used in only one block > Significant reduction in total number of Φ-functions > Needs only local liveness information (cheap to compute) Pruned SSA: only insert Φ-functions where their value is live > Inserts even fewer Φ-functions, but costs more to do > Requires global live variable analysis (more expensive) In practice, both are simple modifications to step 1. SSA Construction Algorithm We can improve the stack management Push at most one name per stack per block (save push & pop) Thread names together by block To pop names for block b, b use b s thread This is another good use for a scoped hash table Significant reductions in pops and pushes Makes a minor difference in SSA construction time Scoped table is a clean, clear way to handle the problem SSA 37 SSA 38 SSA Deconstruction At some point, we need executable code Machines do not implement Φ instructions Need to fix up the flow of values...... X 17 Φ(x 10,x 11 )... x 17 Basic idea Insert copies Φ-function pred s Simple algorithm Adds lots of copies X 17 x 10 X 17 x 11 > Most of them coalesce away... x 17 SSA 39