Compiling for Parallelism & Locality. Dependence Testing in General. Algorithms for Solving the Dependence Problem. Dependence Testing



Similar documents
Loop Parallelization

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

Heuristic Static Load-Balancing Algorithm Applied to CESM

Project Networks With Mixed-Time Constraints

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

where the coordinates are related to those in the old frame as follows.

L10: Linear discriminants analysis

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

n + d + q = 24 and.05n +.1d +.25q = 2 { n + d + q = 24 (3) n + 2d + 5q = 40 (2)

POLYSA: A Polynomial Algorithm for Non-binary Constraint Satisfaction Problems with and

J. Parallel Distrib. Comput.

This circuit than can be reduced to a planar circuit

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

PERRON FROBENIUS THEOREM

Support Vector Machines

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

An ILP Formulation for Task Mapping and Scheduling on Multi-core Architectures

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Peak Inverse Voltage

BERNSTEIN POLYNOMIALS

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

INSTITUT FÜR INFORMATIK

Factorization of Multivariate Polynomials by Extended Hensel Construction

Lecture 2 Sequence Alignment. Burr Settles IBS Summer Research Program 2008 bsettles@cs.wisc.edu

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

Generalizing the degree sequence problem

Comparison of Control Strategies for Shunt Active Power Filter under Different Load Conditions

Logical Development Of Vogel s Approximation Method (LD-VAM): An Approach To Find Basic Feasible Solution Of Transportation Problem

7.5. Present Value of an Annuity. Investigate

1 Example 1: Axis-aligned rectangles

Forecasting the Direction and Strength of Stock Market Movement

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

SOLVING CARDINALITY CONSTRAINED PORTFOLIO OPTIMIZATION PROBLEM BY BINARY PARTICLE SWARM OPTIMIZATION ALGORITHM

On the Solution of Indefinite Systems Arising in Nonlinear Optimization

2008/8. An integrated model for warehouse and inventory planning. Géraldine Strack and Yves Pochet

Fuzzy Set Approach To Asymmetrical Load Balancing In Distribution Networks

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Rate Monotonic (RM) Disadvantages of cyclic. TDDB47 Real Time Systems. Lecture 2: RM & EDF. Priority-based scheduling. States of a process

On Leonid Gurvits s proof for permanents

The circuit shown on Figure 1 is called the common emitter amplifier circuit. The important subsystems of this circuit are:

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Series Solutions of ODEs 2 the Frobenius method. The basic idea of the Frobenius method is to look for solutions of the form 3

Solution: Let i = 10% and d = 5%. By definition, the respective forces of interest on funds A and B are. i 1 + it. S A (t) = d (1 dt) 2 1. = d 1 dt.

The Greedy Method. Introduction. 0/1 Knapsack Problem

Single and multiple stage classifiers implementing logistic discrimination

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Maintenance Scheduling by using the Bi-Criterion Algorithm of Preferential Anti-Pheromone

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

OPTIMAL INVESTMENT POLICIES FOR THE HORSE RACE MODEL. Thomas S. Ferguson and C. Zachary Gilstein UCLA and Bell Communications May 1985, revised 2004

Method for Production Planning and Inventory Control in Oil

Adaptive Fractal Image Coding in the Frequency Domain

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

Simple Interest Loans (Section 5.1) :

Virtual Network Embedding with Coordinated Node and Link Mapping

An MILP model for planning of batch plants operating in a campaign-mode

Inventory Rebalancing and Vehicle Routing in Bike Sharing Systems

8 Algorithm for Binary Searching in Trees

Hedging Interest-Rate Risk with Duration

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

reduce competition increase market power cost savings economies of scale and scope cost savings Oliver Williamson: the efficiency defense

Level Annuities with Payments Less Frequent than Each Interest Period

Software Alignment for Tracking Detectors

VoIP over Multiple IEEE Wireless LANs

DEFINING %COMPLETE IN MICROSOFT PROJECT

Consider a 1-D stationary state diffusion-type equation, which we will call the generalized diffusion equation from now on:

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Solving Factored MDPs with Continuous and Discrete Variables

Overview of monitoring and evaluation

Quantization Effects in Digital Filters

Gender Classification for Real-Time Audience Analysis System

On the Interaction between Load Balancing and Speed Scaling

An Analysis of Dynamic Severity and Population Size

Simulation and optimization of supply chains: alternative or complementary approaches?

Using Multi-objective Metaheuristics to Solve the Software Project Scheduling Problem

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Learning Permutations with Exponential Weights

Realistic Image Synthesis

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Calculating the high frequency transmission line parameters of power cables

New Approaches to Support Vector Ordinal Regression

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Minimal Coding Network With Combinatorial Structure For Instantaneous Recovery From Edge Failures

Ants Can Schedule Software Projects

Transcription:

Complng for Parallelsm & Localty Dependence Testng n General Assgnments Deadlne for proect 4 extended to Dec 1 Last tme Data dependences and loops Today Fnsh data dependence analyss for loops General code do 1 = l 1,h 1... do n = l n,h n A(f( 1,..., n )) =... A(g( 1,..., n ))... There exsts a dependence between teratons I=( 1,..., n ) and J=( 1,..., n ) when f(i) = g(j) (l 1,...l n ) < I,J < (h 1,...,h n ) CS553 Lecture Data Dependence Analyss 2 CS553 Lecture Data Dependence Analyss 3 Algorthms for Solvng the Dependence Problem Dependence Testng Heurstcs GCD test (Baneree76,Towle76): determnes whether nteger soluton s possble, no bounds checkng Baneree test (Baneree 79): checks real bounds I-Test (Kong et al. 90): nteger soluton n real bounds Lambda test (L et al. 90): all dmensons smultaneously Delta test (Goff et al. 91): pattern matches for effcency Power test (Wolfe et al. 92): extended GCD and Fourer Motzkn combnaton Use some form of Fourer-Motzkn elmnaton for ntegers, exponental worst-case Parametrc Integer Programmng (Feautrer91) Omega test (Pugh92) Consder the followng code do = 1,5 A(3*+2) = A(2*+1)+1 Queston How do we determne whether one array reference depends on another across teratons of an teraton space? CS553 Lecture Data Dependence Analyss 4 CS553 Lecture Data Dependence Analyss 5 1

Dependence Testng: Smple Case Sample code do = l,h A(a*+c 1 ) =... A(a*+c 2 ) do = l,h A(2*+2) = A(2*-2)+1 1 Dependence? a* 1 +c 1 = a* 2 +c 2, or a* 1 a* 2 = c 2 -c 1 Soluton exsts f a dvdes c 2 -c 1 Dependence? 2* 1 2* 2 = -2 2 = -4 (yes, 2 dvdes -4) Knd of dependence? Ant? 2 + d = 1 d = -2 Flow? 1 + d = 2 d = 2 2 CS553 Lecture Data Dependence Analyss 6 CS553 Lecture Data Dependence Analyss 7 GCD Test Generalze test to lnear functons of terators do = l,h do = l,h A(a 1 * + a 2 * + a 0 ) =... A(b 1 * + b 2 * + b 0 )... Agan a 1 * 1 - b 1 * 2 + a 2 * 1 b 2 * 2 = b 0 a 0 Soluton exsts f gcd(a 1,a 2,b 1,b 2 ) dvdes b 0 a 0 do = l,h do = l,h A(4* + 2* + 1) =... A(6* + 2* + 4)... gcd(4,-6,2,-2) = 2 Does 2 dvde 4-1? CS553 Lecture Data Dependence Analyss 8 CS553 Lecture Data Dependence Analyss 9 2

Baneree Test Dstance Vectors: Legalty for (=L; <=U; ++) { x[a_0 + a_1*] =...... = x[b_0 + b_1*] } Does a_0 + a_1* = b_0 + b_1* for some nteger and? If so then (a_1* - b_1* ) = (b_0 - a_0) Determne upper and lower bounds on (a_1* - b_1* ) for (=1; <=5; ++) { x[+5] = x[]; } upper bound = a_1*max() - b_1 * mn( ) = 4 lower bound = a_1*mn() - b_1*max( ) = -4 b_0 - a_0 = Defnton A dependence vector, v, s lexcographcally nonnegatve when the leftmost entry n v s postve or all elements of v are zero Yes: (0,0,0), (0,1), (0,2,-2) No: (-1), (0,-2), (0,-1,1) A dependence vector s legal when t s lexcographcally nonnegatve (assumng that ndces ncrease as we terate) Why are lexcographcally negatve dstance vectors llegal? What are legal drecton vectors? CS553 Lecture Data Dependence Analyss 10 CS553 Lecture Data Dependence Analyss 11 Drecton Vector Defnton A drecton vector serves the same purpose as a dstance vector when less precson s requred or avalable Element of a drecton vector s <, >, or = based on whether the source of the dependence precedes, follows or s n the same teraton as the target n loop do = 1,6 do = 1,5 A(,) = A(-1,-1)+1 Drecton vector: Dstance vector: (<,<) (1,1) CS553 Lecture Data Dependence Analyss 12 Loop-Carred Dependences Defnton A dependence D=(d 1,...d n ) s carred at loop level f d s the frst nonzero element of D do = 1,6 do = 1,6 A(,) = B(-1,)+1 B(,) = A(,-1)*2 Dstance vectors: Loop-carred dependences (0,1) for accesses to A (1,0) for accesses to B The loop carres dependence due to A The loop carres dependence due to B CS553 Lecture Data Dependence Analyss 13 3

Parallelzaton Each teraton of a loop may be executed n parallel f t carres no dependences Parallelzaton Each teraton of a loop may be executed n parallel f t carres no dependences do = 1,6 do = 1,5 A(,) = B(-1,-1)+1 B(,) = A(,-1)*2 Parallelze loop? Dstance Vectors: (1,0) for A (flow) (1,1) for B (flow) CS553 Lecture Data Dependence Analyss 14 Iteraton Space do = 1,6 do = 1,5 A(,) = B(-1,-1)+1 B(,) = A(,-1)*2 Parallelze loop? Dstance Vectors: (1,0) for A (flow) (1,1) for B (flow) CS553 Lecture Data Dependence Analyss 15 Iteraton Space 2: Parallelzaton (reprse) Why can t ths loop be parallelzed? 1: Loop Permutaton (reprse) Sample code do = 1,100 A() = A(-1)+1 Why can ths loop be parallelzed? 1 2 3 4 5... Dstance Vector: (1) do = 1,6 do = 1,5 A(,) = A(,)+1 do = 1,5 do = 1,6 A(,) = A(,)+1 do = 1,100 A() = A()+1 1 2 3 4 5... Dstance Vector: (0) Why s ths legal? No loop-carred dependences, so we can arbtrarly change order of teraton executon CS553 Lecture Data Dependence Analyss 16 CS553 Lecture Data Dependence Analyss 17 4

Concepts Improve performance by... mprovng data localty parallzng the computaton Next Tme Lecture Loop transformatons for parallelsm and localty Data Dependences teraton space dstance vectors and drecton vectors loop carred Transformaton legalty must respect data dependences scalar expanson as a technque to remove ant and output dependences Data Dependence Testng general formulaton of the problem GCD test CS553 Lecture Data Dependence Analyss 18 CS553 Lecture Data Dependence Analyss 19 5