Intelligent Heuristic Construction with Active Learning

Size: px

Start display at page:

Download "Intelligent Heuristic Construction with Active Learning"

Irene Bell
8 years ago
Views:

1 Intelligent Heuristic Construction with Active Learning William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather E H U N I V E R S I T Y T O H F G R E D I N B U

2 Space is BIG! Hubble Ultra-Deep Field Tiny region of space shown Despite this, many galaxies Each galaxy, billions of stars Relevance to heuristics?

3 Optimisation spaces are MUCH BIGGER!!! We can t pick from Rough heuristics instead Atoms in the Universe Traditionally hard-coded Can take a year to perfect As if that wasn't bad enough Combinations of GCC Optimisations

4 the problem is even worse than that! Each architectural change requires heuristics to be re-tuned Heuristics are inherently tied to the underlying hardware Most compilers support many different platforms Very difficult to keep up and getting harder We already have out of date compilers

are inherently tied to the underlying hardware Most compilers support

5 Machine Learning to the rescue? Leverage machine learning techniques to create heuristics Well suited to the problem Lots of interesting research Can be better than Humans But, it s also incredibly slow to learn We demonstrate how it s possible to accelerate training Create a heuristic which maps workload to processor

problem Lots of interesting research Can be better than Humans But, it s also

6 feature values Quick Detour: Machine Learning 101 Classification involves forming a correlation between the features of an object and its label examples Machine Learning Algorithm Model best heuristic value

between the features of an object and its label

7 Training a Heuristic thousands of examples input value 2 input value 1

8 Training a Heuristic thousands of examples Machine Learning Algorithm input value 2 input value 1

9 Training a Heuristic thousands of examples Machine Learning Algorithm input value 2 GPU CPU mathematical model input value 1

10 Using a Heuristic unseen features Mathematical Model input value 2 GPU CPU predicted processor input value 1

11 So what s wrong with this? feature 2 feature 1 Traditional approach almost universally adopted

12 Well, we actually only needed these! feature 2 feature 1

13 So this was a complete waste of time! feature 2 feature 1 Random sampling inevitably leads to redundancy

14 How much time was wasted? Correctness of labels are tied to heuristic quality I.e. consistently wrong labels leads to wrong model Sound data is essential, but very expensive E.g. are inputs X, Y, Z faster on CPU or GPU? 1. Run program on CPU using X, Y, Z 2. Run program on GPU using X, Y, Z 3. GOTO 1 until statistical difference observed

15 Compile-time Heuristics are Even Slower Labelling one single example requires iterative compilation compile code using different optimisation values repeated profiling to make statistically sound determination only then, associate best optimisation with code features.exe.c.exe best optimisation wins.exe

values repeated profiling to make statistically sound determination only

16 What do we do about it? We cannot know where the informative examples lie But, we can let the algorithm make an educated guess You and I do not learn in a random, unstructured way We build up our knowledge gradually and iteratively Perhaps, let the algorithm do the same?

the algorithm make an educated guess You and I do not learn in a

17 Active Supervised Learning passive (random) thousands of random examples Machine Learning Algorithm final model

18 Active Supervised Learning passive (random) active (iterative) few random examples thousands of random examples ML Algorithm intermediate model Machine Learning Algorithm final model

19 Active Supervised Learning passive (random) active (iterative) few random examples thousands of random examples Machine Learning Algorithm ML Algorithm intermediate model completion reached? no carefully select an example final model yes final model

Learning Algorithm ML Algorithm intermediate model completion

20 How do we know when it s complete? few random examples Many criteria, including time elapsed loop iterations ML Algorithm intermediate model carefully select an example cross-validation completion reached? no yes final model

21 What about selecting examples? few random examples Many algorithms available Used Query by Committee Easier to show than to tell ML Algorithm intermediate model carefully select an example completion reached? no yes final model

22 We start with a few random examples feature 2 feature 1

23 We form multiple intermediate models feature 2 feature 1

24 Each with a distinct algorithm feature 2 feature 1

25 A committee of different models feature 2 feature 1

26 Here the committee disagrees, but we use this to our advantage feature 2 feature 1 Disagreement regions hold the greatest potential to improve the collective knowledge learn from there!

27 So what example do we learn from next? feature 2 feature 1 We ask each model to predict the label of random unseen examples drawn from the feature space

28 Broadly the Committee will agree feature 2 feature 1

29 but we re interested in disagreement! feature 2 feature 1 Disagreement inevitably occurs around class boundaries

30 We select one of these examples to label properly feature 2 feature 1

31 Then rebuild the intermediate models feature 2 feature 1 Notice the region of disagreement has shrunk Eventually the distinct models will converge

32 Experimental Setup Demonstrate technique by creating an important heuristic Map workload to fastest device CPU or GPU Much studied problem, choosing poorly can drastically degrade performance Specifically, given inputs for Rodinia HotSpot, PathFinder, SRAD and Matrix Multiplication is it faster to use OpenMP (CPU) or OpenCL (GPU)? Compared number of training examples required to get high accuracy heuristic using passive versus active learning

33 A few gory details most in the paper Measured accuracy of randomly-trained vs. QBCtrained classifier using 500 test examples Intel Core i7 3.4GHz (8 HW Threads) NVIDIA Geforce GTX Titan (6GB) 12 distinct committee members 1 random example to begin 10,000 candidate examples 200 loop iterations

34 Random Training Examples 120 CPU GPU Sample Points Program Input Parameter Program Input Parameter

35 QBC Chosen Training Examples 120 CPU GPU Sample Points Program Input Parameter Program Input Parameter Same accuracy but quicker

36 Lights, Camera, Action... Region of Disagreement over time Shape of Model over time Shows ib1 algorithm refining a HotSpot model over time, using training examples chosen by a committee

37 It works 3x faster on average!

38 Summary Desperately need fast, reliable method to generate heuristics Current implementations rely on learning randomly Randomness is problematic because of labelling costs We show active learning is much more efficient 3x faster at creating heuristics to map program inputs to best processor in a heterogeneous system

GPU for Scientific Computing. -Ali Saleh

GPU for Scientific Computing. -Ali Saleh 1 GPU for Scientific Computing -Ali Saleh Contents Introduction What is GPU GPU for Scientific Computing K-Means Clustering K-nearest Neighbours When to use GPU and when not Commercial Programming GPU