RIPL. An Image Processing DSL. Rob Stewart & Deepayan Bhowmik. 1st May, 2014. Heriot Watt University

Similar documents
Turbomachinery CFD on many-core platforms experiences and strategies

Computational Foundations of Cognitive Science

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Dataflow Programming with MaxCompiler

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

Stream Processing on GPUs Using Distributed Multimedia Middleware

Medical Image Processing on the GPU. Past, Present and Future. Anders Eklund, PhD Virginia Tech Carilion Research Institute

DEFERRED IMAGE PROCESSING IN INTEL IPP LIBRARY

HPC Wales Skills Academy Course Catalogue 2015

Xilinx SDAccel. A Unified Development Environment for Tomorrow s Data Center. By Loring Wirbel Senior Analyst. November

Next Generation Operating Systems

HIGH PERFORMANCE BIG DATA ANALYTICS

Adaptive Stable Additive Methods for Linear Algebraic Calculations

Accelerating CFD using OpenFOAM with GPUs

Xeon+FPGA Platform for the Data Center

PyCompArch: Python-Based Modules for Exploring Computer Architecture Concepts

CFD Implementation with In-Socket FPGA Accelerators

:Introducing Star-P. The Open Platform for Parallel Application Development. Yoel Jacobsen E&M Computing LTD

How To Program With Adaptive Vision Studio

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

High Performance Computing in CST STUDIO SUITE

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

GPU Renderfarm with Integrated Asset Management & Production System (AMPS)

FPGA-based Multithreading for In-Memory Hash Joins

Parallel Algorithm Engineering

Computer Graphics Hardware An Overview

Multicore Parallel Computing with OpenMP

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

Semester Review. CSC 301, Fall 2015

Go Faster - Preprocessing Using FPGA, CPU, GPU. Dipl.-Ing. (FH) Bjoern Rudde Image Acquisition Development STEMMER IMAGING

GPU File System Encryption Kartik Kulkarni and Eugene Linkov

Parallel Computing with MATLAB

Object Recognition and Template Matching

CUDAMat: a CUDA-based matrix class for Python

Using MATLAB to Measure the Diameter of an Object within an Image

Network Traffic Monitoring & Analysis with GPUs

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

GeoImaging Accelerator Pansharp Test Results

A numerically adaptive implementation of the simplex method

GPU Point List Generation through Histogram Pyramids

Distributed Image Processing using Hadoop MapReduce framework. Binoy A Fernandez ( ) Sameer Kumar ( )

CUDA SKILLS. Yu-Hang Tang. June 23-26, 2015 CSRC, Beijing

Implementation of Stereo Matching Using High Level Compiler for Parallel Computing Acceleration

Bringing Big Data Modelling into the Hands of Domain Experts

FPGA-based MapReduce Framework for Machine Learning

CS1112 Spring 2014 Project 4. Objectives. 3 Pixelation for Identity Protection. due Thursday, 3/27, at 11pm

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Chapter 5 Functions. Introducing Functions

Image Processing & Video Algorithms with CUDA

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

A Multi-layered Domain-specific Language for Stencil Computations

Network Traffic Monitoring and Analysis with GPUs

GPU-Based Network Traffic Monitoring & Analysis Tools

Writing Applications for the GPU Using the RapidMind Development Platform

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Overview of HPC Resources at Vanderbilt

A general-purpose virtualization service for HPC on cloud computing: an application to GPUs

The Click2NetFPGA Toolchain. Teemu Rinta-aho, Mika Karlstedt, Madhav P. Desai USENIX ATC 12, Boston, MA, 13 th of June, 2012

Introduction to GPGPU. Tiziano Diamanti

CUDA programming on NVIDIA GPUs

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Implementation of Canny Edge Detector of color images on CELL/B.E. Architecture.

High-speed image processing algorithms using MMX hardware

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

Structure of Presentation. The Role of Programming in Informatics Curricula. Concepts of Informatics 2. Concepts of Informatics 1

Installation Guide. (Version ) Midland Valley Exploration Ltd 144 West George Street Glasgow G2 2HG United Kingdom

GPU-based Decompression for Medical Imaging Applications

Cluster Computing at HRI

Computational Mathematics with Python

GPU Parallel Computing Architecture and CUDA Programming Model

How To Write A Data Processing Pipeline In R

Speed up numerical analysis with MATLAB

S b at 1.6 Gigabit/Second Bandwidth Encourages Industrial Imaging and Instrumentation Applications Growth

Practical Generic Programming with OCaml

Intelligent Heuristic Construction with Active Learning

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

Low power GPUs a view from the industry. Edvard Sørgård

Parallel Computing with Mathematica UVACSE Short Course

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

Interactive Level-Set Deformation On the GPU

Binary search tree with SIMD bandwidth optimization using SSE

Big Data Visualization on the MIC

Parallel Image Processing with CUDA A case study with the Canny Edge Detection Filter

How To Build An Ark Processor With An Nvidia Gpu And An African Processor

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

CS231M Project Report - Automated Real-Time Face Tracking and Blending

Cisco Enhanced Device Interface 2.2

Computational Mathematics with Python

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Computer Math

Computer Vision Technology. Dave Bolme and Steve O Hara

Chapter One Introduction to Programming

Clustering Billions of Data Points Using GPUs

Data Center and Cloud Computing Market Landscape and Challenges

Multi-Threading Performance on Commodity Multi-Core Processors

Transcription:

RIPL An Image Processing DSL Rob Stewart & Deepayan Bhowmik Heriot Watt University 1st May, 2014

Image processing language Rathlin = + FPGA

Motivation

Application scenario

FPGAs good fit for remote image processing reconfigurable energy efficient FPGA constraints Memory Task scheduling Language design tradeoffs Solution: Small DSL closely coupled to FPGA instruction set

Requirements

Design

High level imperative language Language choices image algebra Existing languages/libraries FPGA instruction set abstraction Platform independent reference interpreter GPUs & CPUs

RIPL Language Features Functions and procedues Assignment let rgb img a = foo(..) {.. } ; action(..) {.. } ; let rgb image a =.. ; var grey image b; b :=.. ; Iteration & conditional branching for i in 0.. n {.. } ; if (.. ) {.. } else {.. } ; Image algebra implementation Overloading b := (a (+) s)ˆ2 ; c := max( sum(b), d) ; let rgb image a =.. ; let rgb img b =.. ; let rgb image c = a - b ; let int x = 3 ; let int y = 4 ; let int z = x - y ;

Constructing Images in RIPL image : pointset valueset /* RGB img */ let rgb image a = [1:3,1:2] {(1,56,35),(94,22,42),(155,134,99), (56,7,21),(245,1,32),(42,211,111)};

Constructing Images in RIPL image : pointset valueset /* RGB img */ let rgb image a = [1:3,1:2] {(1,56,35),(94,22,42),(155,134,99), (56,7,21),(245,1,32),(42,211,111)}; /* Same RGB img to image algebra notation b F Y */ let ptset Y = [1:3,1:2] ; let valset F = {(1,56,35),(94,22,42),(155,134,99), (56,7,21),(245,1,32),(42,211,111)}; let rgb image b = FˆY ;

Constructing Images in RIPL image : pointset valueset /* RGB img */ let rgb image a = [1:3,1:2] {(1,56,35),(94,22,42),(155,134,99), (56,7,21),(245,1,32),(42,211,111)}; /* Same RGB img to image algebra notation b F Y */ let ptset Y = [1:3,1:2] ; let valset F = {(1,56,35),(94,22,42),(155,134,99), (56,7,21),(245,1,32),(42,211,111)}; let rgb image b = FˆY ; /* mutable variables */ var grey image c; c := [1:2,1:3] {221,244,230,165,102,124};

Overloaded Operations /* add two integers */ let int i = 3; let int j = 4; print(i+j); /* add two value sets */ let valset v1 = {3,4,5}; let valset v2 = {1,2,6}; print(v1+v2); /* add two point sets */ let ptset pt1 = {(1,2),(4,3)}; let ptset pt2 = {(3,1),(5,2)}; print(pt1+pt2); /* add a dog and a cat */ var rgb image dog, cat, friends; cat = readfile("cat.bmp"); dog = readfile("dog.bmp"); friends = cat + dog; writefile(friends,"out.bmp");

Thresholding I I Segment into regions of interest Semi thresholding: pixels within threshold are retained var grey image a; var grey image b; a := readfile("pumpkin.jpg") ; b := X[100..255](a) ; writefile(b,"segmented.jpg") ;

Edge detection Contour with abrupt brightness change Important for segmentation & scene analysis Convolve two kernels over original image to calculate approximations of the X & Y derivatives G = s 2 + t 2

let grey image a = readfile( pumpkin.bmp ) ; let int template s = [3,3] {-1, 0, 1, -2, 0, 2, -1, 0, 1 } ; let int template t = [3,3] {-1,-2,-1, 0, 0, 0, 1, 2, 1 } ; let grey image newimg = (((a (+) s)ˆ2) + (((a (+) t)ˆ2)))ˆ(1/2) ; writefile(b,"out.bmp");

RIPL let rgb image a = readfile("images/bike.bmp"); /* Sobel template definitions */ let grey image newimg = (((a(+)s)ˆ2) + (((a(+)t)ˆ2)))ˆ(1/2); writefile(newimg,"pumpkin-edges.bmp"); OpenCV Mat src, dst, grad_x, grad_y, abs_grad_x, abs_grad_y;; src = imread( pumpkin.bmp ); Sobel(src, grad_x, ddepth, 1, 0, 3, 1, 0, BORDER_DEFAULT); convertscaleabs( grad_x, abs_grad_x ); Sobel(src, grad_y, ddepth, 0, 1, 3, 1, 0, BORDER_DEFAULT); addweighted( grad_x, 0.5, grad_y, 0.5, 0, dst ); convertscaleabs( grad_y, abs_grad_y ); imwrite( pumpkin-edges.bmp,dst);

Tackling the pyramid with image algebra Image enhancement Edge detection Thresholding Connected components Morphological transformations Shape detection Image features

Image Algebra Operations to express all image-to-image transformations Small number of concise & simple operations 25 point operations 15 point set operations 9 value set operations 30 image operations 37 template operations 4 neighbourhood operations Amenable optimisation techniques that are machine: independent formal mathematical systems dependent FPGAs, CPUs & GPUs

Image Algebra Value sets Numeric data for points of types Z, R, or Z 2 k Point sets Spatial relationship between points Image pixels Tuple of point & value function (x, a(x)) Image Function from points to values F -valued image on X is a : X F, or a F X. Rectangular point set X = Z + m Z + n where Z + m Z+ n = {(x 1, x 2 ) Z 2 : 1 x 1 m, 1 x 2 n}

Thresholding For source image a R X and threshold range [h, k], semithreshold image b R X is given by: { a(x) if h a(x) k b(x) = 0 otherwise Semithresholded image b R X over [100, 255] is b := a χ [100,255] (a) var grey image a; var grey image b; a := readfile("pumpkin.jpg") ; b := X[100..255](a) ; writefile(b,"segmented.jpg") ;

Edge detection ( [ Edge enhanced image b R Y is b := (a s) 2 + (a t) 2] ) 1/2 2 x = (i 1, j) 1 x = (i 1, j 1), (i 1, j + 1) s (i,j) (x) = 1 x = (i + 1, j 1), (i + 1, j + 1) 2 x = (i + 1, j) 0 otherwise 2 x = (i, j + 1) 1 x = (i 1, j + 1), (i + 1, j + 1) t (i,j) (x) = 1 x = (i 1, j 1), (i + 1, j 1) 2 x = (i, j 1) 0 otherwise

let grey image a = readfile( pumpkin.bmp ) ; let int template s = [3,3] {-1, 0, 1, -2, 0, 2, -1, 0, 1 } ; let int template t = [3,3] {-1,-2,-1, 0, 0, 0, 1, 2, 1 } ; let grey image newimg = (((a (+) s)ˆ2) + (((a (+) t)ˆ2)))ˆ1/2 ( [ (a s) 2 + (a t) 2] 1/2 ) writefile(b,"pumpkin-edges.bmp");

Tool Support

Syntax highlighting & code completion

Rendering RIPL programs as image algebra Video demonstration

Implementation

RIPL syntax described in labelled BNF notation Prog. Program ::= [Decl] Body ; CmdIf. Command ::= SelectionStm ; EENorm. Exp ::= "[[" Exp "]]2" ; BNot. Exp ::= " " Exp ; ESumIA. Exp ::= "\\sum" Exp ;... Compiled to lexer, parser & AST RIPL Interpreter traverses user program using AST

BNF Converter ELisp backend https://github.com/robstewart57/bnfc/tree/elisp-backend

Symbolic RIPL Markup Operation RIPL RIPL-IA IA symbol Negation -x -x x Ceiling ceil(x) \ceil*{x} x Floor floor(x) \floor*{x} x Rounding [x ] [x ] [x] Projection p(i,x) p i(x) p i (x) Sum sum(x) \sum x x Product product(x) \Pi x Π x Maximum max(x) \vee x x Minimum min(x) \wedge x x Euclidean norm [[x ]]2 x 2 x 2 Characteristic X x (z) \chi x(z) χ X (z)

Symbolic RIPL Markup

RIPL Interpreter

Evaluation

Parallel image processing Current status: Profiling RIPL on Heriot-Watt Beowulf cluster CPU 2Ghz Intel Xeon, 12Gb memory GPU 1.6Ghz GeForce GT 610, 1Gb memory Feeding repa & accelerate benchmarks to community Goal: to match OpenCV performance (Optimising data parallel code is hard)

Future Work Full coverage of image algebra operations Implement algorithm libraries in RIPL RIPL dataflow compiler Hardware support for low-level image algebra operations

References Handbook of Computer Vision Algorithms in Image Algebra 2nd Ed., G. Ritter & J Wilson, 2000. Supporting image algebra in the Matlab programming language for compression research, M. Schmalz et al. SPIE, 2009. Efficient parallel stencil convolution in Haskell, B Lippmeier & G Keller, Haskell, 2011. Accelerating Haskell array codes with multicore GPUs, M Chakravarty et al. DAMP, 2011.

Thanks