OpenCL Static C++ Kernel Language Extension



Similar documents
The C Programming Language course syllabus associate level

Course materials. In addition to these slides, C++ API header files, a set of exercises, and solutions, the following are useful:

C++ INTERVIEW QUESTIONS

CSCI 253. Object Oriented Programming (OOP) Overview. George Blankenship 1. Object Oriented Design: Java Review OOP George Blankenship.

An Incomplete C++ Primer. University of Wyoming MA 5310

C++ Programming Language

Classes and Pointers: Some Peculiarities (cont d.)

SYCL for OpenCL. Andrew Richards, CEO Codeplay & Chair SYCL Working group GDC, March Copyright Khronos Group Page 1

Object Oriented Software Design II

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Object Oriented Programming With C++(10CS36) Question Bank. UNIT 1: Introduction to C++

C++FA 5.1 PRACTICE MID-TERM EXAM

CORBA Programming with TAOX11. The C++11 CORBA Implementation

How To Port A Program To Dynamic C (C) (C-Based) (Program) (For A Non Portable Program) (Un Portable) (Permanent) (Non Portable) C-Based (Programs) (Powerpoint)

KITES TECHNOLOGY COURSE MODULE (C, C++, DS)

Objectif. Participant. Prérequis. Remarque. Programme. C# 3.0 Programming in the.net Framework. 1. Introduction to the.

Lecture 3. Optimising OpenCL performance

OpenACC 2.0 and the PGI Accelerator Compilers

Glossary of Object Oriented Terms

Getting Started with the Internet Communications Engine

Object Oriented Software Design II

Chapter 4 OOPS WITH C++ Sahaj Computer Solutions

C++FA 3.1 OPTIMIZING C++

PROBLEM SOLVING SEVENTH EDITION WALTER SAVITCH UNIVERSITY OF CALIFORNIA, SAN DIEGO CONTRIBUTOR KENRICK MOCK UNIVERSITY OF ALASKA, ANCHORAGE PEARSON

16 Collection Classes

Java Interview Questions and Answers

CEC225 COURSE COMPACT

Konzepte objektorientierter Programmierung

Course Name: ADVANCE COURSE IN SOFTWARE DEVELOPMENT (Specialization:.Net Technologies)

N3458: Simple Database Integration in C++11

Crash Course in Java

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Multichoice Quetions 1. Atributes a. are listed in the second part of the class box b. its time is preceded by a colon. c. its default value is

Introduction to C++ Introduction to C++ Week 7 Dr Alex Martin 2013 Slide 1

Polymorphism. Problems with switch statement. Solution - use virtual functions (polymorphism) Polymorphism

Subject Name: Object Oriented Programming in C++ Subject Code:

vector vec double # in # cl in ude <s ude tdexcept> tdexcept> // std::ou std t_of ::ou _range t_of class class V Vector { ector {

Applied Informatics C++ Coding Style Guide

Debugging. Common Semantic Errors ESE112. Java Library. It is highly unlikely that you will write code that will work on the first go

C++ Overloading, Constructors, Assignment operator

Moving from CS 61A Scheme to CS 61B Java

Chapter 5 Functions. Introducing Functions

Habanero Extreme Scale Software Research Project

Object-Oriented Programming Lecture 2: Classes and Objects

Java Programming Fundamentals

CIS 190: C/C++ Programming. Polymorphism

El Dorado Union High School District Educational Services

Elemental functions: Writing data-parallel code in C/C++ using Intel Cilk Plus

Altera SDK for OpenCL

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

MSP430 C/C++ CODE GENERATION TOOLS Compiler Version 3.2.X Parser Error/Warning/Remark List

Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program

C++ Support for Abstract Data Types

To Java SE 8, and Beyond (Plan B)

Free Java textbook available online. Introduction to the Java programming language. Compilation. A simple java program

Lecture 1 Introduction to Android

CpSc212 Goddard Notes Chapter 6. Yet More on Classes. We discuss the problems of comparing, copying, passing, outputting, and destructing

MAHALAKSHMI ENGINEERING COLLEGE B TIRUCHIRAPALLI

GBIL: Generic Binary Instrumentation Language. Andrew Calvano. September 24 th, 2015

Sources: On the Web: Slides will be available on:

Introduction to OpenCL Programming. Training Guide

CISC 181 Project 3 Designing Classes for Bank Accounts

Mitglied der Helmholtz-Gemeinschaft. OpenCL Basics. Parallel Computing on GPU and CPU. Willi Homberg. 23. März 2011

13 Classes & Objects with Constructors/Destructors

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Introduction to Programming System Design. CSCI 455x (4 Units)

Binary storage of graphs and related data

CompuScholar, Inc. Alignment to Utah's Computer Programming II Standards

Django Assess Managed Nicely Documentation

INTRODUCTION TO OBJECTIVE-C CSCI 4448/5448: OBJECT-ORIENTED ANALYSIS & DESIGN LECTURE 12 09/29/2011

Java Programming Language

Overview. Lecture 1: an introduction to CUDA. Hardware view. Hardware view. hardware view software view CUDA programming

Design Patterns in C++

PHP Object Oriented Classes and objects

Outline of this lecture G52CON: Concepts of Concurrency

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

LAB4 Making Classes and Objects

WORKSPACE WEB DEVELOPMENT & OUTSOURCING TRAINING CENTER

1 bool operator==(complex a, Complex b) { 2 return a.real()==b.real() 3 && a.imag()==b.imag(); 4 } 1 bool Complex::operator==(Complex b) {

Binary compatibility for library developers. Thiago Macieira, Qt Core Maintainer LinuxCon North America, New Orleans, Sept. 2013

How To Write A Program In Anieme Frontend (For A Non-Programmable Language)

explicit class and default definitions revision of SC22/WG21/N1582 = and SC22/WG21/ N

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

JAVA - METHODS. Method definition consists of a method header and a method body. The same is shown below:

Delegating Constructors (revision 3)

An Overview of the C++ Programming Language

Software Engineering Concepts: Testing. Pointers & Dynamic Allocation. CS 311 Data Structures and Algorithms Lecture Slides Monday, September 14, 2009

Compile-time type versus run-time type. Consider the parameter to this function:

Operator Overloading. Lecture 8. Operator Overloading. Running Example: Complex Numbers. Syntax. What can be overloaded. Syntax -- First Example

Basics of C++ and object orientation in OpenFOAM

qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

Rigorous Software Development CSCI-GA

Laboratory Assignments of OBJECT ORIENTED METHODOLOGY & PROGRAMMING (USING C++) [IT 553]

1 External Model Access

Programming by Contract. Programming by Contract: Motivation. Programming by Contract: Preconditions and Postconditions

In order to understand Perl objects, you first need to understand references in Perl. See perlref for details.

Curriculum Map. Discipline: Computer Science Course: C++

Constructor, Destructor, Accessibility and Virtual Functions

Transcription:

OpenCL Static C++ Kernel Language Extension Document Revision: 04 Advanced Micro Devices Authors: Ofer Rosenberg, Benedict R. Gaster, Bixia Zheng, Irina Lipov December 15, 2011

Contents 1 Overview... 3 1.1 Supported Features... 3 1.2 Unsupported Features... 3 1.3 Relations with ISO/IEC C++... 4 2 Additions and Changes to Section 6 - The OpenCL C Programming Language... 5 2.1 Additions and Changes to Section 5.7.1 - Creating Kernel Objects... 5 2.2 Passing Classes between Host and Device... 5 3 Additions and Changes to Section 6 - The OpenCL C Programming Language... 6 3.1 Building C++ Kernels... 6 3.2 Classes and derived classes... 6 3.3 Namespaces... 7 3.4 Overloading... 7 3.5 Templates... 8 3.6 Exceptions... 8 3.7 Libraries... 9 3.8 Dynamic operation... 9 4 Examples... 10 4.1 Passing a class from the Host to the Device... 10 4.2 Kernel overloading... 11 4.3 Kernel Template... 11 5 References... 12 Page 2

1 Overview This extension defines OpenCL Static C++ kernel language, which is a form of the ISO/IEC Programming languages C++ specification [1]. The language supports some key features such as overloading and templates that can be resolved at compile time (hence static), while restricting the use of language features which requires dynamic/runtime resolving. At the same time, the language is extended to support most of the extended features described in Section 6 of OpenCL spec: new data types (vectors, images, samples, etc.), OpenCL Built-in functions and more. 1.1 Supported Features The following list contains the major C++ features which are supported by this extension: Kernel and function overloading Inheritance o Strict inheritance o Friend classes o Multiple inheritance Templates: o Kernel templates o Member templates o Template default argument o Limited class templates (the virtual keyword is not exposed) o Partial template specialization Namespaces References this operator Note that supporting templates and overloading highly improves the efficiency of writing code, as it allows developers to avoid replication of code when not necessary. Using kernel template & kernel overloading requires support from the Runtime API as well. AMD provides a simple extension to clcreatekernel which enables the user to specify the desired kernel. 1.2 Unsupported Features The following list contains the major C++ features which are not supported by this extension: Virtual functions, i.e. methods marked with the virtual keyword. Abstract classes, i.e. a class defined only of pure virtual functions. Dynamic memory allocation, i.e. non-placement new/delete support is not provided. Exceptions, i.e. there is no support for throw/catch. The :: operator STL and other standard C++ libraries are not supported. The language specified in this extension can be easily expanded to support these features. Page 3

1.3 Relations with ISO/IEC C++ This extension focuses mainly on documenting the differences between the OpenCL Static C++ kernel language and the ISO/IEC Programming languages C++ specification [1]. Where possible, this extension leaves technical definitions to the ISO/IEC specification. The main motivation for this is simply that the C++ specification is a large and complicated definition and there seems little benefit in replicating it here. Page 4

2 Additions and Changes to Section 6 - The OpenCL C Programming Language 2.1 Additions and Changes to Section 5.7.1 - Creating Kernel Objects In the static C++ kernel language, a kernel can be overloaded, tepmlated, or both overloaded and templated. The syntax which explains how to do it is defined in chapters 3.4 & 3.5. In order to support these cases, we added the following error codes that can be returned by clcreatekernel: CL_INVALID_KERNEL_TEMPLATE_TYPE_ARGUMENT_AMD if a kernel template argument is not a valid type, i.e. is neither a valid OpenCL C type or a user defined type in the same source file. CL_INVALID_KERNEL_TYPE_ARGUMENT_AMD if a kernel type argument, used for overloading resolution, is not valid a valid type, i.e. as above is neither a valid OpenCL C type or user defined type in the same source program. 2.2 Passing Classes between Host and Device The extension allows a developer to pass Classes between the Host and the Device. The mechanism that is used to pass the class to the device and back is the existing buffer object APIs. The Class which is passed maintains its state (public and private members), and the Complier implicitly changes the Class to use either the Host-side or Device-side methods. On the Host side, the Application creates the Class and an equivalent memory object with the same size (using the sizeof function). It can then use the class methods to set or change values of the class members. When the class is ready, the Application uses a standard buffer API to move the class to the device either Unmap or Write, then sets the buffer object as the adequate kernel argument and enqueues the kernel for execution. When the kernel finished the execution, the Application can map back (or read) the buffer object into the Class, and continue working on it. Page 5

3 Additions and Changes to Section 6 - The OpenCL C Programming Language 3.1 Building C++ Kernels In order to compile a program which contains C++ kernels and functions, the Application must add following compile option to clbuildprogramwithsource -x language where language is defined as one of the following: clc in this case the source language is considered to be OpenCL C as defined in the The OpenCL Programming Language version 1.1 *2+. clc++ - in this case the source language is considered to be OpenCL C++ as defined in the following sections of the this document. 3.2 Classes and derived classes OpenCL C is extended to support classes and derived classes as per Sections 9 and 10 [1], with the limitation that virtual functions and abstracts classes are not supported. The virtual keyword is reserved and the OpenCL C++ compiler is required to report a compile time error if it is used in the input program. This limitation restricts class definitions to be fully statically defined. There is nothing prohibiting a future version of OpenCL C++ from relaxing this restriction, pending performance implications. A class definition should not contain any address space qualifier, either for members or for methods: class myclass public: int mymethod1() return x; void local mymethod2()x = 0; // illegal private: int x; local y; // illegal ; The class invocation inside a kernel, however, can be either in private or local address space: kernel void mykernel() myclass c1; local myclass c2;... Page 6

Classes can be passed as arguments to kernels, by defining a buffer object at the size of the class, and using it. The device will invoke the adequate device-specific methods, and will access the class members passed from the host. OpenCL C kernels (defined with kernel) may not be applied to a class constructor, destructor, or method, except in the case that the class method is defined static and thus does not require object construction to be invoked. 3.3 Namespaces Namespaces are support without change as per [1]. 3.4 Overloading As defined in [1] when two or more different declarations are specified for a single name in the same scope, that name is said to be overloaded. By extension, two declarations in the same scope that declare the same name but with different types are called overloaded declarations. Only kernel and function declarations can be overloaded; object and type declarations cannot be overloaded. As per [1] there are a number of restrictions to how functions can be overloaded and these are defined formally in Section 13 [1]; we note here that kernels and functions cannot be overloaded by return type. Also the rules for well-formed programs as defined by Section 13 [1] are lifted to apply to both kernel and function declarations. Overloading resolution is per [1], Section 13.1, but extended to account for vector types. The algorithm for "best viable function", Section 13.3.3 [1], is extended for vector types by inducing a partial-ordering as a function of the partial-ordering of its elements. Following the existing rules for vector types in OpenCL 1.1 [2], explicit conversion between vectors is not allowed. (This reduces the number of possible overloaded functions with respect to vectors but we do not expect this to be a particular burden to developers as explicit conversion can always be applied at the point of function evocation). For overloaded kernels, the following syntax is used as part of the kernel name: foo(type 1,...,type n ) where type 1,...,type n must be either OpenCL scalar or vector type, or can be a user defined type that is allocated in the same source file as the kernel foo. In order to allow overloaded kernels, the user should use the following syntax: attribute ((mangled_name(mymangledname))) The kernel mangled name is used as a parameter to passed to clcreatekernel() API. This mechanism is needed to allow overloaded kernels without changing the existing OpenCL kernel creation API. Page 7

3.5 Templates OpenCL C++ provides support, without restriction, for C++ templates as defined in Section 14 [1]. The arguments to templates are extended to allow for all OpenCL base types, including vectors and pointers qualified with OpenCL C address spaces, i.e. global, local, private, and constant. OpenCL C++ kernels (defined with kernel) can be templated and can be called from within an OpenCL C (C++) program or as an external entry point, i.e from the host. For kernel templates the following syntax is used as part of the kernel name (assuming a kernel called foo): foo<type 1,...,type n > where type 1,...,type n must be either OpenCL scalar or vector type, or can be a user defined type that is allocated in the same source file as the kernel foo. In the case a kernel is both overloaded and templated: foo<type 1,...,type n >(type n+1,...,type m ) Note that in this case overloading resolution is done by first matching non-templated arguments in order they appear in the definition and then substituting template parameters. This allows for the intermixing of template and non-template arguments in the signature. In order to support template kernels, the same mechanism for kernel overloading is used. The user should use the following syntax: attribute ((mangled_name(mymangledname))) The kernel mangled name is used as a parameter to passed to clcreatekernel() API. This mechanism is needed to allow template kernels without changing the existing OpenCL kernel creation API. An implementation is not required to detect name collision with user specified kernel mangled names involved. 3.6 Exceptions Support for exceptions, as per Section 15 [1] is not support. The keywords: try catch throw are reserved and it is required that the OpenCL C++ compiler produce a static compile time error if they are used in the input program. Page 8

3.7 Libraries Support for the general utilities library as defined in Sections 20-21 [1] is not provided. The standard C++ libraries and STL library are not supported. 3.8 Dynamic operation Features related to dynamic operation are not supported: virtual modifier not allowed o OpenCL C++ prohibits the use of virtual modifier. As a result, virtual member functions and virtual inheritance are not supported. Dynamic_cast that requires runtime check is not allowed Dynamic storage allocation and deallocation Page 9

4 Examples 4.1 Passing a class from the Host to the Device Here s an example for passing a Class from the Host to the device and back. The class definition needs to be the same on the Host code and Device code, besides the members type in the case of vectors. If the class includes vector data types, the definition needs to be according to the table which appears on Section 6.1.2 (Corresponding API type for OpenCL Language types ) Kernel code: Class Test setx (int value); private: int x; kernel foo ( global Test* InClass,...) If (get_global_id(0) == 0) InClass->setX(5); Host Code: Class Test setx (int value); private: int x; MyFunc () tempclass = new(test);... // Some OpenCL startup code create context, queue, etc. cl_mem classobj = clcreatebuffer(context, CL_USE_HOST_PTR, sizeof(test),&tempclass, event); clenqueuemapbuffer(...,classobj,...); tempclass.setx(10); clenqueueunmapbuffer(...,classobj,...); //class is passed to the Device clenqueuendrange(..., fookernel,...); clenqueuemapbuffer(...,classobj,...); //class is passed back to the Host Page 10

4.2 Kernel overloading Here s an example for defining and used mangled name for kernel overloading, and choosing the right kernel rom the host code. Assume the following kernels are defined: attribute ((mangled_name(testaddfloat4))) kernel void testadd(global float4 * src1, global float4 * src2, global float4 * dst) int tid = get_global_id(0); dst[tid] = src1[tid] + src2[tid]; attribute ((mangled_name(testaddint8))) kernel void testadd(global int8 * src1, global int8 * src2, global int8 * dst) int tid = get_global_id(0); dst[tid] = src1[tid] + src2[tid]; The names testaddfloat4 and testaddint8 are the external names for the two kernel instants. When calling clcreatekernel, passing one of these kernel names will lead to the right overloaded kernel. 4.3 Kernel Template The following example defines a kernel template, testadd. It also defines two explicit instants of the kernel template, testaddfloat4 and testaddint8. The names testaddfloat4 and testaddint8 are the external names for the two kernel template instants that must be used as parameters when calling to clcreatekernel API. template <class T> kernel void testadd(global T * src1, global T * src2, global T * dst) int tid = get_global_id(0); dst[tid] = src1[tid] + src2[tid]; template attribute ((mangled_name(testaddfloat4))) kernel void testadd(global float4 * src1, global float4 * src2, global float4 * dst); template attribute ((mangled_name(testaddint8))) kernel void testadd(global int8 * src1, global int8 * src2, global int8 * dst); Page 11

5 References [1] Programming languages C++. International Standard ISO/IEC 14881, 1998. [2] The OpenCL Programming Language 1.2. Rev15 Khronos 2011. Page 12