Programming/OpenCL

Introduction to openCL

OpenCL programming

The first thing to realise when trying to port a code to a GPU is that they do not share the same memory as the CPU. In other words, a GPU does not have direct access to the host memory. The host memory is generally larger, but slower than the GPU memory. To use a GPU, data must therefore be transferred from the main program to the GPU through the PCI bus, which has a much lower bandwidth than either memories. This means that managing data transfer between the host and the GPU will be of paramount importance. Transferring the data and the code onto the device is called offloading.

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages (based on C99 and C++11) for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

OpenCL vs CUDA

From one point of view these two paradigms look quite similar, however there are some differences

CUDA is mature and efficient, it has many tools and libraries.

However, it is only usable by NVIDIA GPU architectures.

OpenCL designed for various different processors including AMD/NVIDIA CPU/GPUs, DSPs and FPGAs (many platforms heterogeneous).

However, it not as mature and as widely used as CUDA C and programming appears to be verbose compared to CUDA.

In the following example, we take a code comprised of two loops

Example openACC C/C++ code

</pre>

HPC

Programming/OpenCL

Contents

Introduction to openCL

OpenCL programming

OpenCL vs CUDA

Example openACC C/C++ code

Further Information

Navigation

Support

Research

Tools