Detects and proves absence of data races and barrier divergence in OpenCL and CUDA kernels
Click on a component of the diagram for further details.
DSLs allow massively parallel computations to be expressed in a compact manner through high-level built-in operators.
We are investigating DSLs for image processing and linear algebra to be compiled into PENCIL and downstream to highly optimised OpenCL code
PENCIL is a Platform-Neutral Compute Intermediate Language being designed as part of the CARP project. The goal of PENCIL is to enable core functionality of accelerator applications to be succinctly expressed in an platform-independent manner, in a form that enables optimising compilation into OpenCL for a range of platforms. To aid compilation, the language will provide support for manually specified performance metadata.
As well as being an input language for optimising compilers, a design goal is for PENCIL to be suitable as a target language for DSL compilers.
Check out the preliminary design of PENCIL through our PENCIL poster and PENCIL position paper.
Open Computing Language (OpenCL) is an industry standard model for programming heterogeneous multicore and manycore systems, maintained by the Khronos Group. OpenCL is widely used for GPU programming, with OpenCL implementations provided by all major GPU vendors.
The compilation tools developed during CARP will target OpenCL as an output language, so that vendor-specific tools can be used to enable CARP-generated code to run on individual platforms.
In addition, to support OpenCL programming directly, we are designing verification and analysis techniques to help find bugs in, or prove correctness of, OpenCL kernels. See the GPUVerify tool for more details.
The main goal of CARP is to make it easy to write high-performance code that will run correctly across a range of platforms.
During the project we will evaluate our techniques with respect to hardware from the major vendors, including NVIDIA, AMD and project partner ARM.
By using OpenCL as a target for the CARP compilers, we ensure support across all platforms that support the OpenCL standard.
We are designing compilation tools to translate programs written in our linear algebra and image processing DSLs into PENCIL.
If PENCIL's design goals are met, it should be a straightforward process to construct these DSL → PENCIL compilers.
Furthermore, PENCIL's support for performance metadata will allow high-level information available at the DSL level to be carried through the compilation process, and communicated directly to the PENCIL → OpenCL compiler.
We are designing formal verification techniques to analyse accelerator applications written in both PENCIL and OpenCL.
For details of our OpenCL verification work, check out the GPUVerify tool.
Analysis of PENCIL programs will be based on a formal semantics for the language, which is being developed as part of the language design process.
The polyhedral model has shown great promise as a framework for building optimising compilers for parallel architectures, and is an excellent fit for graphics processing units.
PENCIL programs will be compiled into OpenCL using novel extensions of the polyhedral approach, exploiting performance metadata to go beyond what is possible in the traditional polyhedral model.
The PENCIL language will provide support for performance metadata to be provided either by the programmer, or by a DSL → PENCIL compiler.
Building on prior work on decoupled access/execute specifications, performance metadata will be used to allow optimisations that go beyond what can be achieved using the polyhedral model.
While high-level programming models such as PENCIL aim to reduce the need to write low-level code by hand, there will always be cases where, for reasons of performance, programmers wish to bypass high-level approaches and optimise manually.
To support this approach we are developing formal verification techniques for OpenCL, to assist programmers in writing correct low-level code, and we are designing PENCIL so that integration with hand-optimised components is straightforward.
Targeting the OpenCL standard ensures portability across multiple platforms, but does not guarantee portable performance: an OpenCL-accelerated application needs to be specialised to perform well on a particular architecture.
To cater for this we plan to exploit the wealth of recent progress on profile-guided auto tuning that has happened over the last few years, so that our PENCIL → OpenCL compiler repeatedy generates code variants for an application with respect to a given architecture until a high-peformance fit is obtained.
We are developing techniques for cost analysis that can provide information on energy usage of an application, and plan to use these techniques to enable auto tuning with respect to energy as well as runtime performance.
Project number 287767. Coordinator: Dr Alastair F. Donaldson, alastair.donaldson@imperial.ac.uk