A Modern GPU Compiler for .NET Programs

Developed by Marcel Köster

graphics cpu
graphics cpu

Learn more about ILGPU

A modern, lightweight & fast GPU compiler for high-performance .Net programs

  • What is ILGPU?

    ILGPU is a new JIT (just-in-time) compiler for high-performance GPU programs (also known as kernels) written in .Net-based languages. ILGPU is completely written in C# without any native dependencies which allows you to write GPU programs that are truly portable.

  • It combines the convenience of C++ AMP with the high performance of CUDA. Functions in the scope of kernels do not have to be annotated (e.g. default C# functions) and are allowed to work on value types. All kernels (including all hardware features like shared memory, atomics and warp shuffles) can be executed and debugged on the CPU using the integrated multi-threaded CPU accelerator.

  • Nope! ILGPU is released under the University of Illinois/NCSA Open Source License. It is a free project and open-source project supported mainly by G-Research. Support the project with contributions or a small donation in order to speed up the development process and to keep the project alive.


High Performance

High performance kernel compilation, dispatch and execution times. Furthermore, type-safe kernel delegates avoid boxing.

High Convenience

Use the power of C# or F# to write high-level kernels and execute them on the GPU. No need to program C++, Cuda or OpenCL.

CPU Accelerator

Single- or multi-threaded execution of kernels on the CPU. This is also useful for debugging or emulation of specific target platforms.

Advanced Debugging

High-level kernel debugging using your favorite .Net debugger. Furthermore, the single-threaded execution feature allows to focus on the algorithm instead of the parallelism.

No Function Annotations

Functions do not have to be annotated in order to use them in the scope of kernels.

Any-CPU Builds

Compile your applications for any CPU. ILGPU will automatically adjust everything else for X86 or X64 platforms.

Implicitly Grouped Kernels

Implicitly grouped kernels let you implement high-level kernels without paying attention to low-level index computations or tiling.

Multi-dimensional Indices

Multi-dimensional index types simplify address computations and kernel writing.

Array Views

No pointer arithmetic and dramatically simplified index computations due to views to memory regions.

Shared Memory

Support for shared (scratch-pad) memory in kernels via array views. Static or dynamic allocation of shared memory is supported.

Atomics and Low-Level Intrinsics

Easy access to atomic functions and low-level-intrinsics like warp shuffles. All functions are supported during CPU debugging.

High-Performance Math Functions

Default math functions and operations are mapped to high-performance math functions. Furthermore, there is support for fast math and forced 32bit math to avoid doubles.

Features Comparison

Features ILGPU C++ AMP Cuda
.Net Code
C++ Code
Function Annotations Required
Intel GPUs
High-Level Abstractions (Implicitly Grouped Kernels, ...)
Low-Level Intrinsics
High-Performance Math Functions
Cross-Platform Support
Single-Compilation Cross-Platform Support
Direct Multi-GPU Support
Convenient Algorithm Debugging
Debugging on GPU Hardware
Kernel Profiling
CPU Runtime
CPU Runtime with Shared Memory and Low-Level Intrinsics
SIMD CPU Runtime

Yellow checkmarks indicate partial or limited support.
Features marked with an orange checkmark will be available in the future.

Ready to get started?

Take a look through the documentation, create your first ILGPU project, and join the community on Discord!


Updates and news related to ILGPU and the community.

New Talk to the Dev(s) Timeslot 2021!

There will be a weekly talk-to-the-dev(s) meeting on the Discord server every Wednesday from 10-11pm Berlin Time, 8-9am Canberra Time (+1 day, Thursday), 4-5pm New York Time, 1-2pm California Time. The first meeting using the updated time slot will take place on March 3rd 2021.

Full Article
Enhanced Progress Visibility

Starting in February 2021, an updated ILGPU version will be available every 6-8 weeks. As the community grows, new features will be explicitly tracked on the GitHub issues page and are linked to their appropriate milestone to which they belong.

Full Article
Major version released February 15th, 2021

The new stable version offers significant performance improvements of the generated kernel programs and contains critical resource de-allocation fixes. It is strongly recommended to upgrade to this version as soon as possible to avoid resource and GC related de-allocation issues.

Full Article

Frequently Asked Questions

  • Are exceptions supported?

    Exceptions require support for exception handlers and a limited support for reference types. Changes of the "intended" control flow (which can be caused by exceptions) are currently not supported. However, there might be a conversion phase in the future that converts several exceptions into debug assertions.

  • Debug assertions are supported on all accelerators and can be enabled via one of the Context flags.

  • Reference types are currently not supported. However, a limited support for reference types will be added in the future. This will also allow the implementation of delegates. Lambda functions (or delegates in general) are currently not supported since they require a limited support for reference types and custom code-transformation passes. Support for lambda functions will be added in the future.

  • There is basic support for hardware-based kernel debugging and profiling. However, CPU-based kernel debugging is recommended in all cases due to the advanced debugging and testing capabilities.

  • The new ILGPU version supports .Net 4.7 and .Net Standard 2.1 (e.g. .Net Core 3.1/.Net 5.0).

  • ILGPU supports .Net Core, which allows writing portable .Net applications. Since ILGPU is written in C# and does not rely on native libraries in the current version, kernels can be run on all .Net Core/.Net 5 compatible platforms. This allows you to compile your application (including GPU code) only once.