Kernel Execution Time Measurement

Is there any possibility to measure the kernel execution times while using jcuda? I guess i can’t use the cutCreateTimer() Method from the Cutil library.

I don’t know if there is any function in jCuda. But a simple method could be :

long timeBefore = System.nanoTime();
long timeAfter = System.nanoTime();
System.out.println("Execution in "+(timeAfter-timeBefore)+" ns");


Yes, the way Bertrand suggested is appropriate for most cases. JCuda also offers the functionality for measuring the elapsed time between two CUDA events, i.e. using the cudaEvent/CUevent classes. An example of how these Events may be used with the Runtime API is here: - for the Driver API, it will be similar.

But the test from the above link showed that there is hardly a noticable difference between measuring the time with the CUDA events, and measuring the time with System.nanoTime() as Betrand suggested, and for most cases, the latter will be much more convenient to use.

Actually, I already started porting the CUTIL library to Java. As far as this makes sense: Some of the functions are heavily tailored for C. In contrast to JCuda itself, the goal of the “JCudaUtils” will not be to resemble the original API, but to provide some methods with similar functionality as the CUTILS, but with a slighty more Java-Like style. However, there will also be a Timer class, which offers functions similar to the timer funtions of the CUTILs.