Hi,
I am currently trying to implement my own version of Jakobi iterations. However, when invoking the kernel function multiple times (without changing any arguments), I always get “CUDA_ERROR_LAUNCH_FAILED” as run result.
The same kernel implementation works with plain cuda like a charm. Only JCuda throws this error, and only from the second call onwards. The first call succeeds. Do I have to take anything into account when calling a kernel multiple times? This error can even be reproduced when running an empty kernel.
The call currently looks like the following:
for (int i = 0; i < iterations; i++) {
cuLaunchKernel(function,
gridSize, 1, 1,
blockSize, 1, 1,
0, null,
kernelParameters, null
);
cuCtxSynchronize();
}
Thanks,
Matthias