I have been trying to get the Visual Profiler working with an application written using JCuda. I get the following exception when trying to profile my application, at the point of performing a host to device copy.
jcuda.CudaException: CUDA_ERROR_UNKNOWN
at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:170)
at jcuda.driver.JCudaDriver.cuMemcpyHtoD(JCudaDriver.java:3023)
I don’t think there’s anything wrong with my memcpy code. If I don’t perform a memcpy before the start of the kernel launch, it throws a similar error when launching the kernel. My application (as a runnable jar) works fine when I run it from the command line. I’m running it under Windows 7 with the 3.2a JCuda build.
Has anybody got this to work? Any suggestions?
Incidentally, CUDA_ERROR_UNKNOWN does not have the correct value in CUresult - it’s written as 99 rather than 999!
Just a short note: I made a “mistake”, and updated to CUDA 4.0RC. They changed a lot there (one could say: Everything…). It will take some time until I have everything again “up and running” on my main development machine, but I’ll try to run a test with the Visual Profiler on a different (3.2) machine as soon as I get the chance.
OK, I did a short test, but it seems not to work: Even with the runtime API, it reports an ‘unknown error’ for the first operation that involves GPU memory.
Currently, I have no idea what might be the problem there. (“Similar settings”, namely JOCL with the gDEBugger, seemed to work - so it should not be a general problem to do GPU profiling and debugging of a DLL that is loaded through Java). Maybe this is only not supported by the Visual Profiler? Maybe it’s necessary to compile the DLL with ‘debug’ settings? I’m not sure, but I have to take some more time to investigate this in detail. CUDA 4.0 seems to introduce some features for programmatic profiling, but I’m still at the beginning of this.
Sorry for inconveniences, I see that the possibility to profile JCuda applications is an important feature, I’ll try to find a solution for that.
As a matter of interest, I can report that using the JCuda 3.2 bindings under the CUDA 4.0 release candidate drivers and visual profiler works without any further changes.
Huh, admittedly, I did not test this, but there are lots of changes in the main CUDA library and especially in the Runtime Library. I probably need a vacation to get all these updated ready in time…