I’ve been trying to get JCUDA to use multiple GPUs simultaneously? Is this possible?
The best I’ve gotten is to specify which GPU to use by manipulating the environmental variable CUDA_VISIBLE_DEVICES, but I have not been able to run cuInit() with an index other than 0.
It should, of course, support multiple GPUs - and basically everything else that CUDA supports (despite some limitations for asynchronous operations). But the value given as a parameter for cuInit is NOT the device index, but the initialization flags (and it must be 0 at the moment). If you have multiple devices, it should be possible to obtain a handle to the second device with
CUdevice secondDevice = new CUdevice();
cuDeviceGet(secondDevice, 1);
I also just received the mail about CUDA 4.1. Of course, JCuda will support CUDA 4.1, but it will require a recompilation of the native libraries. I’m not sure when I will find the time for the update. Actually, it should not take too long, but considering the pending update to CUBLAS2 and the implementation of JNpp, and the fact that NVIDIA does not give a roadmap for the final release, I still have to see what’s the best „strategy“ for the release of the next version.