I’ve tried jcuda on my multiple Tesla host in compute normal mode and it’s working correctly (all the threads go to device 0):
JVM 1 -> Tesla dev 0
JVM 2 -> Tesla dev 0
But we want our JVMs to each bind to their own Tesla card by using the compute exclusive mode
via the nvidia-smi tool:
JVM1 -> Tesla 0
JVM2 -> Tesla 1
When I try exclusive mode I notice that the first Jcuda instance goes to device 0 ok but the 2nd one tries device 0, fails as expected but doesn’t then move on to device 1. Is this a known limitation of jcuda? So now I suspect I should try this:
int[] devcount = new int[1];
cudaGetDeviceCount(devcount);
int[] devlist = new int[devcount[0]];
for (int i = 0; i < devcount[0]; i++) {
devlist** = i; // load array in order with 0,1,…, devcount[0]-1
}
cudaSetValidDevices(devlist, devcount[0]);
The CUDA Reference says cudaSetValidDevices will try devices on the devlist until “it finds one that works” which I assume means not blocked via exclusive mode because a previous thread is on it.
I have to admit that I only occasionally have the chance to work on a PC with 2 devices (no Tesla devices). Forthermore, my Linux PC has no device at all, and is only used for the compilation of the binaries. Thus I could not yet use the smi-tool to test the different compute modes, and can not say whether this is a general problem with JCuda or only some unexpected behavior for which a workaround may be found.
When I try exclusive mode I notice that the first Jcuda instance goes to device 0 ok but the 2nd one tries device 0, fails as expected but doesn’t then move on to device 1.
Due to my lack of experience with this, I have to ask: In which way is it ‚failing‘?
So now I suspect I should try this: …
…
The CUDA Reference says cudaSetValidDevices will try devices on the devlist until „it finds one that works“ which I assume means not blocked via exclusive mode because a previous thread is on it.
The Programming Guide also states that: The devices that are in exclusive mode AND occupied by another thread should be skipped.
Of course it would be desirable to use this sort of „s(e)mi-automatic“ device selection, but just for testing: Did you try setting the devices to be used manually end explicitly with cudaSetDevice?
Ok, I think I have this fixed. I was using cudaSetDevice and didn’t need to.
ps. regarding my problem with posting here, I must have made two accounts accidentally, one from my home laptop and one from work. I know which is the working one so I’ll change my bookmarks to the good one.