Jcuda not trying devices after device id 0 on multiple card hosts

I’ve tried jcuda on my multiple Tesla host in compute normal mode and it’s working correctly (all the threads go to device 0):

JVM 1 -> Tesla dev 0
JVM 2 -> Tesla dev 0

But we want our JVMs to each bind to their own Tesla card by using the compute exclusive mode
via the nvidia-smi tool:

JVM1 -> Tesla 0
JVM2 -> Tesla 1

When I try exclusive mode I notice that the first Jcuda instance goes to device 0 ok but the 2nd one tries device 0, fails as expected but doesn’t then move on to device 1. Is this a known limitation of jcuda? So now I suspect I should try this:

int[] devcount = new int[1];
cudaGetDeviceCount(devcount);

int[] devlist = new int[devcount[0]];
for (int i = 0; i < devcount[0]; i++) {
devlist** = i; // load array in order with 0,1,…, devcount[0]-1
}

cudaSetValidDevices(devlist, devcount[0]);

The CUDA Reference says cudaSetValidDevices will try devices on the devlist until “it finds one that works” which I assume means not blocked via exclusive mode because a previous thread is on it.

Mark

Hello

I have to admit that I only occasionally have the chance to work on a PC with 2 devices (no Tesla devices). Forthermore, my Linux PC has no device at all, and is only used for the compilation of the binaries. Thus I could not yet use the smi-tool to test the different compute modes, and can not say whether this is a general problem with JCuda or only some unexpected behavior for which a workaround may be found.

When I try exclusive mode I notice that the first Jcuda instance goes to device 0 ok but the 2nd one tries device 0, fails as expected but doesn’t then move on to device 1.

Due to my lack of experience with this, I have to ask: In which way is it ‚failing‘?

So now I suspect I should try this: …

The CUDA Reference says cudaSetValidDevices will try devices on the devlist until „it finds one that works“ which I assume means not blocked via exclusive mode because a previous thread is on it.

The Programming Guide also states that: The devices that are in exclusive mode AND occupied by another thread should be skipped.
Of course it would be desirable to use this sort of „s(e)mi-automatic“ device selection, but just for testing: Did you try setting the devices to be used manually end explicitly with cudaSetDevice?

Ok, I think I have this fixed. I was using cudaSetDevice and didn’t need to.

ps. regarding my problem with posting here, I must have made two accounts accidentally, one from my home laptop and one from work. I know which is the working one so I’ll change my bookmarks to the good one.

Mark