This might be pretty standard newbie question, but I couldn’t find any similar post.
Namely, I’m having problem with allocation of 2D array on the device using JCuda.
I allocate it with:
CUdeviceptr ptr = new CUdeviceptr();
long[] pitch = new long[1];
float[] testArr2 = new float[2*512];
JCudaDriver.cuMemAllocPitch(ptr, pitch, 512*Sizeof.FLOAT, 2L, 4); //512 elements per row, 2 rows
And once the kernel is done, I try to copy the data back to the host with:
CUDA_MEMCPY2D copyParams = new CUDA_MEMCPY2D();
copyParams.srcMemoryType = CUmemorytype.CU_MEMORYTYPE_DEVICE;
copyParams.srcPitch = pitch[0];
copyParams.srcDevice = ptr;
copyParams.srcXInBytes = 0;
copyParams.srcY = 0;
copyParams.dstMemoryType = CUmemorytype.CU_MEMORYTYPE_HOST;
copyParams.dstHost = Pointer.to(testArr2);
copyParams.dstXInBytes = 0;
copyParams.dstY = 0;
copyParams.Height = 2; //no of users = rows
copyParams.WidthInBytes = 512 * Sizeof.LONG;
System.out.println(JCudaDriver.cuMemcpy2D(copyParams) == CUresult.CUDA_ERROR_INVALID_VALUE );
It seems that I’m constantly getting CUresult.CUDA_ERROR_INVALID_VALUE as the return value of my copy command.
What am I missing?
Thanks,
Tina