Question about KernelLauncherSample


I occur an exception after running the

Preparing the KernelLauncher...
Exception in thread "Main Thread" jcuda.CudaException: CUDA_ERROR_INVALID_SOURCE
	at jcuda.driver.JCudaDriver.checkResult(
	at jcuda.driver.JCudaDriver.cuModuleLoad(
	at KernelLauncher.init(
	at KernelLauncher.create(
	at KernelLauncher.create(
	at KernelLauncher.compile(
	at KernelLauncherSample.main(

I checked the exception is occur in KernelLauncher.init function. It may be the problem of

cuModuleLoad(module, cubinFileName);

The JCuda Driver is Version 0.3.0a and the Cuda Driver setting is:

Device 0: "GeForce GTX 470"
  CUDA Driver Version:                           3.10
  CUDA Runtime Version:                          3.10
  CUDA Capability Major revision number:         2
  CUDA Capability Minor revision number:         0


Hello Lemon,

Unfortunately, the CUDA documentation is kind of ridiculous in this point: Despite the fact that the CUDA_ERROR_INVALID_SOURCE error is not listed as the return value of any of the functions, and there is in general no description about the conditions under which a certain error code is returned, the only decumentation of this error code is


Yes, I already thought that it could be something like this :twisted:

However, did you use the original, unmodified KernelLauncherSample, with the original, unmodified Source Code? If you have modified the source code, you might have introduced an error…

EDIT: Hey, wait a moment: The CUDA driver is version 3.1, but you mentioned JCuda 0.3.0a. If you have the CUDA Version 3.1, you should use JCuda 0.3.1.


Hi Marco,

I am facing different problems.

  1. Using both JCuda 0.3.0a version and the unmodified code of and, the same exception occur.

  2. While I upgrade JCuda 0.3.1 version, I import jxxx-0.3.1.jar rather than jxxx-0.3.0a.jar and also I replace the .so file from JCuda-All-0.3.1-bin-linux-x86_64 (i hope that i haven’t miss other steps).
    However, the same exception of KernelLauncherSample occur.And also, I failed to run the

Creating input data
Initializing device data using JCuda
Performing FFT using JCufft
Performing caxpy using JCublas
Performing scan using JCudpp
Error while loading native library with base name "JCudpp"
Operating system name: Linux
Architecture         : amd64
Architecture bit size: 64
Exception in thread "Main Thread" java.lang.UnsatisfiedLinkError: Could not load native library
	at jcuda.LibUtils.loadLibrary(
	at jcuda.jcudpp.JCudpp.assertInit(
	at jcuda.jcudpp.JCudpp.cudppPlan(
	at JCudaRuntimeSample.main(

Is the built from cudpp v1.1.1 ??

Sorry~Always lot of questions from me:(


Hello Lemon,

This forum is intended for questions :slight_smile:

The other native libraries except for JCudpp seem to work. I have contacted the person who provided the binaries for Linux 64, maybe he has an idea what might be wrong there.

Best regards,

OK. I wait for the updated JCudpp and then test the KernelLauncherSample again.



I received a response - since I’m not a Linux expert, and am not sure if this can be a solution for you, I’ll post the relevant part as I received it:

I have tested it here at all seems to be working. One clue might be
that if we look at

$ ldd => (0x00007fff001ff000) => /usr/local/cuda/lib64/ (0x00007fed13d6d000) => /usr/local/cuda/lib64/ (0x00007fed12290000)

it is looking for in /usr/local/cuda/lib64/

so in /use/local/cuda/lib64 one needs to make sure that the cudpp libs
are symbolically linked. Here is what I have in my
/usr/local/cuda/lib64 -> -> ->


So a quick fix might be to just create the symb. links.

In how far does your setup differ from the one described above, concerning the files in “/usr/local/cuda/lib64/”? Maybe creating these links can help to get it running?


Hi Marco,

My linux setting is far different from you. I have no cudpp library in cuda/lib64.

$ echo $LD_LIBRARY_PATH   //I include 2 folders, the first one is for cuda library, another one is for Jcuda lib

Under cuda library folder
/usr/local/cuda/lib64 -> -> -> -> -> ->

Under JCuda library folder

 ldd => /usr/local/cuda/lib64/ (0x00002b4344d05000) => not found                 --> in my 0.3.0a version, it doesn't show this message => /usr/lib64/ (0x00002b4344f67000) => /lib64/ (0x00002b4345268000) => /lib64/ (0x00002b43454eb000) => /lib64/ (0x00002b43456f9000) => /lib64/ (0x00002b4345a51000) => /lib64/ (0x00002b4345c55000) => /lib64/ (0x00002b4345e70000)
        /lib64/ (0x00000039b5400000)

What steps I should do to have the same setting with you??



As I mentioned: This is not my setting, but the setting of the contributor who provided the binaries for Linux 64.

But it seems it is mainly missing the “”. If you have the NVIDIA CUDA SDK installed, it should contain the appropriate CUDPP library. (Under windows it’s in
NVIDIA GPU Computing SDK\C\bin\win32\Release\cudpp32_31_9.dll
maybe you can find it at a similar place in your Linux installation).

Maybe it’s enough to copy the .SO file from the SDK into the cuda/lib64 directory?


Dear Lemon,

Yes Marco is right, you need to put the cudpp shared library in your /usr/loca/cuda/lib64 (for 64 bit) folder and create a symbolic link to the different names, in particular:

$ sudo ln -s

cudpp comes as a static library with the linux SDK code samples from nvidia, and one could link it statically when creating, but the problem is that this static library is around 90 MB!

So I compiled cudpp into a shared lib. from the source and put it in /usr/local/cuda/lib64/ together with the symbolic links.

If you cannot be bothered to compile cudpp as a shared lib yourself, then you can download the one I have for 64bit linux version 1.1.1 from here:

I hope this helps,

Hi kashif,

I downloaded the from your link. And I put in /usr/local/cuda/lib64/.

Am I need to do the following step? But I am fail to do so.

 sudo ln -s
ln: creating symbolic link `' to `': File exists

It seems OK of the

 ldd => /usr/local/cuda/lib64/ (0x00002b0bfd4e5000) => /usr/local/cuda/lib64/ (0x00002b0bfd720000) => /usr/lib64/ (0x00002b0bff223000) => /lib64/ (0x00002b0bff524000) => /lib64/ (0x00002b0bff7a7000) => /lib64/ (0x00002b0bff9b5000) => /lib64/ (0x00002b0bffd0d000) => /lib64/ (0x00002b0bfff11000) => /lib64/ (0x00002b0c0012c000) => /usr/lib64/ (0x00002b0c00336000)
        /lib64/ (0x00000039b5400000) => not found => /usr/lib64/ (0x00002b0c00d1d000)

But it seems not ok of “ => not found”, am i missing some file again?

 ldd => /usr/local/cuda/lib64/ (0x00002b7063fcb000) => /usr/lib64/ (0x00002b706422d000) => not found => /usr/lib64/ (0x00002b7064c14000) => /lib64/ (0x00002b7064f14000) => /lib64/ (0x00002b7065197000) => /lib64/ (0x00002b70653a6000) => /lib64/ (0x00002b70656fd000) => /lib64/ (0x00002b7065901000) => /lib64/ (0x00002b7065b1d000) => /usr/lib64/ (0x00002b7065d26000)
        /lib64/ (0x00000039b5400000)

Need your help again , thanks.



Maybe kashif gan give a more precise and profound hint there, but … you might try the same procedure (“sudo ln…”) with the A version of the libcutil should definitely be in the SDK.


Opps you are right I forgot that it also needs the libcutil which i compiled as a shared lib. I have now linked libcutil as a static lib. into my so please download it again from here:

and kindly try again.

Best wishes,

ps. so sorry about the ln -s confusion… that was just to show u how to create links to the name of the lib that it was expecting… but if you have just then you dont need to do that.


Thanks Marco and kashif, I can sucessfully upgrade to 0.3.1 version, no error occur when I run JCudpp function.

However, I still have the error exception of KernelLauncherSample. Marco, do you have any suggestion to trace the problem?



Thanks to kashif for the support here, I’d be lost otherwise… :o

I did a quick websearch on “cuModuleLoad CUDA_ERROR_INVALID_SOURCE”, and found this thread - you might want to try adding the “-sm_20” argument to the NVCC:

System.out.println("Preparing the KernelLauncher...");
KernelLauncher kernelLauncher = 
    KernelLauncher.compile(sourceCode, "add"**, "-sm_20"**);

I’m still sticking with my “old” GeForce 8800, so I’m not always aware of any CC 2.0 issues…



Thanks for your hint.

It should be

KernelLauncher kernelLauncher =
            KernelLauncher.compile(sourceCode, "add","-arch sm_20");

It is alright now.



Hi Marco,

In the sample code, the block size setting is hard code.

kernelLauncher.setBlockSize(size, 1, 1);

Can I dynamically set the block size which can maximize my display card usage?
I don’t know how to calculate it.



Note that the kernel must be written appropriately, so that you can really use the respective block size.

You may query the maximum block size. There is not yet a device query example in JCuda, but you may have a look at the NVIDIA CUDA device query example which shows how it basically works. If you have difficulties converting it to JCuda, you might want to have a look at the sample, which is for JOCL instead of JCuda, but with JCuda it should be quite similar.
If it does not work, maybe I’ll find the time to port the device query example to JCuda.


Thank you so much Marco and Lemon!

The simple change of adding “-arch sm_20” is the fix I needed.
I’m not sure if it is the case but it may well be for all Fermi architecture cards.