Question about how to compile the combined codes

system · 14. Mai 2011 um 09:24

Hi Marco13,
I know that the CUDA codes are compiled by NVCC, and in C-CUDA version, the NVCC firstly divide the code into two parts, one for host compiler, another for GPU compiler(NVCC). Here I wonder that how can you split the code or how can you compile the JAVA-CUDA code with JDK and NVCC? Can you show some details? Thanks very much!

Marco13 · 14. Mai 2011 um 12:33

Hello

I’m not sure what you are referring to specifically. But I assume that you mean the following: When you write a CUDA-C program, you can write usual C-Code into a .CU file. This .CU file will then be compiled with the NVCC. The NVCC will roughly take the C code and let it be compiled with the underlying C-Compiler (Visual Studio, GCC…), and it will take the CUDA code (namely, the kernels) and compile it on its own. (This is not as simple as it is suggested here, but … to get an idea). The “mix” of the resulting C-binary-code and CUDA-binary-code will then be linked by the underlying linker.

Of course, with Java, there is no underlying C-Code to compile, and no Linking phase.

So when you want to use own kernels with JCuda, you only compile the CUDA Kernel Code with the NVCC to create a CUBIN file. This CUBIN file can then be loaded with the JCuda driver API, and the kernel can be executed.

Hopefully I’ll one day find the time to sum this up more precisely in a “Getting started”-Tutorial… :o

bye
Marco

system · 15. Mai 2011 um 10:21

[QUOTE=Marco13]Hello

I’m not sure what you are referring to specifically. But I assume that you mean the following: When you write a CUDA-C program, you can write usual C-Code into a .CU file. This .CU file will then be compiled with the NVCC. The NVCC will roughly take the C code and let it be compiled with the underlying C-Compiler (Visual Studio, GCC…), and it will take the CUDA code (namely, the kernels) and compile it on its own. (This is not as simple as it is suggested here, but … to get an idea). The “mix” of the resulting C-binary-code and CUDA-binary-code will then be linked by the underlying linker.

Of course, with Java, there is no underlying C-Code to compile, and no Linking phase.

So when you want to use own kernels with JCuda, you only compile the CUDA Kernel Code with the NVCC to create a CUBIN file. This CUBIN file can then be loaded with the JCuda driver API, and the kernel can be executed.

Hopefully I’ll one day find the time to sum this up more precisely in a “Getting started”-Tutorial… :o

bye
Marco[/QUOTE]

Dear Marco, thanks for details. Today I do some experiments with your JCuda codes, and it is really a talented work for you to release the JCuda for us. Thanks again for what you have done. When using the JCuda, I think, maybe a wrong idea, it is not convenient to use the NVCC to compile the kernel firstly and then reload it by the JCuda Driver API. So I suggest that can you do some complier work such as make the compiler itself compile the .cu files with NVCC and then link them with the Java code(maybe java byte-code) like what Visual Studio does based on some opensource Java compiler. I know what you do is a big help for me, and that is just a rough thought, if anything makes you feel uncomfortable, please just omit i:Dt. Thanks.
Regards~~:p

Marco13 · 15. Mai 2011 um 13:38

Hello

Of course, using the NVCC in this way may be considered as being … “invonvenient”, at least. That’s why I created the “KernelLauncher” class in the Utilities package: It allows automatically creating the CUBIN file and launching the Kernel. The CUDA code can be loaded from a .CU file, or from an “inlined” String in the Java Code.
But note that for more recent CUDA versions, there are some issues with this class: The target architecture has to be specified manually - future versions of the KernelLauncher will do this automatically. Additionally, it will be extended to make it easier to handle PTX files, which are much more flexible than CUBIN files. And last but not least: With CUDA 4.0, the mechanisms for invoking kernels has changed significantly. This change will be added to the KernelLauncher as well (so for the user of the kernelLauncher, this will not change much, but internally, there will be some refactoring)

bye