Basic JCuda setup on Windows

I am novice user of JCUDA, want to do program in jcuda in windows but 1.unable to get IDE for windows. (I only found nsight eclipse edition for linux/mac and nsight visual studio for c++)2. After installing cuda toolkit nsight HUD launcher get added in it but I am not aware, what to do next?

(I moved this question here - the “About JCuda” thread is mainly for infos about updates etc.)

The question is very broad and unspecific.

You don’t need a specific IDE. You can just pick a default Java IDE (e.g. “Eclipse IDE for Java Developers” from Eclipse Downloads )

Regarding the basic setup, it depends on your goals and how familiar you are with other aspects of Java development. You might want to have a look at the basic JCuda tutorial jcuda.org - Tutorial . (It also links to GPU Computing Using CUDA, Eclipse, and Java with JCuda - CodeProject , although this mainly focuses on Linux). The tutorial also shows a “Basic Test” at jcuda.org - Tutorial

NOTE: The handling of native libraries has been changed (and hopefully simplified) in the JCuda 0.8.0RC (release candidate). The tutorial will be updated accordingly for the 0.8.0 “final” version.

Which version of CUDA and JCuda are you currently using?

thanks…!! I am using cuda 6.5 toolkit…

and now I am doing simple program of jcuda from NVIDIA samples using eclipse on windows, Here is my program

            import static jcuda.driver.JCudaDriver.*;
            import java.io.*;
            import jcuda.*;
            import jcuda.driver.*;

            /**
            * This is a sample class demonstrating how to use the JCuda driver
            * bindings to load and execute a CUDA vector addition kernel.
            * The sample reads a CUDA file, compiles it to a PTX file
            * using NVCC, loads the PTX file as a module and executes
            * the kernel function. <br />
            */
            public class TestSampleCuda
            {
                /**
                 * Entry point of this sample
                 *
                 * @param args Not used
                 * @throws IOException If an IO error occurs
                 */
                public static void main(String args[]) throws IOException
                {
                    // Enable exceptions and omit all subsequent error checks
                    JCudaDriver.setExceptionsEnabled(true);

                    // Create the PTX file by calling the NVCC
                   String ptxFileName = preparePtxFile("C:\\Users\\590943\\workspace\\Assignments\\TestSampleCudaKernel.cu");
            //        String ptxFileName = "C:\\Users\\590943\\workspace\\Assignments\\JCudaVectorAddKernel.ptx";
                    // Initialize the driver and create a context for the first device.
                    cuInit(0);
                    CUdevice device = new CUdevice();
                    cuDeviceGet(device, 0);
                    CUcontext context = new CUcontext();
                    cuCtxCreate(context, 0, device);

                    // Load the ptx file.
                    CUmodule module = new CUmodule();
                    cuModuleLoad(module, ptxFileName);

                    // Obtain a function pointer to the "add" function.
                    CUfunction function = new CUfunction();
                    cuModuleGetFunction(function, module, "add");

                    int numElements = 1000000;

                    // Allocate and fill the host input data
                    float hostInputA[] = new float[numElements];
                    float hostInputB[] = new float[numElements];
                    for(int i = 0; i < numElements; i++)
                    {
                        hostInputA** = (float)i;
                        hostInputB** = (float)i;
                    }

                    // Allocate the device input data, and copy the
                    // host input data to the device
                    CUdeviceptr deviceInputA = new CUdeviceptr();
                    cuMemAlloc(deviceInputA, numElements * Sizeof.FLOAT);
                    cuMemcpyHtoD(deviceInputA, Pointer.to(hostInputA),
                        numElements * Sizeof.FLOAT);
                    CUdeviceptr deviceInputB = new CUdeviceptr();
                    cuMemAlloc(deviceInputB, numElements * Sizeof.FLOAT);
                    cuMemcpyHtoD(deviceInputB, Pointer.to(hostInputB),
                        numElements * Sizeof.FLOAT);

                    // Allocate device output memory
                    CUdeviceptr deviceOutput = new CUdeviceptr();
                    cuMemAlloc(deviceOutput, numElements * Sizeof.FLOAT);

                    // Set up the kernel parameters: A pointer to an array
                    // of pointers which point to the actual values.
                    Pointer kernelParameters = Pointer.to(
                        Pointer.to(new int[]{numElements}),
                        Pointer.to(deviceInputA),
                        Pointer.to(deviceInputB),
                        Pointer.to(deviceOutput)
                    );

                    // Call the kernel function.
                    int blockSizeX = 256;
                    int gridSizeX = (int)Math.ceil((double)numElements / blockSizeX);
                    cuLaunchKernel(function,
                        gridSizeX,  1, 1,      // Grid dimension
                        blockSizeX, 1, 1,      // Block dimension
                        0, null,               // Shared memory size and stream
                        kernelParameters, null // Kernel- and extra parameters
                    );
                    cuCtxSynchronize();

                    // Allocate host output memory and copy the device output
                    // to the host.
                    float hostOutput[] = new float[numElements];
                    cuMemcpyDtoH(Pointer.to(hostOutput), deviceOutput,
                        numElements * Sizeof.FLOAT);

                    // Verify the result
                    boolean passed = true;
                    for(int i = 0; i < numElements; i++)
                    {
                        float expected = i+i;
                        if (Math.abs(hostOutput** - expected) > 1e-5)
                        {
                           System.out.println(
                                "At index "+i+ " found "+hostOutput**+
                                " but expected "+expected);
                            passed = false;
                            break;
                        }
                    }
                    System.out.println("Test "+(passed?"PASSED":"FAILED"));

                    // Clean up.
                    cuMemFree(deviceInputA);
                    cuMemFree(deviceInputB);
                    cuMemFree(deviceOutput);
                }

                /**
                 * The extension of the given file name is replaced with "ptx".
                 * If the file with the resulting name does not exist, it is
                 * compiled from the given file using NVCC. The name of the
                 * PTX file is returned.
                 *
                 * @param cuFileName The name of the .CU file
                 * @return The name of the PTX file
                 * @throws IOException If an I/O error occurs
                 */
                private static String preparePtxFile(String cuFileName) throws IOException
                {
                    int endIndex = cuFileName.lastIndexOf('.');
                    if (endIndex == -1)
                    {
                        endIndex = cuFileName.length()-1;
                    }
                    String ptxFileName = cuFileName.substring(0, endIndex+1)+"ptx";

                    //File ptxFile = new File(ptxFileName);

                    File ptxFile = new File(ptxFileName);
                    System.out.println(ptxFile.getCanonicalPath());
                    if (ptxFile.exists())
                    {
                        return ptxFileName;
                    }

                    File cuFile = new File(cuFileName);
                    System.out.println(cuFile.getCanonicalPath());
                    if (!cuFile.exists())
                    {
                        throw new IOException("Input file not found: "+cuFileName);
                    }
                    String modelString = "-m"+System.getProperty("sun.arch.data.model");
                    String command =
                        "nvcc " + modelString + " -ptx "+
                        cuFile.getPath()+" -o "+ptxFileName;

                    System.out.println("Executing
"+command);
                    Process process = Runtime.getRuntime().exec(command);

                    String errorMessage =
                        new String(toByteArray(process.getErrorStream()));
                    String outputMessage =
                        new String(toByteArray(process.getInputStream()));
                    int exitValue = 0;
                    try
                    {
                        exitValue = process.waitFor();
                    }
                    catch (InterruptedException e)
                    {
                        Thread.currentThread().interrupt();
                        throw new IOException(
                            "Interrupted while waiting for nvcc output", e);
                    }

                    if (exitValue != 0)
                    {
                        System.out.println("nvcc process exitValue "+exitValue);
                        System.out.println("errorMessage:
"+errorMessage);
                        System.out.println("outputMessage:
"+outputMessage);
                        throw new IOException(
                            "Could not create .ptx file: "+errorMessage);
                    }

                    System.out.println("Finished creating PTX file");
                    return ptxFileName;
                }

                /**
                 * Fully reads the given InputStream and returns it as a byte array
                 *
                 * @param inputStream The input stream to read
                 * @return The byte array containing the data from the input stream
                 * @throws IOException If an I/O error occurs
                 */
                private static byte[] toByteArray(InputStream inputStream)
                    throws IOException
                {
                    ByteArrayOutputStream baos = new ByteArrayOutputStream();
                    byte buffer[] = new byte[8192];
                    while (true)
                    {
                        int read = inputStream.read(buffer);
                        if (read == -1)
                        {
                            break;
                        }
                        baos.write(buffer, 0, read);
                    }
                    return baos.toByteArray();
                }


            }

giving error:

C:\Users\590943\workspace\Assignments\TestSampleCudaKernel.ptx Exception in thread “main” C:\Users\590943\workspace\Assignments\TestSampleCudaKernel.cu java.io.IOException: Input file not found: C:\Users\590943\workspace\Assignments\TestSampleCudaKernel.cu at TestSampleCuda.preparePtxFile(TestSampleCuda.java:169) at TestSampleCuda.main(TestSampleCuda.java:46)

Can you please help me?

The message clearly says
Input file not found: C:\Users\590943\workspace\Assignments\TestSampleCudaKernel.cu (coming from line 151)

Are you sure that the file exists?

yes file is there…

now it is showing error: Exception in thread “main” jcuda.CudaException: CUDA_ERROR_INVALID_IMAGE
at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:288)
at jcuda.driver.JCudaDriver.cuModuleLoad(JCudaDriver.java:1906)
at TestSampleCuda.main(TestSampleCuda.java:50)
and at line no 50 I added one more line : cuModuleLoad(module, “TestSampleCudaKernel.ptx”);
below line CUmodule module = new CUmodule(); (in load ptx file)

The “TestSampleCudaKernel.ptx” is likely wrong, because it has to be the complete file name, including the path.

It should work with
cuModuleLoad(module, ptxFileName);
because the ptxFileName there should be the complete file name.

(If you want to insert it manually - although I do not recommend this - the path should probably be something like
“C:\Users\590943\workspace\Assignments\TestSampleCudaKernel.ptx”
if you properly compiled the PTX file)

yeah, I did it,still no change. I think ptx file is not generating so problem is there to fetch it in module…

EDIT: Ah, I noticed https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/21939-cl-exe-found-path.html#post140284 - I’ll respond there.


Old answer:

So does the file
C:\\Users\\590943\\workspace\\Assignments\\TestSampleCudaKernel.ptx
exist on your system?

(If it cannot be created, then the “preparePtxFile” method should actually show an error message)

yes, TestSampleCudaKernel.ptx is generating now but is shows error:
jcuda.CudaException: CUDA_ERROR_NO_BINARY_FOR_GPU

If you are the same person as the one who wrote the other posts:
https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/21752-basic-jcuda-setup-windows.html#post140361
https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/19607-matrix-row-sum-jcuda.html#post140379
https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/20617-jcuda-cudaexception-cuda_error_invalid_image.html#post140362
https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/21939-cl-exe-found-path.html#post140360
then please try to focus on one topic, otherwise I don’t know what I should write where, and I don’t know what your actual question is, and what exactly works or does not work.


I’ll answer here:
https://forum.byte-welt.net/byte-welt-projekte-projects/jcuda/21939-cl-exe-found-path.html#post140360