Question from a beginner about jcuda

Hi!
I have a problem with jcuda. I tried to get a file as byte array and sent it to device,then clone it to output pointer.
My aim is observing that are input and output files same. Hovewer input text and output text dont match, I got lots of # symbol in my output file. Then I tried to send float array instead of byte array. Same problem goes on. I got 0.0 as ouput of my float array’s first member. What am I missing? What’s wrong in my code?
Best regards
Zafer

public class byteToCuda {
    public static void main(String[] args) throws IOException{
 
        File myFile = new File("dene.odt");
        long fileLength = myFile.length();

       byte[] mybyte = getBytesFromFile(myFile);

       float[] zafer = new float[1];
       int size = zafer.length;
       zafer[0] = 299;
      

//----------------------------------JCUDA BEGINS--------------------------------------
       // allocate device memory
        Pointer devicePointer = new Pointer();
        JCuda.cudaMalloc(devicePointer, fileLength);

        Pointer devicePointer2 = new Pointer();
        JCuda.cudaMalloc(devicePointer2, size);

        // copy host memory to device
        JCuda.cudaMemcpy(devicePointer, Pointer.to(mybyte), fileLength,
            cudaMemcpyKind.cudaMemcpyHostToDevice);

        JCuda.cudaMemcpy(devicePointer2, Pointer.to(zafer), fileLength,
            cudaMemcpyKind.cudaMemcpyHostToDevice);

        // allocate device memory for result
        Pointer deviceOutputPointer = new Pointer();
        JCuda.cudaMalloc(deviceOutputPointer, fileLength);

        Pointer deviceOutputPointer2 = new Pointer();
        JCuda.cudaMalloc(deviceOutputPointer2, size);

        //copy devicePointer into deviceOutputPointer
       deviceOutputPointer = devicePointer.to((byte[])mybyte.clone());

       deviceOutputPointer2 = devicePointer2.to(zafer.clone());

       // allocate mem for the result on host side
        byte[] fromCUDA =new byte[(int)fileLength*2];

        float[] fromCUDA2 =new float[size];

        // copy result from device to host
        JCuda.cudaMemcpy(Pointer.to(fromCUDA), deviceOutputPointer, fileLength*2,
           cudaMemcpyKind.cudaMemcpyDeviceToHost);
          
        JCuda.cudaMemcpy(Pointer.to(fromCUDA2), deviceOutputPointer2, size,
           cudaMemcpyKind.cudaMemcpyDeviceToHost);
//----------------------------------JCUDA ENDS--------------------------------------

       OutputStream out = new FileOutputStream("denedim.odt");
        out.write(fromCUDA);
        out.close();

        System.out.println(fromCUDA2[0]);

     //Set the pointers free!
    JCuda.cudaFree(devicePointer);
    JCuda.cudaFree(deviceOutputPointer);

    JCuda.cudaFree(devicePointer2);
    JCuda.cudaFree(deviceOutputPointer2);
    }


    
    public static byte[] getBytesFromFile(File file) throws IOException {...}

}

Hello

There are some… strange lines in the Code… For example, with
deviceOutputPointer = devicePointer.to((byte[])mybyte.clone());
you are overwriting the “deviceOutputPointer” with a NEW pointer (that is NO device pointer, but points to the cloned byte array). It’s probably a matter of luck that it does not crash when you try to access this pointer as a real device pointer. Adding
JCuda.setExceptionsEnabled(true);
at the beginning will probably show some “Invalid pointer” error messages. (I recommend to have this flag enabled during the development. Otherwise you would have to check every CUDA call for errors)

I have not yet tested it, but first of all wanted to ask what you are going to achieve: If this is really an attempt to check two files for equality more quickly? Then I suspect that CUDA will not bring any speedup: Copying the data from the host to the device will take some time, and will make it impossible to make an “early return” when the first difference is found. If this is an attempt to find all differences between two files, it might be possible to accelerate it with CUDA, but I’m not sure about that.

bye
Marco

Hi!
I just want to send byte array type variable to cuda memory and cloning the array is just a job for gpu (I couldnt find any easy job ). Sorry for wrong code, I am newbie and I just want to learn how to use cuda
Zafer

[QUOTE=Marco13]Hello

There are some… strange lines in the Code… For example, with
deviceOutputPointer = devicePointer.to((byte[])mybyte.clone());
you are overwriting the “deviceOutputPointer” with a NEW pointer (that is NO device pointer, but points to the cloned byte array). It’s probably a matter of luck that it does not crash when you try to access this pointer as a real device pointer. Adding
JCuda.setExceptionsEnabled(true);
at the beginning will probably show some “Invalid pointer” error messages. (I recommend to have this flag enabled during the development. Otherwise you would have to check every CUDA call for errors)

I have not yet tested it, but first of all wanted to ask what you are going to achieve: If this is really an attempt to check two files for equality more quickly? Then I suspect that CUDA will not bring any speedup: Copying the data from the host to the device will take some time, and will make it impossible to make an “early return” when the first difference is found. If this is an attempt to find all differences between two files, it might be possible to accelerate it with CUDA, but I’m not sure about that.

bye
Marco[/QUOTE]

Sorry, I still did not understand what you intend to do: Do you only want to compare the file contents?

If you just want to get started with CUDA/JCuda: Have you seen the short Tutorial that I added recently?

Sorry for my unclear words. I want to do file operations with jcuda but as far as I’ve read from forums, it is not a doable job for cuda. So I wanted to send a file as byte array to cuda memory, copy its content to another pointer and get this pointers data as output data as byte array to convert it to file again. I just tried to get what I sent as input at first, as output.

[QUOTE=Marco13]Sorry, I still did not understand what you intend to do: Do you only want to compare the file contents?

If you just want to get started with CUDA/JCuda: Have you seen the short Tutorial that I added recently?[/QUOTE]

I can try to create an example for that, but what you posted so far should be rather similar

The following is UNTESTED, just from the tip of my head:

byte[] hostInput = getBytesFromFile(file);
int size = hostInput.length;

// Allocate input device memory
Pointer deviceInput = new Pointer();
JCuda.cudaMalloc(deviceInput, size);

// Copy input host memory to device
JCuda.cudaMemcpy(deviceInput, Pointer.to(hostInput), size,
    cudaMemcpyKind.cudaMemcpyHostToDevice);

// Allocate output device memory
Pointer deviceOutput = new Pointer();
JCuda.cudaMalloc(deviceOutput, size);

// --->--- Do something interesting here ---
// Copy from device input to device output
JCuda.cudaMemcpy(deviceOutput, deviceInput, size,
    cudaMemcpyKind.cudaMemcpyDeviceToDevice);
// ---<--- Do something interesting here ---

// Allocate mem for the result on host side
byte[] hostOutput =new byte[size];

// Copy output from device to host
JCuda.cudaMemcpy(Pointer.to(hostOutput), deviceOutput, size,
    cudaMemcpyKind.cudaMemcpyDeviceToHost);

In the lines marked with
// —>--- Do something interesing here —
you could, for example, do some operation with CUBLAS or whatever (you could also do a Kernel call with the driver API, but if you want to use this, you could in general use the JCudaDriver API, although they are interoperable…)