JCudaDriverCubinSample -- TEST FAILED

i am running java 1.6, ubuntu 10.04 (amd64) and an Nvidia 460 card. Cuda works just fine.

I copied the Cubin example and and the cu file. I compiled the cu file with nvcc with the appropriate flags and run the code.

Test FAILED

After inserting some print statements in the check portion of the code

         // Verify the result
        boolean passed = true;
        for(int i = 0; i < numThreads; i++)
        {
            float expected = 0;
            for(int j = 0; j < size; j++)
            {
                expected += hostInput**[j];
            }
            System.out.println(expected + " " + hostOutput**);
            if (Math.abs(hostOutput** - expected) > 1e-5)
            {
                passed = false;
//                break;
            }
        }

I get this

8128.0 3.4136074E19
8128.0 -3.705157E37
8128.0 -9.872553E28
8128.0 -2.640668E37
8128.0 -3.3055266E22
8128.0 NaN
8128.0 -9.1280878E18
8128.0 -0.026603693

Has anyone seen this before? i am at a loss.

Hello,

Are you sure that you have replaced the existing CUBIN file with the one that you created? It’s important, becuase the CUBINS are architecture specific. Also make sure that your new CUBIN file does not get overwritten in the “prepareCubinFile” method.

How exactly did you compile the CUBIN file? It might be necessary to add the
nvcc -m64 -cubin input.cu -o output.cubin
parameter (if you not already did that), and, depending on the compute capability of your GPU, the argument indicating the architecture, like
nvcc -m64 -arch sm_20 -cubin input.cu -o output.cubin

bye
Marco

i use the compile command


nvcc -m64 -arch sm_21 -cubin JCudaCubinSample_kernel.cu -o JCudaCubinSample_kernel1.cubin

I changed the source to reflect the “1” in the name.

But you did point out something i had forgot -arch sm_21 instead of -arch sm_11. Now I get:

8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
8128.0 16000.0
Test FAILED

Forget it … problem solved!! with the switch to sm_21. I had forgotten that i had modified the CUDA c code to add 125 together and not the thread id… :stuck_out_tongue_winking_eye:

Thanks Marco

danke für die info


Gesegnet sein jene, die nichts zu sagen haben und trotzdem den Mund halten! *
(* Zitat von Oscar Wilde)
Aufgrund der Rechtschreibreform leidet der Author an einer umfassenden Verwirrung. Daher sind Fehler zu entschuldigen
Schau doch mal vorbei: www.meingutscheincode.de http://hosting.t-online.de