+ Antworten
Seite 1 von 2 1 2 LetzteLetzte
Ergebnis 1 bis 20 von 26

Thema: JCuda in Emulation mode

  1. #1
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Hi

    I have recently setup JCuda on my Linux x64 machine with Cuda v5.0. I would like to run the programs in emulation mode as i don't have own a Nvidia graphics card currently. Can i use GPUOcelot(gpuocelot - A dynamic compilation framework for PTX - Google Project Hosting) with JCuda to run it in emulation mode as Cuda 5.0 no longer have inbuilt emulation mode? Can I pass additional linker options like library options(-locelot) to JCudaDriver to link with the ocelot's libraries?

    Please advise.

    Thanks

  2. #2
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    Hello

    That's certainly an interesting idea.

    But admittedly, I only knew that gpuocelot existed, but never had a closer look at how it actually works, and never really gave it a try. I could try to allocate some time and set up the toolchain to see whether I get it working, but don't know when I would have the chance to do this (and even less, whether it would actually work )

    However, I also have to admit that I'm not sure how using it from JCuda would work on a technical level. Your question about the linker options sounds like you think that it is possible to simply re-compile the JCuda binaries, and use the Ocelot binaries as a drop-in-replacement of the NVIDIA CUDA libraries. Did I understand this correctly?

    bye
    Marco

  3. #3
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Zitat Zitat von Marco13 Beitrag anzeigen

    Your question about the linker options sounds like you think that it is possible to simply re-compile the JCuda binaries, and use the Ocelot binaries as a drop-in-replacement of the NVIDIA CUDA libraries. Did I understand this correctly?
    Hi Marco,

    Thanks for the reply. Yes, you got it right. I tried to run a sample program in C with the ocelot binaries completely bypassing the cuda libraries and it worked perfectly fine on the CPU. I searched for the sourcecode of JCuda for linux but could only find source code for Windows in the downloads section. It would be helpful if you could provide the source code(Version 0.5.0b) for Linux as well or how to compile the existing source code in Linux. I will then try to compile it with ocelot libraries and test it out with JCuda samples.

    P.S.- I have to do my college minor project using CUDA and Java, so any help in this direction would be much appreciated.

    Thanks
    Geändert von nadal (31.01.2015 um 17:11 Uhr)

  4. #4
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    The source code on the site is for all platforms. It includes CMake files. So one way to get started could be to...

    - download CMake from CMake
    - start cmake-gui, and point it to the directory of the JCuda sources
    - select an output directory for the build files
    - press "Configure" (this will ask for the compilation target - in your case, this would be some GCC makefile)
    - press "Generate"

    This will generate the makefiles, which, by default, will refer to the NVIDIA CUDA libraries (that are found by the "FindCUDA.cmake" script). In the simplest case, it should then be possible to replace these references in the makefile with references to the gpuocelot files.

    (I'm really curious whether this will work, this could be a nice feature... )

  5. #5
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Thanks.

    I was able to compile it with gpuocelot. And the ocelot emulator seems to be getting invoked as i am no longer facing "Error inserting device" error on running the program. However, i tried one of the samples from the site(JCudaVectorAdd), it failed with logs below:

    Code:
    $ java -cp ".:jcuda-0.5.0.jar" -Djava.library.path="../JCuda/sourcecode/JCuda-All-0.5.0b-src/lib" Test
     - cuDeviceGet() 
     - cuCtxCreate_v2() 
     - cuModuleLoad() 
     - cuModuleGetFunction() 
     - cuMemAlloc_v2() 
     - cuMemcpyHtoD_v2() 
     - cuMemAlloc_v2() 
     - cuMemcpyHtoD_v2() 
     - cuMemAlloc_v2() 
    Exception in thread "main" jcuda.CudaException: CUDA_ERROR_NOT_INITIALIZED
    	at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:282)
    	at jcuda.driver.JCudaDriver.cuLaunchKernel(JCudaDriver.java:14106)
    	at Test.main(Test.java:92)

  6. #6
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    I think these outputs of the functions that are called is some sort of debug/trace output of gpuocelot?

    The strange thing is: It seems to call all the functions, as expected, except for "cuInit" (which would explain why it later says "NOT_INITIALIZED", although I would have expected this error to happen earlier then...)

    You may try to add the (unofficial) log level setting
    C Code:
    JCudaDriver.setLogLevel(LogLevel.LOG_DEBUGTRACE);
    at the beginning of the "main" method - then the JCuda library should additionally print the calls that are done to the native library.

    Apart from that, I'm not sure whether there are any precautions necessary in order to use gpuocelot (maybe some special value to be passed as the "flags" parameter for "cuInit"?). But when you say that it worked in the native version, this can hardly be the case...

  7. #7
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Thanks. With debug logging enabled -
    Code:
    Executing cuInit
     - cuInit() 
    Executing cuDeviceGet for device 0
     - cuDeviceGet() 
    Executing cuCtxCreate
     - cuCtxCreate_v2() 
    Executing cuModuleLoad
     - cuModuleLoad() 
    Executing cuModuleGetFunction
     - cuModuleGetFunction() 
    Executing cuMemAlloc of 400000 bytes
     - cuMemAlloc_v2() 
    Executing cuMemcpyHtoD of 400000 bytes
    Initializing pointer data for Java NativePointerObject 0x7fafa69e1878
    Initializing ArrayBufferPointerData
     - cuMemcpyHtoD_v2() 
    Releasing ArrayBufferPointerData
    Executing cuMemAlloc of 400000 bytes
     - cuMemAlloc_v2() 
    Executing cuMemcpyHtoD of 400000 bytes
    Initializing pointer data for Java NativePointerObject 0x7fafa69e1878
    Initializing ArrayBufferPointerData
     - cuMemcpyHtoD_v2() 
    Releasing ArrayBufferPointerData
    Executing cuMemAlloc of 400000 bytes
     - cuMemAlloc_v2() 
    Executing cuLaunchKernel
    Initializing pointer data for Java NativePointerObject 0x7fafa69e1800
    Initializing PointersArrayPointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076de0
    Initializing ArrayBufferPointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e00
    Initializing PointersArrayPointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e18
    Initializing NativePointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e20
    Initializing PointersArrayPointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e38
    Initializing NativePointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e40
    Initializing PointersArrayPointerData
    Initializing pointer data for Java NativePointerObject 0x7fafa0076e58
    Initializing NativePointerData
    Initializing pointer data for Java NativePointerObject (nil)
    Initializing NativePointerObjectPointerData
    Releasing PointersArrayPointerData
    Releasing ArrayBufferPointerData
    Releasing PointersArrayPointerData
    Releasing NativePointerData
    Releasing PointersArrayPointerData
    Releasing NativePointerData
    Releasing PointersArrayPointerData
    Releasing NativePointerData
    Releasing NativePointerObjectPointerData
    Exception in thread "main" jcuda.CudaException: CUDA_ERROR_NOT_INITIALIZED
    	at jcuda.driver.JCudaDriver.checkResult(JCudaDriver.java:282)
    	at jcuda.driver.JCudaDriver.cuLaunchKernel(JCudaDriver.java:14106)
    	at Test.main(Test.java:93)
    I will try to dig deeper and post any findings here..

    *** Edit ***

    Running the JCudaDeviceQuery example gave the following output:

    Code:
    JCuda/sourcecode/JCuda-All-0.5.0b-src/lib" JCudaDeviceQuery
     - cuInit() 
     - cuDeviceGetCount() 
    Found 1 devices
     - cuDeviceGet() 
     - cuDeviceGetName() 
     - cuDeviceComputeCapability() 
    Device 0: Ocelot PTX Emulator with Compute Capability 2.1
     - cuDeviceGetAttribute() 
        Maximum number of threads per block                  : 512
     - cuDeviceGetAttribute() 
        Maximum x-dimension of a block                       : 1024
     - cuDeviceGetAttribute() 
        Maximum y-dimension of a block                       : 1024
     - cuDeviceGetAttribute() 
        Maximum z-dimension of a block                       : 1024
     - cuDeviceGetAttribute() 
        Maximum x-dimension of a grid                        : 1048576
     - cuDeviceGetAttribute() 
        Maximum y-dimension of a grid                        : 1048576
     - cuDeviceGetAttribute() 
        Maximum z-dimension of a grid                        : 1048576
     - cuDeviceGetAttribute() 
        Maximum shared memory per thread block in bytes      : 65536
     - cuDeviceGetAttribute() 
        Total constant memory on the device in bytes         : 65536
     - cuDeviceGetAttribute() 
        Warp size in threads                                 : 32
     - cuDeviceGetAttribute() 
        Maximum pitch in bytes allowed for memory copies     : 256
     - cuDeviceGetAttribute() 
        Maximum number of 32-bit registers per thread block  : 1024
     - cuDeviceGetAttribute() 
        Clock frequency in kilohertz                         : 1000000
     - cuDeviceGetAttribute() 
        Alignment requirement                                : 256
     - cuDeviceGetAttribute() 
        Number of multiprocessors on the device              : 1
     - cuDeviceGetAttribute() 
        Whether there is a run time limit on kernels         : 1000
     - cuDeviceGetAttribute() 
        Device is integrated with host memory                : 0
     - cuDeviceGetAttribute() 
        Device can map host memory into CUDA address space   : 1
     - cuDeviceGetAttribute() 
        Compute mode                                         : 1
     - cuDeviceGetAttribute() 
        Maximum 1D texture width                             : 4092
     - cuDeviceGetAttribute() 
        Maximum 2D texture width                             : 4092
     - cuDeviceGetAttribute() 
        Maximum 2D texture height                            : 4092
     - cuDeviceGetAttribute() 
        Maximum 3D texture width                             : 4092
     - cuDeviceGetAttribute() 
        Maximum 3D texture height                            : 4092
     - cuDeviceGetAttribute() 
        Maximum 3D texture depth                             : 4092
     - cuDeviceGetAttribute() 
        Maximum 2D layered texture width                     : 4092
     - cuDeviceGetAttribute() 
        Maximum 2D layered texture height                    : 4092
     - cuDeviceGetAttribute() 
        Maximum layers in a 2D layered texture               : 4092
     - cuDeviceGetAttribute() 
        Alignment requirement for surfaces                   : 256
     - cuDeviceGetAttribute() 
        Device can execute multiple kernels concurrently     : 1
     - cuDeviceGetAttribute() 
        Device has ECC support enabled                       : 0
     - cuDeviceGetAttribute() 
    java: ocelot/ocelot/cuda/implementation/CudaDriverFrontend.cpp:485: virtual CUresult cuda::CudaDriverFrontend::cuDeviceGetAttribute(int*, CUdevice_attribute, CUdevice): Assertion `0 && "cuDeviceGetAttribute() - unsupported attribute requested: "' failed.
    Aborted
    Geändert von nadal (01.02.2015 um 14:35 Uhr)

  8. #8
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    NOW it also prints "cuInit". But I think I might have found something that could be a problem: Although the Google Code page ( https://code.google.com/p/gpuocelot/ ) does not seems to say anything here, the FAQ at FAQ | GPU Ocelot says:
    Which versions of PTX are supported?
    GPU Ocelot robustly supports PTX 2.3 and CUDA 4.0. We are in the process of improving support for CUDA 4.1, though many CUDA programs compiled with CUDA 4.1 work correctly already.
    It might be a coincidence, but ... the error appears in the call to "cuLaunchKernel", with the new kernel launching syntax - it might at least be worth a try to use the "old" syntax. (which is, for example, still used in http://jcuda.org/samples/JCudaDriverTextureSample.java ). The reason for the error in the device query might also be related to an attribute that was not yet present in the CUDA versions <5.0. But until now, this is just a guess....

  9. #9
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Thanks. I will try this out.

  10. #10
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    Sorry @nadal , I'm curious whether you got this working

    When you tried ocelot in native mode (with CUDA instead of JCuda), which program did you use to test it? I just wondered whether it also contained CUDA 5.x features (which should not work with ocelot, if I interpreted the documentation correctly)

  11. #11
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Hi

    Sorry, i was a bit busy with some other work, so couldn't post the progress here. I tried loading module using the old signature, but that also didn't work out ,unfortunately. The example i used with native CUDA 5 and ocelot 2.3 is given here - CUDA Tutorial 01 - Ocelot

    I am getting the following error while using old signature of loading a module:
    Code:
    java -cp ".:jcuda-0.5.0.jar" -Djava.library.path="/../JCuda/sourcecode/JCuda-All-0.5.0b-src/lib" JCudaReduction
     - cuInit() 
     - cuDeviceGet() 
     - cuModuleLoad() 
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x00007fe14f0a9bdb, pid=3481, tid=140605992240896
    #
    # JRE version: OpenJDK Runtime Environment (7.0_55-b14) (build 1.7.0_55-b14)
    # Java VM: OpenJDK 64-Bit Server VM (24.51-b03 mixed mode linux-amd64 compressed oops)
    # Problematic frame:
    # C  [libocelot.so+0x3eebdb]  cuda::CudaDriverFrontend::_getContext()+0x2b
    #
    # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
    #
    # An error report file with more information is saved as:
    # /../Workspace/JCudaTest/src/hs_err_pid3481.log
    #
    # If you would like to submit a bug report, please include
    # instructions on how to reproduce the bug and visit:
    #   http://icedtea.classpath.org/bugzilla
    # The crash happened outside the Java Virtual Machine in native code.
    # See problematic frame for where to report the bug.
    #
    Aborted
    Let me know your thoughts/suggestions on this.

    Thanks
    Geändert von nadal (09.02.2015 um 06:54 Uhr)

  12. #12
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Tried one of the other samples(JCudaBandwidthTest.java) which did not involve calling a kernel. It worked fine with ocelot. So the problem seems to be with kernel loading from JCuda.

    Code:
    Running..................
    Bandwidths for PINNED
          1024 bytes : 262.144 MB/s
          2048 bytes : 1290.555 MB/s
          4096 bytes : 2542.002 MB/s
          8192 bytes : 4142.522 MB/s
         16384 bytes : 6213.784 MB/s
         32768 bytes : 9007.901 MB/s
         65536 bytes : 7872.007 MB/s
        131072 bytes : 9068.766 MB/s
        262144 bytes : 6774.396 MB/s
        524288 bytes : 6011.992 MB/s
       1048576 bytes : 6189.605 MB/s
       2097152 bytes : 6387.993 MB/s
       4194304 bytes : 6648.299 MB/s
       8388608 bytes : 6338.149 MB/s
      16777216 bytes : 6247.680 MB/s
      33554432 bytes : 6318.103 MB/s
      67108864 bytes : 6360.853 MB/s
     134217728 bytes : 6278.425 MB/s
    
    
    Running..................
    Bandwidths for PAGEABLE_ARRAY
          1024 bytes : 188.933 MB/s
          2048 bytes : 316.551 MB/s
          4096 bytes : 599.186 MB/s
          8192 bytes : 1118.481 MB/s
         16384 bytes : 2027.458 MB/s
         32768 bytes : 3043.486 MB/s
         65536 bytes : 5326.100 MB/s
        131072 bytes : 7101.467 MB/s
        262144 bytes : 8215.316 MB/s
        524288 bytes : 8010.010 MB/s
       1048576 bytes : 4603.888 MB/s
       2097152 bytes : 5959.438 MB/s
       4194304 bytes : 6417.104 MB/s
       8388608 bytes : 6188.937 MB/s
      16777216 bytes : 6105.142 MB/s
      33554432 bytes : 6003.300 MB/s
      67108864 bytes : 6043.225 MB/s
     134217728 bytes : 5810.144 MB/s
    
    
    Running..................
    Bandwidths for PAGEABLE_DIRECT_BUFFER
          1024 bytes : 352.463 MB/s
          2048 bytes : 742.355 MB/s
          4096 bytes : 1729.610 MB/s
          8192 bytes : 3423.922 MB/s
         16384 bytes : 5546.187 MB/s
         32768 bytes : 8441.366 MB/s
         65536 bytes : 9161.620 MB/s
        131072 bytes : 9418.788 MB/s
        262144 bytes : 9353.151 MB/s
        524288 bytes : 7617.892 MB/s
       1048576 bytes : 7465.613 MB/s
       2097152 bytes : 6634.691 MB/s
       4194304 bytes : 5231.544 MB/s
       8388608 bytes : 6083.846 MB/s
      16777216 bytes : 6316.651 MB/s
      33554432 bytes : 6053.726 MB/s
      67108864 bytes : 6092.908 MB/s
     134217728 bytes : 5870.631 MB/s
    
    
    Done

  13. #13
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    Zitat Zitat von nadal Beitrag anzeigen
    The example i used with native CUDA 5 and ocelot 2.3 is given here - CUDA Tutorial 01 - Ocelot
    This one uses the runtime API, with the <<<kernel>>> launching syntax. Did you also try one that uses the Driver API? It seems that this gpuocelot test was intended for this: http://gpuocelot.googlecode.com/svn/.../vectorAdd.cpp

    The file that is mentioned in the error report ("/../Workspace/JCudaTest/src/hs_err_pid3481.log") could contain additional information (note: It may also contain some info about the settings on your PC that you might want to omit if you post it), but it's hard to pull really helpful information out of it: In doubt, it also just says that there was an error in
    C [libocelot.so+0x3eebdb] cuda::CudaDriverFrontend::_getContext()+0x2b
    but possibly, at least, with an additional stack trace.

  14. #14
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Thanks. I think i can use the runtime API with ocelot for my project as there is no necessity to use the driver API/external kernels which hopefully would suffice.

    The error report contains the following exceptions which don't seem to give any significant information as you said.
    Code:
    Internal exceptions (10 events):
    Event: 0.159 Thread 0x00007f73dc00a000 Threw 0x00000007d6ed3e90 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.160 Thread 0x00007f73dc00a000 Threw 0x00000007d6ed8878 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.160 Thread 0x00007f73dc00a000 Threw 0x00000007d6ede8e0 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.160 Thread 0x00007f73dc00a000 Threw 0x00000007d6ee1b98 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.161 Thread 0x00007f73dc00a000 Threw 0x00000007d6ee4ea0 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.161 Thread 0x00007f73dc00a000 Threw 0x00000007d6ee9758 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.162 Thread 0x00007f73dc00a000 Threw 0x00000007d6eedab8 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.162 Thread 0x00007f73dc00a000 Threw 0x00000007d6ef1de8 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.164 Thread 0x00007f73dc00a000 Threw 0x00000007d6ef5050 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Event: 0.164 Thread 0x00007f73dc00a000 Threw 0x00000007d6ef86d0 at /build/buildd/openjdk-7-7u55-2.4.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:1244
    Thanks for your help throughout. I can share the steps to use gpuocelot with JCuda for runtime API only in the form of an article which might help the JCuda community.

    Regards

  15. #15
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    Zitat Zitat von nadal Beitrag anzeigen
    I can share the steps to use gpuocelot with JCuda for runtime API only in the form of an article which might help the JCuda community.
    That would be great

    However, I still wonder what might cause it to fail when launching a kernel. How did you compile the PTX file? Maybe there's another versioning issue, and you might have to add -arch sm_23 when compiling the PTX file, to make sure that it does not use newer features...

  16. #16
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Yes, i had tried with -arch sm_20 and sm_21 for generating the ptx file. I am not sure if Cuda 5 is backward compatible i.e. if we can run a program having kernel launching code written with v4 API in Cuda v5 environment.

    Thanks
    Geändert von nadal (10.02.2015 um 10:54 Uhr)

  17. #17
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    It is backward compatible, in the sense that some of the samples still use the "old" syntax, and they are still running smoothly with CUDA 6.5.

  18. #18
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Hi

    Listed below are the steps to use Ocelot CUDA emulator with JCuda.

    Pre-requisites:

    1. CUDA Sdk 5.0
    2. JCuda-All-0.5.0b source code
    3. GPUOcelot (Downloads - gpuocelot - A dynamic compilation framework for PTX - Google Project Hosting and CUDA Tutorial 01 - Ocelot )

    Integration of Ocelot with JCuda

    Ocelot emulator can be integrated with JCuda very easily simply by linking the ocelot libraries during compilation.
    1. Link the ocelot library by adding "-locelot" in the file "link.txt" in each of the JNI related component folders(or whichever component you are interested in emulating). For example, to emulate JCuda runtime api, the file "link.txt" located at /JCuda/sourcecode/JCuda-All-0.5.0b-src/JCudaRuntimeJNI/CMakeFiles/JCudaRuntime-linux-x86_64.dir will need to be changed as shown below.

    Code:
    /usr/bin/c++  -fPIC    -shared -Wl,-soname,libJCudaRuntime-linux-x86_64.so -o ../lib/libJCudaRuntime-linux-x86_64.so CMakeFiles/JCudaRuntime-linux-x86_64.dir/src/JCudaRuntime.cpp.o -locelot /usr/local/cuda/lib64/libcudart.so -lcuda ../lib/libCommonJNI.a -Wl,-rpath,/usr/local/cuda/lib64
    2. Make sure that the ocelot library is linked ahead of cuda and cudart libraries, as shown above, to completely bypass them so that the nvidia driver is not invoked.

    3. Compile the JCuda source code with the above settings to generate new binaries and point to them while running programs. This would invoke the Ocelot emulator instead of the nvidia driver.

    Please note that the driver api may not work correctly due to compatibilty issues between cuda 5.0 and ocelot 2.1.

    Thanks

  19. #19
    Global Moderator Viertel Gigabyte
    Registriert seit
    05.08.2008
    Fachbeiträge
    4.963
    Genannt
    324 Post(s)
    Hi

    Good to hear that it worked (for the Runtime API, at least).

    Are the
    /usr/local/cuda/lib64/libcudart.so -lcuda
    flags still required for the compilation at all?

    If you don't mind, I'd put this information on the JCuda website (referring to this forum post, or with more detailed credits if you send me a PM with further infos)

    bye
    Marco

  20. #20
    User byte Themenstarter

    Registriert seit
    31.01.2015
    Fachbeiträge
    14
    Genannt
    1 Post(s)
    Hi

    Yes, cuda flags are still required for compilation.
    Sure, the information can be put on the website. Let me know if you need any more information.

    Thanks

+ Antworten Thema als "gelöst" markieren
Seite 1 von 2 1 2 LetzteLetzte

Direkt antworten Direkt antworten

Aktive Benutzer

Aktive Benutzer

Aktive Benutzer in diesem Thema: 1 (Registrierte Benutzer: 0, Gäste: 1)

Ähnliche Themen

  1. Java NIO Channels non blocking mode
    Von ZickZack im Forum Java-Grundlagen
    Antworten: 1
    Letzter Beitrag: 27.11.2014, 21:59
  2. NoSQL nur ein Mode-Trend?
    Von Sascha-Kunitz im Forum Datenbankprogrammierung
    Antworten: 10
    Letzter Beitrag: 12.09.2014, 15:24
  3. CWorkingArea and Externalized mode
    Von Cedric im Forum DockingFrames
    Antworten: 2
    Letzter Beitrag: 25.09.2013, 13:43
  4. jcuda in emulation with 0.2.3
    Von vinayaka im Forum JCuda
    Antworten: 1
    Letzter Beitrag: 25.10.2011, 11:18
  5. jcuda in emulation
    Von Unregistered im Forum JCuda
    Antworten: 1
    Letzter Beitrag: 15.10.2011, 15:59

Berechtigungen

  • Neue Themen erstellen: Ja
  • Themen beantworten: Ja
  • Anhänge hochladen: Nein
  • Beiträge bearbeiten: Nein
  •