Calculation standart math functions like sin, cos, sqrt and other on JCUDA?

How to calculate standart math functions like sin, cos, sqrt and other on JCUDA?

Who have experience in this question? some samples?

If it’s impossible in existing JCUDA capabilities, how I can create it?

Whether there are plans to expand JCUDA?

Thank you!

Hello

I’m not sure what you mean. You can simply use sin, cos and so on - see the CUDA Programming Guide, section C.1, “Standard Functions”.

(The question sounds a little bit like you are searching for a CUDA function to replace the java.lang.Math functions - but this does not make any sense).

Could you explain what exactly you want to achieve?

bye
Marco

Hello,

Yes, I want to replace the java.lang.Math functions.

I need to calculate simple functions many times.

Why this does not make any sense?

By the way I find GPUlib (http://www.txcorp.com/products/GPULib/). There Java bindings for CUDA, but in the experimental stage, and I not obtaine to use it. There class VectorOp have many functions: sqrtF,expF,logF,sinF and others.

I don’t know much programming on C, I exploring the possibility GPU programming on Java.:confused:

Thanks!

Hello,

Well, it can make sense, of course, but only in a certain context. When you have a large program which calls Math.sin at several places, then you can not simply replace all occurances of Math.sin with another call like “Cuda.sin”, and expect your application to run faster.

The key idea of CUDA is that of data-parallel processing. That is, you apply the same computation several hundred, thousand or even million times - only operating on different parts of data. Each of these operations is then computed in its own thread.

As a trivial example where it is may be faster to use CUDA: If you have, say, one Million float values, and want to compute the sine of each value, then you can do something like this in Java

int n = 1000000;
float input[] = inputArrayOfFloats(n);
float output[] = new float[n];

for (int i=0; i<n; i++)
{
    output** = (float)Math.sin(input**);
}

In CUDA (or JCuda) you would write a kernel, which executes this operation in parallel:

extern "C"
__global__ void computeSine(float *input, float *output)
{
    int index = threadIdx.x + blockDim.x*blockIdx.x;
    output[index] = sin(input[index]);
}

This kernel could then be compiled into a CUBIN (CUDA binary) file, and loaded and executed using the CUDA Driver API.

Note that in this case, the operation (only one ‘sin’ computation) is still very simple, so the speedup may not yet be worth the effort. But if you have to perform more complex computations, preferable lots of arithmetic or trigonometric operations on a relatively small chunk of data, then CUDA will more likely bring noticable performance benefits.

EDIT: BTW, thanks for pointing me at GPULib, I didn’t know this one before.

bye
Marco