Atomics gives error

Fredrik · 17. April 2016 um 10:04

Hi.

I trying to create an example to calculate and average value. The kernel with calulate an result and added to total sum.
Im using latest 2.0 openCl with JOCL.

__kernel void averageKernel(__global const int *in, __global int *sum)
{
	int gid = get_global_id(0);
		if(get_local_id(0)==0){ 
		sum[0]=0; 
	} 
	float res = sin((float)gid) + cos((float)gid) +  sin((float)gid)  + in[gid];
	atomic_add(&sum[0],res); 
}

Kernel works with integers but not with doubles. Then I get the error:

**instance of overloaded function “atomic_add” matches the argument list argument types are: (__global double *, float)
**

Thanks

//Fredrik

*** Edit ***

Was reading the API, https://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/atomic_add.html

Seems like atomic_add() only supports integers, I dont know why.

Any suggestions for a workaround so floats or doubles can be used?

Marco13 · 17. April 2016 um 10:46

I haven’t yet used atomics extensively. A quick websearch reveals some options ( Advanced application development: OpenCL 1.1: Atomic operations on floating point values ), but for double values, this will likely be more tricky. The kernel that you posted so far seems to be a “dummy” kernel (at least, I’m not sure what the sin/cos computations should accomplish). If you intend to compute the average of a given array, I think a reduction with “+” and a subsequent division by the length of the array may be an option…

Fredrik · 24. April 2016 um 09:46

I found an example that works with floats.

 void atomicAdd(volatile __global float *addr, float val)
					   {
					       union{
					       unsigned int u64;
					       float        f64;
					       } next, expected, current;
					   	current.f64    = *addr;
					       do{
					   	   expected.f64 = current.f64;
					           next.f64     = expected.f64 + val;
					   		current.u64  = atomic_cmpxchg( (volatile __global unsigned int *)addr, 
					                               expected.u64, next.u64);
					       } while( current.u64 != expected.u64);
					   }

but should be used with care since it has an big performance impact.

//Fredrik

Marco13 · 24. April 2016 um 10:49

This looks like it was equivalent to the sample in the article that I linked to (although I wonder why it’s “64” in the variable names - a float value only has 32 bits…).

How large the performance impact actually is? Sorry, no idea, but … the built-in atomics (for int) won’t come for free either…

Fredrik · 24. April 2016 um 11:10

Yes, almost the same example, only 64 bits part is different.

For me the differens was multiple times.

I tested to replace the 64 with 32 and there was an differens by one decimal. Not much but there should not be any ?

Thanks

//Fredrik

Marco13 · 24. April 2016 um 16:29

Well, I would have to “align” (format) the code a bit better, but from now having looked twice over both, I’d say that they are equal - but regardless of that: The “64” parts are only variable names and technically don’t make a difference. Where did you see the “once decimal” difference that you mentioned?

Fredrik · 26. April 2016 um 12:33

Hi.

I post the result from the program when got home again in the weekend.

Thanks

//Fredrik

Fredrik · 30. April 2016 um 01:39

Hi.

Just arrived home. Run an exempel and when using the 32 vs 64 named version I get the result:

Average value calculated: (2.6229537/500) = 0.005245907 (32 named version)
Average value calculated: (2.6229506/500) = 0.005245901 (64 named version)

But it should not be you said, I need to do some more investigation

Thanks

//Fredrik

Marco13 · 30. April 2016 um 05:01

When the naming of variables seems to make a difference, it’s more likely that there is some other sort of race condition involved. Did you try each version serveral times? When one version produces different results in each run, then there is another problem.
(Maybe I can try it out as well in the next days)

Fredrik · 1. Mai 2016 um 09:40

Hi.

You are correct, seems to be som sort if race cond. since im getting different result each time.

//Fredrik

Marco13 · 1. Mai 2016 um 11:08

Note, however, that a race condition may not be the only reason of why you receive different results each time. The order in which the threads are run (i.e. “the order in which the gids are processed”) is not necessarily deterministic. And since you are adding floating point values, the order will affect the results. Floating point arithmetic is not associative:

public class FloatingPointArithmeticIsNotAssociative
{
    public static void main(String[] args)
    {
        double x=0.1+(0.2+0.3);
        double y=(0.1+0.2)+0.3;
        System.out.printf("%30.20f
", x);
        System.out.printf("%30.20f
", y);
        System.out.println("equal? "+(x==y));
    }
}

I think that even numerical tricks like https://en.wikipedia.org/wiki/Kahan_summation_algorithm won’t help here, as long as the order of additions is not predictable.