Sure, its not critical with 400 or 600 ms. Would just be nice to make it shorter if possible. Lets say the calculations only is done in 300ms then another 400-600ms is needed just for the default init.
But as you said, its just needed once, after that the lib is loaded and can be reused
The 400-600ms is coming from my own benchmark, this is for the normal startup time on my PC.
Its all between
cl_platform_id platforms = new cl_platform_id;
clGetPlatformIDs(platforms.length, platforms, null);
clBuildProgram(program, 0, null, null, null, null);
kernel = clCreateKernel(program, "somekernel", null);