There are debuggers and profilers for OpenCL, like gDEBbugger from http://www.gremedy.com/ which now seems to be continued by AMD as http://developer.amd.com/tools/hc/gDEBugger/Pages/default.aspx, or the NVIDIA Visual Profiler. Using these with JOCL is possible to some extent (I intended to write a small "How To" about that ... when I have the time ). However, these tools can not analyze all possible reasons for a card being "out of resources". I think I already mentioned in another thread that CL_OUT_OF_RESOURCES seems to me (!) like a "standard message" that appears when anything goes wrong which is not explicitly covered with other error codes -_-
So there are many possible reasons: Unspecified errors (like writing out of memory bounds). Attempts to allocate memory that is too large.
Referring to your use of Images: One reason could be the attempt to allocate/create too many image samplers (or exceeding any other limit that is implied by the values reported by clGetDeviceInfo - see http://jocl.org/samples/JOCLDeviceQuery.java , although this does not query all properties).
Referring to your problems with building some kernels: An attempt to use too much local memory (or maybe even a kernel that runs out of registers) could also cause this error - although your kernels did not seem to use local memory or many registers, so this may be unlikely here.
Some of these reasons could possibly detected beforehand and offline, using the NVIDIA Occupancy Calculator or the AMD Kernel Analyzer.
BTW: You seem to work towards some generic image manipulation/analysis library, right? Admittedly, I'm also not sure about some real practical aspects of using OpenCL: On the one hand, it's intended to be device-independent, on the other hand, you still always have to query limits, and in the worst case, use different execution paths depending on the results. (Not to mention the question about how to properly handle different OpenCL versions that are supported by different platforms that may be installed simultaneously, but that's another topic). In that sense, in order to be "perfectly portable", one probably has to do many device queries and cover all cases, or try to choose or assume the lowest common denominator....