How to use clCreateSubBuffer?


Hi Marco,

Can you please give me an example of how to use org.jocl.CL.clCreateSubBuffer?

Thank You!



I’ll post an example on sunday or monday, I think I created one when I tested the new OpenCL 1.1 functions



OK, I did not test this function extensively :o otherwise I would have noticed that there are two minor issues: The first is that the creation flags should be a ‘long’ value (but this is trivial to fix in the next release). The other one is that the creation of the ‘buffer_create_info’ structure may be a hassle when trying to write it in the most generic way. The ‘buffer_create_info’ is a pointer to a structure that contains two size_t values:

typedef struct_cl_buffer_region {
    size_t origin;
    size_t size;

At the moment, this is simply treated as a usual pointer. That means that one has to check on Java side what the size of a size_t actually is. I have attached an example how this may be done (I’ll have to verify this on a 64bit machine), but I’ll try to find a simpler and more intuitive solution to handle this.

 * JOCL - Java bindings for OpenCL
 * Copyright 2012 Marco Hutter -
package org.jocl.samples;

import static org.jocl.CL.*;

import java.nio.*;
import java.util.*;

import org.jocl.*;

 * A sample demonstrating how to create sub-buffers
 * that have been introduced with OpenCL 1.1.
public class JOCLSubBufferSample
    private static cl_context context;
    private static cl_command_queue commandQueue;

     * The entry point of this sample
     * @param args Not used
    public static void main(String args[])
        // Create an array with 8 elements and consecutive values
        int fullSize = 8;
        float fullArray[] = new float[fullSize];
        for (int i=0; i<fullSize; i++)
            fullArray[i] = i;
        System.out.println("Full input array  : "+Arrays.toString(fullArray));
        // Create a buffer for the full array
        cl_mem fullMem = clCreateBuffer(context, 
            Sizeof.cl_float * fullSize,, null);

        // Create a sub-buffer
        int subOffset = 2;
        int subSize = 4;
        cl_mem subMem = clCreateSubBuffer(fullMem, 
            createInfo(subOffset, subSize, Sizeof.cl_float), null);

        // Create an array for the sub-buffer, and copy the data
        // from the sub-buffer to the array
        float subArray[] = new float[subSize];
        clEnqueueReadBuffer(commandQueue, subMem, true, 
            0, subSize * Sizeof.cl_float,, 
            0, null, null);
        System.out.println("Read sub-array    : "+Arrays.toString(subArray));

        // Modify the data in the sub-array, and copy it back
        // into the sub-buffer
        subArray[0] = -5;
        subArray[1] = -4;
        subArray[2] = -3;
        subArray[3] = -2;
        clEnqueueWriteBuffer(commandQueue, subMem, true, 
            0, subSize * Sizeof.cl_float,, 
            0, null, null);

        System.out.println("Modified sub-array: "+Arrays.toString(subArray));
        // Read the full buffer back into the array 
        clEnqueueReadBuffer(commandQueue, fullMem, true, 
            0, fullSize * Sizeof.cl_float,, 
            0, null, null);
        System.out.println("Full result array : "+Arrays.toString(fullArray));
     * Create a pointer to a 'buffer_create_info' struct for the
     * {@link CL#clCreateSubBuffer(cl_mem, int, int, Pointer, int[])}
     * call
     * @param offset The sub-buffer offset, in number of elements
     * @param size The sub-buffer size, in number of elements
     * @param The size of the buffer elements (e.g. Sizeof.cl_float)
     * @return The pointer to the buffer creation info
    private static Pointer createInfo(long offset, long size, int elementSize)
        // The 'buffer_create_info' is a struct with two size_t
        // values on native side. This is emulated with a 
        // byte buffer of the appropriate size
        ByteBuffer createInfo = 
            ByteBuffer.allocate(2 * Sizeof.size_t).order(
        if (Sizeof.size_t == Sizeof.cl_int)
            createInfo.putInt(0, (int)offset * elementSize); 
            createInfo.putInt(Sizeof.size_t , (int)size * elementSize);
            createInfo.putLong(0, offset * elementSize); 
            createInfo.putLong(Sizeof.size_t , size * elementSize);
     * Simple OpenCL initialization of the context and command queue
    private static void simpleInitialization()
        // The platform, device type and device number
        // that will be used
        final int platformIndex = 0;
        final long deviceType = CL_DEVICE_TYPE_ALL;
        final int deviceIndex = 0;

        // Enable exceptions and subsequently omit error checks in this sample

        // Obtain the number of platforms
        int numPlatformsArray[] = new int[1];
        clGetPlatformIDs(0, null, numPlatformsArray);
        int numPlatforms = numPlatformsArray[0];

        // Obtain a platform ID
        cl_platform_id platforms[] = new cl_platform_id[numPlatforms];
        clGetPlatformIDs(platforms.length, platforms, null);
        cl_platform_id platform = platforms[platformIndex];

        // Initialize the context properties
        cl_context_properties contextProperties = new cl_context_properties();
        contextProperties.addProperty(CL_CONTEXT_PLATFORM, platform);
        // Obtain the number of devices for the platform
        int numDevicesArray[] = new int[1];
        clGetDeviceIDs(platform, deviceType, 0, null, numDevicesArray);
        int numDevices = numDevicesArray[0];
        // Obtain a device ID 
        cl_device_id devices[] = new cl_device_id[numDevices];
        clGetDeviceIDs(platform, deviceType, numDevices, devices, null);
        cl_device_id device = devices[deviceIndex];

        // Create a context for the selected device
        context = clCreateContext(
            contextProperties, 1, new cl_device_id[]{device}, 
            null, null, null);
        // Create a command-queue
        commandQueue = 
            clCreateCommandQueue(context, devices[0], 0, null);


Hello Marco,

Your example helped me a lot! Thank You!



OK, the “createInfo” method was… -_- ehrm… could be simplified to

    private static Pointer createInfo(long offset, long size, int elementSize)
        if (Sizeof.size_t == Sizeof.cl_int)
            return int[] { 
                (int)offset * elementSize,
                (int)size * elementSize
            return long[] { 
                offset * elementSize,
                size * elementSize

but this is still not so nice (and also not yet tested on 64bit). Maybe I’ll create a “cl_buffer_region” class that would allow to pass this info to the native side without having to check for the size_t size on Java side.