Prevent memory leak?


#1

Hi

Is it possible that releasing memory objects created with clCreateBuffer() or clCreateImage2D() by using the method clReleaseMemobject() doesn’t suffice?
Each time I get rid of every object I don’t need in the future but still the memory remains in use or something?
Are there some other things in openCL that I have to clear as well.

I tried debugging my code line per line and monitoring the memory usage of the gpu when clReleaseMemobject() is called. But if i have for example 40 MB in use the drops down to like 38 MB.
Running my code once is fine but when test running for 15 or more testcases with larger and larger images the memory gets completely full and crashes with a memory allocation failure.

I hope someone knows how memory can be freed up.

Thanks in advance!


#2

Hello

You basically have to release everything that you create. When you don’t need them any more, you have to release memory objects, events, command queues… and finally, the context.

I just ran another test

import static org.jocl.CL.*;
import java.util.Arrays;
import org.jocl.*;

public class JOCLMemoryReleaseTest
{
	private static cl_context context;
	private static cl_command_queue commandQueue;    

    public static void main(String args[])
    {
    	defaultInitialization();
    	
    	float array[] = new float[10*1000*1000];
    	Arrays.fill(array, 1);
        int bufferSize = array.length * Sizeof.cl_float;
        for (int i=0; i<100000; i++)
        {
        	cl_mem mem = clCreateBuffer(context, CL_MEM_READ_WRITE,
        		bufferSize, null, null);
        	
        	// Make sure the mem is accessible by writing and reading
        	// one value 
        	clEnqueueWriteBuffer(commandQueue, mem, CL_TRUE, 0, 
       			1*Sizeof.cl_float, Pointer.to(array), 0, null, null);
        	array[0] = 0;
        	clEnqueueReadBuffer(commandQueue, mem, CL_TRUE, 0, 
       			1*Sizeof.cl_float, Pointer.to(array), 0, null, null);

        	System.out.println(
        		"Created "+mem+" with "+bufferSize+" bytes, " +
        		"valid? "+(array[0]==1)+", run "+i);
        	
        	clReleaseMemObject(mem);
        }
        shutdown();
    }

    private static void defaultInitialization()
	{
        // The platform, device type and device number
        // that will be used
        final int platformIndex = 0;
        final long deviceType = CL_DEVICE_TYPE_ALL;
        final int deviceIndex = 0;

        // Enable exceptions and subsequently omit error checks
        CL.setExceptionsEnabled(true);

        int numPlatformsArray[] = new int[1];
        clGetPlatformIDs(0, null, numPlatformsArray);
        int numPlatforms = numPlatformsArray[0];
        cl_platform_id platforms[] = new cl_platform_id[numPlatforms];
        clGetPlatformIDs(platforms.length, platforms, null);
        cl_platform_id platform = platforms[platformIndex];
        cl_context_properties contextProperties = new cl_context_properties();
        contextProperties.addProperty(CL_CONTEXT_PLATFORM, platform);
        int numDevicesArray[] = new int[1];
        clGetDeviceIDs(platform, deviceType, 0, null, numDevicesArray);
        int numDevices = numDevicesArray[0];
        cl_device_id devices[] = new cl_device_id[numDevices];
        clGetDeviceIDs(platform, deviceType, numDevices, devices, null);
        cl_device_id device = devices[deviceIndex];
        context = clCreateContext(
            contextProperties, 1, new cl_device_id[]{device}, 
            null, null, null);
        commandQueue = 
            clCreateCommandQueue(context, device, 0, null);
	}
    
    private static void shutdown()
    {
    	clReleaseCommandQueue(commandQueue);
    	clReleaseContext(context);
    }

}

(on Win7/64) and did not encounter any problem. How are you measuring the memory usage? Can you post an example where the problem occurs?

bye
Marco


#3

Hi

The program consists of a number of operations that will be applied to images i provide.
Because of performance issues it’s commonly known to reduce the amount of times you read and write from and to memory. Because of that i keep the image and memory objects where they are, so i can use the output from 1 operation as the input of another operation.

I figured that if i can reuse memory objects like that it should also be possible to keep working with the same context and commandqueue, that is why i made a custom java class called OpenCLPlatform.java where i instantiate all the different parts of openCL such as the context, the commandqueue, contextproperties, platforms, devices, etc.

Because I will use the same images over and over again and i only want to copy them once into memory i use a class named OpenCLImage which holds some info about the images and a memory object so i can reuse the same memory again in other operations

The operations that can be applied on images all have a kernel and a program that i want to create only once because i will be using the same kernel over and over again. I did this by giving the operation classes a static{} part where i make the program and the kernel.

I will provide you the code of the OpenCLPlatform and OpenCLImage, one of the many operations (they all look a lot alike) and a part of the code where i use these operations on images and where i release their memory when i no longer need it.

OpenCLPlatform:

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;

import org.jocl.CL;
import org.jocl.Pointer;
import org.jocl.Sizeof;
import org.jocl.cl_command_queue;
import org.jocl.cl_context;
import org.jocl.cl_context_properties;
import org.jocl.cl_device_id;
import org.jocl.cl_platform_id;

import static org.jocl.CL.*;


public class OpenCLPlatform {
	
	private static OpenCLPlatform INSTANCE = null;
	
	//globale vars
	private cl_context_properties contextProperties = null;
	private cl_platform_id platforms[] = null;
	private cl_context context = null;
	private cl_device_id devices[] = null;
	private cl_command_queue commandQueue = null;
	
	private OpenCLPlatform(){
		long begin = System.currentTimeMillis();
		enableExceptions(true);
		createContextProperties();
		obtainPlatforms();
		createContext();
		obtainDevices();
		checkImageSupport();
		createCommandQueue();
		long end = System.currentTimeMillis();
		//System.out.println("Initialization OpenCLPlatform "+(end-begin));
	}
	
	public static OpenCLPlatform getInstance(){
		if(INSTANCE == null){
			INSTANCE = new OpenCLPlatform();
		}
		return INSTANCE;
	}
	
	public static String obtainKernel(String fileName) throws IOException{
		long begin = System.currentTimeMillis();
		
		BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(fileName)));
		StringBuffer sb = new StringBuffer();
		String line = null;
		
		while(true){
			line = br.readLine();
			if(line == null){
				break;
			}
			sb.append(line).append("
");
		}
		String out = sb.toString();
		
		long end = System.currentTimeMillis();
		//System.out.println("Parsing kernel "+(end-begin));
		
		return out;
	}
	
	
	private void enableExceptions(boolean bool){
		long begin = System.currentTimeMillis();
		
		CL.setExceptionsEnabled(bool);
		
		long end = System.currentTimeMillis();
		//System.out.println("Enabling exceptions"+(end-begin));
	}
	
	private void createContextProperties(){
		long begin = System.currentTimeMillis();
		
		setContextProperties(new cl_context_properties());
		
		long end = System.currentTimeMillis();
		//System.out.println("Creating context properties "+(end-begin));
	}
	
	private void obtainPlatforms(){
		long begin = System.currentTimeMillis();
		
		setPlatforms(new cl_platform_id[1]);
		clGetPlatformIDs(getPlatforms().length,getPlatforms(),null);
		this.getContextProperties().addProperty(CL_CONTEXT_PLATFORM, getPlatforms()[0]);
		
		long end = System.currentTimeMillis();
		//System.out.println("Obtaining platforms "+(end-begin));
	}
	
	private void createContext(){
		long begin = System.currentTimeMillis();
		
		this.setContext(clCreateContextFromType(this.getContextProperties(), CL_DEVICE_TYPE_GPU, null, null, null));
		
		long end = System.currentTimeMillis();
		//System.out.println("Creating context "+(end-begin));
	}
	
	private void obtainDevices(){
		long begin = System.currentTimeMillis();
		
		long numBytes[] = new long[1];
		clGetContextInfo(this.getContext(), CL_CONTEXT_DEVICES, 0, null, numBytes);
		int numDevices = (int)numBytes[0]/Sizeof.cl_device_id;
		this.setDevices(new cl_device_id[numDevices]);
		clGetContextInfo(this.getContext(), CL_CONTEXT_DEVICES, numBytes[0], Pointer.to(this.getDevices()), null);
		
		long end = System.currentTimeMillis();
		//System.out.println("Obtaining devices "+(end-begin));
	}
	
	private void createCommandQueue(){
		long begin = System.currentTimeMillis();
		
		long properties = 0;
		properties |= CL_QUEUE_PROFILING_ENABLE;
		properties |= CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE;
		this.commandQueue = clCreateCommandQueue(this.getContext(), this.getDevices()[0], properties, null);
		
		long end = System.currentTimeMillis();
		//System.out.println("Creating commandqueue "+(end - begin));
	}
	
	public void destroy(){
		long begin = System.currentTimeMillis();
		
		clReleaseCommandQueue(commandQueue);
 		clReleaseContext(context);
 		
 		long end = System.currentTimeMillis();
 		//System.out.println("OpenCLPlatform destroy "+(end-begin));
		
	}
	
	public void checkImageSupport(){
		int imageSupport[] = new int[1];
        clGetDeviceInfo (devices[0], CL.CL_DEVICE_IMAGE_SUPPORT,
            Sizeof.cl_int, Pointer.to(imageSupport), null);
        //System.out.println("Images supported: "+(imageSupport[0]==1));
        if (imageSupport[0]==0)
        {
            //System.out.println("Images are not supported");
            System.exit(1);
            return;
        }
	}

	public cl_context getContext() {
		return context;
	}

	public void setContext(cl_context context) {
		this.context = context;
	}

	public cl_command_queue getCommandQueue() {
		return commandQueue;
	}

	public void setCommandQueue(cl_command_queue commandQueue) {
		this.commandQueue = commandQueue;
	}

	public cl_context_properties getContextProperties() {
		return contextProperties;
	}

	public void setContextProperties(cl_context_properties contextProperties) {
		this.contextProperties = contextProperties;
	}

	public cl_platform_id[] getPlatforms() {
		return platforms;
	}

	public void setPlatforms(cl_platform_id platforms[]) {
		this.platforms = platforms;
	}

	public cl_device_id[] getDevices() {
		return devices;
	}

	public void setDevices(cl_device_id devices[]) {
		this.devices = devices;
	}
}

OpenCLImage:


import java.awt.image.BufferedImage;
import java.awt.image.ColorConvertOp;
import java.awt.image.ColorModel;
import java.awt.image.DataBufferInt;
import java.awt.image.WritableRaster;
import java.util.Hashtable;

import javax.media.jai.PlanarImage;
import javax.swing.JFrame;
import javax.swing.JScrollPane;

import org.jocl.Pointer;
import org.jocl.Sizeof;
import org.jocl.cl_image_format;
import org.jocl.cl_mem;

import static org.jocl.CL.*;

public class OpenCLImage {
	
	private cl_mem memObject;
	
	private int width;
	private int height;
	
	public OpenCLImage(cl_mem m, int width, int height){
		OpenCLPlatform platform = OpenCLPlatform.getInstance();
		long begin = System.currentTimeMillis();
		
		setMemObject(m);
		setWidth(width);
		setHeight(height);
				
		long end = System.currentTimeMillis();
		//System.out.println("Create OpenCLImage w/ memory object "+(end-begin));
	}
	
	public OpenCLImage(PlanarImage img){
		long begin = System.currentTimeMillis();
		
		setWidth(img.getWidth());
		setHeight(img.getHeight());
		
		BufferedImage imgB = img.getAsBufferedImage();
		BufferedImage imgCol = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
        ColorConvertOp op = new ColorConvertOp(img.getColorModel().getColorSpace(),imgCol.getColorModel().getColorSpace(),null);
        op.filter(imgB, imgCol);
		
		
		 DataBufferInt dataBufferSrc =
	                (DataBufferInt)imgCol.getRaster().getDataBuffer();
	        int dataSrc[] = dataBufferSrc.getData();
		
		
		OpenCLPlatform platform = OpenCLPlatform.getInstance();
		cl_image_format imgformat = new cl_image_format();
		imgformat.image_channel_order = CL_RGBA;
		imgformat.image_channel_data_type = CL_UNSIGNED_INT8;
		
		memObject = clCreateImage2D(platform.getContext(), CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, new cl_image_format[]{imgformat}, width, height, width * Sizeof.cl_uint, Pointer.to(dataSrc), null);
		
		long end = System.currentTimeMillis();
		//System.out.println("Create OpenCLImage w/ planarimage"+(end-begin));
	}
	
	public PlanarImage getAsPlanarImage(){
		long begin = System.currentTimeMillis();
		
		PlanarImage imgOut = PlanarImage.wrapRenderedImage(getAsBufferedImage());
		
		long end = System.currentTimeMillis();
		//System.out.println("OpenCLImage getAsPlanarImage "+(end-begin));
		
		return imgOut;
	}
	
	public BufferedImage getAsBufferedImage(){
		OpenCLPlatform platform = OpenCLPlatform.getInstance();
		
		long begin = System.currentTimeMillis();
		BufferedImage outImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
		
		DataBufferInt dataBufferDst =
	            (DataBufferInt)outImg.getRaster().getDataBuffer();
	        int dataDst[] = dataBufferDst.getData();
	        clEnqueueReadImage(
	            platform.getCommandQueue(), memObject, true, new long[3],
	            new long[]{width, height, 1},
	            width * Sizeof.cl_uint, 0,
	            Pointer.to(dataDst), 0, null, null);
	        
	        
	        
	    long end = System.currentTimeMillis();
		//System.out.println("OpenCLImage getAsBufferedImage "+(end-begin));
		return outImg;
	}
	
	public void destroy(){
		long begin = System.currentTimeMillis();
		
		clReleaseMemObject(memObject);
		this.memObject = null;
		
		long end = System.currentTimeMillis();
		//System.out.println("OpenCLImage destroy"+(end-begin));
	}

	public int getWidth() {
		return width;
	}

	public void setWidth(int width) {
		this.width = width;
	}

	public int getHeight() {
		return height;
	}

	public void setHeight(int height) {
		this.height = height;
	}

	public cl_mem getMemObject() {
		return memObject;
	}

	public void setMemObject(cl_mem memObject) {
		this.memObject = memObject;
	}
}

an Operation:


import static org.jocl.CL.CL_MEM_WRITE_ONLY;
import static org.jocl.CL.CL_RGBA;
import static org.jocl.CL.CL_UNSIGNED_INT8;
import static org.jocl.CL.clBuildProgram;
import static org.jocl.CL.clCreateKernel;
import static org.jocl.CL.clCreateProgramWithSource;
import static org.jocl.CL.clSetKernelArg;

import java.io.IOException;

import org.jocl.CL;
import org.jocl.Pointer;
import org.jocl.Sizeof;
import org.jocl.cl_image_format;
import org.jocl.cl_kernel;
import org.jocl.cl_mem;
import org.jocl.cl_program;

import static org.jocl.CL.*;

public class SubstractOperation {
	private static final String KERNEL_SOURCE_FILE = "kernels/substract.cl";
	private static final String KERNEL_METHOD = "substract";
	
	private static cl_program program;
	private static cl_kernel kernel;
	
	private int imgWidth;
	private int imgHeight;
	
	private cl_mem output;
	static{
		try {
			OpenCLPlatform platform = OpenCLPlatform.getInstance();
			
			long begin = System.currentTimeMillis();
			
			String kernelString = OpenCLPlatform.obtainKernel(KERNEL_SOURCE_FILE);
			program = clCreateProgramWithSource(platform.getContext(), 1, new String[]{kernelString}, null, null);
			clBuildProgram(program, 0, null, null, null, null);
			kernel = clCreateKernel(program, KERNEL_METHOD, null);
			
			long end = System.currentTimeMillis();
			//System.out.println("Retrieve/Compile/Build Program, Kernel --> Substract "+(end-begin));
		} catch (IOException e) {
			e.printStackTrace();
		}
	}
	public SubstractOperation(OpenCLImage img1, OpenCLImage img2){
		OpenCLPlatform platform = OpenCLPlatform.getInstance();
		
		long begin = System.currentTimeMillis();
		
		this.imgWidth = img1.getWidth();
		this.imgHeight = img1.getHeight();
		
		cl_image_format imageFormat = new cl_image_format();
        imageFormat.image_channel_order = CL_RGBA;
        imageFormat.image_channel_data_type = CL_UNSIGNED_INT8;
		
		int err[] = new int[1];
		this.output = CL.clCreateImage2D(
                platform.getContext(), CL_MEM_READ_WRITE,
                new cl_image_format[]{imageFormat}, imgWidth, imgHeight,
                0, null, err);
		
		//System.out.println(err[0]);
		
		
		
		//assign the arguments
		clSetKernelArg(kernel, 0, Sizeof.cl_mem, Pointer.to(img1.getMemObject()));
		clSetKernelArg(kernel, 1, Sizeof.cl_mem, Pointer.to(img2.getMemObject()));
		clSetKernelArg(kernel, 2, Sizeof.cl_mem, Pointer.to(this.output));
		
		long end = System.currentTimeMillis();
		//System.out.println("Creating buffers / Setting kernel arguments --> Substract "+(end-begin));
	}
	
	public OpenCLImage execute(){
		OpenCLPlatform platform = OpenCLPlatform.getInstance();
		
		long begin = System.currentTimeMillis();
		
		long global_work_size[] = new long[2];
		global_work_size[0] = imgWidth;
		global_work_size[1] = imgHeight;
		//long local_work_size[] = new long[]{1};
		
		clEnqueueNDRangeKernel(platform.getCommandQueue(), kernel, 2, null, global_work_size, null, 0, null, null);
		
		
		long end = System.currentTimeMillis();
		//System.out.println("Setting worksize / Running kernel --> Substract "+(end-begin));
		
		return new OpenCLImage(output,this.imgWidth,this.imgHeight);
	}
}

example usage code:

		int maskY = 3;
		int[]kernel = new int[maskX*maskY];
		Arrays.fill(kernel, 0);
		int maskXKey = 1;
		int maskYKey = 1;
		
		
		
		OpenCLImage imgin1 = new OpenCLImage(img1);
		OpenCLImage imgin2 = new OpenCLImage(img2);
		
		//showImage(imgin1.getAsBufferedImage(),"Original image1");
		//showImage(imgin2.getAsBufferedImage(),"Original image2");
		
		ErodeOperation erode;
		DilateOperation dilate;
		SubstractOperation sub;
		MaxOperation max;
		
		erode = new ErodeOperation(imgin1, kernel, maskX, maskY, maskXKey, maskYKey);
		OpenCLImage imgout1 = erode.execute();
		
		//showImage(imgout1.getAsBufferedImage(),"Erode image1");
		
		dilate = new DilateOperation(imgin2, kernel, maskX, maskY, maskXKey, maskYKey);
		OpenCLImage imgout2 = dilate.execute();
		
		//showImage(imgout2.getAsBufferedImage(), "Dilate image2");
		
		sub = new SubstractOperation(imgout1, imgout2);
		OpenCLImage imgout3 = sub.execute();
		
		//showImage(imgout3.getAsBufferedImage(),"Erode image1 - Dilate image2 = Diff1");
		
		imgout1.destroy();
		imgout2.destroy();
		imgout1 = null;
		imgout2 = null;
		
		//---
		
		erode = new ErodeOperation(imgin2, kernel, maskX, maskY, maskXKey, maskYKey);
		imgout1 = erode.execute();
		
		//showImage(imgout1.getAsBufferedImage(),"Erode image2");
		
		dilate = new DilateOperation(imgin1, kernel, maskX, maskY, maskXKey, maskYKey);
		imgout2 = dilate.execute();
		
		//showImage(imgout2.getAsBufferedImage(), "Dilate image1");
		
		sub = new SubstractOperation(imgout1,imgout2);
		OpenCLImage imgout4 = sub.execute();
		
		//showImage(imgout4.getAsBufferedImage(), "Erode image2 - Dilate image1 = Diff2");
		
		imgout1.destroy();
		imgout2.destroy();
		imgout1 = null;
		imgout2 = null;
		
		//---
		
		max = new MaxOperation(imgout3, imgout4);
		OpenCLImage outimg = max.execute();
		imgout3.destroy();
		imgout4.destroy();
		imgout3=null;
		imgout4=null;
		//PlanarImage diff = outimg.getAsPlanarImage();
		//showImage(diff.getAsBufferedImage(),"Max Diff1 & Diff2 = Diff");
		
		//outimg.destroy();
		imgin1.destroy();
		imgin2.destroy();
		imgin1 = null;
		imgin2 = null;
		
		
		return outimg;```

Beware of the many commented lines, it's an experimental process for me as i'm pretty new to openCL (you can tell by my previous beginner questions).

I hope i provided enough details.

Again, thanks in advance!

#4

Hi

Of course, the context, command queue and kernels should be created only once, so everything should be fine here.

I tried another simple test like that


import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

public class ImageToolsTest
{
    public static void main(String[] args) throws IOException
    {
        BufferedImage img1 = ImageIO.read(new File("image00.png"));
        BufferedImage img2 = ImageIO.read(new File("image04.png"));
       
        for (int i=0; i<10000; i++)
        {
            OpenCLImage imgin1 = new OpenCLImage(img1);
            OpenCLImage imgin2 = new OpenCLImage(img2);
           
            SubstractOperation sub = new SubstractOperation(imgin1, imgin2);
            OpenCLImage imgout3 = sub.execute();
           
            BufferedImage b = imgout3.getAsBufferedImage();
            System.out.println("Result in run "+i+": "+b);
           
            imgin1.destroy();
            imgin2.destroy();
        }
    }
}

But of course, I have commented out the kernel.

As far as I can see, the only really “variable” CL objects (which may be created and deleted) are the cl_mems inside the OpenCLImage. So this might be the source of the problem. However, when you are always calling “destroy” on the images, this should be OK.

Can you describe in more detail under which condition you receive a memory allocation failure? Note that there is a maximum size for memory allocations (this is a device property that can be queried using CL_DEVICE_MAX_MEM_ALLOC_SIZE). Maybe the images are just too large?

bye
Marco


#5

I think I can be almost sure that the image isn’t too large for the gpu to handle, because when I run only that image it doesn’t throw an exception like when I run all the images in a row. The program also always gets stuck at the same piece in the code.

Process explorer to track the memory usage of the gpu, and i can clearly tell that it just keeps getting filled without the memory being released. And in the end, when it’s almost completely full and he tries to allocate another memobject for an image operation it crashes.

While writing this post it seems that I made a mistake in overwriting an OpenCLImage object… By overwriting the cl_mem object it gets the new value but the old one still remains…

Thanks for giving me some additional insight in memory allocation etc.

I appreciate the help :wink:


#6

[QUOTE=Unregistered;17394]By overwriting the cl_mem object it gets the new value but the old one still remains…
[/QUOTE]

I did not understand what you mean… Did you somehow solve the problem? For me it seemed you are releasing the right cl_mem objects…?


#7

One of the simpler operations I did was a contraststretch where I took an OpenCLImage input object, and returned an OpenCLImage output object.

The problem was situated in these lines of code:

...
//input comes from previous operations
OpenCLImage input;
ConstrastStretchOperation op = new ConstrastStretchOperation (input, ...);

//apply contrast stretch twice for better results
input = op.execute();

op = new ConstrastStretchOperation (input, ...)
input = op.execute();

...

This was solved by doing:

...

//apply contrast stretch twice for better results
op = new ConstrastStretchOperation (input, ...);
OpenCLImage contrast1 = op.execute();
input.destroy();

op = new ConstrastStretchOperation (contrast1, ...);
OpenCLImage contrast2 = op.execute();
contrast1.destroy();

...

Giving the java object cl_mem a new value does not get rid of the old value of a cl_mem object. Therefore I make a new one and dispose of the old one.
My memory allocation got previously filled with more than a gig of memory whilst now it doesnt go over 50 MB. :slight_smile:

Problem solved in my eyes and an important lesson learned.