Included .cu files not updated

djmj · 5. September 2011 um 21:53

I have a include file which contains my defined variables like block size.

This file is included by my kernel functions which need defined block sizes or tile sizes.

The Problem is, the compiler does not register that this file has been updated.
It only checks for the main kernel .cu file if that has been updated.

Is this the usual case, or is this JCuda specific ?

/*
 * Shared Memory Tile Size, must be a multiple of 16
 */	
#define TILE_SIZE 64

/*
 * Number of Threads in a Block
 *
 * Maximum number of resident blocks per multiprocessor : 8
 *
 * ///////////////////
 * Compute capability:
 * ///////////////////
 *
 * Cuda [1.0 - 1.1] ~	
 *	Maximum number of resident threads per multiprocessor 768
 *	Optimal Usage: 768 / 8 = 96
 * Cuda [1.2 - 1.3] ~
 *	Maximum number of resident threads per multiprocessor 1024
 *	Optimal Usage: 1024 / 8 = 128
 * Cuda [2.x] ~
 *	Maximum number of resident threads per multiprocessor 1536
 *	Optimal Usage: 1536 / 8 = 192
 */	
#define BLOCK_SIZE_DEF 96

/*
 * Number of Threads in a Block, mainly for reduction kernels,
 * must be a power of two variable
 *
 * ///////////////////
 * Compute capability:
 * ///////////////////
 *
 * Cuda [1.0 - 1.1] ~	
 *	BLOCK_SIZE_DEF = 96 --> 128  --> 6 of 8 possible resident blocks per multiprocessor
 * Cuda [1.2 - 1.3] ~
 *	BLOCK_SIZE_DEF = 128 --> 128
 * Cuda [2.x] ~
 *	BLOCK_SIZE_DEF = 192 --> 256
 */	
#define BLOCK_SIZE_POW2 128


/*
 * two times the block size for reduction kernel, since one thread
 * loads two elements at global first reduction level
 */
#define BLOCK_SIZE_POW2_DOUBLE (BLOCK_SIZE_POW2 << 1)

Marco13 · 6. September 2011 um 01:17

Hello

How are you compiling the main .CU file? Is the file that you posted a header, that is inlcuded via
#include “myConstants.h”
in the main file?

bye
Marco

djmj · 6. September 2011 um 08:02

Included like this.

extern "C"
{
	#include "inc/GPU_utilBlockSize.cu"
}

Marco13 · 6. September 2011 um 09:41

And how are you compiling it? (I don’t see a reason why this should not work so far, although the ‘extern “C”’ should not be necessary here)

djmj · 6. September 2011 um 17:37

Eclipse compiles it when I start my Programm ?, for example.

Java Host Code:

UtilCuda.kernelLauncherCreateSetupCall(
	 UtilCudaKernels.KERNELS_UTIL_PATH + "GPU_norm_kernel.cu", 
	"normColumn",
  	UtilCuda.getNewDim3(width),
	UtilCuda.getNewDim3(BLOCK_SIZE),
	new Object[] { mat_hd, 
		width, 
		heigth, 
		Util.getCeil(heigth, elemPerBlock),
		outputTileCount});

Normalize Kernel:

extern "C"
{
	#include "inc/GPU_utilBlockSize.cu"

	//include other necessary files
	__global__ void normColumn(float** inOutMat_g,
		const unsigned int inWidth_s,
		const unsigned int inHeigth_s,
		const unsigned int inTileCount_s,
		const unsigned int inOutputTileCount_s)
	{
		//do kernel
	}
}

GPU_utilBlockSize.cu

#define TILE_SIZE 64
#define BLOCK_SIZE_DEF 96
#define BLOCK_SIZE_POW2 128
#define BLOCK_SIZE_POW2_DOUBLE (BLOCK_SIZE_POW2 << 1)

Marco13 · 7. September 2011 um 01:29

Is this UtilCuda class internally using the KernelLauncher? Note that the KernelLauncher checks whether the output file already exists and is newer than the input file. If you only modify an included file, the KernelLauncher will not notice this. You may add the ‘forceRebuild’ flag in the ‘compile’ call of the KernelLauncher, to enforce the Kernel to be rebuilt each time.

djmj · 7. September 2011 um 07:21

ah, yes UtilCuda has a few wrapper methods for launching and creating a kernelLauncher.