Freeing a device pointer with an offset should have has the same effect as it has in CUDA-C, if you did something like
which means that it should cause a “cudaErrorInvalidDevicePointer” error.
Concerning the question about obtaining the first elements of a pointer… maybe this is a misunderstanding, but in C there is no such concept like an “array length”. Pointers do not know how large the (valid) memory region is that they are pointing to. So, according to your example: The pointer to the 8 elements is in C identical to a pointer to the first 4 elements - the pointer is pointing to the same memory location (namely, to the beginning of the “array”) and does not know whether there are 4, 8 or 1000 “valid” elements beyond this location.
You are most likely asking these questions because you want to “reduce” the size of a CUDA array at runtime, step by step, right? This is not possible. The only solution would be to create a new, smaller array and copy the first elements from the large one into the small one. But of course, this would take time and might be unnecessary: If the memory consumption is not absolutely critical, and if you do not absolutely need the “unused” memory in order to allocate new arrays, you should probably only allocate the array with its maximum size once in the beginning, and then, in later steps, not “make the array smaller”, but only store the current length, i.e. maintain a variable that says how many elements of the large array are currently really used. I can hardly imagine how the algorithm should work without this information, so you most likely already have something like this. And this value may, by the way, also be the value that says how many elements may have to be copied back to the host in the end.