In general, there is no other way. Kernels can only write to global (device) memory. If you need this data on host side, it has to be copied explicitly with cudaMemcpy. This is the same in CUDA-C and in JCuda.
The Pointer.to-methods intentionally only accept arrays. One reason is to emphasize that it’s simply not possible to take the address of a local variable. In C you can write (host!) code like
void modify(int *data) { *data = 123; }
void main() {
int value = 0;
modify(&value);
// 'value' is now '123'
}
but there is no such thing in Java. In Java, the most simple (although, admittedly, still clumsy and inconvenient) way of “emulating” this is
void modify(int data[]) { data[0] = 123; }
void main() {
int value[] = { 0 };
modify(value);
// 'value[0]' is now '123'
}
Specifically refering to the code snippet you posted: It’s right that the ‘h_answer’ has to be written as an array. So the C code
long h_answer;
...
cudaMemcpy(&h_answer, ...)
has to be translated to
long h_answer[] = {0};
...
cudaMemcpy(Pointer.to(h_answer), ...)
In some cases, it would make sense to let “Pointer.to” also accept single values - namely if the value should only be read. Specifically, it would then be possible to write something like
long hostInputThatWillOnlyBeRead = 123;
...
cudaMemcpy(d, Pointer.to(hostInputThatWillOnlyBeRead), ...);
The Pointer.to method would simply put the value into an array internally. But I thought that it might cause confusion (especially for beginners). IF this method existed, it would be possible to write something like
long h_answer;
...
cudaMemcpy(Pointer.to(h_answer), ...)
But although this code would look like using a pointer to the local variable, this would NOT be the case, of course. The value of the local variable could not be modified using this pointer. This can simply not be achieved in Java. I hoped that the necessity to explictly use an array would avoid such potential errors.