Help~ Any example about HashMap (counting the frequency of key)

Hi all,

I would like to ask any jcuda example to do the counting function. Just using hashMap in Java. For example,

Map<String,Integer> countMap = new HashMap<String, Integer>();
String [] keys = new String{"a", "b","a","c"};

for(int i=0; i<keys.length; i++){
          countMap.put(keys**, countMap.get(keys**)+1);
          countMap.put(keys**, 1);



I think this is not trivial in general, and more importantly: It’s probably hard to achieve a nice speedup there. Assuming that you really want to cound the frequency of Java String objects, you’d have to convert all these Objects to byte[] arrays (which may be expensive), copy them from the host to the device (which might be expensive as well), implement a hash map in device memory (which might be tricky), then compute the hash codes of these byte[] arrays, handle hash collisions properly, and finally compare the strings with hash collisions to finally find out whether the counter in the target array has to be increased.

One aspect which would be interesting for me would be to use shared memory for the hash table (although this would be limiting its size) and the use of atomics for the counters. I’m not a CUDA expert, but even if someone has a deep insight and understanding of this subject, I think this could hardly be more than an academic example, since the operations involved are probably memory-bound. Maybe I’ll find the time to give it a try, but I would not expect it to be notably faster than the plain Java implementation.



I had another short look at the problem. It seems to be harder than I thought in the first place. An approach that resembles the Java version (using some sort of hash map) will hardly be feasible due to the inherently dynamic nature of a hash map data structure (Simply said: It’s not possible to allocate the required memory before the kernel is launched, since the structure of the hash map will depend on the data).

Specifically for the “Word Count” problem, there are some existing solutions. For example, on this page: . They are using an Map/Reduce approach for the word count with CUDA. But this is not a simple kernel that may be called, but it’s a larger framework and IMHO not very accessible (for me, at least)

Considering that it could take weeks or months to create a CUDA kernel that (embarassingly enough) does the same thing as a 3 lines(!) Java function, and considering the very high probabilty that any possible speedup is eaten up by the conversions of Strings to byte arrays and the memory transfer, I’m pretty sure that this is in no way worth the effort.

However, if this is only intended as an academic proof of concept, you may want to have a closer look at the Map/Reduce approach in the above link.


Hi Marco,

Thanks for your great support on my question. I will put effort on study the Map/Reduce approach. Thanks for your great help.