I want a method to do key-value searches across (potentially) 100s of GB of information. Ideally something with different distributed hashtable, that actually works nicely with Java. It ought to be fault-tolerant, and free.

The shop ought to be persistent, but would ideally cache data in memory to quicken things.

It will have the ability to support concurrent reads and creates from multiple machines (reads is going to be 100X more prevalent though). Essentially the reason would be to perform a quick initial research of user metadata for any web-service.

Can anybody recommend anything?

Open Chord is definitely an implementation from the CHORD protocol in Java. It's a distributed hash table protocol which should meet your needs perfectly.

You might like to take a look at Hazelcast. It's distributed/partitioned, super lite, simple and easy , free.

java.util.Map map = Hazelcast.getMap ("mymap");
map.put ("key1", "value1");

Regards,

-talip

With respect to the use situation, Terracotta might be just the thing you need.

You need to most likely specify if it must be persistent or otherwise, in memory or otherwise, etc. You could attempt: http://www.danga.com/memcached/

Distributed hash tables include Tapestry, Chord, and Pastry. One of these simple should meet your requirements.

OpenChord sounds promising but i'd also consider BDB, or other non-SQL hashtable, which makes it distributed could be dead-easy (if the amount of storage nodes is (almost) constant, a minimum of), just hash the important thing around the client to find the appropriate server.

nmdb seems like its exactly the thing you need. Distributed, in memory cache, having a persistent on-disk storage. Current back-finishes include qdbm, berkeley db, and (lately added following a quick email towards the developer) tokyo, japan cabinet. key/value dimensions are limited though, however i believe that may be lifted if you do not need TICP support.

DNS has got the capacity to get this done, I'm not sure what size every one of your records is (8GB of a lot of small data?), however it may go.