I wish to use redis to keep a sizable group of user_ids with all these ids, a "group id" that that user was formerly designated:

User_ID | Group_ID
   1043 | 2 
   2403 | 1

The amount of user_ids is rather large (~ ten million) the amount of unique group ids is all about three to five.

My purpose with this LuT is routine:

  • discover the group id for any given user and

  • return a listing of other customers (of specified length) with similar group id as that given user

There can be an idiomatic method of doing this in redis or at best a means that's most effective. If that's the case i must know what it's. Here is a simplified version of my working implementation (while using python client):

# assume a redis server is already running 
# create some model data:
import numpy as NP
user_id = NP.random.randint(1000, 9999, NUM_REG_USERS)
cluster_id = NP.random.randint(1, 4, NUM_REG_USERS)
D = zip(cluster_id, user_id)

from redis import Redis
# r = Redis()

# populate the redis LuT:
for t in D :
    r.sadd( t[0], t[1] )

# the queries:
# is user_id 1034 in Group 1?
r.sismember("1", 1034)

# return 10 users in the same Group 1 as user_id 1034:
r.smembers("1")[:10]     # assume user_id 1034 is in group 1

And so i have implemented this LuT using regular redis sets each set is keyed to some Group ID (1, 2, or 3), so you will find three takes hold total.

Is the best way store this data given the kind of queries i wish to run against it?

Using sets is a great fundamental approach, though you will find a few things inside you might want to change:

Unless of course you keep group ID for every a person somewhere you'll need 5 round outings to obtain the group for the user - the operation is O(1), however, you still have to consider latency. Usually it's simple enough to get this done without an excessive amount of effort - you've all qualities saved for every user, so it's trivial to include one for group id.

You most likely want SRANDMEMBER instead of SMEMBERS - I believe SMEMBERS will return exactly the same 10 products out of your million item set each time.