Say I've organizations to keep. ATM it might be sufficiently good to consider them blobs. I would like the organizations to become saved on the cluster. The important thingOrIdentification from the entity is really a (x,y) integer coordinate. So that they are essentially situated inside a two dimensional power grid. Upgrading any entity requires securing it's 4 neighbors. Since I Have want redundancy, I figured the best is always to make use of the redundancy to make sure that the neighbors will always be available. Here's what the distribution could seem like:

   1  2  3  4  5  6
1 [F][F][E][E][G][G]
2 [F][F][E][E][G][G]
3 [D][D][A][A][B][B]
4 [D][D][A][A][B][B]
5 [H][H][C][C][I][I]
6 [H][H][C][C][I][I]

If Your,W,D,Deb,At the,Farrenheit,Grams,They would,I are servers, a is the owner of the (3,3) entity, and it must know (2,3) and (3,2) which fit in with other servers. Arranged in blocks of four, this always leaves two sides owned by other servers. Using triple redundancy, I wish to pressure a nearby copy of neighbors. This could provides me with essentially linear scalability.

It is possible to database which enables me to define the sharding/replication key such will be able to specify this type of distribution, or it is possible to method of mixing x and y right into a single value that may be used to do this?

What I am after is low latency and redundancy, not saving drive space. My organizations possess a "locality of reference" property transactions only ever access the neighbors, but utilizing the same key to have an entity and it is neighbors would lead to everybody have a similar key.

I realize that duplication of BLOB data could be costly storage smart, you want to limit redundancy. The advantage of sharding the information is the fact that within an unnormalized method for you to greatly accelerate searching abilities and gratifaction.

Case a concept but possibly you can approach this inside a stabilized way rather by creating three tables:

Coordinates - Column (ID), Column (Xcor), Column (Ycor)

Data - Column - (ID), Column (Checksum?),

CoordinateData - Column (CoordinateID), Column (DataID)

With CoordinateData like a mapping table. This normally is not well suited for indexes or searching if however you saved possibly a checksum string, you can use another medium for storing and finding raw data.

Like I stated just a concept.