I wish to have a large purchased list (countless elements) in the search engines Application Engine datastore. Fast insertion is needed.

The easiest way could be adding an indexed property (or column) "order_num" representing an order. For instance, a listing [A, B, C] could be saved such as this:

content   order_num
--------------------
   A         1
   B         2
   C         3  

However, this does not provide you with fast insertion. For instance, If I wish to place X following a, I must renumber B and C to "make room" for X, i.e., let B become 3, C becomes 4, and X be 2. This is a tragedy basically have countless elements.

I discovered a achievable solution known as "gap approach" referred to here. This method looks after a gap between adjacent elements. Such as this:

content   order_num
--------------------
   A         1000
   B         2000
   C         3000

When I wish to place X following a, I'm able to simply add X using its order_num set to (1000 + 2000) / 2 = 1500, no renumbering needed.

However with these gaps becoming more compact, renumbering might be needed. My real question is, can there be any known strategy on renumbering? And determining how big gaps?

Thanks!

UPDATE

Here's more detail. Say I've a listing of elements in database, and each element comes with an integer property named my_num. The need for my_num is definitely an arbitrary positive integer. Suppose I've a listing [A, B, C, D], as well as their my_num are

 element    my_num   
---------------------
   A          5        
   B          2
   C         10
   D          7

Now, let us define an accum() operator:

accum(n) = element[0].my_num + element[1].my_num + ... + element[n-1].my_num

Therefore the accum values for every element are

 element    my_num   accum 
----------------------------
   A          5        5
   B          2        7
   C         10       17
   D          7       24

But accum values most likely shouldn't be saved in database since the list is continually up-to-date. It's easier to keep insertion fast.

I wish to design a question which input is definitely an integer x:

query(x) = element[i] if accum(i-1) < x <= accum(i)

For instance, query(11) is C and query(3) is really a.

Can you really design a datastore schema to create this question fast? Or the only method is accumulate it 1 by 1 at query time which I am likely to do?

alternatively, would you use decimals, or perhaps a string?

content     order
-------------------- 
   A         'a' 
   B         'b' 
   C         'c'

Then to place D from a and b, provide the worthiness 'aa'

An formula for producing the strings is better proven for any binary string: if you wish to place something between "1011" and "1100", perform the following:

  • Avalue = 1+*(1/2)+1*(1/4)+1*(1/8)
  • Bvalue = 1+1*(1/2)+*(1/4)+*(1/8)

average, new value = 1+*(1/2)+1*(1/4)+1*(1/8)+1*(1/16) new string = "10111"

content     order
-------------------- 
   A         '1011' 
   new!      '10111' 
   B         '1100' 
   C         '1101'

because you always average 2 values, the typical will invariably possess a finite binary development, along with a finite string. It effectively defines a binary tree.

You may already know binary trees don't always come out well-balanced, quite simply, some strings is going to be considerably longer than the others after enough insertions. To ensure that they're short, you could utilize any even number base - it needs to be even because then the introduction of any average of two values is finite.

But anything you do, strings will most likely become lengthy, you'll also find to complete some housekeeping sooner or later, cleaning the values to ensure that the string space can be used effectively. What this formula provides you with may be the certainty that between clean-ups, the machine could keep ticking along.

You most likely be thinking about using app-engine-ranklist, which utilizes a tree-based structure to keep a rank order within the datastore.

Or, if you're able to describe your needs in greater detail, maybe we are able to suggest an alternate which involves less overhead.