Using two databases as one example of this situation: CouchDB and Cassandra.


CouchDB utilizes a B+ Tree for document indexes (using a clever modification to operate within their append-only atmosphere) - more particularly as documents are modified (place/update/remove) they're appended towards the running database file in addition to a full Leaf -> Node path in the B+ tree of all of the nodes effected through the up-to-date revision immediately after the document.

These piece-mealed index revisions are inlined right alongside the modifications so that the entire index is really a union of the very recent index modifications appended in the finish from the file together with additional pieces further in the information file which are still relevant and weren't modified yet.

Searching the B+ tree is O(logn).


Cassandra keeps record secrets sorted, in-memory, in tables (let us think about them as arrays with this question) and creates them out separate (sorted) sorted-string tables every so often.

We are able to think about the gathering famous these tables because the "index" (from things i understand).

Cassandra is needed to compact/combine these sorted-string tables every so often, developing a more complete file representation from the index.

Searching a sorted array is O(logn).


Presuming an identical degree of complexity between either maintaining partial B+ tree portions in CouchDB versus partial sorted-string indices in Cassandra and considering that both provide O(logn) search time which do you consider will make a much better representation of the database index and why?

I'm particularly curious if there's an implementation detail about one within the other that causes it to be particularly attractive or if they're both a clean and you simply pick whichever data structure you'd rather useOrwill work better towards the developer.

Appreciate the ideas.