An interesting finding from Kresten Krab Thorup on how key distribution is impacting performance:
Innostore uses a B-tree, and we realized that it was really suffering from the random keys, because it then needs to do I/O on random nodes of the B-tree.
So we changed the keys to be
<<timestamp>>:<<random-bits> i.e., such that successive writes have keys that are lexicographically close. The random bits are there to make the chance of conflict small enough.
Using such keys cause the underlying B-tree to only writes to a few nodes at a time, and ideally innostore only needs to keep tree-nodes in memory corresponding to a path from the root of the tree to the node currently being added to.
Original title and link: Riak, Bitcask, Innostore and The Impact of Key Distribution (NoSQL databases © myNoSQL)