Riak, Bitcask, Innostore and The Impact of Key Distribution
An interesting finding from Kresten Krab Thorup[1] on how key distribution is impacting performance:
Innostore uses a B-tree, and we realized that it was really suffering from the random keys, because it then needs to do I/O on random nodes of the B-tree.
So we changed the keys to be
<<timestamp>>:<<random-bits>i.e., such that successive writes have keys that are lexicographically close. The random bits are there to make the chance of conflict small enough.Using such keys cause the underlying B-tree to only writes to a few nodes at a time, and ideally innostore only needs to keep tree-nodes in memory corresponding to a path from the root of the tree to the node currently being added to.
-
Kresten Krab Thorup: Programmer, Entrepreneur, Programmer, Scientist, Programmer, CTO at Trifork ↩
Original title and link: Riak, Bitcask, Innostore and The Impact of Key Distribution (NoSQL databases © myNoSQL)
via: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2011-April/003819.html