The Benefits of Virtual Nodes and Performance Results
Sam Overton and Tom Wilkie of Acunu explain the advantages of using virtual nodes in distributed data storage engines and the performance they’ve measure introducing virtual nodes in Acunu platform when compared with Apache Cassandra:
One of the factors that limits the amount of data that can be stored on each node is the amount of time it takes to re-replicate that data when a node fails. That time matters, because it is a period during which the cluster is more vulnerable than normal to data loss. The challenge is that the more data stored on a node, the longer it takes to re-replicate it. Therefore, to store more data per node safely, we want to reduce the time taken to return to normal. This was one of our aims with virtual nodes.
Virtual Nodes reduces the time taken to re-replicate data as it involves every node in the cluster in the operation. In contrast, Apache Cassandra v1.1 will only involve a number of nodes equal to the Replication Factor (RF) of your keyspace. What’s more, with Virtual Nodes, the cluster remains balanced after this operation - you do not need to shuffle the tokens on the other nodes to compensate for the loss!
Original title and link: The Benefits of Virtual Nodes and Performance Results (©myNoSQL)
via: http://www.acunu.com/2/post/2012/07/virtual-nodes-performance-results.html