NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



10gen: MongoDB’s Fault Tolerance Is Not Broken…

Sitting comfortably? Check. Popcorn? Check. Let’s press play.

In an interview for InfoQ, 10gen’s Jared Rosoff replies to Emin Gün Sirer: “Broken by Design: MongoDB Fault Tolerance“.

  1. MongoDB lies when it says that a write has succeeded

    JR: “[…] Today the default behavior of official MongoDB drivers is Receipt Acknowledged, which means that you wait until the server has processed your write before returning to the client.”

    The key word here is today. As clearly explained by Sirer, the behavior he described was in all versions of MongoDB and all the drivers until the 2.2 release and the drivers update in Nov.2012.

    Basically the “fire-and-forget” behavior (nb: even this description is not accurate; a better one would be “load-and-forget”) has been the default for almost 3 years. With the 2.2 release and the corresponding drivers update it was changed to “Receipt acknowledged”. But new default acknowledges that the data was received on the server, but not that it was written anywhere. If you want your data to exist on multiple machines or be on a disk you need to use different settings.

  2. Using getLastError slows down write operations.

    JR: “GetLastError is underlying command in the MongoDB protocol that is used to implement write concerns. Intuitively, waiting for an operation to complete on the server is slower than not waiting for it.”

    The problem here is not that the operation slows down because it has to wait for the acknowledgement. The real problem is that getLastError requires an extra network roundtrip. Not to mention that your old code was probably polluted by all these extra calls.

  3. getLastError doesn’t work pipelined

    JR: “[…] For many bulk loads, performing multiple inserts with periodic checks of getLastError is the right choice. […]”

    Read the above.

  4. getLastError doesn’t work multi-threaded

    JR: ” Threads do not see getLastError responses for other thread’s operations. MongoDB’s getLastError command applies to the previous operation on a connection to the database, and not simply whatever random operation was performed last. […]”

    If I’m reading this correctly, it seems like Sirer’ hypothesis was that connections can be shared acrossed threads. I have to agree that many drivers do not provide thread-safe connections. So, linking getLastError behavior to the current connection seems OK.

  5. Write Concerns are broken

    JR: “As described in the above sections, WriteConcerns provide a flexible toolset for controlling the durability of write operations applied to the database. You can choose the level of durability you want for individual operations, balanced with the performance of those operations. With the power to specify exactly what you want comes the responsibility to understand exactly what it is you want out of the database.”

    Let’s look at what Sirer wrote:

    […] one could use WriteConcern.SAFE, FSYNC_SAFE or REPLICAS_SAFE for the insert operation [2]. There are 13 different concern levels, 8 of which seem to be distinct and presumably the remaining 5 are just kind of there in case you mistype one of the other ones. WriteConcern is at least well-named: it corresponds to “how concerned would you be if we lost your data?” and the potential answers are “not at all!”, “be my guest”, and “well, look like you made an effort, but it’s ok if you drop it.” Specifically, that’s three different kinds of SAFE, but none of them give you what you want: (1) SAFE means acknowledged by one replica but not written to disk, so a node failure can obliterate that data, (2) FSYNC_SAFE means written to a single disk, so a single disk crash can obliterate that data, and (3) REPLICAS_SAFE means it has been written to two replicas, but there is no guarantee that you will be able to retrieve it later.

    If you want a different explanation: even if there are 13 different WriteConcern types available, there is none that offers the option of having the data written to disk on more replicas.

The period is over and in my mind the score is clearly 4-1. But I know that this is not a real game and there will be some concluding that I’m not getting it. I’m OK living with that though.

Original title and link: 10gen: MongoDB’s Fault Tolerance Is Not Broken… (NoSQL database©myNoSQL)