NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



MongoDB: The Size of the Document and Why it Matters

Kyle Banker explains (@Hwaet) some of the possible implications of using very large documents in MongoDB:

  1. If you’re doing a full-document, replace-style, update, that entire 500k needs to be serialized and sent across the wire. This could get expensive on an update-heavy deployment.
  2. Same goes for queries. If you’re pulling back 500k at a time, that has to go across the network and be deserialized on the driver side.
  3. While most atomic updates happen in-place, the document usually has to be rewritten in-place on the server, as this is dictated by the BSON format. If you’re doing lots of $push operations on a very large document, that document will have to be rewritten server-side, which, again, on a heavy deployment, could get expensive.
  4. If an inner-document is frequently manipulated on its own, it can be less computationally expensive both client-side and server-side simply to store that “many” relationship in its own collection. It’s also frequently easier to manipulate the “many” side of a relationship when it’s in its own collection.

If going embedded all the way works for your use case, then there’s probably no problem with it. But with these extra-large documents, and a heavy load, you may start to see consequences in terms of performance and/or manipulability.

I’d say that these probably apply to most of the document databases out there.