To SQL or to NoSQL?

Bob Lambert’s thoughts about a post about migrating from a NoSQL database back to a relational database:

I really liked this post, but particularly for these two points:

  • All data is relational, but NoSQL is useful because sometimes it isn’t practical to treat it as such for volume/complexity reasons.
  • In the comments, Jonathon Fisher’s remarked that NoSQL is really old technology, not new. (Of course you have to like any commenter that uses the word “defenestrated”).

I have lost count of how many times I’ve read exactly these arguments. But let’s take a different look at this post and the original article:

  1. the first thing that strikes me is that there’s no mention of the NoSQL database that was used; adding to that there’s no explanation of what led to using that database in the first place. What if it was just an experiment? What if the initial implementation was just a fashionable decision?

  2. all data is relational

    A more accurate statement would be “all data is connected“. The way we represent these connections can take many different forms and many times depends on the ways we use the data. This is exactly the core principle when doing data modeling in the NoSQL world too.

    The relational model is the most common as a consequence of the popularity of relational databases. One quick example of connected but not relational is hierarchical data, an area where relational databases are still not excelling (even if some have built custom solutions).

  3. “data in a relational model is optimized for the set of all possible operations”.

    Actually, the relational model is optimizing for space efficiency and set operations. There’s no such thing that optimizes for everything. Take graph data and traversal operations as an obvious counter example for relations and operations that are outside the capabilities of a relational database. And there are quite a few other examples: massive sparse matrices, etc.

  4. “Todd Homa recounts one horror story that shows how NoSQL data modelers must be aware of the corners into which they paint themselves as they optimize one access path at the expense of others.”

    This is like saying that everyone using a relational database has shut all their chances to grow their application to more than one server. Both of these are quite inaccurate.

Last, I think we should change the “choose the right tool for the job” advise to something that is a bit more clear: “understand and choose the trade-offs that correspond to your requirements”. Doesn’t sound as nice, but I think it’s better.

