NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL debate: All content tagged as NoSQL debate in NoSQL databases and polyglot persistence

Addiction to Familiar Systems

Marco Arment about switching from familiar systems or programming languages to better ones:

The fear of making the “wrong” choice actually makes the familiar, mastered PHP more attractive. […] If you can get PHP programmers to agree that they need to stop using it, the first question that comes up is what to use instead, and they’re met with a barrage of difficult choices and wildly different opinions and recommendations.

The same problem plagues anyone interested in switching to Linux (Which distro? Which desktop environment? Which package manager?), and the paralysis of choice-overload usually leads people to abandon the choice and just stick with Windows or OS X. But the switching costs of choosing the “wrong” programming language for a project are much larger.

Once you master a programming language or system you start seeing the other options from a different perspective. It doesn’t mean you have a better or an objective perspective though. What you’ve got is a new dimension you are considering in all decisions: familiarity. Every other option you have will go through your familiarity filter: does it feel familiar? does it allow me to do what I’ve been doing all this time? does it work in a similar way?

You might think that using a familiar system is all about productivity. I think that is only partially true. A familiar system doesn’t come with a learning curve and so in the early stages it feels productive. But many times you’ll just have to write over and over again the same things, avoid the same traps and made the tweaks you’ve learned. In a way this part of being productive feels like repetition.

But what all these have to do with databases? The answer is probably obvious.

Familiarity is in so many cases the main reason new systems start with a relational database. It feels familiar. It is familiar. As your application grows and new features are needed there will be cases when the relational database would become a less optimal solution. But in the name of familiarity, you’ll be tempted to stick with it. Make a change here and there, declare a feature too complicated, tweak it, optimize it. Repeat.

After a while, taking a step back might make you realize that what you’ve built is not anymore familiar. Or maybe it’s still familiar to you, but to a new project team member it will feel different and new. Or maybe very similar to a different database that you could have started with.

The costs of sticking with familiar programming languages, systems, or databases could be much larger than you’d think of.

Original title and link: Addiction to Familiar Systems (NoSQL database©myNoSQL)

Techmemed: NoSQL Is a Premature Optimization

This article by Bob Warfield was the first NoSQL post to be Techmemed. Main point:

In fact, I would argue that starting with NoSQL because you think you might someday have enough traffic and scale to warrant it is a premature optimization, and as such, should be avoided by smaller and even medium sized organizations.  You will have plenty of time to switch to NoSQL as and if it becomes helpful.  Until that time, NoSQL is an expensive distraction you don’t need.

Differently put: don’t use NoSQL for the wrong reasons. But isn’t this what NoSQL people have been saying for so long: don’t use a relational database for the wrong reasons?

Original title and link: Techmemed: NoSQL Is a Premature Optimization (NoSQL database©myNoSQL)



Jay Janssen:

Over time SQL FUD has cropped up from all the NoSQL evangelists hammering on about how NoSQL solves all the traditional problems RDBMSs have. Do they have valid points? In a lot of cases, sure. But that doesn’t mean that NoSQL is without its own problems.

Can you name who’s spreading FUD about SQL or relational databases? And then who’s dismissive about NoSQL databases or polyglot persistence?

Let’s say for a moment that he’s right and there are indeed two camps fighting over the same market and spreading FUD about competitors. In the NoSQL group we could count: a bunch of open sourcers, a couple of startups, and possibly some consultants searching for their next gig. In the other camp, we have a few of multi-billion corporations with dedicated departments for marketing, pre-sales, sales, and consulting — I’m leaving aside the enterprise IT departments. Which of these camps has more training, experience, and opportunities in spreading FUD?

Original title and link: FUD in NoSQL (NoSQL databases © myNoSQL)


Tunning SQL, No Need for NoSQL

The most common complaint against NoSQL is that if you know how to write good SQL queries then SQL works fine. If SQL is slow you can always tune it and make it faster.

Let’s keep in mind two things:

  1. performance is not equivalent to scalability
  2. most NoSQL databases have been created to deal with scalability issues. And they offer a different, non-relational, data model.

A NoSQLite might counter that this is what key-value database is for. All that data could have been retrieved in one get, no tuning, problem solved. The counter is then you lose all the benefits of a relational database and it can be shown that the original was fast enough and could be made very fast through a simple turning process, so there is no reason to go NoSQL.

There’s the third thing too: you shouldn’t always obey the Law of the instrument. Polyglot persistence is a viable option.

Original title and link: Tunning SQL, No Need for NoSQL (NoSQL databases © myNoSQL)


Yet NoSQL Dogma

Given all this I did some benchmarks and as expected the NOSQL community was hurt […]

He is right about my comments not being clear. Let me try to clarify.

  1. If all you need is MySQL minus the relational model using the HandlerSocket for key-value access that’s OK.

  2. If you are ready to give up the relational model and turn your database into a key-value store, then doors are open to try using other data models: document database, key-value stores, etc.

    The fact that MySQL can be used as a key-value store is a good thing, but not looking around to alternatives is dogmatic. Other storage solutions might actually offer better alternatives (performance, scalability, data model richness, development agility, etc.).

  3. If you need to build a distributed system, there are solutions out there that can maintain either ACID characteristics (like VoltDB) or that can go fully scalable (like Cassandra, HBase, Riak, etc.).

So, what is my point? If your relational database works for you that’s perfect. If it doesn’t, you have alternatives. Just look around and pick something that solves the problem. We are in a polyglot persistence world. We have options. We provide solutions and move along.

Original title and link: Yet NoSQL Dogma (NoSQL databases © myNoSQL)


NoSQL Speed Bumps

Dr.Doob’s Ken North writes a history of NoSQL failures:

Organizations using the new NoSQL data stores are like the settlers who often traveled uncharted territory in the 19th-century American West. Some days involve the beauty of the journey; but on other days, you’re dodging arrows.

I’m not arguing that NoSQL databases are still young and so people using them are running into bugs or even misusing them. But aren’t both these issues still present in the relational databases world after 30 years?

Original title and link: NoSQL Speed Bumps (NoSQL databases © myNoSQL)


When NoSQL makes better sense than MySQL

TInniam V Ganesh:

However when there is a need to scale the application to be capable of handling millions of transactions the NoSQL model works better. There are several examples of such databases – the more reputed are Google’s BigTable, HBase, Amazon’s Dynamo, CouchDB & MongoDB.

BigTable is available only to Google’s employees, Amazon’s Dynamo only to Amazon employees, CouchDB is not scalable per se, MongoDB scaling architecture is young and complicated. How about this list: Cassandra, HBase, Hypertable, Riak, Project Voldemort, Amazon SimpleDB, Hibari(?), MongoDB?

Original title and link: When NoSQL makes better sense than MySQL (NoSQL databases © myNoSQL)


Developers: Key to NoSQL Adoption

In a recent ☞ CouchOne post, the author[1] wrote:

Developer adoption matters above all else in the early days of technological change. That is why all of these companies are placing big bets on training and documentation programs.

I have asked myself: why developer adoption? Was new technology adoption always targeting developers? And I think I have a couple of answers:

  • targeting developers is the easiest way to get feedback from people that will use your product. Keep in mind that business solutions rarely rely directly on technology. They usually rely on their technical teams which rely on technology, so most of the time your product’s users will be the developers, not the businesses
  • more importantly, developers will quickly come up with use cases and scenarios where your product fits (nb they will also complain a lot about what doesn’t work). You could then use these use cases/scenarios to build your pre-sales/sales pitches
  • last, but not least, developers will do a part of the marketing and PR for you. If they don’t, then your product might not be as useful as you thought of it.

On the other hand, if your product positioning is not very clear, betting your adoption on developers only might also be interpreted as a a weakness sign (as in “hey, they have no idea what their product should do”). Some will see your product as a vitamin instead of a painkiller. Not to mention, that developers will say both good things, but also a lot of things about what they don’t like and where your product sucks.

  1. I think the author is J.Chris Anderson @jchris  ()

Original title and link: Developers: Key to NoSQL Adoption (NoSQL databases © myNoSQL)

Why NoSQL … Why Not

Interesting article from Xeround Avi Kapuya ☞ NoSQL: The Sequel. Couple of comments though:


In other words, in SQL, the data model does not enforce a specific way to work with the data — it is built with an emphasis on data integrity, simplicity, data normalization and abstraction, which are all extremely important for large complex applications.

I’d say that data normalization is not a goal per se, but a solution to a problem (data duplication, frequent updates to common entities). But what if this solution is introducing another bigger problem (read JOINs)?

The NoSQL approach presents huge advantages over SQL databases because it allows one to scale an application to new levels

Plus it may give you more flexibility in your data model, plus it may be a better (as in operational, complexity, performance, etc.) storage for different formats of data.

Why not NoSQL

At the system level, data models are key*. Not having a skilled authority to design a single, well-defined data model, regardless of the technology used, has its drawbacks.

Actually I think the reality might be a bit different. Because NoSQL imposes a “narrow predefined access pattern” it will require one to spend more time understanding and organizing data. Secondly, the final model will reflect and be based on the reality of the application, on not only on pure theory (as is the case with most initial relational model designs).

At the architecture level, two major issues are interfaces and interoperability. Interfaces for the NoSQL data services are yet to be standardized.

The interface limitation is a temporary issue in terms of getting more/better/quicker tooling support and probably a longer term issue for developers needing to learn different models. But as we’ve agreed, NoSQL has a small, predefined access mode and so we are not talking about learning completely new languages.

Personally, I think the real issue is steep learning curve of understanding each of these NoSQL databases semantics and operational behavior then not having a common API.

Interoperability is an important point, especially when data needs to be accessed by multiple services.

I’m not seeing the problem here. As far as I know each relational database is coming with its per-language drivers. On the NoSQL side, there are already quite a few products using standard protocols.

Moving to the operational realm, here, from my experience, lies the toughest resistance, and rightfully so… The operational environment requires a set of tools that is not only scalable but also manageable and stable, be it on the cloud or on a fixed set of servers. […] Operation needs to be systematic and self contained.

Now, this is completely the other way around. If you read any large scale application story, you’ll notice the pattern: the operational costs where a significant factor in deciding to use NoSQL. Just check the stories of Twitter, Adobe, Adobe products, Facebook. Complexity is a fundamental dimension of scalability and right now the balance is towards NoSQL databases .

It is my opinion that a SQL database built on NoSQL foundations can provide the highest value to customers who wish to be both agile and efficient while they grow.

Unfortunately I don’t think that’s actually possible or at least not for all solutions. But If we just want some common access language, we will probably get it.

If, on the other hand, what we want is more tunable and scenario specific engines, we will probably get these too. (nb: as far as I’ve heard the PostgreSQL community is learning a lot from the various NoSQL databases and trying to bring in as many of the good ideas they can).


My conclusion is simple. As with programming languages where we are not stuck with COBOL, polyglot persistence is here to stay and it’ll only get better.

Original title and link: Why NoSQL … Why Not (NoSQL databases © myNoSQL)

Distributed Database Systems

After an intro about large scale classical RDBMS setups, ☞ Will Fitch’s post started well:

What advantages and disadvantages will come with this new architecture [distributed database systems]? What hardware can I reuse efficiently with this new setup? What vendor do I choose to go with? What kind of code changes and culture shock will this introduce to the developers and DBAs?

But then it slowly turned into: how not to make a technical decision. There are two parts that make me think the decision was probably already made:

While there are a few distributed solutions out there: Hadoop, Cassandra, Hypertable, Amazon SimpleDB, etc., one stands out in my opinion – VoltDB.

You cannot say you’re making an informed decision when mixing a data processing framework with NoSQL databases and DaaS, plus you leave aside products like HBase or Riak or Membase.

And then it is this part that made me think the VoltDB pre-sales have already done their job:

We’re used to writing code that connects to a database and executes a stored procedure that lives in the database and is written in SQL. Introducing this new architecture would completely change our environment. Stored procedures would likely be written in Java or another JIT language. The CRUD functionality would then execute that instead.

There’s nothing fundamentally wrong with having preferences, but technical decisions should be based on good understanding of the evaluated products and a lot of experimentation and prototyping. It shouldn’t be the other way around.

Original title and link: Distributed Database Systems (NoSQL databases © myNoSQL)


SQL or NoSQL: Stop Being Religious

But lot of people asked me why I am part of Zynga database team when there is no MySQL being used […]

As a consultant, I help lot of other companies to scale using NoSQL systems apart from MySQL especially on large data handling; as the data store solution should help to scale the systems to yield the desired results; especially MySQL should be used for typical OLTP workloads and combination of MySQL and NoSQL or any other data warehouse clusters for analytics and/or OLAP workloads by combining with right application and caching components based on the business model and how the data is generated, stored, accessed and processed. ☞ Venu Anuganti

For the majority, technology is just a small part towards the business goals. As engineers, we should stop being religious about our technical fetishes and deliver value.

That doesn’t mean though, we cannot continue having our techy tee parties (read with tons of beers) where we are bashing every other product.

Original title and link: SQL or NoSQL: Stop Being Religious (NoSQL databases © myNoSQL)