NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



nosql databases: All content tagged as nosql databases in NoSQL databases and polyglot persistence

Are SQL Databases Dead?

A couple of interesting thoughts about the NoSQL market from a long-time Oracle and MySQL DBA, Sean Hull:

There’s a lot to be excited about in this new realm of db, and some interesting bigger trends that are pushing things in a new way.

You don’t need to agree with all of these. Nor do you have to consider that they are all true. But it’s almost always useful to understand how your audience is seeing things.

✚ This reminded me of the series of posts, another long-time DBA wrote after he made the jump: part 1, part 2, and part 3.

Original title and link: Are SQL Databases Dead? (NoSQL database©myNoSQL)


Do all roads lead back to SQL? Some might and some might not

Seth Proctor for Dr.Dobb’s:

Increasingly, NewSQL systems are showing scale, schema flexibility, and ease of use. Interestingly, many NoSQL and analytic systems are now putting limited transactional support or richer query languages into their roadmaps in a move to fill in the gaps around ACID and declarative programming. What that means for the evolution of these systems is yet to be seen, but clearly, the appeal of Codd’s model is as strong as ever 43 years later.

Spend a bit of time reading (really reading) the above paragraph—there are quite a few different concepts put together to make the point of the article.

SQL is indeed getting closer to the NoSQL databases, but mostly to Hadoop. I still stand by my thoughts in The premature return to SQL.

Most NoSQL databases already offer some limited ACID guarantees. And some flavors of transactions are supported or are being added. But only as long as the core principles can still be guaranteed or the trade-offs are made obvious and offered as clear choices to application developers.

The relational model stays with the relational databases. If some of its principles can be applied (e.g. data type integrity, optional schema enforcement), I see nothing wrong with supporting them. Good technical solutions know both what is needed and what is possible.

Original title and link: Do All Roads Lead Back to SQL? | Dr Dobb’s (NoSQL database©myNoSQL)


NoSQL Shouldn’t Mean NoDBA

Nick Heudecker (Gartner):

The results were largely what I expected, except for the respondent profile. Database administrators (DBAs) appear to be significantly underrepresented in the NoSQL space, representing only 5.5% of respondents

Question here is why is this happening? Keeping in mind the survey’s audience is “NoSQL adopters”, I’m wondering what combination of the following explains the results:

  1. DBAs see no value in NoSQL
  2. DBAs see no job security
  3. DBAs see a drop in their revenue with NoSQL
  4. DBAs are misinformed
  5. DBAs are change resistant (putting them in the later phases of adoption)

I’d go with a combination of 5 (explained mostly by 3) and 4.

Original title and link: NoSQL Shouldn’t Mean NoDBA (NoSQL database©myNoSQL)


The path of disruption: The dirty truth about big data and NoSQL

Andrew C. Oliver:

The dirty secret is that big data and NoSQL vendors aren’t just targeting gigantic, consumer-facing companies like Facebook or Google. The technology applies much more broadly, and as the supply of high-concurreny, low-cost, flexible data storage increases, so will demand. If you can hoard all that data cheaply, why not mine it cheaply as well and compete with the big names?

It’s called the path of disruption.

The best and shortest explanation can be found in Ben Thompson’s “Chromebooks and the Cost of Complexity“:

The key thing to notice is that products improve more rapidly than consumer needs expand. This means that while the incumbent product may have once been subpar, over time it becomes “too good” for most customers, offering features they don’t need yet charging for them anyways. Meanwhile, the new entrant has an inferior product, but at a much lower price, and as its product improves — again, more rapidly than consumer needs — it begins to peel away customers from the incumbent by virtue of its lower price. Eventually it becomes good enough for nearly all of the consumers, leaving the incumbent high and dry.

the path of disruption

Original title and link: The path of disruption: The dirty truth about big data and NoSQL (NoSQL database©myNoSQL)


NoSQL? No Thanks

Steve Jones in “NoSQL? No Thanks“:

There continues to be a disproportionate amount of hype around ‘NoSQL’ data stores. By disproportionate I mean ‘completely and utterly out of scale with the actual problems of the vast majority of companies’.

Can you imagine how many such posts I’ve read since starting this blog? Sometimes I think that running a “Bashing NoSQL” blog could be a good business and help me fund this one too.

While there’s usually some truth behind every complaint about NoSQL, generalizations and are leading to less useful conclusions. For Steve Jones’s post, I’ll leave aside the clear example of an unnecessary generalization, that largely voids the points of the post, and try to focus on the rest.

The author suggests that the actual problems faced by “the vast majority of companies” are related to data interactions for traditional reporting, complex analytics, and embedding in applications. The hypothesis being that NoSQL databases and the Hadoop toolkit are making worse.

Put this way, it sounds right, isn’t it?

Before cars were invented, we had horses and carriages and dirt roads or no roads. Once the car was invented, people couldn’t ride their horses anymore, they couldn’t initially carry as much marchendise they did with carriages and on top of all these they had to redo the whole infrastructure. Cars made everything worse.

Both these arguments are missing the root cause. The whole rationale behind NoSQL databases and Hadoop is that the existing solutions were prohibitevely expensive for the the current requirements or they couldn’t handle the volume, velocity, variety, and variability of the data in this age.

There is no data interactions with no data. Little data means less useful reports and inaccurate or expensive data analysis.

Saying “no thanks to NoSQL and Hadoop” is implicitely saying no to the future of your business.

Last, but not least:

business users couldn’t care less what developers use as long as they deliver.

What’s the industry where technology doesn’t make the difference? If there’s one, how long it will last?

Original title and link: NoSQL? No Thanks (NoSQL database©myNoSQL)

NoSQL and RDBMS - The right and left brain of data

One of the best explanation for polyglot database architectures:

Comparing NoSQL and relational databases is lot like comparing the left and right sides of the brain. Too much focus on structural differences and attributes can overshadow the fact that we’re stuck with both sides of the brain and we need both to make the best use of sensory data.

Original title and link: NoSQL and RDBMS - The right and left brain of data (NoSQL database©myNoSQL)


Why NoSQL Can Be Safer than an RDBMS

Robin Schumacher1:

That said, I disagree with many of the article’s statements, the most important being that companies should not consider NoSQL databases as a first choice for critical data. In this article, I’ll show first how a NoSQL database like Cassandra is indeed being used today as a primary datastore for key data and, second, that Cassandra can actually end up being safer than an RDBMS for important information.

You already know how this goes: “First they ignore you, then they laugh at you, then they fight you, then you win”. I’ll let you decide where major NoSQL databases are today.

  1. Robin Schumacher is VP of Products at DataStax. He’s also my boss

Original title and link: Why NoSQL Can Be Safer than an RDBMS (NoSQL database©myNoSQL)


PostgreSQL and the NoSQL world

As I linked earlier today to the MemSQL and JSON story, I’ve thought again about PostgreSQL and its community approach of bringing in new features. It’s hard to miss what they are doing. And I think they are doing it right.

The PostgreSQL community is looking outside the box and listens. What features are users of NoSQL databases most excited about? Can we offer native support for them? Can we integrate with these other tools? These are the right questions to ask when considering expanding outside your space to help your users.

PostgreSQL 9.2 introduced a JSON data type and JSON functions and operators. Here’s what I wrote about when linking to a post about the augmented support for the JSON data type.

PostgreSQL 9.3 added a Javascript engine (V8) bringing even more power to the JSON data type. Also in PostgreSQL 9.3 there’s support for foreign data wrappers: a feature allowing to query from PostgreSQL external data sources.

While it might sound easy to watch what others are doing and then do it yourself—this is probably well-known as the Microsoft strategy—the reality is there’s a lot of complexity of following this strategy. Besides asking the right questions when picking what features to bring in, there are always the technical and design decisions:

  1. can we actually support this?
  2. can we support it in a way that’ll not break or impact negatively existing features?
  3. how should we expose these “imported” features so we make them appealing to existing users (with their vision of the product), while keeping them attractive and familiar to new users?

The last question is the most difficult to come with the right answers.

✚ Here’s also a post I’ve linked to showing how to use PostgreSQL as a schemaless database.

Original title and link: PostgreSQL and the NoSQL world (NoSQL database©myNoSQL)

RavenDB 2.5 with Dynamic Aggregation and Query Streaming

Jan Stenberg summarizes on InfoQ the latest RavenDB release:

A stable version 2.5 of the document database RavenDB has been released with dynamic aggregation allowing for complex queries and an Unbounded results API using query streaming to retrieve large result sets in a single request.

While the Hadoop space is lately about SQL and speed, the NoSQL databases are starting to look into an area where users have high expectations: advanced queries over large amounts of data. If you remember the early days pretty much everything was about key-based access and then map-reduces data sifting. Today we have many different query languages or data processing frameworks. And there’s still a lot to come.

Original title and link: RavenDB 2.5 with Dynamic Aggregation and Query Streaming (NoSQL database©myNoSQL)


When NoSQL Databases Are Good for You and Your Company

One of those NoSQL popularization articles published by big media sites. If you haven’t read one before, this one isn’t too bad (but as most such attempts they make too many generalizations and also use too many buzzwords).

Original title and link: When NoSQL Databases Are Good for You and Your Company (NoSQL database©myNoSQL)


Big Data: 3 Questions to Ask When Comparing Relational to NoSQL Databases

Dave Beulke1:

  1. What feature justify using NoSQL over relational database?
    Hopefully, the answer is not buzz words or acronyms and is something truly in the NoSQL databases;

  2. What type of Big Data transaction consistency is required?
    Ask if the big data analysis in the NoSQL database needs to be repeatable and reliable.

  3. Do you realize that a Big Data application using an open source NoSQL database is going to cost more than a relational database?

I think the first question is great, the second could be a great question if such databases supporting repeatable and re-creatable results2 would actually exist, but the third one convinced me someone has something to sell.

  1. Dave Beulke is an internationally recognized DB2 consultant, DB2 trainer and education instructor. 

  2. Dave Beulke refers to ACID guarantees, but those are not actually providing repeatable and re-creatable results in the true sense. I think Datomic model is the closest to offering this behavior. 

Original title and link: Big Data: 3 Questions to Ask When Comparing Relational to NoSQL Databases (NoSQL database©myNoSQL)


NoSQL and Big Data Money News


  1. Cloudant has received an undisclosed investment from Samsun Ventures

  2. Think Big Analytics, a Big Data consulting company raised $3mil. from former Cisco executive Dan Scheinman and WI Harper Group

Hortonwork’s announces Certification Program for Apache Hadoop

Hortonworks’ New Certification Program Enables the Next Generation Data Architecture with Apache Hadoop:

[…]today announced the launch of the Hortonworks Certified Technology Program, designed to help customers choose leading enterprise software that has been tested to integrate with Hortonworks Data Platform (HDP), the only 100-percent open source Apache Hadoop distribution. By certifying technologies, Hortonworks is taking the risk out of the technology selection, thereby accelerating and simplifying customers’ big data projects. The Program strengthens and expands the Apache Hadoop ecosystem, while helping to increase the enterprise capabilities of Apache Hadoop.

I assume the model here is that vendors pay Hortonworks for this certification and they can use the Hortonworks stamp when talking to customers.

DataStax’s Next Great Data Developer Contest

Two scholarships up to $10,000 each for computer science students from North America enrolled in a Bachelor or Master program. Announcement here and blog post here.

Last, but not necessarily money-related:

MySQL 5.6 Released

I’m still reading about what’s new in MySQL 5.6, but what caught my eyes while skimming over the docs is support for online DDL.

Original title and link: NoSQL and Big Data Money News (NoSQL database©myNoSQL)