NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



NoSQL: All content tagged as NoSQL in NoSQL databases and polyglot persistence

NoSQL databases: 10 Things you Should Know About Them

5 pros and 5 cons by Guy Harrison:

Five advantages of NoSQL

  1. Elastic scaling
  2. Big data
  3. Goodbye DBAs (see you later?)
  4. Economics
  5. Flexible data models

Five challenges of NoSQL

  1. Maturity
  2. Support
  3. Analytics and business intelligence
  4. Administration
  5. Expertise

A few amendments though:

  • Elastic scaling: partially correct. Right now there are just a few featuring elastic scaling: Cassandra, HBase, Riak, Project Voldemort, Membase, and just recently CouchDB through BigCouch.

  • Big data: partially correct. Some of the NoSQL databases do not scale horizontally and so they are not a perfect fit for BigData.

  • Goodbye DBAs (see you later?): Maybe you’ll not call them DBA, but someone should still do data modeling and think about data access patterns

  • Support and Expertise: I do see these just as sub-categories of maturity (lack of thereof).

Original title and link for this post: NoSQL databases: 10 Things you Should Know About Them (published on the NoSQL blog: myNoSQL)


The Future of MySQL

Narayan Newton:

However, things are mostly definitely changing. The trend for the last year has been major developments outside of MySQL AB, funded by everyone from Google to Percona to MontyProgram. In fact, the 5.4 release of MySQL is little more than a re-packaging of external patches. This is a far cry from an earlier MySQL AB driven development model. With Oracle’s purchase of Sun and by extension MySQL AB, this change has accelerated.

Sounds really sad for such a important piece that, as part of LAMP, has pushed the web forward. But I know at least 2 NoSQL databases that will use this opportunity in their favor.

Original title and link for this post: The Future of MySQL (published on the NoSQL blog: myNoSQL)

20 Linux Monitoring Tools for SysAdmins

Not only NoSQL:

Need to monitor Linux server performance? Try these built-in command and a few add-on tools. Most Linux distributions are equipped with tons of monitoring. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most basic commands when it comes to system analysis and debugging server issues such as:

  1. Finding out bottlenecks.
  2. Disk (storage) bottlenecks.
  3. CPU and memory bottlenecks.
  4. Network bottlenecks.

Original title and link for this post: 20 Linux Monitoring Tools for SysAdmins (published on the NoSQL blog: myNoSQL)


Django and NoSQL Databases Latest Status Update

Recently, in the Django and NoSQL databases revisited, I’ve covered the coordinated efforts for making Django a NoSQL friendly framework. Alex Gaynor, the main person behind this initiative having the support of the Django community, has ☞ published the final report of the GSOC project:

With this past week GSOC has officially come to it’s close, and I’m here to report on the status of the query-refactor. The original purpose of this branch was to do refactorings to the internals of the ORM, and produce a prototype backend for a non-relational database to demonstrate that this was a viable option. At this time far more work has gone into the latter half of the project, I have developed a fully functioning MongoDB backend, that demonstrates that possibility of using the ORM, almost unmodified on non-relational databases. However, some of the larger refactors that I was originally hoping to do have ultimately not happened, on the other hand they are evidentially not necessary for a functioning backend. At this time there are a number of outstanding tasks, such as: porting the ListField to work on Postgres, and completing the work on embedded documents.
However the largest open question is what of this work should be merged into trunk, and what should live external. My recommendation would be for any changes in Django itself to be merged, including the new form fields, but for the MongoDB backend (and, indeed, any future backends) to live external to Django, until such a time as it obtains a user base anywhere approaching our current backends, as well as a more individuals dedicated to maintaining it.

The guys over ☞ All Buttons Pressed also commented on the outcome of this project:

The biggest design issue (in my opinion) is how to handle AutoField. In the GSoC branch, non-relational model code would always need a manually added NativeAutoField(primary_key=True) because many NoSQL DBs use string-based primary keys. As you can see in Django-nonrel, a NativeAutoField is unnecessary. The normal AutoField already works very well and it has the advantage that you can reuse existing Django apps unmodified and you don’t need a special NativeAutoField definition in your model. Hopefully this issue will get fixed before official NoSQL support is merged into trunk.

Original title and link for this post: Django and NoSQL Databases Latest Status Update (published on the NoSQL blog: myNoSQL)

InformationWeek on MongoDB

Informed journalism:

In addition to MongoDB, systems such as CouchDB or Cassandra are in use on social networking and game sites, including Farmville, and online retailing, such as


Like other large cluster software, such as Hadoop, MongoDB generates two replicas of the original data set in the server cluster so that it can tolerate a hardware failure

  1. Journalism warning signs have been created by Tom Scott and can be found ☞ here  ()

InformationWeek on MongoDB originally posted on the NoSQL blog: myNoSQL


Suggest a simple NoSQL database for java project

That’s not the way to find out if and what NoSQL database to use. Take a look at the getting started with NoSQL to better understand how to ask the right question.

Suggest a simple NoSQL database for java project originally posted on the NoSQL blog: myNoSQL


NoSQL Databases and Security

Jeff Darcy writes about NoSQL systems’ security (actually the lack of):

Most NoSQL stores have no concept of security. […] Mostly it falls into two categories: encryption and authentication/authorization (collectively “auth”). For encryption, there’s a further distinction to be made between on-the-wire and at-rest encryption.

As far as I know:

  • CouchDB supports authentication/authorization
  • Yahoo! recently contributed to Hadoop an authentication module based on Kerberos and SASL

What about the others?

NoSQL Databases and Security originally posted on the NoSQL blog: myNoSQL


NoSQL Databases and Data Warehousing

I didn’t know data warehousing strictly imposes a relational model:

From a philosophical standpoint, my largest problem with NoSQL databases is that they don’t respect relational theory. In short, they aren’t meant to deal with sets of data, but lists. Relational algebra was created to deal with the large sets of data and have them interact. Reporting and analytics rely on that.

I’d bet people building and using Hive, Pig, Flume and other data warehousing tools would disagree with Eric Hewitt.

NoSQL Databases and Data Warehousing originally posted on the NoSQL blog: myNoSQL


Django and NoSQL Databases Revisited

Django decided long time ago that Ruby on Rails cannot be the only framework where people can have fun integrating with all NoSQL databases. During this year DjangoCon Europe there were several session dedicated to Django and NoSQL databases:

What NoSQL support in the Django ORM looks like, and how do we get there

Alex Gaynor speaks about what needs to change in Django ORM to make it more NoSQL friendly:

Reinout van Rees has a summary of the talk ☞ here.

Using MongoDB in your app

Peter Bengtsson talks about his experience of passing from using ZODB for the last 10 years to MongoDB

Some notes from the talk are available ☞ here.

Relax your project with CouchDB

Benoît Chesneau talks about what makes CouchDB appealing to python developers. He also covers the CouchDBkit python framework.

Django and Neo4j: Domain Modeling that Kicks Ass

Not coming from DjangoCon, but still about Django and Neo4j, is Tobias Ivarsson’s presentation: “Django and Neo4j - Domain modeling that kicks ass”:

Derek Stainer summarizes the slide deck ☞ here.

Django and NoSQL Panel

A fantastic panel on the future of Django and NoSQL databases that you can watch over ☞ Reinout van Rees published a transcript of the panel ☞ here.

All in all a lot of NoSQL excitement in the Django world! Or should it be the opposite?

Update: Here is the latest Django and NoSQL Databases status update

Django and NoSQL Databases Revisited originally posted on the NoSQL blog: myNoSQL

VC Perspective on BigData and NoSQL Databases

Fantastic overview of the BigData and NoSQL databases market from a VC:

[…] Though many companies in the Fortune 1000 are starting to experiment with Hadoop, today only 10-20% of enterprises need big data solutions. This number could grow as high as 40-50% in 5 years.


Too many NoSQL database companies have already been created (Cloudera, 10gen, MongoDB, VoltDB, CouchDB, etc). While the user interest in such databases is increasing (many Fortune 1000 companies have started Hadoop evaluation projects), the market won’t be able to sustain them. I expect to see significant consolidation in the next 3-5 years.


For no reason apparent to me, NoSQL database companies are trying to reinvent the data warehousing and business intelligence infrastructures that have been created over the years.

Note also the fantastic BigData definition:

The data in these sets is at the terabyte or petabyte scale, it is semi-structured, highly distributed, and much of it is of unknown value so it must be processed quickly to identify the interesting parts to keep.

VC Perspective on BigData and NoSQL Databases originally posted on the NoSQL blog: myNoSQL


NoSQL Databases and CMS or ECM or DMS

Purists will probably say I’m generalizing a bit (as there are some differences between CMS, ECM, DMS, etc.), but more and more people in the market start to think that these would make good usage of NoSQL databases:

DMS of the future will need to adopt NoSQL  :

  • because new systems are build for Internet : highly available documents is required feature, imagine a place where everybody can really write simultaneously – no locks-in.
  • If you request a specific document, you will get it and there is no difference here with RDBMS, noSQL is even more performant.
  • The “eventually consistent” will not really change anything, when you need a global view of the data (stats for example) you will get it “consistent”.
  • Backup of documents could be done easily – and you will fall in love with replication
  • Sharding is your friend for large distributed database of documents

☞ Why ECM & e-Archiving Solutions Should Adopt NoSQL

Definitely not the first one talking about the future of CMS NoSQL. The first system that comes to my mind when saying CMS and NoSQL is ☞ Lily:

Lily CMS architecture

Create Your Library of NoSQL Papers

Nice little “hack”:

Instead of navigating a bunch of web pages and downloading some PDFs, I decided to automate the process and write a tiny program to do it for me. […] This script very neatly downloads everything to the directory of your choosing . It also thoughtfully names the files with their difficulty rating as the first character so you can sort them ASCII-betically and make a halfway decent list to help your learn your way into NoSQL nerdery.