NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



document database: All content tagged as document database in NoSQL databases and polyglot persistence

Enterprise-class NoSQL

What is distinctive about an enterprise-class NoSQL database is its support for additional enterprise-scale application requirements, namely: ACID (atomic, consistent, isolated, and durable) transactions, government-grade security and elasticity, as well as automatic failover.

What is distinctive about an enterprise-class NoSQL database is what my company is selling.

If that would be true, I doubt we would have no any other databases around considering MarkLogic’ age and perfect fit.

Snarky comments aside, the enterprise requirements are so complicated, numerous, political and sometime non-technical, that I don’t think anyone would ever be able to come up with a definition or (even if extremely long) checklist of what’s enterprise-grade.

Original title and link: Enterprise-class NoSQL (NoSQL database©myNoSQL)


New PostgreSQL guns for NoSQL market

Joab Jackson (PCWorld):

Embracing the widely used JSON data-exchange format, the new version of the PostgreSQL open-source database takes aim at the growing NoSQL market of nonrelational data stores, notably the popular MongoDB.

I’ve always appreciated the openness of the PostgreSQL developers to consider new features and their efforts to bring these to a relational database. What’s missing from the picture is how many users are actually using these features.

Original title and link: New PostgreSQL guns for NoSQL market (NoSQL database©myNoSQL)


Monitoring CouchDB with Munin

A long, but extremely useful list of metrics to get from CouchDB:

The most of monitoring systems plugins for CouchDB are unable to handle all the described cases since they are trying to work with just /_stats resource - it’s good, but, as you may noted, not enough to see full picture of your CouchDB.

However, at least for Munin there is one that’s going to handle all this post recommendations.

Original title and link: Monitoring CouchDB with Munin (NoSQL database©myNoSQL)


A map to reviewing RavenDB code base

Ayende provides some answers to a series of questions about specific code base details of RavenDB. This could be a very good starting point for those interested into how RavenDB is implemented.

Original title and link: A map to reviewing RavenDB code base (NoSQL database©myNoSQL)


MMS and the state of backups in MongoDB land

So just to be clear, if you are doing it yourself, you are probably settling for something other than a consistent snapshot. Even then, it’s not simple.

I’m always fascinated by companies introducing products by calling out how shitty and complicated their other products are. Axion. Now cleans 10 times better than before.

Original title and link: MMS and the state of backups in MongoDB land (NoSQL database©myNoSQL)


What versions of Erlang should you use with CouchDB

Ruseel Branca goes through a list of Erlang versions to identify those that are safe to be used with CouchDB:

There has been some discussion on what versions of Erlang CouchDB should support, and what versions of Erlang are detrimental to use. Sadly there were some pretty substantial problems in the R15 line and even parts of R16 that are landmines for CouchDB. This post will describe the current state of things and make some potential recommendations on approach.

Very useful.

Original title and link: What versions of Erlang should you use with CouchDB (NoSQL database©myNoSQL)


Choice of NoSQL databases from Cloudera

Adam Fowler1 looks at the potential confusion for Cloudera’s customers when talking about NoSQL databases:

As for Cloudera customers I’m not too sure. It may confuse people asking Cloudera about NoSQL. Below is a potential conversation that, as a sales engineer for NoSQL vendor MarkLogic, I can see easily happening:

This announcement struck me as being too publicized — it’s normal for companies with similar interests to partner, but a fair amount of care should be put into clearing all possible confusions and I don’t think this happened.

Just to summarize: Cloudera provides support for HBase and Accumulo. And it has a deal with MongoDB and Oracle. I assume in the sale process, Cloudera will go with: “we work with whatever you already have in place”. As for recommending a NoSQL solution for their customers, it will probably go as in Adam Fowler’s post. To which we could probably add Oracle too.

  1. Adam Fowler works for MarkLogic. 

Original title and link: Choice of NoSQL databases from Cloudera (NoSQL database©myNoSQL)


An intro to bulk updates in MongoDB

MongoHQ guys introducing the new bulk update API in MongoDB:

What is the most interesting in this new functionality is how MongoDB has implemented one common, fluent API across all of the MongoDB drivers. Apart from some language-centric casing and variations in how the results and errors are handled, the consistency of the API implementations are remarkably high.

  1. I didn’t know MongoDB didn’t support bulk operations. So many other NoSQL solutions supported bulk operations for much longer and since earlier versions (e.g. Cassandra, RethinkDB, CouchDB)
  2. The API is what you’d actually expect.

Original title and link: An intro to bulk updates in MongoDB (NoSQL database©myNoSQL)


RW locks are hard

Mark Callaghan continues his research and benchmarking of MongoDB, TokuMX, and InnoDB. This post focuses on the impact of locks in MongoDB and the different solutions that were implemented over time in InnoDB. Fantastic read.

MongoDB and TokuMX saturated at a lower QPS rate then MySQL when running read-only workloads on a cached database with high concurrency. Many of the stalls were on the per-database RW-lock and I was curious about the benefit from removing that lock. I hacked MongoDB to not use the RW-lock per query (not safe for production) and repeated the test. I got less than 5% more QPS at 32 concurrent clients. I expected more, looked at performance with PMP and quickly realized there were several other sources of mutex contention that are largely hidden by contention on the per-database RW-lock. So this problem won’t be easy to fix but I think it can be fixed.

Original title and link: RW locks are hard (NoSQL database©myNoSQL)


5 Myths about NoSQL vs Relational Databases

Ryan Betts, the CTO of VoltDB addressing an article by MongoDB’s CEO Max Schireson that seems to have stroken a chord:

Recently Max Schireson, CEO of MongoDB, shared his thoughts on relational databases. His statements deserve a direct and frank opposing response. Let’s walk through the myths that Mr. Schireson promoted.

Compared with PostgreSQL’s Robert Haas post “Why the clock is ticking for MongoDB“, this one makes some debatable arguments — e.g. “All popular SQL systems support document types”: aside for SOA committees and MarkLogic, I’ve never heard someone enjoying XML. They aren’t innaccurate, but they’re paiting VoltDB’s space in a too bright color palette.

Original title and link: 5 Myths about NoSQL vs Relational Databases (NoSQL database©myNoSQL)


MongoDB is growing up

If Curt Monash says so…

With that caveat, the MongoDB rewrite story is something like:

  • Updating has been reworked. Most of the benefits are coming later.
  • Query optimization and execution have been reworked. Most of the benefits are coming later, except that …
  • … you can now directly filter on multiple indexes in one query; previously you could only simulate doing that by pre-building a compound index.
  • One of those future benefits is more index types, for example R-trees or inverted lists.
  • Concurrency improvements are down the road.
  • So are rewrites of the storage layer, including the introduction of compression.

Original title and link: MongoDB is growing up (NoSQL database©myNoSQL)


maxTimeMS in MongoDB 2.6

Jason McCay (MongoHQ) explains the new maxTimeMS API in MongoDB 2.6:

There are a number of scenarios where a flag like this can be helpful. For example, if you are in discovery mode and want to protect your database performance against unintended runaway operations, you could ensure all your queries include this flag.

Another scenario would be the batching of results, allowing you to define the amount of time/effort the database should spend returning results until it quits and moves on to the next request. In this situation, the cursor would continue to return results until the allotted amount of time has expired.

Original title and link: maxTimeMS in MongoDB 2.6 (NoSQL database©myNoSQL)