ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

document database: All content tagged as document database in NoSQL databases and polyglot persistence

The State of CouchDB With Comments

With the whole confusion surrounding and the lack of energy in the Couch[addsuffixhere] world, my attention has slowly shifted away. Thus, it was only last night that I’ve read Jan Lehnardt’s “State of CouchDB” post.

Couple of notes:

  • Jan Lehnardt is out of Couchbase (as an employee) and plans to focus again (a part of his time) on Apache CouchDB
  • he’s the first person directly related to CouchDB that finally accepts publicly the whole confusion created around CouchDB and the companies connected in some way or another to it. I was really, really tired repeating this for the last 2 years.
  • “people who really get to know CouchDB are extremely passionate about it” — I read this as passive aggressive style. According to my data the people interested in CouchDB were fewer and fewer by the day.
  • there are plans for what’s coming next in CouchDB. The post gives a short list of 4 things, but the real ones are in the gist I’ve linked to earlier
    • BigCouch merging has been mentioned for so long that right now it feels like “waiting for the unicorn”
  • Jan Lehnardt mentions (and is excited) about the alphabet of _ouchDB projects. I’ll say it one more time: they’re probably cool, but long term they’ll perpetuate the confusion. Unfortunately there’s nothing much to be done now.
  • I’m glad to read a state of a union from a person that has been involved for so long with CouchDB. But in the world of open source, it’s only the facts that matter. Sometimes reviving a project or regaining users is more difficult than starting from scratch.
  • “We had a hard year, lost our traction, and we still came out on top.” Nope.

Original title and link: The State of CouchDB With Comments (NoSQL database©myNoSQL)

via: http://writing.jan.io/2013/02/01/the-state-of-couchdb.html


CouchDB Future Feature List

I’m saving the current list of CouchDB Future Features so I can check it back in 6 months1:


  1. I am not saying this as a mild form of “filing for future claim chowder”. 

Original title and link: CouchDB Future Feature List (NoSQL database©myNoSQL)


CouchDB In-Browser JavaScript Debugger

Interesting project and if it works it can prove to be very useful considering mostly everything in CouchDB is JavaScript, at least from a developer’s perspective:

Original title and link: CouchDB In-Browser JavaScript Debugger (NoSQL database©myNoSQL)

via: http://jhs.iriscouch.com/files/debugger/debug.html


NoSQL on MySQL: Stating the Obvious

Matthew Aslett about Couchbase’s and DataStax’s reactions to Oracle’s announcement of MySQL support of NoSQL API:

Sure, Couchbase and DataStax laid it on a bit thick, but these are corporate blog posts – it goes with the territory.

I’ve already linked and commented about these: Couchbase’s reaction and DataStax’s reaction. What I didn’t know—more accurately I should probably write “I hoped”—is that this sort of reactions come with the “corporate” badge. But I’ll keep my hope considering the exhaustive list of reactions from other NoSQL companies.

Original title and link: NoSQL on MySQL: Stating the Obvious (NoSQL database©myNoSQL)

via: http://blogs.the451group.com/information_management/2013/02/13/nosql-on-mysql-stating-the-obvious/


Reactions to MySQL 5.6: Couchbase

Bob Wiederhold (Couchbase CEO) about MySQL 5.6, their use of the NoSQL term, and the PR message touting the new version as the solution “combining the best of both worlds”:

What we see is a whole new wave of applications that have very different requirements than applications had just a few years ago. More often than not they are cloud-based, need to support a huge and dynamically changing number of users, need to store huge amounts of data, and need a highly flexible data model that allows them to adjust to rapidly changing data capture requirements and process lots of semi-structured and unstructured data. The fundamentally different architectural decisions embedded in NoSQL technologies – along with the easy scalability, consistently high performance, and flexible data model advantages (along with all the other tradeoffs) NoSQL provides – are turning out to be a better fit for an increasing number of these applications.

That doesn’t mean MySQL (or relational databases) will go away or won’t play a significant role in the database industry in the future.

Bob Wiederhold is also interested in how Oracle positions their products in terms of NoSQL:

As a side note it’s curious that the MySQL team seems out of step with other parts of Oracle. While the MySQL team seems to be convinced MySQL can do it all, Oracle’s NoSQL team seems to feel differently and is busily trying to catch up to NoSQL leaders like Couchbase, MongoDB, and Cassandra with their own NoSQL product. If relational technology is a one size fits all technology, why is Oracle itself making such a big investment in developing its own NoSQL product?

My supposition, expressed in the post MySQL 5.6 - What’s new, is that NoSQL is just a critical checkbox on the marketing and sales departments. Oracle NoSQL database and its precursor BerkleyDB seem to silently live inside the giant.

Original title and link: Reactions to MySQL 5.6: Couchbase (NoSQL database©myNoSQL)

via: http://blog.couchbase.com/why-mysql-56-no-real-threat-nosql


A Human-Readable Jackrabbit Persistence Manager Prototype for Orientdb

Jackrabbit still has a very special place in my heart. I’ve fought it many times, sometimes losing, most of the time winning. But for over 7 years now, it is still the main storage engine serving the content of InfoQ. So this OrientDB engine for Jackrabbit by Thomas Kratz caught my attention:

This has some limitations, as jackrabbit will still access only one node at a time, being able to traverse the graph at the storage level is simply not intended by the whole api. But it works, it’s readable, can be modified at the db level easily.

Original title and link: A Human-Readable Jackrabbit Persistence Manager Prototype for Orientdb (NoSQL database©myNoSQL)

via: http://thomaskratz.blogspot.de/2013/01/a-human-readable-jackrabbit-persistence.html


NoSQL Bug Fix Releases: Redis 2.6.10 and RavenDB 2.01

The RavenDB team has released mostly a bug fix new version RavenDB 2.01. The change log is here.

Redis also has a new bug fix release: 2.6.10 including non-critical fixes and 5 small improvements. Change log is here

Original title and link: NoSQL Bug Fix Releases: Redis 2.6.10 and RavenDB 2.01 (NoSQL database©myNoSQL)


Using Hadoop Pig With MongoDB

In this post, we’ll see how to install MongoDB support for Pig and we’ll illustrate it with an example where we join 2 MongoDB collections with Pig and store the result in a new collection.

Color me very biased this time, but all these (especially the JOIN) can be done directly using RethinkDB.

Original title and link: Using Hadoop Pig With MongoDB (NoSQL database©myNoSQL)

via: http://chimpler.wordpress.com/2013/02/07/using-hadoop-pig-with-mongodb/


MongoDB 2.4 Highlights

MongoDB 2.4 is just around the corner:

MongoDB 2.4 highlights

From Mike Friedman’s Roadmap slidedeck.

Original title and link: MongoDB 2.4 Highlights (NoSQL database©myNoSQL)


MongoDB Is Still Broken by Design 5-0

My score after the first period was 4-1. But Emin Gün Sirer contested the 1 in the follow up post to 10gen’s reply:

Until recently, MongoDB did not talk about requestStart() and requestDone() in any context except when talking about how to ensure a very weak consistency requirement. Namely, if you don’t use this pair of operations, then a write to the database followed by a read from the database, by the same client, can return old values. So, I write 42 for key k with a WriteConcern.SAFE, read key k, and get some other number, because the Mongo driver can, by default, very well send the first request to one node over one connection, and the second one to another, over another connection. So requestStart() and requestDone() were billed as a mechanism to avoid that scenario; I saw no mention that they were required for correctness in multithreaded settings. I bet there is plenty of multithreaded code that does not follow that pattern. Such code is broken; if you’re a Mongo user, it’d be a good idea to check if you ever use getLastError without a bracketing requestStart() and Done().

5-0.

Original title and link: MongoDB Is Still Broken by Design 5-0 (NoSQL database©myNoSQL)

via: http://hackingdistributed.com/2013/02/07/10gen-response/#id2


10gen: MongoDB’s Fault Tolerance Is Not Broken…

Sitting comfortably? Check. Popcorn? Check. Let’s press play.

In an interview for InfoQ, 10gen’s Jared Rosoff replies to Emin Gün Sirer: “Broken by Design: MongoDB Fault Tolerance“.

  1. MongoDB lies when it says that a write has succeeded

    JR: “[…] Today the default behavior of official MongoDB drivers is Receipt Acknowledged, which means that you wait until the server has processed your write before returning to the client.”

    The key word here is today. As clearly explained by Sirer, the behavior he described was in all versions of MongoDB and all the drivers until the 2.2 release and the drivers update in Nov.2012.

    Basically the “fire-and-forget” behavior (nb: even this description is not accurate; a better one would be “load-and-forget”) has been the default for almost 3 years. With the 2.2 release and the corresponding drivers update it was changed to “Receipt acknowledged”. But new default acknowledges that the data was received on the server, but not that it was written anywhere. If you want your data to exist on multiple machines or be on a disk you need to use different settings.

  2. Using getLastError slows down write operations.

    JR: “GetLastError is underlying command in the MongoDB protocol that is used to implement write concerns. Intuitively, waiting for an operation to complete on the server is slower than not waiting for it.”

    The problem here is not that the operation slows down because it has to wait for the acknowledgement. The real problem is that getLastError requires an extra network roundtrip. Not to mention that your old code was probably polluted by all these extra calls.

  3. getLastError doesn’t work pipelined

    JR: “[…] For many bulk loads, performing multiple inserts with periodic checks of getLastError is the right choice. […]”

    Read the above.

  4. getLastError doesn’t work multi-threaded

    JR: ” Threads do not see getLastError responses for other thread’s operations. MongoDB’s getLastError command applies to the previous operation on a connection to the database, and not simply whatever random operation was performed last. […]”

    If I’m reading this correctly, it seems like Sirer’ hypothesis was that connections can be shared acrossed threads. I have to agree that many drivers do not provide thread-safe connections. So, linking getLastError behavior to the current connection seems OK.

  5. Write Concerns are broken

    JR: “As described in the above sections, WriteConcerns provide a flexible toolset for controlling the durability of write operations applied to the database. You can choose the level of durability you want for individual operations, balanced with the performance of those operations. With the power to specify exactly what you want comes the responsibility to understand exactly what it is you want out of the database.”

    Let’s look at what Sirer wrote:

    […] one could use WriteConcern.SAFE, FSYNC_SAFE or REPLICAS_SAFE for the insert operation [2]. There are 13 different concern levels, 8 of which seem to be distinct and presumably the remaining 5 are just kind of there in case you mistype one of the other ones. WriteConcern is at least well-named: it corresponds to “how concerned would you be if we lost your data?” and the potential answers are “not at all!”, “be my guest”, and “well, look like you made an effort, but it’s ok if you drop it.” Specifically, that’s three different kinds of SAFE, but none of them give you what you want: (1) SAFE means acknowledged by one replica but not written to disk, so a node failure can obliterate that data, (2) FSYNC_SAFE means written to a single disk, so a single disk crash can obliterate that data, and (3) REPLICAS_SAFE means it has been written to two replicas, but there is no guarantee that you will be able to retrieve it later.

    If you want a different explanation: even if there are 13 different WriteConcern types available, there is none that offers the option of having the data written to disk on more replicas.

The period is over and in my mind the score is clearly 4-1. But I know that this is not a real game and there will be some concluding that I’m not getting it. I’m OK living with that though.

Original title and link: 10gen: MongoDB’s Fault Tolerance Is Not Broken… (NoSQL database©myNoSQL)


NoSQL Hosting: Redis and RavenDB

More service providers for hosted NoSQL solutions:

  1. Garantia Data to Offer its Redis & Memcached Hosting Services in Europe: “In-memory NoSQL Company extends Redis Cloud and Memcached Cloud to European Amazon Web Services users.”
  2. CloudBird Launch, now with RavenDB 2.0 support - The CloudBird Blog: “Today we’re cracking open the Champagne as we peel off the beta label and officially welcome production databases to our RavenDB hosting service. What’s more we’re also introducing support for the Raven 2.0 RTM.”

It’s not anymore just “a database for every taste”, but steadly becoming more of “a database for every taste served from anywhere you like”,

Original title and link: NoSQL Hosting: Redis and RavenDB (NoSQL database©myNoSQL)