NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



couchdb tips: All content tagged as couchdb tips in NoSQL databases and polyglot persistence

CouchDB Incremental Backups

you can just rsync or copy the files while CouchDB is running. Thanks to CouchDB’s append-only file structure this is safe.

Yes, it’s so easy. Operational cost optimization.

Original title and link: CouchDB Incremental Backups (NoSQL databases © myNoSQL)


CouchDB: Creating a Pagination Index

I’m trying to create a pagination index view in CouchDB that lists the doc._id for every Nth document found.

Answer on ☞ stackoverflow. Not sure if paginating with CouchDB is leading to a similar solution.

Original title and link: CouchDB: Creating a Pagination Index (NoSQL databases © myNoSQL)

CouchDB group_level for hierarchical data

CouchDB group_level applied:

CouchDB supports something called group_level in the view queries. On pcapr, we never really had the need to use this feature though we have over 52 different views. But in a recent internal project, we had the need to display folders in the application that can be expanded and collapsed. Each document in CouchDB represents a file of sorts and contains the relative path name. One of the views in the app is a classic folder view that can be expanded recursively. Obviously, from a scaling perspective, we don’t want to load all this data up front and that’s exactly where the group_level comes in. This was my first time playing with this capability and I have to say, once you get grok this, it’s totally cool.

You can read more about CouchDB group_level ☞ here and ☞ here.

Original title and link: CouchDB group_level for hierarchical data (NoSQL databases © myNoSQL)


CouchDB Built-In Reduce Functions

Via ☞ Mikeal Rogers:

Currently (CouchDB 0.11.0) there are three built-in reduce functions. Built-in reduce functions are performed right inside CouchDB implemented in Erlang. In most cases it is very fast because they are way more efficient.

J. Chris Anderson explains why built-in reduce functions are faster:

The deal is that map function (and intermediate reduce function) output is cached in the view index (as it is generated). So in those cases the function overhead is not an issue.

But in the case of reduce, for any query that has parameters (even a startkey and endkey) the JavaScript function will be executed once (and fed the intermediate cached results), to give the accurate answer to the query. Once is fine and fast enough, but in the case of group=true or group_level=N queries, the JavaScript function is executed once per row of output so that starts to slow things down (in a linear way, related to the # of group rows, not the # of map rows, so it’s still “scalable”)

Using the builtin reduces avoids the interprocess communication overhead of calling the JavaScript function once per row, in a situation where the output cannot be cached.

In the normal operations, the JavaScript is batched and cached, so the effects of the slowness are mitigated.


CouchDB on ZFS

Very interesting idea of having FS-level compression kicking in automatically on CouchDB compressions from Victor Igumnov:

CouchDB was made for next generation filesystems such as ZFS and BTRFS. First off, unlike PostgreSQL or MySQL, CouchDB can be snapshot while in production without any flushing or locking trickery since it uses an append only B-Tree storage approach. That alone makes it a compelling database choice on ZFS/BTRFS.

Second, CouchDB works hand-in-hand with ZFS’s block level compression. ZFS can compress blocks of data as they are being written out to the disk. However, it only does it for new blocks and not retroactively. Now, the awesome part, CouchDB on compaction writes out a brand new database file which can utilize the new gzip compression settings on ZFS. This means you can try out different gzip compression settings just by compacting your CouchDB.


CouchDB Cheat Sheet

Jan-Piet Mens put together a six page CouchDB cheat sheet:

You can download it from ☞ here.

Document Versioning in CouchDB

Just because CouchDB uses Multi Versioning Concurrency Control (or MVCC[1]) doesn’t mean that you’ll also get document versioning support by default. So, how would you go if you need document versioning in CouchDB?

This article takes a look at 4 different approaches and details the first one as it looks to be both simple, scalable and supported by CouchDB replication:

  1. storing as a binary attachment the previous version
  2. append old versions to an array embedded in the current document
  3. keep each version as a separate document
  4. skip compaction

How does storing previous document version as a binary attachment to the new version work?

[…] when the document is loaded from the CouchDB server, the string representation is saved before being parsed into JSON. Later, when the document is saved, the string representation is attached as a new binary attachment, with the corresponding rev as it’s name, and a content type of application/json. This way any CouchDB library can just open the stored rev, and see it as a normal document.

This means that each time the document is updated, the client will also store the previous version as an attachment to the latest version. At any time, a user can load any of the old versions.



News from CouchDB Camp

  1. Firstly a very nice trick from Christian: ☞ Couch Potato Bookmarklet - Lazy Features for CouchDB’s Futon: a bookmarklet for bulk deleting CouchDB documents

    So, I created the CouchDB Potato Bookmarklet (because I’m lazy). The bookmarklet creates a new delete column and provides a “Delete Documents” link to delete all the checked documents. I also added a “Select All Documents” which only selects non-design documents (so that I don’t accidentally delete a CouchDB view). These links can be found in the right navigation column under the “Recent Databases” section.

  2. Embedded below is a video of J. Chris Anderson talking about CouchDB @ E-VAN
  3. Couchio has announced a very interesting event CouchCamp and we added it to our new NoSQL events page.

  4. Last, (my) internet has been “spammed” by tons of redirects to the article ☞ CouchDB Moves to the Cloud with Couchio. According to the article:

    “We’ll be including Apache Lucene full-text indexing,” Katz said. “That’s an add-on for CouchDB that people usually have to download and build themselves.”

Basic CouchDB Cheat Sheet

The very basics of CouchDB


Using Multiple Start and End Keys for CouchDB Views

CouchDB view collation is great and only has one real drawback that has caused me any real pain – the inability to handle queries that need to be parameterised by more than one dimension.

Do you know any solutions for this scenario?


CouchDB: Intercepting Document Updates and Server-Side Processing

For the last couple of weeks, we’ve run a series of CouchDB Tips and I think that this one would be a nice addition to the list.

Starting with its version 0.10, CouchDB offers the possibility to perform server-side processing of incoming documents before they are committed. This feature is available through document update handlers defined in a design doc. You can find more details and a sample application of this feature on the linked page.


CouchDB List Functions

Just another trick for your CouchDB toolbox:

List functions are a mechanism for iterating over rows in a view to produce output. CouchDB list functions are typically used to generate alternate formats for output (Atom, XML, HTML, etc.). I still want to generate JSON for consumption by my Sinatra application. Hopefully, that will not prove difficult.

Other CouchDB tips&tricks