couchdb tips: All content tagged as couchdb tips in NoSQL databases and polyglot persistence
Thursday, 25 November 2010
CouchDB Incremental Backups
you can just rsync or copy the files while CouchDB is running. Thanks to CouchDB’s append-only file structure this is safe.
Yes, it’s so easy. Operational cost optimization.
Original title and link: CouchDB Incremental Backups (NoSQL databases © myNoSQL)
via: http://comments.gmane.org/gmane.comp.db.couchdb.user/11410
Monday, 11 October 2010
CouchDB: Creating a Pagination Index
I’m trying to create a pagination index view in CouchDB that lists the doc._id for every Nth document found.
Answer on ☞ stackoverflow. Not sure if paginating with CouchDB is leading to a similar solution.
Original title and link: CouchDB: Creating a Pagination Index (NoSQL databases © myNoSQL)
Wednesday, 6 October 2010
CouchDB group_level for hierarchical data
CouchDB group_level applied:
CouchDB supports something called group_level in the view queries. On pcapr, we never really had the need to use this feature though we have over 52 different views. But in a recent internal project, we had the need to display folders in the application that can be expanded and collapsed. Each document in CouchDB represents a file of sorts and contains the relative path name. One of the views in the app is a classic folder view that can be expanded recursively. Obviously, from a scaling perspective, we don’t want to load all this data up front and that’s exactly where the group_level comes in. This was my first time playing with this capability and I have to say, once you get grok this, it’s totally cool.
You can read more about CouchDB group_level ☞ here and ☞ here.
Original title and link: CouchDB group_level for hierarchical data (NoSQL databases © myNoSQL)
via: http://labs.mudynamics.com/2010/10/02/using-couchdb-group_level-for-hierarchical-data/
Monday, 5 July 2010
CouchDB Built-In Reduce Functions
Via ☞ Mikeal Rogers:
Currently (CouchDB 0.11.0) there are three built-in reduce functions. Built-in reduce functions are performed right inside CouchDB implemented in Erlang. In most cases it is very fast because they are way more efficient.
J. Chris Anderson explains why built-in reduce functions are faster:
The deal is that map function (and intermediate reduce function) output is cached in the view index (as it is generated). So in those cases the function overhead is not an issue.
But in the case of reduce, for any query that has parameters (even a startkey and endkey) the JavaScript function will be executed once (and fed the intermediate cached results), to give the accurate answer to the query. Once is fine and fast enough, but in the case of group=true or group_level=N queries, the JavaScript function is executed once per row of output so that starts to slow things down (in a linear way, related to the # of group rows, not the # of map rows, so it’s still “scalable”)
Using the builtin reduces avoids the interprocess communication overhead of calling the JavaScript function once per row, in a situation where the output cannot be cached.
In the normal operations, the JavaScript is batched and cached, so the effects of the slowness are mitigated.
via: http://wiki.apache.org/couchdb/Built-In_Reduce_Functions
Tuesday, 29 June 2010
CouchDB on ZFS
Very interesting idea of having FS-level compression kicking in automatically on CouchDB compressions from Victor Igumnov:
CouchDB was made for next generation filesystems such as ZFS and BTRFS. First off, unlike PostgreSQL or MySQL, CouchDB can be snapshot while in production without any flushing or locking trickery since it uses an append only B-Tree storage approach. That alone makes it a compelling database choice on ZFS/BTRFS.
Second, CouchDB works hand-in-hand with ZFS’s block level compression. ZFS can compress blocks of data as they are being written out to the disk. However, it only does it for new blocks and not retroactively. Now, the awesome part, CouchDB on compaction writes out a brand new database file which can utilize the new gzip compression settings on ZFS. This means you can try out different gzip compression settings just by compacting your CouchDB.
Monday, 21 June 2010
CouchDB Cheat Sheet
Jan-Piet Mens put together a six page CouchDB cheat sheet:
You can download it from ☞ here.
Tuesday, 1 June 2010
Document Versioning in CouchDB
Just because CouchDB uses Multi Versioning Concurrency Control (or MVCC[1]) doesn’t mean that you’ll also get document versioning support by default. So, how would you go if you need document versioning in CouchDB?
This article takes a look at 4 different approaches and details the first one as it looks to be both simple, scalable and supported by CouchDB replication:
- storing as a binary attachment the previous version
- append old versions to an array embedded in the current document
- keep each version as a separate document
- skip compaction
How does storing previous document version as a binary attachment to the new version work?
[…] when the document is loaded from the CouchDB server, the string representation is saved before being parsed into JSON. Later, when the document is saved, the string representation is attached as a new binary attachment, with the corresponding rev as it’s name, and a content type of application/json. This way any CouchDB library can just open the stored rev, and see it as a normal document.
This means that each time the document is updated, the client will also store the previous version as an attachment to the latest version. At any time, a user can load any of the old versions.
References
- [1] ☞ Multiversion concurrency control (MVCC or MCC) is a concurrency control method used to provide concurrent access (↩)
via: http://blog.couch.io/post/632718824/simple-document-versioning-with-couchdb
Friday, 28 May 2010
News from CouchDB Camp
- Firstly a very nice trick from Christian: ☞ Couch Potato Bookmarklet - Lazy Features for CouchDB’s Futon: a bookmarklet for bulk deleting CouchDB documents
So, I created the CouchDB Potato Bookmarklet (because I’m lazy). The bookmarklet creates a new delete column and provides a “Delete Documents” link to delete all the checked documents. I also added a “Select All Documents” which only selects non-design documents (so that I don’t accidentally delete a CouchDB view). These links can be found in the right navigation column under the “Recent Databases” section.
- Embedded below is a video of J. Chris Anderson talking about CouchDB @ E-VAN
Couchio has announced a very interesting event CouchCamp and we added it to our new NoSQL events page.
- Last, (my) internet has been “spammed” by tons of redirects to the article ☞ CouchDB Moves to the Cloud with Couchio. According to the article:
“We’ll be including Apache Lucene full-text indexing,” Katz said. “That’s an add-on for CouchDB that people usually have to download and build themselves.”
Tuesday, 30 March 2010
Basic CouchDB Cheat Sheet
The very basics of CouchDB
via: http://www.andyjarrett.co.uk/blog/index.cfm/2010/3/29/CouchDB-quick-refstarting-guide
Wednesday, 24 March 2010
Using Multiple Start and End Keys for CouchDB Views
CouchDB view collation is great and only has one real drawback that has caused me any real pain – the inability to handle queries that need to be parameterised by more than one dimension.
Do you know any solutions for this scenario?
via: http://jamietalbot.com/2010/03/24/using-multiple-start-and-end-keys-for-couchdb-views/
Friday, 26 February 2010
CouchDB: Intercepting Document Updates and Server-Side Processing
For the last couple of weeks, we’ve run a series of CouchDB Tips and I think that this one would be a nice addition to the list.
Starting with its version 0.10, CouchDB offers the possibility to perform server-side processing of incoming documents before they are committed. This feature is available through document update handlers defined in a design doc. You can find more details and a sample application of this feature on the linked page.
Tuesday, 23 February 2010
CouchDB List Functions
Just another trick for your CouchDB toolbox:
List functions are a mechanism for iterating over rows in a view to produce output. CouchDB list functions are typically used to generate alternate formats for output (Atom, XML, HTML, etc.). I still want to generate JSON for consumption by my Sinatra application. Hopefully, that will not prove difficult.
Other CouchDB tips&tricks
- Paginating with CouchDB
- Access CouchDB document revisions with RelaxDB
- Generic CouchDB _changes consumer using node.js
- A Stub Ruby Library for CouchDB
via: http://japhr.blogspot.com/2010/02/collating-not-reducing-with-couchdb.html
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling