riak: All content tagged as riak in NoSQL databases and polyglot persistence
Tuesday, 28 February 2012
Paginating With Riak
Alexander Sicular explaining why pure key-value stores require a different approach when an application needs to paginate through result sets:
Riak at its core is a distributed key/value persisted data store that also happens to do a lot of other things. Now break that down. Looking at those words individually we have “distributed”, meaning that your data lives on a number of different machines in your cluster. Good thing, right? Yes. However it also means that no single machine is the canonical reference for all your data. Which in turn means that you need to ask multiple machines for your data and those machines will return data to you when they see fit, ie. not in order. Moving on, we have “key/value”. In regards to the topic at hand, this means that Riak has no insight into any data held within your keys, ie. Riak does not care if your stored json object has an age value in it. Next, we have “persisted”. Riak has no native internal index, meaning Riak will not store on disk the data you send it in any useful way - useful to you at least. Lastly, we have “happens to do a lot of other things.” Thankfully for us, one of those other things is Map/Reduce.
Original title and link: Paginating With Riak (©myNoSQL)
Monday, 27 February 2012
Multiple Index Queries in Riak Using Python
Sreejith K describing his riak_multi_query Python library for muti-indeces-based queries:
One of the advantage of using LevelDB with Riak is that they support Secondary Indexes. […] I wrote a Python wrapper that allows multiple index queries using Secondary indexes and MapReduce. The basic idea is as follows:
- Query Multiple Indexes and get the associated keys
- Pass the keys to a MapReduce job where Multiple filters are again evaluated. The map phase applies all the conditions to individual keys.
Now imagine this library would run those queries in parallel.
Original title and link: Multiple Index Queries in Riak Using Python (©myNoSQL)
via: http://foobarnbaz.com/2012/02/25/multi-index-queries-in-riak/
Riak Precommit Hooks for Creating Secondary Indeces
I wanted to create a secondary index on a field so that when I want to look up the data, it doesn’t require a full M/R to do. The great Riak Handbook showed a couple of examples of creating a secondary index, and it looked simple enough. It actually is pretty simple, but the documentation and examples are few and far between, so I’m going to share my experience. […] Whenever an item is created in this bucket, I want a secondary index based on the app. I never update items in this bucket, only create. In fact, I never delete these items either, but figured my code should handle that case.
Erlang code included.
Original title and link: Riak Precommit Hooks for Creating Secondary Indeces (©myNoSQL)
via: http://startup-tech.tumblr.com/post/15300179505/riak-precommit-hooks
Tuesday, 21 February 2012
Riak Performance of Link Walking vs MapReduce
If you are asked to compare (or you just wonder about) the performance of link walking and map-reduce in Riak keep in mind the following details of how the two mechanism are implemented:
The biggest difference I see is that the link-walk uses an Erlang function where your MapReduce query uses a Javascript function (link-walking is implemented as a MapReduce query internally).
Serializing/deserializing to JSON as well as contention for Javascript VMs likely accounts for the lost time.
My emphasis on Bryan Fink’s email from Riak’s mailing list.
Original title and link: Riak Performance of Link Walking vs MapReduce (©myNoSQL)
Thursday, 2 February 2012
NoSQL Market from Couchbase Perspective
James Philips (Couchbase) for Curt Monash:
- MongoDB is the big competition. He believes Couchbase has an excellent win rate vs. 10gen for actual paying accounts.
- DataStax/Cassandra wins over Couchbase only when multi-data-center capability is important. Naturally, multi-data-center capability is planned for Couchbase. (Indeed, that’s one of the benefits of swapping in CouchDB at the back end.)
- Redis has “dropped off the radar”, presumably because there’s no particular persistence strategy for it.
- Riak doesn’t show up much.
I assume this is sort of a pre-sales/sales department 100k feet overview.
Original title and link: NoSQL Market from Couchbase Perspective (©myNoSQL)
Wednesday, 1 February 2012
Riak Used by Auric Systems to Meet PCI Compliance Requirements
Auric Systems International, a leader in merchant transaction processing solutions, relies on Basho’s Riak to power its PaymentVault(TM) solution for PCI compliance. Riak was chosen because of the simplicity by which it replicates data, including stored encrypted credit card tokenized data, its ability to automate the aging of data, and its availability as open source.
After spending half an hour on the pcisecuritystandards site I still couldn’t figure out what the Level 1 PCI compliancy means to understand what Riak brought to the table.
If you thought all systems in the financial sector need transactions and are using relational databases, then I guess you were wrong. Read also the Card payment sytems and the CAP theorem to see the requirements of another financial service.
Original title and link: Riak Used by Auric Systems to Meet PCI Compliance Requirements (©myNoSQL)
Thursday, 19 January 2012
Pros and Cons of Using MapReduce With Distributed Key-Value Stores: HBase, Cassandra, Riak
Old Quora question with very good answers.
- (pro) can (potentially) query live data
- (pro) can (conceptually) be highly efficient at joining data sets that are identically sharded on the join key (the joins can be pushed down into the key-value store itself)
- (con) full scans (the most common pattern for map-reduce) is most likely to be much faster with raw file system access
- (con) because of the better decoupling of computation and storage in the GFS+Map-Reduce model - tolerating hot spots (resulting from MR jobs) is much easier
- (con) key-value stores are rarely arranged to have schemas optimized for analytics
Original title and link: Pros and Cons of Using MapReduce With Distributed Key-Value Stores: HBase, Cassandra, Riak (©myNoSQL)
Basho: Congratulations, Amazon!
A dynamo-as-a-service offered by Amazon on their ecosystem will appeal to some. For others, the benefits of a Dynamo-inspired product that can be deployed on other public clouds, behind-the-firewall, or not on the cloud at all, will be critical.
Objective. Clear. To the point.
Original title and link: Basho: Congratulations, Amazon! (©myNoSQL)
via: http://basho.com/blog/technical/2012/01/18/Congratulations-Amazon/
Tuesday, 17 January 2012
Bug Fix Release Riak 1.0.3 Available for Download
No mentions of any critical bugs in the announcement, but it is almost always a good idea to stay up to date.
Original title and link: Bug Fix Release Riak 1.0.3 Available for Download (©myNoSQL)
Thursday, 12 January 2012
Eventual and Strong Consistency, Sloppy and Strict Quorums, and Other Lessons and Thoughts on Distributed Systems
Anything I’d write would just steal from your time to read and think about the email Joseph Blomstedt posted to the Riak list.
Original title and link: Eventual and Strong Consistency, Sloppy and Strict Quorums, and Other Lessons and Thoughts on Distributed Systems (©myNoSQL)
Tuesday, 3 January 2012
Nmap Scripts for Riak, Redis, Memcached
If you take a look at the topic of security in the NoSQL context, you’ll notice that things are far from being perfect. So, any contributions in this area are welcome. Patrik Karlsoon added a couple of network exploration Nmap scripts for Riak, Redis, and Memcached. And while these will not help much with security they might proove useful for managing your NoSQL deployments:
-
Added the script riak-http-info that lists version and statistics information from the Basho Riak distributed database.
-
Added the script memcached-info that lists version and statistics information from the distributed memory object caching service memcached
-
Added the script redis-info that lists version and statistic information gathered from the Redis network key-value store.
-
Added the redis library and the script redis-brute that performs brute force password guessing against the Redis network key-value store.
Original title and link: Nmap Scripts for Riak, Redis, Memcached (©myNoSQL)
Tuesday, 20 December 2011
Grails 2.0 and NoSQL
Graeme Rocher:
Grails 2.0 is the first release of Grails that truly abstracts the GORM layer so that new implementations of GORM can be used. […] The MongoDB plugin is at final release candidate stage and is based on the excellent Spring Data MongoDB project which is also available in RC form. […] Grails users can look forward to more exciting NoSQL announcements in 2012 with upcoming future releases of GORM for Neo4j, Amazon SimpleDB and Cassandra in the works.
This is great news.
The very very big news would be a Grails version that doesn’t default anymore to using Hibernate for accessing a relational database.
Original title and link: Grails 2.0 and NoSQL (©myNoSQL)
via: http://blog.springsource.org/2011/12/15/grails-2-0-released/
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling