Riak: All content tagged as Riak in NoSQL databases and polyglot persistence
One of the major releases that happened around the end of February (and I’ve missed due to some personal problems), is Riak 1.1. I assume that by now everyone using Riak already knows all the goodies packaged by the Basho team in this new release, but for those that are not yet onboard here is a summary:
From the Release notes:
- Numerous changes to Riak Core which address issues with cluster scalability, and enable Riak to better handle large clusters and large rings
- New Ownership Claim Algorithm: The new ring ownership claim algorithm introduced as an optional setting in the 1.0 release has been set as the default for 1.1. The new claim algorithm significantly reduces the amount of ownerhip shuffling for clusters with more than N+2 nodes in them.
Riak KV improvements:
- Liskeys backpressure: Backpressure has been added to listkeys to prevent the node listing keys from being overwhelemed.
- Don’t drop post-commit errors on floor
- The MapReduce interface now supports requests with empty queries. This allows the 2i, list-keys, and search inputs to return matching keys to clients without needing to include a reduce_identity query phase.
- MapReduce error messages have been improved. Most error cases should now return helpful information all the way to the client, while also producing less spam in Riak’s logs.
- Bitcask and LevelDB improvements
Then there’s also Riaknostic and the new Riak admin tool: Riak Control.
What is Riaknostic?
From the initial Riaknostic announcement:
Riaknostic is an Erlang script (escript) that runs a series of “diagnostics” or “checks”, inspecting your operating system and Riak installation for known potential problems and then printing suggestions for how to fix those problems. Riaknostic will NOT fix those problems for you, it’s only a tool for diagnostics. Some of the things it checks are:
- How much memory does the Riak process currently use?
- Do Riak’s data directories have the correct permissions?
- Did the Riak node crash in the past and leave a dump file?
Riaknostic project page is here.
What is Riak Control?
From Riak Control GitHub page:
Riak Control is a set of webmachine resources, all accessible via the /admin/* paths, allow you to inspect your running cluster, and manipulate it in various ways.
Now that description doesn’t make Riak Control any justice. What Riak Control is a very fancy REST-driven admin interface for Riak. You don’t have to take my word for it, so check this screenshot:
Riak Control covers different details of a Riak cluster:
- general cluster status
- details about the cluster
- details about the ring
This blog post gives more details about Riak Control and a couple more sexy screenshots. If you’d like to dive a bit deeper into Riak Control, you can also watch after the break a 25min video of Mark Phillips talking about it.
Riak and WebMachine are the two systems for which I wished I knew Erlang so I could dive into and learn more about. I’m already (slowly) working to change this.
If you are asked to compare (or you just wonder about) the performance of link walking and map-reduce in Riak keep in mind the following details of how the two mechanism are implemented:
My emphasis on Bryan Fink’s email from Riak’s mailing list.
Original title and link: Riak Performance of Link Walking vs MapReduce ( ©myNoSQL)
Auric Systems International, a leader in merchant transaction processing solutions, relies on Basho’s Riak to power its PaymentVault(TM) solution for PCI compliance. Riak was chosen because of the simplicity by which it replicates data, including stored encrypted credit card tokenized data, its ability to automate the aging of data, and its availability as open source.
After spending half an hour on the pcisecuritystandards site I still couldn’t figure out what the Level 1 PCI compliancy means to understand what Riak brought to the table.
If you thought all systems in the financial sector need transactions and are using relational databases, then I guess you were wrong. Read also the Card payment sytems and the CAP theorem to see the requirements of another financial service.
Original title and link: Riak Used by Auric Systems to Meet PCI Compliance Requirements ( ©myNoSQL)
Old Quora question with very good answers.
- (pro) can (potentially) query live data
- (pro) can (conceptually) be highly efficient at joining data sets that are identically sharded on the join key (the joins can be pushed down into the key-value store itself)
- (con) full scans (the most common pattern for map-reduce) is most likely to be much faster with raw file system access
- (con) because of the better decoupling of computation and storage in the GFS+Map-Reduce model - tolerating hot spots (resulting from MR jobs) is much easier
- (con) key-value stores are rarely arranged to have schemas optimized for analytics
Original title and link: Pros and Cons of Using MapReduce With Distributed Key-Value Stores: HBase, Cassandra, Riak ( ©myNoSQL)