Riak: All content tagged as Riak in NoSQL databases and polyglot persistence
Tuesday, 30 October 2012
Improvements and Benchmarks for LevelDB in Riak 1.2
Basho team started to investigate and optimize LevelDB, one of the supported storage engine for Riak and the engine for Riak 2i, and the results are already impressive:
- reduced stalls (from 10-90s every 3-5min to 10-30s every 2h)
- increased throughput (from 400 ops/s to 2000 ops/s)
- a better solution for dealing with an infinite loop during compaction against a corrupted data block
- LevelDB bloom filter for quickly identifying keys that don’t exist in the data store
The original posts also shows some charts of the throughput and maximum latency measured in Level 1.1 vs Level 1.2.
Original title and link: Improvements and Benchmarks for LevelDB in Riak 1.2 (©myNoSQL)
via: http://basho.com/blog/technical/2012/10/30/leveldb-in-riak-1p2/
Friday, 26 October 2012
Alex Sicular's Recap of Ricon 2012, a Distributed Systems Conference for Developers
While in conference mode1 I’m like a sponge, I’m almost no good at putting all my chaotic notes in a format that is usable to anyone else.
Alex Siculars has done a great job writing down his thoughts about Basho’s fantastic Ricon 2012 and linking to his post makes me feel less guilty for not being able to post mines—I’m learning to get better for the next events:
Chatter by conference attendees left me convinced that Ricon was a success. Ricon was-well executed, well-attended and actually interesting. But more importantly, it was relevant. For those of us at the conference, we actually work in this space. We are interested in the ongoing development of distributed solutions to a number of problems. The conference delivered on creating a space that brought us together to share solutions and learn about continuing advancements. For a new conference to have a successful maiden voyage is no small feat in my book. I, for one, am looking forward to the next one.
My only contribution to Alex Sicular’s great recap is to provide some links to the talks his blog post refers to:
Joe Hellerstein: Programming Principles for a Distributed Era
The PDF can be downloaded from here
Eric Brewer: Advancing Distributed Systems
Russel Brown and Sean Cribbs: Data Structures in Riak
Bryan Fink: Riak Pipe: Distributed Processing System
Ryan Zezeski: Yokozuna: Riak + Solr
More presentation slides can be found on the official Ricon 2012 site.
-
My thanks again to the Basho team for inviting me to Ricon 2012 and also to DataStax team for the Cassandra Summit invitation. ↩
Original title and link: Alex Sicular’s Recap of Ricon 2012, a Distributed Systems Conference for Developers (©myNoSQL)
Thursday, 25 October 2012
YCSB Benchmark Results for Cassandra, HBase, MongoDB, MySQL Cluster, and Riak
Put together by the team at Altoros Systems Inc., this time run in the Amazon EC2 and including Cassandra, HBase, MongoDB, MySQL Cluster, sharded MySQL and Riak:
After some of the results had been presented to the public, some observers said MongoDB should not be compared to other NoSQL databases because it is more targeted at working with memory directly. We certainly understand this, but the aim of this investigation is to determine the best use cases for different NoSQL products. Therefore, the databases were tested under the same conditions, regardless of their specifics.
Teaser: HBase got the best results in most of the benchmarks (with flush turned off though). And I’m not sure the setup included the latest HBase read improvements from Facebook.
Original title and link: YCSB Benchmark Results for Cassandra, HBase, MongoDB, MySQL Cluster, and Riak (©myNoSQL)
Wednesday, 24 October 2012
Rolling With Eventual Consistency or the Pros and Cons of a Dynamo Style Key-Value Store
Great educational post by Casey Rosenthal on Basho’s blog about the radically different approach of data modelling when using non-relational storage engines or non-queryable data models.
In a previous post I wrote about the different mindset that a software engineer should have when building for a key-value database as opposed to a relational database. When working with a relational database, you describe the model first and then query the data later. With a key-value database, you focus first on what you want the result of the query to look like, and then work backward toward a model.
A different way to look at it is that the advantage of the Dynamo’s style high availability key-value store doesn’t come for free. In the world of distributed systems there’s always a trade-off and you need to carefully choose each component of the architecture to match the requirements, but also be aware of the concenssions or complexity you’ll have to accept in other parts of the system.
Original title and link: Rolling With Eventual Consistency or the Pros and Cons of a Dynamo Style Key-Value Store (©myNoSQL)
via: http://basho.com/blog/technical/2012/09/18/Rolling-with-Eventual-Consistency/
Monday, 1 October 2012
Using Riak as Cache Layer
Sean Cribbs explains how to use Riak as a caching solution:
- Bitcask or Memory backends
- The possibility of configuring the cluster for lower guarantees of per-key availability
Then benchmark the system for your scenario.
Original title and link: Using Riak as Cache Layer (©myNoSQL)
via: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009680.html
Tuesday, 25 September 2012
Doing Redundant Work to Speed Up Distributed Queries
Great post by Peter Bailis looking at how some systems are reducing tail latency by distributing reads across nodes:
Open-source Dynamo-style stores have different answers. Apache Cassandra originally sent reads to all replicas, but CASSANDRA-930 and CASSANDRA-982 changed this: one commenter argued that “in IO overloaded situations” it was better to send read requests only to the minimum number of replicas. By default, Cassandra now sends reads to the minimum number of replicas 90% of the time and to all replicas 10% of the time, primarily for consistency purposes. (Surprisingly, the relevant JIRA issues don’t even mention the latency impact.) LinkedIn’s Voldemort also uses a send-to-minimum strategy (and has evidently done so since it was open-sourced). In contrast, Basho Riak chooses the “true” Dynamo-style send-to-all read policy.
Original title and link: Doing Redundant Work to Speed Up Distributed Queries (©myNoSQL)
via: http://www.bailis.org/blog/doing-redundant-work-to-speed-up-distributed-queries/
Monday, 3 September 2012
From MongoDB to Riak at Shareaholic
Robby Grossman talked at Boston Riak meetup about Shareaholic’s migration from MongoDB to Riak and their requirements and evaluation of top contenders: HBase, Cassandra, Riak.
Why not MongoDB?
- working set needs to fit in memory
- global write lock blocks all queries despite not having transactions/joins
- standbys not “hot”
Bullet point format pros and cons for HBase, Cassandra, and Riak are in the slides.
via: http://blog.shareaholic.com/2012/08/migrating-to-riak-at-shareaholic/
Tuesday, 7 August 2012
Riak 1.2 Released: Operational Improvements
Here’s the tl;dr on what’s new and improved since the Riak 1.1 release:
- More efficiently add multiple Riak nodes to your cluster
- Stage and review, then commit or abort cluster changes for easier operations; plus smoother handling of rolling upgrades
- Better visibility into active handoffs
- Repair Riak KV and Search partitions by attaching to the Riak Console and using a one-line command to recover from data corruption/loss
- More performant stats for Riak; the addition of stats to Riak Search
- 2i and Search usage thru the Protocol Buffers API
- Official Support for Riak on FreeBSD
- In Riak Enterprise: SSL encryption, better balancing and more granular control of replication across multiple data centers, NAT support
More details in the official announcement.
Original title and link: Riak 1.2 Released: Operational Improvements (©myNoSQL)
via: http://basho.com/blog/technical/2012/08/07/Riak-1-2-released/
Tuesday, 17 July 2012
Congrats to Basho Team for the New Round of Funding
Besides the title, the details are in the official announcement. Teaser: $6.1mil are coming from Yahoo! Japan Corporation which will use Riak CS.
Original title and link: Congrats to Basho Team for the New Round of Funding (©myNoSQL)
Friday, 13 July 2012
Why We Chose Riak
Lei Gu about the 9 reasons that made them chose Riak:
- Ever-Evolving Object Model
- High Availability and Multi-Data Center Support
- Free Text Search
- Adjacency Link Walking
- Secondary Index Support
- Multi-Tenant Support
- Ad Hoc Query Support Through MapReduce
- Performance
- Operation and Monitoring Support
These are a lot of good reasons.
Original title and link: Why We Chose Riak (©myNoSQL)
via: http://2rdscreenretargeting.blogspot.com/2012/07/why-we-chose-riak-as-persistence.html
Thursday, 12 July 2012
A Brief Introduction to Riak: Links and MapReduce
While researching NoSQL databases recently, I stumbled upon Riak, found myself intrigued, and decided to dive a little deeper. The quick and dirty: Riak is a key-value, distributed database made up of multiple independent nodes which can be joined together to form Riak clusters. Generally, when data is written into a Riak cluster, it will be written to multiple nodes so that even if a single node does go down, the data will still be reachable. One of Riak’s trademarks is this focus on high availability.
As per the title: just a brief Riak intro.
Original title and link: A Brief Introduction to Riak: Links and MapReduce (©myNoSQL)
via: http://cloud.dzone.com/articles/brief-introduction-riak
Monday, 9 July 2012
Riak Metrics With Folsom
In previous versions of Riak the same process gathered metrics and calculated statistics. In certain situations, under load, reading statistics would slow down, or even timeout altogether. The call to read stats would block that same process that updates stats, leading to large message queue backlogs.
This is the sort of observation and improvement that only a product that got into production (heavy production) could make.
Original title and link: Riak Metrics With Folsom (©myNoSQL)
via: http://basho.com/blog/technical/2012/07/02/folsom-backed-stats-riak-1-2/
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling