NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Riak: All content tagged as Riak in NoSQL databases and polyglot persistence

Riak 1.2 Released: Operational Improvements

Here’s the tl;dr on what’s new and improved since the Riak 1.1 release:

  • More efficiently add multiple Riak nodes to your cluster
  • Stage and review, then commit or abort cluster changes for easier operations; plus smoother handling of rolling upgrades
  • Better visibility into active handoffs
  • Repair Riak KV and Search partitions by attaching to the Riak Console and using a one-line command to recover from data corruption/loss
  • More performant stats for Riak; the addition of stats to Riak Search
  • 2i and Search usage thru the Protocol Buffers API
  • Official Support for Riak on FreeBSD
  • In Riak Enterprise: SSL encryption, better balancing and more granular control of replication across multiple data centers, NAT support

More details in the official announcement.

Original title and link: Riak 1.2 Released: Operational Improvements (NoSQL database©myNoSQL)


Congrats to Basho Team for the New Round of Funding

Besides the title, the details are in the official announcement. Teaser: $6.1mil are coming from Yahoo! Japan Corporation which will use Riak CS.

Original title and link: Congrats to Basho Team for the New Round of Funding (NoSQL database©myNoSQL)

Why We Chose Riak

Lei Gu about the 9 reasons that made them chose Riak:

  1. Ever-Evolving Object Model
  2. High Availability and Multi-Data Center Support
  3. Free Text Search
  4. Adjacency Link Walking
  5. Secondary Index Support
  6. Multi-Tenant Support
  7. Ad Hoc Query Support Through MapReduce
  8. Performance
  9. Operation and Monitoring Support

These are a lot of good reasons.

Original title and link: Why We Chose Riak (NoSQL database©myNoSQL)


A Brief Introduction to Riak: Links and MapReduce

While researching NoSQL databases recently, I stumbled upon Riak, found myself intrigued, and decided to dive a little deeper. The quick and dirty: Riak is a key-value, distributed database made up of multiple independent nodes which can be joined together to form Riak clusters. Generally, when data is written into a Riak cluster, it will be written to multiple nodes so that even if a single node does go down, the data will still be reachable. One of Riak’s trademarks is this focus on high availability.

As per the title: just a brief Riak intro.

Original title and link: A Brief Introduction to Riak: Links and MapReduce (NoSQL database©myNoSQL)


Riak Metrics With Folsom

In previous versions of Riak the same process gathered metrics and calculated statistics. In certain situations, under load, reading statistics would slow down, or even timeout altogether. The call to read stats would block that same process that updates stats, leading to large message queue backlogs.

This is the sort of observation and improvement that only a product that got into production (heavy production) could make.

Original title and link: Riak Metrics With Folsom (NoSQL database©myNoSQL)


Reusable Patterns for Riak in Scala

Ray Jenkins sharing some cool Scala code for Riak on the Boundary blog:

I decided that I’d write this service in Scala and use Riak for persistence. I lazy and I can’t stand doing CRUD stuff so I looked around at our code at Boundary and on the internet and I didn’t find any simple reusable persistence layer in Scala for Riak so I decided to write my own.

Original title and link: Reusable Patterns for Riak in Scala (NoSQL database©myNoSQL)


Quick Guide to Riak HTTP API and Using Riak as Cache Service

A two-part article by Simon Buckle introducing the Riak HTTP API and using it with Riak pluggable Memory back-end as a caching service for a web application. Somehow I missed that Riak has a pluggable memory (non-persistent) storage. The only missing piece for making it a better caching solution would be having the option to set a per-key expiry/time-to-live (TTL) value. It might be interesting to experiment with using Cache-Control and Last-Modified HTTP headers to simulate this behavior. Has anyone tried it?

Original title and link: Quick Guide to Riak HTTP API and Using Riak as Cache Service (NoSQL database©myNoSQL)

NoSQL and Relational Databases Podcast With Mathias Meyer

EngineYard’s Ines Sombra recorded a conversation with Mathias Meyer about NoSQL databases and their evolution towards more friendlier functionality, relational databases and their steps towards non-relational models, and a bit more on what polyglot persistence means.

Mathias Meyer is one of the people I could talk for days about NoSQL and databases in general with different infrastructure toppings and he has some of the most well balanced thoughts when speaking about this exciting space—see this conversation I’ve had with him in the early days of NoSQL. I strongly encourage you to download the mp3 and listen to it.

Original title and link: NoSQL and Relational Databases Podcast With Mathias Meyer (NoSQL database©myNoSQL)

Kobayashi: Historical Data Store with Riak

Speaking about networks, the company monitoring networks, Boundary, is investigating an issue with their historical data storage solution built on top of Riak:

Kobayashi is the name we’ve bestowed upon a new historical data store for streaming data yet to be integrated into the Boundary stack. Every few seconds, a small chunk of the most recent data for each “stream” is remitted to kobayashi for longer-term storage. There are roughly 15-20 of these streams per customer to cover the necessary dimensionality and aggregation periods required by the Boundary dashboard. Kobayashi runs on 9 nodes paired with a riak cluster running on those same 9 nodes.

These days, everytime I’m seeing an investigative monitoring dashboard, I’m thinking of Brendan Gregg’s great system visualizations.

Original title and link: Kobayashi: Historical Data Store with Riak (NoSQL database©myNoSQL)


Riak_mongo Makes Riak Look Like Mongo to Clients

By Pavlo Baron and Kresten Krab Thorup:

In the first step, it will allow Mongo drivers to seamlessly connect to it using Mongo Wire Protocol and to map to the underlying Riak data store. This can help migrate the data store of existing MongoDB based applications to Riak.

In the next step it also might be interesting to have a Mongo based Riak backend.

No need for the second step.

Original title and link: Riak_mongo Makes Riak Look Like Mongo to Clients (NoSQL database©myNoSQL)


NoSQL Releases and Announcements

Catching up after almost two weeks offline is no easy task, but I hope I’ll not miss any important events, releases, or posts. But if I do, please email me.

Cassandra 1.0.9: Maintenance Release

The complete change notes for Cassandra 1.0.9 are here:

  • improve index sampling performance (CASSANDRA-4023)
  • always compact away deleted hints immediately after handoff (CASSANDRA-3955)
  • delete hints from dropped ColumnFamilies on handoff instead of erroring out (CASSANDRA-3975)
  • add CompositeType ref to the CLI doc for create/update column family (CASSANDRA-3980)
  • Avoid NPE during repair when a keyspace has no CFs (CASSANDRA-3988)
  • Fix division-by-zero error on get_slice (CASSANDRA-4000)
  • don’t change manifest level for cleanup, scrub, and upgradesstables operations under LeveledCompactionStrategy (CASSANDRA-3989, 4112)
  • fix race leading to super columns assertion failure (CASSANDRA-3957)
  • ensure that directory is selected for compaction for user-defined tasks and upgradesstables (CASSANDRA-3985)
  • allow custom types in CLI’s assume command (CASSANDRA-4081)
  • fix totalBytes count for parallel compactions (CASSANDRA-3758)
  • fix intermittent NPE in get_slice (CASSANDRA-4095)
  • remove unnecessary asserts in native code interfaces (CASSANDRA-4096)
  • Fix EC2 snitch incorrectly reporting region (CASSANDRA-4026)
  • Shut down thrift during decommission (CASSANDRA-4086)
  • Merged from 0.8: Fix ConcurrentModificationException in gossiper (CASSANDRA-4019)

  • Pig

    • support Counter ColumnFamilies (CASSANDRA-3973)
    • Composite column support (CASSANDRA-3684)
  • CQL

    • fix NPE on invalid CQL delete command (CASSANDRA-3755)
    • Validate blank keys in CQL to avoid assertion errors (CASSANDRA-3612)

Apache Hadoop User Impersonation vulnerability

This vulnerability discovered by Cloudera’s Aaron T. Myers affects Hadoop’s versions,,, 1.0.0 to 1.0.1, and 0.23.0 to 0.23.1 where Kerberos is enabled. Complete details available here.

CouchDB 1.2.0

This is the first important release after the start of the year CouchDB hubbub with Damien Katz and Couchbase. The new version is a major release in itself deserving its own post: CouchDB 1.2.0: Performance, Security, API, Core and Replication Improvements.

Riak 1.1.2: Stabilization release

Just a maintenance release in the Riak 1.1 series. Complete release notes here.

Original title and link: NoSQL Releases and Announcements (NoSQL database©myNoSQL)

Here Is Why in Cassandra vs. HBase, Riak, CouchDB, MongoDB, It's Cassandra FTW

Brian ONeill:

Now, since choosing Cassandra, I can say there are a few other really important less tangible considerations. The first, is the code base. Cassandra has an extremely clean and well maintained code base. Jonathan and team do a fantastic job managing the community and the code. As we adopted NoSQL, the ability to extend the code-base and incorporate our own features has proven invaluable. (e.g. triggers, a REST interface, and server-side wide-row indexing)

Secondly, the community is phenomenal. That results in timely support, and solid releases on a regular schedule. They do a great job prioritizing features, accepting contributions, and cranking out features. (They are now releasing ~quarterly) We’ve all probably been part of other open source projects where the leadership is lacking, and features and releases are unpredictable, which makes your own release planning difficult. Kudos to the Cassandra team.

Everything sounds reasonable except for Riak being the “new kid on the block” and not finding support for it. Basho, where were you hidding?

Original title and link: Here Is Why in Cassandra vs. HBase, Riak, CouchDB, MongoDB, It’s Cassandra FTW (NoSQL database©myNoSQL)