ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

riak: All content tagged as riak in NoSQL databases and polyglot persistence

Reusable Patterns for Riak in Scala

Ray Jenkins sharing some cool Scala code for Riak on the Boundary blog:

I decided that I’d write this service in Scala and use Riak for persistence. I lazy and I can’t stand doing CRUD stuff so I looked around at our code at Boundary and on the internet and I didn’t find any simple reusable persistence layer in Scala for Riak so I decided to write my own.

Original title and link: Reusable Patterns for Riak in Scala (NoSQL database©myNoSQL)

via: http://blog.boundary.com/2012/07/09/reusable-patterns-for-riak-in-scala/


Quick Guide to Riak HTTP API and Using Riak as Cache Service

A two-part article by Simon Buckle introducing the Riak HTTP API and using it with Riak pluggable Memory back-end as a caching service for a web application. Somehow I missed that Riak has a pluggable memory (non-persistent) storage. The only missing piece for making it a better caching solution would be having the option to set a per-key expiry/time-to-live (TTL) value. It might be interesting to experiment with using Cache-Control and Last-Modified HTTP headers to simulate this behavior. Has anyone tried it?

Original title and link: Quick Guide to Riak HTTP API and Using Riak as Cache Service (NoSQL database©myNoSQL)


NoSQL and Relational Databases Podcast With Mathias Meyer

EngineYard’s Ines Sombra recorded a conversation with Mathias Meyer about NoSQL databases and their evolution towards more friendlier functionality, relational databases and their steps towards non-relational models, and a bit more on what polyglot persistence means.

Mathias Meyer is one of the people I could talk for days about NoSQL and databases in general with different infrastructure toppings and he has some of the most well balanced thoughts when speaking about this exciting space—see this conversation I’ve had with him in the early days of NoSQL. I strongly encourage you to download the mp3 and listen to it.

Original title and link: NoSQL and Relational Databases Podcast With Mathias Meyer (NoSQL database©myNoSQL)


Kobayashi: Historical Data Store with Riak

Speaking about networks, the company monitoring networks, Boundary, is investigating an issue with their historical data storage solution built on top of Riak:

Kobayashi is the name we’ve bestowed upon a new historical data store for streaming data yet to be integrated into the Boundary stack. Every few seconds, a small chunk of the most recent data for each “stream” is remitted to kobayashi for longer-term storage. There are roughly 15-20 of these streams per customer to cover the necessary dimensionality and aggregation periods required by the Boundary dashboard. Kobayashi runs on 9 nodes paired with a riak cluster running on those same 9 nodes.

These days, everytime I’m seeing an investigative monitoring dashboard, I’m thinking of Brendan Gregg’s great system visualizations.

Original title and link: Kobayashi: Historical Data Store with Riak (NoSQL database©myNoSQL)

via: http://blog.boundary.com/2012/04/19/hungry-kobayashi-pt1/


Riak_mongo Makes Riak Look Like Mongo to Clients

By Pavlo Baron and Kresten Krab Thorup:

In the first step, it will allow Mongo drivers to seamlessly connect to it using Mongo Wire Protocol and to map to the underlying Riak data store. This can help migrate the data store of existing MongoDB based applications to Riak.

In the next step it also might be interesting to have a Mongo based Riak backend.

No need for the second step.

Original title and link: Riak_mongo Makes Riak Look Like Mongo to Clients (NoSQL database©myNoSQL)

via: https://github.com/pavlobaron/riak_mongo


NoSQL Releases and Announcements

Catching up after almost two weeks offline is no easy task, but I hope I’ll not miss any important events, releases, or posts. But if I do, please email me.

Cassandra 1.0.9: Maintenance Release

The complete change notes for Cassandra 1.0.9 are here:

  • improve index sampling performance (CASSANDRA-4023)
  • always compact away deleted hints immediately after handoff (CASSANDRA-3955)
  • delete hints from dropped ColumnFamilies on handoff instead of erroring out (CASSANDRA-3975)
  • add CompositeType ref to the CLI doc for create/update column family (CASSANDRA-3980)
  • Avoid NPE during repair when a keyspace has no CFs (CASSANDRA-3988)
  • Fix division-by-zero error on get_slice (CASSANDRA-4000)
  • don’t change manifest level for cleanup, scrub, and upgradesstables operations under LeveledCompactionStrategy (CASSANDRA-3989, 4112)
  • fix race leading to super columns assertion failure (CASSANDRA-3957)
  • ensure that directory is selected for compaction for user-defined tasks and upgradesstables (CASSANDRA-3985)
  • allow custom types in CLI’s assume command (CASSANDRA-4081)
  • fix totalBytes count for parallel compactions (CASSANDRA-3758)
  • fix intermittent NPE in get_slice (CASSANDRA-4095)
  • remove unnecessary asserts in native code interfaces (CASSANDRA-4096)
  • Fix EC2 snitch incorrectly reporting region (CASSANDRA-4026)
  • Shut down thrift during decommission (CASSANDRA-4086)
  • Merged from 0.8: Fix ConcurrentModificationException in gossiper (CASSANDRA-4019)

  • Pig

    • support Counter ColumnFamilies (CASSANDRA-3973)
    • Composite column support (CASSANDRA-3684)
  • CQL

    • fix NPE on invalid CQL delete command (CASSANDRA-3755)
    • Validate blank keys in CQL to avoid assertion errors (CASSANDRA-3612)

Apache Hadoop User Impersonation vulnerability

This vulnerability discovered by Cloudera’s Aaron T. Myers affects Hadoop’s versions 0.20.203.0, 0.20.204.0, 0.20.205.0, 1.0.0 to 1.0.1, and 0.23.0 to 0.23.1 where Kerberos is enabled. Complete details available here.

CouchDB 1.2.0

This is the first important release after the start of the year CouchDB hubbub with Damien Katz and Couchbase. The new version is a major release in itself deserving its own post: CouchDB 1.2.0: Performance, Security, API, Core and Replication Improvements.

Riak 1.1.2: Stabilization release

Just a maintenance release in the Riak 1.1 series. Complete release notes here.

Original title and link: NoSQL Releases and Announcements (NoSQL database©myNoSQL)


Here Is Why in Cassandra vs. HBase, Riak, CouchDB, MongoDB, It's Cassandra FTW

Brian ONeill:

Now, since choosing Cassandra, I can say there are a few other really important less tangible considerations. The first, is the code base. Cassandra has an extremely clean and well maintained code base. Jonathan and team do a fantastic job managing the community and the code. As we adopted NoSQL, the ability to extend the code-base and incorporate our own features has proven invaluable. (e.g. triggers, a REST interface, and server-side wide-row indexing)

Secondly, the community is phenomenal. That results in timely support, and solid releases on a regular schedule. They do a great job prioritizing features, accepting contributions, and cranking out features. (They are now releasing ~quarterly) We’ve all probably been part of other open source projects where the leadership is lacking, and features and releases are unpredictable, which makes your own release planning difficult. Kudos to the Cassandra team.

Everything sounds reasonable except for Riak being the “new kid on the block” and not finding support for it. Basho, where were you hidding?

Original title and link: Here Is Why in Cassandra vs. HBase, Riak, CouchDB, MongoDB, It’s Cassandra FTW (NoSQL database©myNoSQL)

via: http://brianoneill.blogspot.com/2012/04/cassandra-vs-couchdb-mongodb-riak-hbase.html


Basho Announces Riak-Based Multi-Tenant, Distributed, S3-Compatible Cloud Storage Platform

Coverage of the announcement of a new product from Basho: Riak CS: a multi-tenant, distributed, S3-compatible cloud storage platform:

My notes about Riak CS will follow shortly.

Original title and link: Basho Announces Riak-Based Multi-Tenant, Distributed, S3-Compatible Cloud Storage Platform (NoSQL database©myNoSQL)


NoSQL Databases Adoption in Numbers

Source of data is Jaspersoft NoSQL connectors downloads. RedMonk published a graphic and an analysis and Klint Finley followed up with job trends:

NoSQL databases adoption

Couple of things I don’t see mentioned in the RedMonk post:

  1. if and how data has been normalized based on each connector availability

    According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.

  2. if and how marketing pushes for each connectors have been weighed in

    Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)

  3. Redis and VoltDB are mostly OLTP only databases

Original title and link: NoSQL Databases Adoption in Numbers (NoSQL database©myNoSQL)


Which NoSQL Databases Are Robust to Net-Splits?

Answered on Quora:

  • Dynamo (key-value)
  • Voldemort (key-value)
  • Tokyo Cabinet (key-value)
  • KAI (key-value)
  • Cassandra (column-oriented/tabular)
  • CouchDB (document-oriented)
  • SimpleDB (document-oriented)
  • Riak (document-oriented)

A couple of clarifications to the list above:

  1. Dynamo has never been available to the public. On the other hand DynamoDB is not exactly Dynamo
  2. Tokyo Cabinet is not a distributed database so it shouldn’t be in this list
  3. CouchDB isn’t a distributed database either, but one could argue that with its peer-to-peer replication it sits right at the border. On the other hand there’s BigCouch.

Original title and link: Which NoSQL Databases Are Robust to Net-Splits? (NoSQL database©myNoSQL)


Riak at Clipboard: Why Riak and How We Made Riak Search Faster

Gary William Flake:

For me, the two most important considerations are (1) how easy it is to write effective code and (2) how bulletproof the system is operationally. Others may argue that other attributes — like performance or the particulars of the data model — are more important, but I’ll pick simplicity and robustness every time1. A simple and robust store can usually be finessed to map to any data model and can be scaled outward to make up for performance.

The rest of the article focuses on the solution Clipboard employed to making Riak Search scale for the scenario of performing multi-matching search queries across millions of documents. While the very details apply only to Clipboard and Riak Search, the idea of precomputing results or at least modeling data in ways that optimize the most often access scenarios are generally applicable.


  1. My emphasis. I find these two principles to be the core of Riak. 

Original title and link: Riak at Clipboard: Why Riak and How We Made Riak Search Faster (NoSQL database©myNoSQL)

via: http://blog.clipboard.com/2012/03/18/0-Milking-Performance-From-Riak-Search


NoSQL Hosting Services

Michael Hausenblas put together a list of hosted NoSQL solutions including Amazon DynamoDB and SimpleDB, Google App Engine, Riak, Cassandra, CouchDB, MongoDB, Neo4j, and OrientDB. If you go through my posts on NoSQL hosting , you’ll find a couple more.

Original title and link: NoSQL Hosting Services (NoSQL database©myNoSQL)

via: http://webofdata.wordpress.com/2012/03/18/hosted-nosql/