riak: All content tagged as riak in NoSQL databases and polyglot persistence
After re-reading HyperDex’s comparison of Cassandra, MongoDB, and Riak backups, I’ve realized there are no links to the corresponding docs. So here they are:
Cassandra backs up data by taking a snapshot of all on- disk data files (SSTable files) stored in the data directory.
You can take a snapshot of all keyspaces, a single keyspace, or a single table while the system is online. Using a parallel ssh tool (such as pssh), you can snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot resumes consistency using Cassandra’s built-in consistency mechanisms.
After a system-wide snapshot is performed, you can enable incremental backups on each node to backup data that has changed since the last snapshot: each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory (provided JNA is enabled).
Basically three are three ways to backup MongoDB:
- Using MMS
- Copying underlying files
Riak’s backup operations are pretty different for the two main storage backends, Bitcask and LevelDB, used by Riak:
Choosing your Riak backup strategy will largely depend on the backend configuration of your nodes. In many cases, Riak will conform to your already established backup methodologies. When backing up a node, it is important to backup both the ring and data directories that pertain to your configured backend.
Note: I’d be happy to update this entry with links to docs on what tools and solutions other NoSQL databases (HBase, Redis, Neo4j, CouchDB, Couchbase, RethinkDB) are providing.
✚ Considering that creating backups is as useful as making sure that these will actually work when trying to restore, I’m wondering why there are no tools that can validate a backup without forcing a complete restore. The two mechanisms are not equivalent, but for large size databases this might simplify a bit the process and increase the confidence of the users.
Original title and link: Quick links for how to backup different NoSQL databases ( ©myNoSQL)
Basho, makers of Riak, published recently an article about the most common patterns that have to be avoided when developing with Riak. Unsurprisingly, most of these rules
can must be applied to the majority of NoSQL databases.
Writing an application that can take full advantage of Riak’s robust scaling properties requires a different way of looking at data storage and retrieval. Developers who bring a relational mindset to Riak may create applications that work well with a small data set but start to show strain in production, particularly as the cluster grows.
What I’ve learned after experimenting and building apps with different NoSQL databases can be summarized in just a couple of short generic rules:
- if you have the “disadvantage” of being experienced with relational databases and working on an app that will use a NoSQL database, forget everything you know about the relational world. Take out that part of your brain and put it in the jar. Use the other side of your brain. Avoid any temptations of doing comparisons or asking yourself “how would I do this in a relational database?”. You’ll fail.
- when using relational databases, most often we start with the data model. “What’s the best way to organize and store our data?” is one of the first questions we’re addressing. Only afterwards we’re figuring out, in the application, how to retrieve data in the format needed by the app.
when using a NoSQL database, focus on your application. “How do I use data in my application?” must be the driving question. Then your NoSQL database API will tell you exactly how to store the data.
This might make it sound too simple. Indeed, it’s not that simple. Some of the complexity you’ll face comes from figuring out how to keep multiple copies of the data to fit the different ways you need to access it, updating and deleting multiple copies, dealing with the consistency requirements of your app, what availability versus consistency trade-offs your app is OK with.
take the time to learn the most common usage patterns and anti-patterns for the NoSQL database you have picked. If you cannot find the ones that fit your application, talk to the community and build a prototype. Do not ignore point 3 above at any stage.
Now go over the list of the anti-patterns when developing with Riak.
Original title and link: Anti-patterns for developing with NoSQL databases ( ©myNoSQL)
A 3-part, a bit too high level for me, article about what is to be gained (and lost) when using Riak instead of a relational database:
What I always like about Basho’s posts is that they don’t shy away from covering the tradeoffs.
Original title and link: Relational to Riak ( ©myNoSQL)
The proposal for Riak’s security, discussed there in the open:
Thus, I propose we add authentication/authorization/TLS and auditing to Riak, to make Riak more resilient to unauthorized access. In general, I took the design cues from PostgreSQL. Another goal was to make this applicable to riak_core, so any reliance on KV primitives or features are intentionally avoided.
Andrew Thomson, the author of the proposal, mentions PostgreSQL as a source of inspiration. Besides the normal topics, authentication, authorization, and auditing, the document has an Open questions section. If you care about Riak’s future security go and help out.
Original title and link: The future of of Riak’s Security ( ©myNoSQL)