NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



XA: All content tagged as XA in NoSQL databases and polyglot persistence

Scalable Atomic Visibility with RAMP Transactions

We’ve developed three new algorithms—called Read Atomic Multi-Partition (RAMP) Transactions—for ensuring atomic visibility in partitioned (sharded) databases: either all of a transaction’s updates are observed, or none are.

Still digesting Peter Bailis’s post and the accompanying Scalable Atomic Visibility with RAMP Transactions paper.

Original title and link: Scalable Atomic Visibility with RAMP Transactions (NoSQL database©myNoSQL)


Non-blocking transactional atomicity

Peter Bailis:

tl;dr: You can perform non-blocking multi-object atomic reads and writes across arbitrary data partitions via some simple multi-versioning and by storing metadata regarding related items.

Without the time to go through all the details of the algorithm proposed by Peter Bailis and the various scenarios of a distributed system where the algorithm would have to work, my head was cycling between:

  1. could this actually be expanded to a read/write scenario? at what costs?
  2. isn’t this a form of a (weaker) XA implementation?

Luckly, Peter Bailis is already answering some of these questions in his post1:

If you’re a distributed systems or database weenie like me, you may be curious how NBTA related to well-known problems like two-phase commit.

In case you are familiar with XA, you could start reading the post with the “So what just happened?” section and then dive into the details of the algorithm and possible extensions.

  1. Thanks Peter for stopping my head spin! 

Original title and link: Non-blocking transactional atomicity (NoSQL database©myNoSQL)


Multi-Document Transactions in RavenDB vs Other NoSQL Databases

“We tried using NoSQL, but we are moving to Relational Databases because they are easier…”

This is how Oren Eini starts his post about RavenDB support for multi-document transactions and the lack of it from MongoDB:

  1. For a single server, we support atomic multi document writes natively. (note that this isn’t the case for Mongo even for a single server).
  2. For multiple servers, we strongly recommend that your sharding strategy will localize documents, meaning that the actual update is only happening on a single server.
  3. For multi server, multi document atomic updates, we rely on distributed transactions.

In the NoSQL space, there are a couple of other solutions that support transactions:

If you look at these from the perspective of distributed systems, the only distributed ones that support transactions are Megastore and RavenDB. There’s also VoltDB which is all transactions. Are there any I’ve left out?

Original title and link: Multi-Document Transactions in RavenDB vs Other NoSQL Databases (NoSQL database©myNoSQL)

Distributed Database Systems

After an intro about large scale classical RDBMS setups, ☞ Will Fitch’s post started well:

What advantages and disadvantages will come with this new architecture [distributed database systems]? What hardware can I reuse efficiently with this new setup? What vendor do I choose to go with? What kind of code changes and culture shock will this introduce to the developers and DBAs?

But then it slowly turned into: how not to make a technical decision. There are two parts that make me think the decision was probably already made:

While there are a few distributed solutions out there: Hadoop, Cassandra, Hypertable, Amazon SimpleDB, etc., one stands out in my opinion – VoltDB.

You cannot say you’re making an informed decision when mixing a data processing framework with NoSQL databases and DaaS, plus you leave aside products like HBase or Riak or Membase.

And then it is this part that made me think the VoltDB pre-sales have already done their job:

We’re used to writing code that connects to a database and executes a stored procedure that lives in the database and is written in SQL. Introducing this new architecture would completely change our environment. Stored procedures would likely be written in Java or another JIT language. The CRUD functionality would then execute that instead.

There’s nothing fundamentally wrong with having preferences, but technical decisions should be based on good understanding of the evaluated products and a lot of experimentation and prototyping. It shouldn’t be the other way around.

Original title and link: Distributed Database Systems (NoSQL databases © myNoSQL)


Distributed Transactions Matter: Google’s Jeff Dean

Via ☞ Yang: Jeff Dean in the ☞ Stanford EE380 lecture (windows video)[1]:

In retrospect I think that [not supporting distributed transactions] was a mistake. We probably should have added that in because what ended up happening is a lot of people did want distributed transactions, and so they hand-rolled their own protocols, sometimes incorrectly, and it would have been better to build it into the infrastructure. So in Spanner we do have distributed transactions. We don’t have a lot of experience with it yet.

  1. I hate this video is windows only. Any ideas how to watch it on a Mac?  ()

Original title and link: Distributed Transactions Matter: Google’s Jeff Dean (NoSQL databases © myNoSQL)

Neo4j Transactions and JTA

I’ve already told you about ☞ Chris Gioran’s series on Neo4j internals. Now, he is working on providing support for pluggable JTA compliant transaction managers in Neo4j and details about the current status can be found in his ☞ last post. Anyways, before that he started with a deep dive into the Neo4j transactions and that resulted in 4 (quite long) articles:

  • ☞ Write Ahead Log and Deadlock Detection

    In this post I will write a bit about two different components that can be explained somewhat in isolation and upon which higher level components are build. The first is the Write Ahead Log (WAL) and the other is an implementation of a Wait-For graph that is used to detect deadlocks in Neo before they happen.

  • ☞ XaResources, Transactions and TransactionManagers

    This time we will look into a higher level than last time, discussing the Transaction class and its implementations, Commands and TransactionManagers, touching a bit first on the subject of XAResources.

  • ☞ Xa roundup and consistency

    This post covers Data sources and XA connections, management of XaResources, and putting all these together.

  • ☞ A complete run and a conclusion

    Here I will try to follow a path from the initialization of the db engine and through the begin() of a transaction and creation of a Node to the commit and shutdown.

As I’ve estimated in my first mention of this series on Neo4j internals, Chris ends up giving up writing and starting to hack Neo4j:

Truth been told, I have reached a point where I no longer want to write about Neo but instead I want to start hacking it

Original title and link: Neo4j Transactions and JTA (NoSQL databases © myNoSQL)