NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Distributed Database Systems

After an intro about large scale classical RDBMS setups, ☞ Will Fitch’s post started well:

What advantages and disadvantages will come with this new architecture [distributed database systems]? What hardware can I reuse efficiently with this new setup? What vendor do I choose to go with? What kind of code changes and culture shock will this introduce to the developers and DBAs?

But then it slowly turned into: how not to make a technical decision. There are two parts that make me think the decision was probably already made:

While there are a few distributed solutions out there: Hadoop, Cassandra, Hypertable, Amazon SimpleDB, etc., one stands out in my opinion – VoltDB.

You cannot say you’re making an informed decision when mixing a data processing framework with NoSQL databases and DaaS, plus you leave aside products like HBase or Riak or Membase.

And then it is this part that made me think the VoltDB pre-sales have already done their job:

We’re used to writing code that connects to a database and executes a stored procedure that lives in the database and is written in SQL. Introducing this new architecture would completely change our environment. Stored procedures would likely be written in Java or another JIT language. The CRUD functionality would then execute that instead.

There’s nothing fundamentally wrong with having preferences, but technical decisions should be based on good understanding of the evaluated products and a lot of experimentation and prototyping. It shouldn’t be the other way around.

Original title and link: Distributed Database Systems (NoSQL databases © myNoSQL)