VoltDB: All content tagged as VoltDB in NoSQL databases and polyglot persistence
Friday, 19 November 2010
Integrating VoltDB and Hadoop
A paper on integrating VoltDB and Hadoop. From what I read, for now it works on a single direction (exporting data from VoltDB to Hadoop):
It is possible to design and develop a complete business solution utilizing both VoltDB and Hadoop from scratch. But you do not need to. VoltDB simplifies the process by providing an export facility that lets you automatically archive selected data from the VoltDB database. And you can use this export functionality with Hadoop.
See the paper below:
Monday, 8 November 2010
VoltDB Release: Version 1.2 Featuring Data Availability Enhancements
VoltDB 1.2 released earlier this month:
New data availability features. Version 1.2 introduces two important data availability enhancements. The first is network partition tolerance, which allows VoltDB to automatically detect, isolate and manage network failures. This is a critical feature for distributed database infrastructures including those deployed into public clouds such as Amazon’s EC2. The second availability feature, node rejoin, allows VoltDB database nodes that have been taken offline (e.g., for maintenance or repair) to “rejoin” the cluster while the database is live. Node rejoin dynamically resynchronizes all node data.
I’d love to read more about about the mechanisms used for automatically detecting, isolating and managing network failures. (If I remember correctly) The topic of reliably determining partitions in a distributed system is a central part of Seth Gilbert and Nancy Lynch paper on CAP theorem. It would also be interesting to understand how VoltDB deals with its strong consistency promise in these situations.
And some management tools (nb: by the announcement text I cannot tell if they are available only in the Enterprise version):
New consoles for provisioning, management and monitoring. New in the Enterprise Edition of version 1.2, the VoltDB Enterprise Manager (VEM) provides database and systems administrators with browser-based tools for managing production VoltDB databases. VEM offers a flexible suite of consoles for performing many common administrative and diagnostic activities.
Original title and link: VoltDB Release: Version 1.2 Featuring Data Availability Enhancements (NoSQL databases © myNoSQL)
via: https://voltdb.com/content/voltdb-releases-version-12-high-performance-oltp-database
Tuesday, 2 November 2010
Using MySQL as NoSQL: A Story for exceeding 750k qps
How many times do you need to run PK lookups per second? […] These are “SQL” overhead. It’s obvious that performance drops were caused by mostly SQL layer, not by “InnoDB(storage)” layer. MySQL has to do a lot of things like below while memcached/NoSQL do not neeed to do.
- Parsing SQL statements
- Opening, locking tables
- Making SQL execution plans
- Unlocking, closing tables
MySQL also has to do lots of concurrency controls.
The story has been out for a couple of weeks already, so I’ll not get into the details. But I felt like adding a couple of comments to the subject:
- existing RDBMS storage engines are most of the time very well thought and long time tested
- some NoSQL databases have realized that and allow plugging in such storage engines in their systems:
- Project Voldemort supports Berkley DB (and MySQL, but not sure it goes around the SQL engine)
- [Riak comes with Innostore], an InnoDB-based storage
- many of the findings in this article sound very close to the rationale behind VoltDB, including the pre-compiled, cluster deployed stored procedures
Original title and link: Using MySQL as NoSQL: A Story for exceeding 750k qps (NoSQL databases © myNoSQL)
via: http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
Friday, 8 October 2010
VoltDB: An SQL Developer’s Perspective
Two hours of VoltDB. Planning to watch it over the weekend:
Original title and link: VoltDB: An SQL Developer’s Perspective (NoSQL databases © myNoSQL)
Friday, 25 June 2010
NoSQL benchmarks and performance evaluations
Some say it is the right time to start having these around. Others are saying it’s way to early to start the “battle”. Users do want to see them and in case they’re lacking they create their own, most of the time using incomplete or wrong approaches.
But what am I talking about? As some of you might have guessed already:
NoSQL benchmarks and performance evaluations!
With their recent release of Riak 0.11.0, Basho guys have also published their internal ☞ benchmarking code. Similar internal benchmark code is ☞ available for MongoDB.
But users are more interested in seeing cross product benchmarks, even if most of the time constructing these is extremely complicated and they end up comparing apples with oranges.
All these being said and accepting that most of the time someone will figure out a way to invalidate the results, lets see what cross product benchmarks do we have in the NoSQL space.
Yahoo! Cloud Serving Benchmark
The Yahoo! Cloud Serving Benchmark’s goal is to facilitate performance comparisons of the new generation of cloud data serving systems. The source code is available on ☞ GitHub and Yahoo! has also published ☞ the results of running this benchmark against Cassandra, HBase, Yahoo!’s PNUTS, and a simple sharded MySQL implementation.
VoltDB Benchmark
VoltDB a new storage solution that calls itself the next-generation SQL RDBMS with ACID for fast-scaling OLTP applications has recently ☞ published the results of their benchmark comparing VoltDB and Cassandra.
It is worth noting that while being one of those apples to oranges comparisons (nb and the authors are well aware of it), there are still a couple of interesting and useful things to be learned from it (i.e. benchmarking procedure, tested scenarios, etc.)
Unfortunately at this time the source code is not yet available, but hopefully we will see it soon:
Going forward, we’re planning to release the code we used to do these benchmarks. We’d also like to try a few other storage layers
Hypertable and HBase Performance Evaluation
The guys behind Hypertable ☞ have published their results of comparing Hypertable with HBase using a benchmark based on the Google BigTable paper[1] from which both HBase and Hypertable are inheriting their architecture. Unfortunately, the benchmark code is not available at this moment.
Thanks to Stu Hood, now I know the code for this benchmark is available in the Hypertable distribution available ☞ here (tar.gz) and the configuration files are also available ☞ here (tar.gz)
So, as far as I could gather we have:
- ☞ Riak internal benchmark
- ☞ MongoDB internal benchmark
- ☞ Yahoo! Cloud Serving Benchmark
- results only of VoltDB Benchmark comparing VoltDB and Cassandra
- BigTable-inspired benchmark comparing Hypertable and HBase
Did I miss any?
Thursday, 24 June 2010
VoltDB Don’ts Validating NoSQL Assumptions
Interesting to note that some VoltDB don’ts from the paper ☞ Do’s and Don’ts (pdf) are validating some major assumptions in the NoSQL space:
Don’t create tables with very large rows (that is, lots of columns or large VARCHAR columns). Several smaller tables with a common partitioning key are better.
Basically both wide-column stores (i.e. Cassandra, HBase, Hypertable) with their column-families and document databases (i.e. CouchDB, MongoDB, RavenDB, Terrastore) with their schema-less approach are addressing this issue.
- Don’t use ad hoc SQL queries as part of a production application.
Firstly this points to the mindset change required by the NoSQL space when doing data modeling: think about data access patterns.
Secondly, it pretty much validates CouchDB and RavenDB approaches of having queries defined upfront making their reads extremely fast.
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling