voltdb: All content tagged as voltdb in NoSQL databases and polyglot persistence
Tuesday, 22 January 2013
VoltDB Hits Proverbial Version 3.0
Andrew Brust for ZDNet:
In the software world, many people believe that products reach true maturity and value in their third version. A first release is about bringing an idea to market, and second releases act to stabilize the first. But it’s “v3” that really incorporates refinements that reflect user feedback and market lessons learned. It seems to me that “NewSQL” database player VoltDB is following that pattern with its own 3.0 release.
There’re no such things as a proverbial version 3.0, nor a pattern about the meaning of version 3. Anyways, congrats to the VoltDB guys!
If you want to know what’s in VoltDB 3.0, ignore the linked post and go read Introducing VoltDB 3.0 on VoltDB blog. Short version: more performance, improved SQL support, support for JSON-encoded data and defining indexes for JSON columns.
Original title and link: VoltDB Hits Proverbial Version 3.0 (©myNoSQL)
via: http://www.zdnet.com/voltdb-hits-proverbial-version-3-0-7000010099/
Monday, 1 October 2012
VoltDB 3.0 Will Include a New Transaction Coordination Architecture
From the VoltDB 3.0 preview notes:
VoltDB v3.0 includes a new transaction coordination architecture that reduces latency and improves transaction throughput.
In VoltDB versions 1.x and 2.x, transactions were globally ordered and ordered according to time. Each node on the cluster communicated with every other node to agree upon transaction ordering. Maintaining a global agreement based on time caused VoltDB to incur additional latency - intra-node communication agreement involving all nodes. This architecture required users to synchronize cluster node clocks using NTP, because any time drift between nodes would introduce artificial, unneeded latency to the system. This overhead, and the strict need to maintain clock synchronization has been eliminated from the VoltDB v3.0 code base.
So how does the new solution work?
Original title and link: VoltDB 3.0 Will Include a New Transaction Coordination Architecture (©myNoSQL)
Friday, 30 March 2012
Integrating VoltDB With the Spring Framework
There are two Java clients for VoltDB. One is a standard JDBC driver that executes all queries synchronously. The other is a specialized client library that can run queries either synchronously or asynchronously, along with a number of other features. Synchronous queries perform well enough but their throughput is no match for asynchronous queries. Asynchronous query throughput is approximately four times greater than synchronous queries in a two node VoltDB cluster. For example, an application using asynchronous queries can run over 200K TPS (transactions per second) in a two node server cluster using a single client running on a Macbook Pro; a synchronous client running the same queries will achieve around 56K TPS.
Could anyone explain what leads to such a difference in performance?
Original title and link: Integrating VoltDB With the Spring Framework (©myNoSQL)
via: http://voltdb.com/company/blog/integrating-voltdb-spring-framework
Tuesday, 27 March 2012
NoSQL Databases Adoption in Numbers
Source of data is Jaspersoft NoSQL connectors downloads. RedMonk published a graphic and an analysis and Klint Finley followed up with job trends:

Couple of things I don’t see mentioned in the RedMonk post:
-
if and how data has been normalized based on each connector availability
According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.
-
if and how marketing pushes for each connectors have been weighed in
Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)
-
Redis and VoltDB are mostly OLTP only databases
Original title and link: NoSQL Databases Adoption in Numbers (©myNoSQL)
Thursday, 22 March 2012
VoltDB Assumptions: Memory vs Disk
These are the assumptions under which VoltDB was architected:
First, it should be noted that main memory is getting very cheap. It is straightforward to put 50 Gbytes of memory on a $5,000 server. Beefy servers these days have 10 times that amount. Moreover, many (but not all) transactional databases don’t require massive storage volumes. An OLTP application with more than a few Tbytes of data is quite rare. The same can be said for new OLTP “fire hose” applications that require ultra-high write throughput and ACID transactions (e.g., digital advertising, wireless, real-time monitoring) — these systems rarely need to manage more than a few Tbytes of hot state. Hence, it is plausible to buy enough main memory to store the data for the vast majority of OLTP applications.
Original title and link: VoltDB Assumptions: Memory vs Disk (©myNoSQL)
Thursday, 5 January 2012
VoltDB for Real-Time Network Monitoring
From the announcement of VoltDB being used by the Japanese ISP, Sakura Internet, for their real-time Internet traffic monitoring and analysis platform for detecting and mitigating large-scale distributed denial of service (DDoS) attacks:
Tamihiro Yuzawa[1]: Our system needs to be capable of sifting through massive amounts of traffic flow data in real-time. VoltDB was our choice from the beginning because it’s a super-fast datastore that supports SQL.
Scott Jarr[2]: Sakura’s security infrastructure requires a datastore that can scale massively and on demand, without sacrificing data accuracy.
Mark these VoltDB keywords:
- fast (read in-memory)
- data consistency
- SQL
Original title and link: VoltDB for Real-Time Network Monitoring (©myNoSQL)
Friday, 8 July 2011
Comments on Urban Myths About NoSQL
Dan Weinreb comments on Michael Stonebraker’s Urban Myths about SQL (PDF) :
Dr. Michael Stonebraker recently posted a presentation entitled “Urban Myths about NoSQL”. Its primary point is to defend SQL, i.e. relational, database systems against the claims of the new “NoSQL” data stores. Dr. Stonebraker is one of the original inventors of relational database technology, and has been one of the most eminent database researchers and practitioners for decades.
In fact, Michael Stonebraker bashes everything that is not his current product—this GigaOm interview is the latest example.
For now, I’m filing this away until VoltDB is sold.
Original title and link: Comments on Urban Myths About NoSQL (©myNoSQL)
Monday, 20 June 2011
Multi-Document Transactions in RavenDB vs Other NoSQL Databases
“We tried using NoSQL, but we are moving to Relational Databases because they are easier…”
This is how Oren Eini starts his post about RavenDB support for multi-document transactions and the lack of it from MongoDB:
- For a single server, we support atomic multi document writes natively. (note that this isn’t the case for Mongo even for a single server).
- For multiple servers, we strongly recommend that your sharding strategy will localize documents, meaning that the actual update is only happening on a single server.
- For multi server, multi document atomic updates, we rely on distributed transactions.
In the NoSQL space, there are a couple of other solutions that support transactions:
- Google Megastore
- Redis has two mechanisms that come close to transactions: MULTI/EXEC/DISCARD and pipelining —this one is exemplified in this Redis based triplestore database implementation
- many of the graph databases (Neo4j, HyperGraphDB, InfoGrid)
If you look at these from the perspective of distributed systems, the only distributed ones that support transactions are Megastore and RavenDB. There’s also VoltDB which is all transactions. Are there any I’ve left out?
Original title and link: Multi-Document Transactions in RavenDB vs Other NoSQL Databases (NoSQL database©myNoSQL)
Tuesday, 17 May 2011
Short Notes about VoltDB
Here are the notes I’ve made while watching a webinar about building applications with VoltDB.
What I like:
- it forces you to think upfront about data partitioning by specifying partitioned or replicated tables
- it forces you to think about data access patterns by asking to define the Java-based stored procedures
- it provides both a synchronous and asynchronous API
- there’s an option to run any queries in development mode
What I don’t like:
- you need to compile and deploy the schema, queries, etc.
- you have to define the cluster topology in an XML file
- everything is transactional
Let’s say you have
k-factor2 and a materialized view: an insert will put your data on the 3 servers and the materialized view within a single transaction. - it’s not clear how you could evolve your schema
- the API doesn’t use timeouts
Original title and link: Short Notes about VoltDB (NoSQL databases © myNoSQL)
Tuesday, 22 March 2011
How Scalable is VoltDB?
Percona guys[1] have run, analyzed, and concluded about VoltDB scalability:
VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores
Considering the definition: “A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system.”, the conclusion should be slightly updated:
VoltDB can scale up to 120 partitions on 39 servers with 300 CPU cores and 1.6 million TPS.
Bottom line:
- if you can fit your data into 40 servers’ memory
- you need ACID and SQL
- you are OK precompiled Java based stored procedures
- you don’t need multi data center deployments
now you can estimate how far you can go with VoltDB.
-
The company specialized on MySQL services and behind the MySQL Performance Blog ↩
Original title and link: How Scalable is VoltDB? (NoSQL databases © myNoSQL)
via: http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/
Wednesday, 16 March 2011
MySQL Fork Drizzle Released
Drizzle aims to be different from MySQL, stripping out “unnecessary” features loved by enterprise and OEMs in the name of greater speed and simplicity and for reduced management overhead.
Drizzle has no stored procedures, triggers, or views […]
Aiming to provide a database for the cloud with support for massive concurrency optimized for increased performance, Drizzle team started by removing “non-essential” code and features. Michael Stonebraker’s VoltDB is focusing on a different set of optimizations for achieving performance — removing logging, locking, latching, buffer management[1].
Anyway, it is not about who’s approach is better, but which scenarios are covered by using a simplified MySQL compatible database or by an in-memory with predefined queries database.
-
The “NoSQL” Discussion has Nothing to Do With SQL:
If one eliminates any one of the above overhead components, one speeds up a DBMS by 25%. Eliminate three and your speedup is limited by a factor of two. You must get rid of all four to run a lot faster.
Original title and link: MySQL Fork Drizzle Released (NoSQL databases © myNoSQL)
via: http://www.channelregister.co.uk/2011/03/16/drizzle_released/
Monday, 17 January 2011
VoltDB: 3 Concepts that Makes it Fast
John Hugg lists the 3 concepts that make VoltDB fast:
- Exploit repeatable workloads: VoltDB exclusively uses a stored procedure interface.
- Partition data to horizontally scale: VoltDB devides data among a set of machines (or nodes) in a cluster to achieve parallelization of work and near linear scale-out.
- Build a SQL executor that’s specialized for the problem you’re trying to solve.: If stored procedures take microseconds, why interleave their execution with a complex system of row and table locks and thread synchronization? It’s much faster and simpler just to execute work serially.
Let’s take a quick look at these.
Using stored procedures — instead of allowing free form queries — would allow the system:
- to completely skip query parsing, creating and optimizing execution plans at runtime
- by analyzing (at deploy time) the set of stored procedures, it might also be possible to generate the appropriate indexes
The benefits of horizontally partitioned data are well understood: parallelization and also easier and cost effective hardware usage.
Single threaded execution can also help by removing the need for locking and reducing data access contention.
While these 3 solutions are making a lot of sense and can definitely make a system faster, there’s one major aspect of VoltDB that’s missing from the above list and which I think is critical to explaining its speed: VoltDB is an in-memory storage solution.
Here are a couple of examples of other NoSQL databases that benefit from being in memory (or as close as possible to it). MongoDB, while being a lot more liberal with the queries it accepts, can deliver very fast results by keeping as much data in memory as possible — remember what happened when it had to hit the disk more often? — and using appropriate indexes where needed. Redis and Memcached can deliver amazingly fast results because they keep all data in-memory. And Redis is single threaded while Memcached is not.
Original title and link: VoltDB: 3 Concepts that Makes it Fast (NoSQL databases © myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling