NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



VoltDB: All content tagged as VoltDB in NoSQL databases and polyglot persistence

5 Myths about NoSQL vs Relational Databases

Ryan Betts, the CTO of VoltDB addressing an article by MongoDB’s CEO Max Schireson that seems to have stroken a chord:

Recently Max Schireson, CEO of MongoDB, shared his thoughts on relational databases. His statements deserve a direct and frank opposing response. Let’s walk through the myths that Mr. Schireson promoted.

Compared with PostgreSQL’s Robert Haas post “Why the clock is ticking for MongoDB“, this one makes some debatable arguments — e.g. “All popular SQL systems support document types”: aside for SOA committees and MarkLogic, I’ve never heard someone enjoying XML. They aren’t innaccurate, but they’re paiting VoltDB’s space in a too bright color palette.

Original title and link: 5 Myths about NoSQL vs Relational Databases (NoSQL database©myNoSQL)


VoltDB raises $8M in Series B


VoltDB has raised $8 million from Sigma Ventures, Kepha Partners and three other “strategic investors”, bringing total venture capital investment to $18.7 million, said its CEO, Bruce Reading. Sigma and Kepha participated in an earlier round, in 2012, through which it raised $5.7 million.

I assume some will say it’s a small round. I’ll say congrats to the VoltDB team.

Original title and link: VoltDB raises $8M in Series B (NoSQL database©myNoSQL)


Benchmarking graph databases... with unexpected results

A team from MIT CSAIL set up to benchmark a graph database and 3 relational databases with different models: row-based (MySQL), in-memory (VoltDB), and column-based (Vertica) . The results are interesting, to say the least:

We can see that relational databases outperform Neo4j on PageRank by up to two orders of magnitude. This is because PageRank involves full scanning and joining of the nodes and edges table, something that relational databases are very good at doing. Finding Shortest Paths involves starting from a source node and successively exploring its outgoing edges, a very different access pattern from PageRank. Still, we see from Figure 1(b) that relational databases match or outperform Neo4j in most cases. In fact, Vertica is more than twice faster than Neo4j. The only exception is VoltDB over Twitter dataset.

Being beaten at your own game is not a good thing. I hope this is just a fluke in the benchmark (misconfiguration) or a result particular to those data sets.

Original title and link: Benchmarking graph databases… with unexpected results (NoSQL database©myNoSQL)


VoltDB Hits Proverbial Version 3.0

Andrew Brust for ZDNet:

In the software world, many people believe that products reach true maturity and value in their third version. A first release is about bringing an idea to market, and second releases act to stabilize the first. But it’s “v3” that really incorporates refinements that reflect user feedback and market lessons learned. It seems to me that “NewSQL” database player VoltDB is following that pattern with its own 3.0 release.

There’re no such things as a proverbial version 3.0, nor a pattern about the meaning of version 3. Anyways, congrats to the VoltDB guys!

If you want to know what’s in VoltDB 3.0, ignore the linked post and go read Introducing VoltDB 3.0 on VoltDB blog. Short version: more performance, improved SQL support, support for JSON-encoded data and defining indexes for JSON columns.

Original title and link: VoltDB Hits Proverbial Version 3.0 (NoSQL database©myNoSQL)


VoltDB 3.0 Will Include a New Transaction Coordination Architecture

From the VoltDB 3.0 preview notes:

VoltDB v3.0 includes a new transaction coordination architecture that reduces latency and improves transaction throughput.

In VoltDB versions 1.x and 2.x, transactions were globally ordered and ordered according to time. Each node on the cluster communicated with every other node to agree upon transaction ordering. Maintaining a global agreement based on time caused VoltDB to incur additional latency - intra-node communication agreement involving all nodes. This architecture required users to synchronize cluster node clocks using NTP, because any time drift between nodes would introduce artificial, unneeded latency to the system. This overhead, and the strict need to maintain clock synchronization has been eliminated from the VoltDB v3.0 code base.

So how does the new solution work?

Original title and link: VoltDB 3.0 Will Include a New Transaction Coordination Architecture (NoSQL database©myNoSQL)

Integrating VoltDB With the Spring Framework

There are two Java clients for VoltDB. One is a standard JDBC driver that executes all queries synchronously. The other is a specialized client library that can run queries either synchronously or asynchronously, along with a number of other features. Synchronous queries perform well enough but their throughput is no match for asynchronous queries. Asynchronous query throughput is approximately four times greater than synchronous queries in a two node VoltDB cluster. For example, an application using asynchronous queries can run over 200K TPS (transactions per second) in a two node server cluster using a single client running on a Macbook Pro; a synchronous client running the same queries will achieve around 56K TPS.

Could anyone explain what leads to such a difference in performance?

Original title and link: Integrating VoltDB With the Spring Framework (NoSQL database©myNoSQL)


NoSQL Databases Adoption in Numbers

Source of data is Jaspersoft NoSQL connectors downloads. RedMonk published a graphic and an analysis and Klint Finley followed up with job trends:

NoSQL databases adoption

Couple of things I don’t see mentioned in the RedMonk post:

  1. if and how data has been normalized based on each connector availability

    According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.

  2. if and how marketing pushes for each connectors have been weighed in

    Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)

  3. Redis and VoltDB are mostly OLTP only databases

Original title and link: NoSQL Databases Adoption in Numbers (NoSQL database©myNoSQL)

VoltDB Assumptions: Memory vs Disk

These are the assumptions under which VoltDB was architected:

First, it should be noted that main memory is getting very cheap.  It is straightforward to put 50 Gbytes of memory on a $5,000 server.  Beefy servers these days have 10 times that amount. Moreover, many (but not all) transactional databases don’t require massive storage volumes. An OLTP application with more than a few Tbytes of data is quite rare.  The same can be said for new OLTP “fire hose” applications that require ultra-high write throughput and ACID transactions (e.g., digital advertising, wireless, real-time monitoring) — these systems rarely need to manage more than a few Tbytes of hot state.  Hence, it is plausible to buy enough main memory to store the data for the vast majority of OLTP applications. 

Original title and link: VoltDB Assumptions: Memory vs Disk (NoSQL database©myNoSQL)


VoltDB for Real-Time Network Monitoring

From the announcement of VoltDB being used by the Japanese ISP, Sakura Internet, for their real-time Internet traffic monitoring and analysis platform for detecting and mitigating large-scale distributed denial of service (DDoS) attacks:

Tamihiro Yuzawa[1]: Our system needs to be capable of sifting through massive amounts of traffic flow data in real-time.  VoltDB was our choice from the beginning because it’s a super-fast datastore that supports SQL. 

Scott Jarr[2]: Sakura’s security infrastructure requires a datastore that can scale massively and on demand, without sacrificing data accuracy.

Mark these VoltDB keywords:

  1. fast (read in-memory)
  2. data consistency
  3. SQL

  1. Tamihiro Yuzawa: Systems Engineer at Sakura Internet  

  2. Scott Jarr: VoltDB CEO  

Original title and link: VoltDB for Real-Time Network Monitoring (NoSQL database©myNoSQL)

Comments on Urban Myths About NoSQL

Dan Weinreb comments on Michael Stonebraker’s Urban Myths about SQL (PDF) :

Dr. Michael Stonebraker recently posted a presentation entitled “Urban Myths about NoSQL”. Its primary point is to defend SQL, i.e. relational, database systems against the claims of the new “NoSQL” data stores. Dr. Stonebraker is one of the original inventors of relational database technology, and has been one of the most eminent database researchers and practitioners for decades.

In fact, Michael Stonebraker bashes everything that is not his current product—this GigaOm interview is the latest example.

For now, I’m filing this away until VoltDB is sold.

Original title and link: Comments on Urban Myths About NoSQL (NoSQL database©myNoSQL)


Multi-Document Transactions in RavenDB vs Other NoSQL Databases

“We tried using NoSQL, but we are moving to Relational Databases because they are easier…”

This is how Oren Eini starts his post about RavenDB support for multi-document transactions and the lack of it from MongoDB:

  1. For a single server, we support atomic multi document writes natively. (note that this isn’t the case for Mongo even for a single server).
  2. For multiple servers, we strongly recommend that your sharding strategy will localize documents, meaning that the actual update is only happening on a single server.
  3. For multi server, multi document atomic updates, we rely on distributed transactions.

In the NoSQL space, there are a couple of other solutions that support transactions:

If you look at these from the perspective of distributed systems, the only distributed ones that support transactions are Megastore and RavenDB. There’s also VoltDB which is all transactions. Are there any I’ve left out?

Original title and link: Multi-Document Transactions in RavenDB vs Other NoSQL Databases (NoSQL database©myNoSQL)

Short Notes about VoltDB

Here are the notes I’ve made while watching a webinar about building applications with VoltDB.

What I like:

What I don’t like:

  • you need to compile and deploy the schema, queries, etc.
  • you have to define the cluster topology in an XML file
  • everything is transactional Let’s say you have k-factor 2 and a materialized view: an insert will put your data on the 3 servers and the materialized view within a single transaction.
  • it’s not clear how you could evolve your schema
  • the API doesn’t use timeouts

Original title and link: Short Notes about VoltDB (NoSQL databases © myNoSQL)