voltdb: All content tagged as voltdb in NoSQL databases and polyglot persistence
From the VoltDB 3.0 preview notes:
VoltDB v3.0 includes a new transaction coordination architecture that reduces latency and improves transaction throughput.
In VoltDB versions 1.x and 2.x, transactions were globally ordered and ordered according to time. Each node on the cluster communicated with every other node to agree upon transaction ordering. Maintaining a global agreement based on time caused VoltDB to incur additional latency - intra-node communication agreement involving all nodes. This architecture required users to synchronize cluster node clocks using NTP, because any time drift between nodes would introduce artificial, unneeded latency to the system. This overhead, and the strict need to maintain clock synchronization has been eliminated from the VoltDB v3.0 code base.
So how does the new solution work?
Original title and link: VoltDB 3.0 Will Include a New Transaction Coordination Architecture ( ©myNoSQL)
Couple of things I don’t see mentioned in the RedMonk post:
if and how data has been normalized based on each connector availability
According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.
if and how marketing pushes for each connectors have been weighed in
Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)
Redis and VoltDB are mostly OLTP only databases
Original title and link: NoSQL Databases Adoption in Numbers ( ©myNoSQL)
From the announcement of VoltDB being used by the Japanese ISP, Sakura Internet, for their real-time Internet traffic monitoring and analysis platform for detecting and mitigating large-scale distributed denial of service (DDoS) attacks:
Tamihiro Yuzawa: Our system needs to be capable of sifting through massive amounts of traffic flow data in real-time. VoltDB was our choice from the beginning because it’s a super-fast datastore that supports SQL.
Scott Jarr: Sakura’s security infrastructure requires a datastore that can scale massively and on demand, without sacrificing data accuracy.
Mark these VoltDB keywords:
- fast (read in-memory)
- data consistency
Original title and link: VoltDB for Real-Time Network Monitoring ( ©myNoSQL)
“We tried using NoSQL, but we are moving to Relational Databases because they are easier…”
This is how Oren Eini starts his post about RavenDB support for multi-document transactions and the lack of it from MongoDB:
- For a single server, we support atomic multi document writes natively. (note that this isn’t the case for Mongo even for a single server).
- For multiple servers, we strongly recommend that your sharding strategy will localize documents, meaning that the actual update is only happening on a single server.
- For multi server, multi document atomic updates, we rely on distributed transactions.
In the NoSQL space, there are a couple of other solutions that support transactions:
- Google Megastore
- Redis has two mechanisms that come close to transactions: MULTI/EXEC/DISCARD and pipelining —this one is exemplified in this Redis based triplestore database implementation
- many of the graph databases (Neo4j, HyperGraphDB, InfoGrid)
If you look at these from the perspective of distributed systems, the only distributed ones that support transactions are Megastore and RavenDB. There’s also VoltDB which is all transactions. Are there any I’ve left out?
Original title and link: Multi-Document Transactions in RavenDB vs Other NoSQL Databases (NoSQL database©myNoSQL)
Here are the notes I’ve made while watching a webinar about building applications with VoltDB.
What I like:
- it forces you to think upfront about data partitioning by specifying partitioned or replicated tables
- it forces you to think about data access patterns by asking to define the Java-based stored procedures
- it provides both a synchronous and asynchronous API
- there’s an option to run any queries in development mode
What I don’t like:
- you need to compile and deploy the schema, queries, etc.
- you have to define the cluster topology in an XML file
- everything is transactional
Let’s say you have
k-factor2 and a materialized view: an insert will put your data on the 3 servers and the materialized view within a single transaction.
- it’s not clear how you could evolve your schema
- the API doesn’t use timeouts