Graph database: All content tagged as Graph database in NoSQL databases and polyglot persistence
The next version of Neo4j will remove the dependency on ZooKeeper for high availability setups. In a post on Neo4j blog, the team has announced the availability of the 1st milestone of Neo4j 1.9 which already contains the new implementation of Neo4j High Availability Cluster:
With Neo4j 1.9 M01, cluster members communicate directly with each other, based on an implementation of the Paxos consensus protocol for master election.
According to the updated documentation annotated with my own comments:
- Write transactions can be performed on any database instance in a cluster. (nb: writes are performed on the master first, but the cluster does the routing automatically)
- If the master fails a new master will be elected automatically. A new master is elected and started within just a few seconds and during this time no writes can take place (the writes will block or in rare cases throw an exception)
- If the master goes down any running write transaction will be rolled back and new transactions will block or fail until a new master has become available.
- The cluster automatically handles instances becoming unavailable (for example due to network issues), and also makes sure to accept them as members in the cluster when they are available again.
- Transactions are atomic, consistent and durable but eventually propagated out to other slaves. (nb: a transaction includes only the write to the master)
- Updates to slaves are eventual consistent by nature but can be configured to be pushed optimistically from master during commit. (nb: writes to slave will still not be part of the transaction)
- In case there were changes on the master that didn’t get replicated before it failed, there are chances to reach a situation where two different versions exists—if the failed master recovers. This situation is resolved by having the old master dismiss its copy of the data (nb the documentation says move away)
- Reads are highly available and the ability to handle read load scales with more database instances in the cluster.
Original title and link: Next Neo4j Version Implementing HA Without ZooKeeper ( ©myNoSQL)
Scroll to minute 16:55 of this video to watch Jim Webber explain the benefits of polyglot persistence and how starting (again) the winner-takes-it-all war is just sending us back at least 10 years from the database Nirvana.
We’ve just come from the place where one-size-fits-all and we don’t want to go back there. There is a huge wonderful ecosystem of stores. Pick the right one. Don’t just assume that the one you find the easiest or the one that shouts the loudest is the one you’re going to use. Pick the one that suits your data model.
It doesn’t matter what flavor of relational or NoSQL database you prefer or have experience with or if a small or large database vendor is paying your bills. You really need to get this right as otherwise we’re just going to destroy a lot of valuable options we’ve added to our toolboxes.
Original title and link: The Database Nirvana ( ©myNoSQL)
Very good slidedeck from Max de Marzi introducing Neo4j’s Cypher query language. While you’ll have to go through the 50 slides yourself to get the details, I’ve extracted a couple of interesting bits:
- Cypher was created because Neo4j Java API was too verbose and Gremlin is too prescriptive
- SPARQL was designed for a different data model and doesn’t work very well with a graph database
- Cypher design decisions:
- ASCII-art patterns (nb: when first sawing Cypher I haven’t thought of this, but it is cool)
- external DSL
- SQL familiarity (nb: as much as it’s possible with a radically different data model and processing model)
Marco A. Rodriguez in Exploring Wikipedia with Gremlin Graph Traversals:
There are numerous ways in which Wikipedia can be represented as a graph. The articles and the href hyperlinks between them is one way. This type of graph is known a single-relational graph because all the edges have the same meaning — a hyperlink. A more complex rendering could represent the people discussed in the articles as “people-vertices” who know other “people-vertices” and that live in particular “city-vertices” and work for various “company-vertices” — so forth and so on until what emerges is a multi-relational concept graph. For the purpose of this post, a middle ground representation is used. The vertices are Wikipedia articles and Wikipedia categories. The edges are hyperlinks between articles as well as taxonomical relations amongst the categories.
Imagine the reachness of the model you’d achieve when every piece of data and metadata would become a vertex or an edge. It’s not just the wealth of data but also the connectivity. Time would be the only missing dimension.
Original title and link: The Richness of the Graph Model: The Sky Is the Limit ( ©myNoSQL)