NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



graphdb: All content tagged as graphdb in NoSQL databases and polyglot persistence

Using Treetop and Neo4j Cypher to Simulate Facebook Graph Search

Interesting as an exercise considering Max de Marzi shared all the code on GitHub, but completely unrelated to the breadth and depth of the Facebook Graph Search.

Original title and link: Using Treetop and Neo4j Cypher to Simulate Facebook Graph Search (NoSQL database©myNoSQL)


Neo Technology Is H… Wait, It’s Building Neo4j-As-A-Service

Neo Technology’s hiring announcement is clear about their intention:

“[…] you will be resonsible for building, managing, and maintaining a 24x7 NOSQL Databases-as-a-Service operation […]”

In the graph databases space, OrientDB is offering a hosting solution NuvolaBase, but I have no numbers about their business so far.

Original title and link: Neo Technology Is H… Wait, It’s Building Neo4j-As-A-Service (NoSQL database©myNoSQL)

Linkurious: Visualize Graph Data Easily

Nice tool for visualizing and exploring graph databases:


Currently it supports only Neo4j, but it can be extended to other graph databases.

Original title and link: Linkurious: Visualize Graph Data Easily (NoSQL database©myNoSQL)


On Graph Computing: Practical Applications and Graph Computing Technologies

Marko A. Rodriguez in a must-read-must-bookmark-must-print article about graphs, graph processing, their applicability, and related technologies:

The concept of a graph has been around since the dawn of mechanical computing and for many decades prior in the domain of pure mathematics. Due in large part to this golden age of databases, graphs are becoming increasingly popular in software engineering. Graph databases provide a way to persist and process graph data. However, the graph database is not the only way in which graphs can be stored and analyzed. Graph computing has a history prior to the use of graph databases and has a future that is not necessarily entangled with typical database concerns. There are numerous graph technologies that each have their respective benefits and drawbacks. Leveraging the right technology at the right time is required for effective graph computing.

Original title and link: On Graph Computing: Practical Applications and Graph Computing Technologies (NoSQL database©myNoSQL)


A Comparison of 7 Graph Databases

The main page of InfiniteGraph, a graph database commercialized by Objectivity, features an interesting comparison of 7 graph databases (InfiniteGraph, Neo4j, AllegroGraph, Titan, FlockDB, Dex, OrientDB) based on 16 criteria: licensing, source, scalability, graph model, schema model, API, query method, platforms, consistency, concurrency (distributed processing), partitioning, extensibility, visualizing tools, storage back end/persistency, language, backup/restore.

7 graph databases

Unfortunately the image is almost unreadable, but Peter Karussell has extracted the data in a GoogleDoc spreadsheet embedded below.

Original title and link: A Comparison of 7 Graph Databases (NoSQL database©myNoSQL)

Using Neo4j Graph Database With Ruby

A two part article by Thiago Jackiw providing a brief explanation of what graph databases and Neo4j are and a quick look at 3 Ruby libraries for Neo4j: Neo4j.rb1, Neography2, and Neoid3

This article demonstrated how to install Neo4j and the basic idea of how to integrate it with a Ruby/Rails application using the different solutions available. Even though the examples given here barely scratched the surface of Neo4j, it should hopefully give you enough knowledge and curiosity to start integrating it on your own projects.

Original title and link: Using Neo4j Graph Database With Ruby (NoSQL database©myNoSQL)

Neo Technology Raises Another $11mil for Neo4j Graph Database

Derrick Harris for GigaOm:

Graph database startup Neo Technology has raised another $11 million, providing more fuel to the fire of specialized databases. Whether they’re graph databases organizing data by relationships, or geospatial databases concerned with where stuff is located, everyone is trying capitalize on myriad new data sources available.

According to my calculations this brings Neo Technology at $24.1 millions ($10.6M in Sept.2011 and $2.5 in Oct.2009).

Original title and link: Neo Technology Raises Another $11mil for Neo4j Graph Database (NoSQL database©myNoSQL)


Next Neo4j Version Implementing HA Without ZooKeeper

The next version of Neo4j will remove the dependency on ZooKeeper for high availability setups. In a post on Neo4j blog, the team has announced the availability of the 1st milestone of Neo4j 1.9 which already contains the new implementation of Neo4j High Availability Cluster:

With Neo4j 1.9 M01, cluster members communicate directly with each other, based on an implementation of the Paxos consensus protocol for master election.

According to the updated documentation annotated with my own comments:

  • Write transactions can be performed on any database instance in a cluster. (nb: writes are performed on the master first, but the cluster does the routing automatically)
  • If the master fails a new master will be elected automatically. A new master is elected and started within just a few seconds and during this time no writes can take place (the writes will block or in rare cases throw an exception)
  • If the master goes down any running write transaction will be rolled back and new transactions will block or fail until a new master has become available.
  • The cluster automatically handles instances becoming unavailable (for example due to network issues), and also makes sure to accept them as members in the cluster when they are available again.
  • Transactions are atomic, consistent and durable but eventually propagated out to other slaves. (nb: a transaction includes only the write to the master)
  • Updates to slaves are eventual consistent by nature but can be configured to be pushed optimistically from master during commit. (nb: writes to slave will still not be part of the transaction)
  • In case there were changes on the master that didn’t get replicated before it failed, there are chances to reach a situation where two different versions exists—if the failed master recovers. This situation is resolved by having the old master dismiss its copy of the data (nb the documentation says move away)
  • Reads are highly available and the ability to handle read load scales with more database instances in the cluster.

Original title and link: Next Neo4j Version Implementing HA Without ZooKeeper (NoSQL database©myNoSQL)

What Is the Most Promising Graph Datastore?

Very interesting answer on Quora from professor Josep Lluis Larriba Pey.

  1. for very lager data size (TB): Infinitegraph, DEX
  2. for query speed: DEX
  3. for transaction support: Neo4j

Original title and link: What Is the Most Promising Graph Datastore? (NoSQL database©myNoSQL)


What's the Current State of Graph Databases?

Jim Webber1 in an interview with Srini Penchikala for InfoQ:

The graph databases are odd, because they’ve actually decided to have a much more expressive data model compared to relational databases. So I think they are an oddity compared to the other three types of NoSQL stores, which means that when a developer first comes across them there is an awful lot of head scratching—you can see this haircut was completely caused by Neo4J. So I think compared to the other NoSQL stores, the graph database community is a little bit further behind in terms of adoption and penetration because they are a bit of an odd beast when you look at them first, “What would I use graphs for, they are those things I forgot from university, with that boring old guy doing math on the whiteboard”, on the blackboard even, I’m so old we had chalk, would you believe?

It’s almost always impossible for me to disagree with Jim. Expanding a bit on the quote above, I’d speculate that a bit of head scratching before adopting a new database is good as it means you’ll not see many improper use cases.

  1. Jim Webber: Chief Scientist at Neo Technology 

Original title and link: What’s the Current State of Graph Databases? (NoSQL database©myNoSQL)


Paper: Efficient Subgraph Matching on Billion Node Graphs

Papers from VLDB 2012 are starting to surface. Authored by a Chinese team, the “Efficient Subgraph Matching on Billion Node Graphs” paper is introducing a new algorithm optimized for large scale graphs:

We present a novel algorithm that supports efficient subgraph matching for graphs deployed on a distributed memory store. Instead of relying on super-linear indices, we use efficient graph exploration and massive parallel computing for query processing. Our experimental results demonstrate the feasibility of performing subgraph matching on web-scale graph data.

Comparison of space and time complexity of other subgraph matching algorithms:

Subgraph Matching Methods

Rolling Upgrades in Upcoming Neo4j 1.8

Chris Gioran describes rolling upgrades, a new feature in the upcoming Neo4j 1.8

So the rolling upgrade, actually, works exactly as you’d expect an upgrade would work. If there are not breaking changes between versions, you normally begin with the slaves, powering down, copying the store, migrating configuration if needed, then bringing that server back up. The new version would take over, communicate with the rest of the cluster and you wouldn’t notice anything.

A rolling upgrade offers that with versions that have incompatible protocols. Each slave, as it is brought up, detects the version running in the cluster and gracefully falls back into a compatibility mode that doesn’t allow it to become master, but allows it to continue to execute transactions.

Another thing I’ve found interesting is that the time a master machine is upgraded is considered the confirmation of a completed upgrade and all machines are switching to the new protocol. Clever.

Original title and link: Rolling Upgrades in Upcoming Neo4j 1.8 (NoSQL database©myNoSQL)