Graph database: All content tagged as Graph database in NoSQL databases and polyglot persistence
Monday, 19 March 2012
NoSQL Hosting Services
Michael Hausenblas put together a list of hosted NoSQL solutions including Amazon DynamoDB and SimpleDB, Google App Engine, Riak, Cassandra, CouchDB, MongoDB, Neo4j, and OrientDB. If you go through my posts on NoSQL hosting , you’ll find a couple more.
Original title and link: NoSQL Hosting Services (©myNoSQL)
via: http://webofdata.wordpress.com/2012/03/18/hosted-nosql/
How Can Graphs Apply to IT Operations
Being it IT, devops, or no-ops, operations are a critical part of every fairly sized system with real, expressed or not, SLAs. John E. Vincent’s post is an interesting look at what he feels is missing to make system operations more managable:
What I feel like we’re missing is a way to express those relationships and then trigger on them all the way up and down the chain as needed. We’re starting to get into graph territory here.
We must we be able to express and act on changes at the micro level (I changed a config, I must restart nginx) and even at the intranode level (something changed in my app tier, need to tell my load balancer) but now we need a way handle it at that macro level. Not only do we need a way to handle it but we must also be able to calculate what is impacted by that change.
Original title and link: How Can Graphs Apply to IT Operations (©myNoSQL)
via: http://blog.lusis.org/blog/2012/03/06/graphs-in-operations/
Friday, 16 March 2012
Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1
Two new releases in the graph databases space:
DEX Graph Database 4.5
The new DEX Graph Database release comes with pre-packaged graph algorithms—breadth and depth first traversal, shortest path, Gabow connectivity—available for Java, .NET, and C++. You can get the new version from here.
Neo4j 1.7 Milestone 1
As per Neo4j 1.7 milestone 1 update, this version features:
- improved Cypher
- SSL support
- improved Neo4j documentation
- high availability improvements (nb: there are recommended maintenance releases for Neo4j 1.5 and 1.6)
- upgraded Blueprints and Gremlin support
You can get Neo4j 1.7 from here.
Original title and link: Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1 (©myNoSQL)
Thursday, 15 March 2012
Using Graph Theory to Predict Basketball Teams Rankings
A directed network is simply a connection of nodes (representing teams) and arrows connecting teams called directed edges. Every time a team defeated another, an arrow was drawn from the losing team’s node to the winning team’s node to represent this game.
Basketball and beers.
Original title and link: Using Graph Theory to Predict Basketball Teams Rankings (©myNoSQL)
via: http://blog.biophysengr.net/2012/03/eigenbracket-2012-using-graph-theory-to.html
Wednesday, 14 March 2012
NoSQL Paper: The Trinity Graph Engine
Even if my first post about the Micosoft research graph database Trinity is back from March last year, I haven’t heard much about it since. Based on my tip, Klint Finley published an interesting speculation about Trinity, Dryad, Probase, and Bing. Since then though, Microsoft moved away from using Dryad to Hadoop and I’m still not sure about the status of the Trinity project. But I have found a paper about the Trinity graph engine authored by Bin Shao, Haixun Wang, Yatao Li. You can read it or download it after the break.
We introduce Trinity, a memory-based distributed database and computation platform that supports online query processing and offline analytics on graphs. Trinity leverages graph access patterns in online and offline computation to optimize the use of main memory and communication in order to deliver the best performance. With Trinity, we can perform efficient graph analytics on web-scale, billion-node graphs using dozens of commodity machines, while existing platforms such as MapReduce and Pregel require hundreds of machines. In this paper, we analyze several typical and important graph applications, including search in a so- cial network, calculating Pagerank on a web graph, and sub-graph matching on web-scale graphs without using index, to demonstrate the strength of Trinity.
Friday, 9 March 2012
Neo4j and D3.js: Visualizing Connections Over Time
Another great graph data visualization using Neo4j and D3.js from Max De Marzi:

- Max de Marzi is lately my favorite source for graph data visualization posts
- Even if the diagram looks amazing I’m wondering if it would scale for larger data sets
- Even if I gave it some thought, I’m still not sure how graph databases can record historical relationship/the evolution of relationships in a graph. If you have any ideas I’d love to hear.
Original title and link: Neo4j and D3.js: Visualizing Connections Over Time (©myNoSQL)
Monday, 5 March 2012
Insolvent Sones GraphDB Available for Sale
An article in a German publication mentions (according to Google translator) that sones GraphDB is up for sale:
The administrator of sones GmbH Hartig, Dr. Oliver lawyer, said that the graph database of insolvent sones GmbH will be sold.
Anyone interested?
Original title and link: Insolvent Sones GraphDB Available for Sale (©myNoSQL)
Monday, 27 February 2012
Beer Recommendations With Graph Databases
Josh Adell explains how to extend a simple recommendation engine to similarity-based collaborative filtering:
Instead of basing recommendations off of one similar rating, I can calculate how similarly you and I rated all the things we have rated, and only get recommendations from you if I have determined we are similar enough in our tastes.
This is much closer to how recommendation engines developed by sites like Amazon or Netflix are working.
Original title and link: Beer Recommendations With Graph Databases (©myNoSQL)
via: http://blog.everymansoftware.com/2012/02/similarity-based-recommendation-engines.html
Thursday, 23 February 2012
Gremlin vs Cypher
Romiko Derbynew comparing Gremlin and Neo4j Cypher:
- Simple graph traversals are much more efficient when using Gremlin
- Queries in Gremlin are 30-50% faster for simple traversals
- Cypher is ideal for complex traversals where back tracking is required
- Cypher is our choice of query language for reporting
- Gremlin is our choice of query language for simple traversals where projections are not required
- Cypher has intrinsic table projection model, where Gremlins table projection model relies on AS steps which can be cumbersome when backtracking e.g. Back(), As() and _CopySplit, where cypher is just comma separated matches
- Cypher is much better suited for outer joins than Gremlin, to achieve similar results in gremlin requires parallel querying with CopySplit, where as in Cypher using the Match clause with optional relationships
- Gremlin is ideal when you need to retrieve very simple data structures
- Table projection in gremlin can be very powerful, however outer joins can be very verbose
So in a nutshell, we like to use Cypher when we need tabular data back from Neo4j and is especially useful in outer joins.
Original title and link: Gremlin vs Cypher (©myNoSQL)
via: http://romikoderbynew.com/2012/02/22/gremlin-vs-cypher-initial-thoughts-neo4j/
Tuesday, 21 February 2012
InfiniteGraph 2.1 Features Gremlin Support and a Plugin Framework
A new version of InfiniteGraph, the graph database from Objectivity, was announced today. This release features:
- a plugin framework: Two kinds of plugins are supported. A navigator plugin bundles components that assist in navigation queries, such as result qualifiers, path qualifiers, and guides. The Formatter plugin formats and outputs results of graph queries.
- enhanced IG Visualizer: The advanced Visualizer is now tightly integrated with InfiniteGraph’s Plugin Framework allowing indexing queries for edges, the Formatter plugin framework export GraphML and JSON (built-in) or other user defined plugin formats.
- support for Tinkerpop Blueprints and Gremlin: InfiniteGraph provides a clean integration with Blueprints that is well suited for applications that want to traverse and query graph databases using Gremlin
A bit more details can be found in the InfiniteGraph 2.1 release notes.
Original title and link: InfiniteGraph 2.1 Features Gremlin Support and a Plugin Framework (©myNoSQL)
Monday, 13 February 2012
What types of applications might a graph database be well suited for?
Found this list of use cases for graph databases in a follow up of a Neo4j webinar:
- Social networks
- Collaboration programs
- Configuration Management
- Geo-Spatial applications
- Impact Analysis
- Master Data Management
- Network Management
- Product Line Management
- Recommendation Engines
The more generic answer would be that graph databases can be a great fit for problems handling highly connected data.
The examples above are clear cases of use cases involving highly connected data , but as of now I’m not aware of any social networks, network management, or large scale recommendation engines built on top of one of the existing graph databases.
Original title and link: What types of applications might a graph database be well suited for? (©myNoSQL)
Monday, 6 February 2012
Calculating a Graph's Degree Distribution Using R MapReduce over Hadoop
Marko Rodriguez is experimenting with R on Hadoop and one of his exercises is calculating a graph’s degree distribution. I confess I had to use Wikipedia for reminding what’s the definition of a node degree:
- The degree of a node in a network (sometimes referred to incorrectly as the connectivity) is the number of connections or edges the node has to other nodes. The degree distribution P(k) of a network is then defined to be the fraction of nodes in the network with degree k.
- The degree distribution is very important in studying both real networks, such as the Internet and social networks, and theoretical networks.
As an imagination exercise think of a graph database that’s actively maintaining an internal degree distribution and uses it to suggest or partition the graph. Would that work?
Original title and link: Calculating a Graph’s Degree Distribution Using R MapReduce over Hadoop (©myNoSQL)
via: http://groups.google.com/group/gremlin-users/browse_thread/thread/db50a72f92a26e06
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling