NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Graph database: All content tagged as Graph database in NoSQL databases and polyglot persistence

Cayley: an open-source graph database

From the GitHub repo:

Cayley is an open-source graph inspired by the graph database behind Freebase and Google’s Knowledge Graph.

  • Written in Go
  • Easy to get running (3 or 4 commands, below)
  • RESTful API * or a REPL if you prefer
  • Built-in query editor and visualizer
  • Multiple query languages: * JavaScript, with a Gremlin-inspired* graph object. * (simplified) MQL, for Freebase fans
  • Plays well with multiple backend stores: * LevelDB for single-machine storage * MongoDB * In-memory, ephemeral
  • Modular design; easy to extend with new languages and backends
  • Good test coverage
  • Speed, where possible.

✚ What’s interesting is that even if under Google’s GitHub account, the project is not backed by Google.

✚ The Hacker News thread focuses on the existing graph database market.

Original title and link: Cayley: an open-source graph database (NoSQL database©myNoSQL)

Neo4j unit testing with GraphUnit

Testing the state of an Embedded Neo4j database is now much easier if you use GraphUnit, a component of the GraphAware Neo4j Framework.

Interesting approach. The only downside I could see at the first glance is that it might become a maintenance nightmare if your model evolves and data changes.

Original title and link: Neo4j unit testing with GraphUnit (NoSQL database©myNoSQL)


Storing, processing, and computing with graphs

Marko Rodriguez is on the roll with yet another fantastic article about graphs:

To the adept, graph computing is not only a set of technologies, but a way of thinking about the world in terms of graphs and the processes therein in terms of traversals. As data is becoming more accessible, it is easier to build richer models of the environment. What is becoming more difficult is storing that data in a form that can be conveniently and efficiently processed by different computing systems. There are many situations in which graphs are a natural foundation for modeling. When a model is a graph, then the numerous graph computing technologies can be applied to it.

✚ If you missed it, the other recent article I’m referring to is “Knowledge representation and reasoning with graph databases

Original title and link: Storing, processing, and computing with graphs (NoSQL database©myNoSQL)


A Story of graphs, DBs, and graph databases

After Marko Rodriguez’s Knowledge representation and reasoning with graph databases, another great intro to graph databases resource is Joshua Shinavier’s presentation:

Knowledge representation and reasoning with graph databases

A graph database and its ecosystem of technologies can yield elegant, efficient solutions to problems in knowledge representation and reasoning. To get a taste of this argument, we must first understand what a graph is.

And Marko Rodriguez delivers a dense but very readable intro to modeling with graphs.

Original title and link: Knowledge representation and reasoning with graph databases (NoSQL database©myNoSQL)


Paper: Parallel Graph Partitioning for Complex Networks

Authored by a team from Karlsruhe Institute of Technology, the paper “Parallel graph partitioning for complex networks” presents a parallelized and adapting label propagation technique for partitioning graphs:

The graph partitioning problem is NP-complete [3], [4] and there is no approximation algorithm with a constant ratio factor for general graphs [5]. Hence, heuristic algorithms are used in practice.

A successful heuristic for partitioning large graphs is the multilevel graph partitioning (MGP) approach depicted in Figure 1, where the graph is recursively contracted to achieve smaller graphs which should reflect the same basic structure as the input graph.

Amazon Web Services Global Infrastructure Graph

Super-smart and impressive application of a graph database to a real domain:

Wouldn’t it be nice if you could slice and dice through the entire AWS domain of services, data centres and prices all in one spot to optimise your AWS bill? , enter the AWS Global Infrastructure Graph!

Original title and link: Amazon Web Services Global Infrastructure Graph (NoSQL database©myNoSQL)


Powered by Neo4j: CrunchBase's Business Graph

A new highly visible project in Neo4j’s portfolio:

Our vision for CrunchBase is to create that Business Graph. We are not there yet, but we have the right ingredients in terms of technology, data and momentum to make it a reality. For starters, CrunchBase 2.0 is built on database specially designed for graph-based applications. We have an existing dataset of 530,000 people and companies, which in turn brings CrunchBase more than 2 million visitors and 10,000 individual data contributors each month.

While I’m sure you could generate a page like this from any database, I can also see why the data model of a graph database is appealing for this sort of data.

Original title and link: Powered by Neo4j: CrunchBase’s Business Graph (NoSQL database©myNoSQL)


Getting started with Neo4j 2.0

Very good introductory post by Jim Webber about Neo4j and some of the new features in the 2.0 release:

In this article we’ve seen how Neo4j 2.0 and the new version of the Cypher query language can be used to store and query a range of retail data from product catalogue to customer purchases. We also saw how straightforward it was to quickly gain insight from that data, despite the domain being highly and intricately connected.

Original title and link: Getting started with Neo4j 2.0 (NoSQL database©myNoSQL)


Neo4j trick: using remote shell combined with Neo4j embedded

Stefan Armbruster:

In cases where Neo4j is used in embedded mode, there is often a demand for having a maintenance channel to the database, e.g. for fixing wrong data. Nothing simpler than that, there’s an easy way to enable the remote shell together with embedded mode

Really nice trick!

Original title and link: Neo4j trick: using remote shell combined with Neo4j embedded (NoSQL database©myNoSQL)


MySQL is a great Open Source project. How about open source NoSQL databases?

In a post titled Some myths on Open Source, the way I see it, Anders Karlsson writes about MySQL:

As far as code, adoption and reaching out to create an SQL-based RDBMS that anyone can afford, MySQL / MariaDB has been immensely successful. But as an Open Source project, something being developed together with the community where everyone work on their end with their skills to create a great combined piece of work, MySQL has failed. This is sad, but on the other hand I’m not so sure that it would have as much influence and as wide adoption if the project would have been a “clean” Open Source project.

The article offers a very black-and-white perspective on open source versus commercial code. But that’s not why I’m linking to it.

The above paragraph made me think about how many of the most popular open source NoSQL databases would die without the companies (or people) that created them.

Here’s my list: MongoDB, Riak, Neo4j, Redis, Couchbase, etc. And I could continue for quite a while considering how many there are out there: RavenDB, RethinkDB, Voldemort, Tokyo, Titan.

Actually if you reverse the question, the list would get extremely short: Cassandra, CouchDB (still struggling though), HBase. All these were at some point driven by community. Probably the only special case could be LevelDB.

✚ As a follow up to Anders Karlsson post, Robert Hodges posted The Scale-Out Blog: Why I Love Open Source.

Original title and link: MySQL is a great Open Source project. How about open source NoSQL databases? (NoSQL database©myNoSQL)


Neo4j 2.0 released - A graph browser and query language improvements

I might be wrong, but Neo4j guys seem to go back to making a big announcement in December. It is a big announcement as the version says: Neo4j got a new data browser and Cypher, Neo4j’s graph query language,

The official announcement contains more details about what’s new in Neo4j. There’s also an interview with Michael Hunger on InfoQ about the new version.

Last, there’s also a slidedeck about the changes and improvements in Cypher: