neo4j: All content on NoSQL databases and projects about neo4j, featuring the best daily NoSQL articles, news, and links on neo4j

Exploring Neo4j, the NoSQL Graph Database

by Alex Popescu

Twitter Reddit

Rahul Sharma takes a look at Neo4j and some basic operations with graph databases:

Let us say we want to implement a use-case where there are persons and a person can be connected to other persons. In order to use Neo4J we must think about POJOs in terms of interfaces and corresponding implementions. This is so because the database is a key-value store at the back, so it asks us to store the properties of the POJO in terms of key-value pairs. Moreover there are no foreign keys in Neo4J, objects in the db are connected with other objects using Relationships.

Interestingly, he mentions getting some errors when trying to push 151K names. Sounds like he could use this Neo4j tip for handling long transactions.

Original title and link for this post: Exploring Neo4j, the NoSQL Graph Database (published on the NoSQL blog: myNoSQL)


Neo4j: Advanced Indexes Using Multiple Keys

by Alex Popescu

Twitter Reddit

There’s a prototype implementation of a new index which solves this (and some other issues as well, f.ex. indexing for relationships). The code is at https://svn.neo4j.org/laboratory/components/lucene-index/ and it’s built and deployed over at http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/

The new index isn’t compatible with the old one so you’ll have to index your data with the new index framework to be able to use it.

Before you were only able to search by a single property.

Original title and link for this post: Neo4j: Advanced Indexes Using Multiple Keys (published on the NoSQL blog: myNoSQL)


Django and NoSQL Databases Revisited

by Alex Popescu

Twitter Reddit

Django decided long time ago that Ruby on Rails cannot be the only framework where people can have fun integrating with all NoSQL databases. During this year DjangoCon Europe there were several session dedicated to Django and NoSQL databases:

What NoSQL support in the Django ORM looks like, and how do we get there

Alex Gaynor speaks about what needs to change in Django ORM to make it more NoSQL friendly:

Reinout van Rees has a summary of the talk ☞ here.

Using MongoDB in your app

Peter Bengtsson talks about his experience of passing from using ZODB for the last 10 years to MongoDB

Some notes from the talk are available ☞ here.

Relax your project with CouchDB

Benoît Chesneau talks about what makes CouchDB appealing to python developers. He also covers the CouchDBkit python framework.

Django and Neo4j: Domain Modeling that Kicks Ass

Not coming from DjangoCon, but still about Django and Neo4j, is Tobias Ivarsson’s presentation: “Django and Neo4j - Domain modeling that kicks ass”:

Derek Stainer summarizes the slide deck ☞ here.

Django and NoSQL Panel

A fantastic panel on the future of Django and NoSQL databases that you can watch over ☞ blip.tv. Reinout van Rees published a transcript of the panel ☞ here.

All in all a lot of NoSQL excitement in the Django world! Or should it be the opposite?

Update: Here is the latest Django and NoSQL Databases status update

Django and NoSQL Databases Revisited originally posted on the NoSQL blog: myNoSQL


Gephi: Visualization Library for Graph Databases

by Alex Popescu

Twitter Reddit

You probably know by now that I love visualization tools:

Get the version of Gephi app that can read neo4j databases bzr branch http://bazaar.launchpad.net/~bujacik/gephi/support-for-neo4j:

Gephi and Neo4j

Transport Route Planner Using Neo4j

by Alex Popescu

Twitter Reddit

TransportDublin.ie:

It is combines Neo4j , Google Maps API v3 , Spring 3.0 MVC-AJAX with JQuery and Javascript parsed JSON for the presentation layer.


Neo4j Tips & Tricks: Handling Long Transactions

by Alex Popescu

Twitter Reddit

An answer to the question: is write performance influenced by the size of transactions? (nb the “popular” question though is: why does my write performance drops off when performing many operations in a single transaction?):

The reason is because Neo4j keeps the transaction’s operations in memory until commit, so your JVM will eventually run out of memory and start paging to disk.

There are two solutions:

  1. split your transactions into groups of 30,000 or so (obviously you give up the ability to do a full rollback)
  2. skip the transaction part and use the BatchInserter, which writes directly to the persistence layer rather than keeping everything in memory.

Morpheus: A Web Admin for Neo4j

by Alex Popescu

Twitter Reddit
2 likes

Always good to see this sort of tools coming out:

Morpheus: This is a stand-alone, full feature Neo4j distribution that exposes a Neo4j database over REST together with a web-based administrative interface for said database.

Morpheus: Neo4j web admin

Before starting to use it make sure you check the license though[1]


  1. Neo4j is using an AGPL license and tools connecting directly to it might need to be distributed using the same license.  ()

Video: Emil Eifrem about NoSQL and the Benefits of Graph Databases

by Alex Popescu

Twitter Reddit

InfoQ[1] style!


  1. The presentation is great, but as a disclaimer please keep in mind I’m the co-founder of InfoQ.com. I also have a hint: InfoQ just added the possibility to watch presentations in both vertical and horizontal mode. Hope you’ll like it!  ()

Neo4j and PHP and Probably More

by Alex Popescu

Twitter Reddit

Protocols are extremely important and Neo4j has been opening up to a whole new world with its addition of the REST API. Now people using any programing language can try out this graph database.

Rob Olmost shows an example of using the Neo4j REST API[1] from PHP:

I was trying out Neo4j due to my curiosity of the graph specialization. Although Neo4j is not designed to run stand-alone like a database server there is a sub-project that adds a REST API to allow non-Java applications to make use of Neo4j. Neo4j is pretty simple, you basically have nodes, relationships, and properties on both. That’s about it.

First time I’ve heard of Neo4j, my first question was: how will I be able to define custom traversals? One part that I wasn’t aware of in the Neo4j REST is the embedded JavaScript engine — they chose Rhino for its easy integration with Java — allowing users to define node traversals using javascript. Problem solved!

As a side note (and I haven’t checked the code it), I think that integrating any language that runs on top of the Java VM would be possible, so imagine having your traversals in your preferred language like Groovy, Python, Ruby, or even Clojure. Pretty cool, isn’t it?


  1. You can read more about using Neo4j REST API ☞ here  ()

Recommendation Engines with Neo4j

by Alex Popescu

Twitter Reddit
1 likes

Even if a bit hidden behind the Rails code, the slides embedded below and these ☞ code snippets are using the RESTful access to Neo4j to build a recommendation engine in a network of StarWars heroes:

You Might Also Like: Implementing User Recommendations in Rails

Probably obvious, but graph databases are definitely a good tool for building recommendation engines.


NoSQL Graph Database Matrix

by Alex Popescu

Twitter Reddit

After triggering our quick review of graph databases, Pere Urbón came up with a nice comparison of these — Neo4j, HyperGraphDB, DEX, InfoGrid, Sones, VertexDB — in terms of License, Schema, Querying, Storage implementation, Utilities, Language and Operating system support.

Pere has made this very interesting NoSQL graph database matrix available as a ☞ PDF on his blog.


An Interesting Problem: Scaling Graph Databases

by Alex Popescu

Twitter Reddit
2 likes

One of the problems mentioned when discussing relational databases scalability is that handling storage enforced relationships, ACID and scale do not play well together. In the NoSQL space there is a category of storage solutions that uses highly interconnected data: graph databases. (note also that some of these graph databases are also transactional).

Lately there have been quite a few interesting discussions related to scaling graph databases. Alex Averbuch is working on a sharding Neo4j thesis and his recent ☞ post presents some of the possible solutions. Alex’s article is a very good starting point for anyone interesting in scaling graph databases.

Then there is also this ☞ article on InfoGrid‘s blog that is presenting a different web-like solution based on a custom protocol: ☞ XPRISO: eXtensible Protocol for the Replication, Integration and Synchronization of distributed Objects. While I haven’t had the chance to dig deeper into InfoGrid suggested approach there was one thing that caught my attention right away: while the association with web-scale is definitely an interesting idea, having specific knowledge of the nodes location and having to use custom API for it doesn’t seem to be the best solution. Basically the web addressed this by having URIs for each reachable resource (InfoGrid should try a similar idea, get rid of the different API for accessing local vs remote nodes, etc.)

Update: make sure you check the comment thread for more details about InfoGrid perspective on scaling graph databases.

Oren Eini concludes in ☞ his post:

After spending some time thinking about it, I came to the conclusion that I can’t envision any general way to solve the problem. Oh, I can think of several ways of reduce the problem:

  • Batching cross machine queries so we only perform them at the close of each breadth first step.
  • Storing multiple levels of associations (So “users/ayende” would store its relations but also “users/ayende”’s relation and “users/arik”’s relations).

While I haven’t had enough time to think about this topic, my gut feeling is that possible solutions are to be found in the space of a combination of using unique identifiers for distributed nodes and a mapreduce-like approach. I cannot stop wondering if this is not what Google’s ☞ Pregel is doing (nb I should have read the ☞ paper (pdf) firstly).