neo4j: All content on NoSQL databases and projects about neo4j, featuring the best daily NoSQL articles, news, and links on neo4j
Wednesday, 1 September 2010
Exploring Neo4j, the NoSQL Graph Database ☞
Rahul Sharma takes a look at Neo4j and some basic operations with graph databases:
Let us say we want to implement a use-case where there are persons and a person can be connected to other persons. In order to use Neo4J we must think about POJOs in terms of interfaces and corresponding implementions. This is so because the database is a key-value store at the back, so it asks us to store the properties of the POJO in terms of key-value pairs. Moreover there are no foreign keys in Neo4J, objects in the db are connected with other objects using Relationships.
Interestingly, he mentions getting some errors when trying to push 151K names. Sounds like he could use this Neo4j tip for handling long transactions.
Original title and link for this post: Exploring Neo4j, the NoSQL Graph Database (published on the NoSQL blog: myNoSQL)
Wednesday, 25 August 2010
Neo4j: Advanced Indexes Using Multiple Keys ☞
There’s a prototype implementation of a new index which solves this (and some other issues as well, f.ex. indexing for relationships). The code is at https://svn.neo4j.org/laboratory/components/lucene-index/ and it’s built and deployed over at http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/
The new index isn’t compatible with the old one so you’ll have to index your data with the new index framework to be able to use it.
Before you were only able to search by a single property.
Original title and link for this post: Neo4j: Advanced Indexes Using Multiple Keys (published on the NoSQL blog: myNoSQL)
Thursday, 5 August 2010
Django and NoSQL Databases Revisited
Django decided long time ago that Ruby on Rails cannot be the only framework where people can have fun integrating with all NoSQL databases. During this year DjangoCon Europe there were several session dedicated to Django and NoSQL databases:
- Alex Gaynor: What NoSQL support in the Django ORM looks like, and how do we get there
- Peter Bengtsson: Using MongoDB in your app
- Benoît Chesneau: Relax your project with CouchDB
- Tobias Ivarsson: Django and Neo4j: Domain Modeling that Kicks Ass
- Django and NoSQL Panel
What NoSQL support in the Django ORM looks like, and how do we get there
Alex Gaynor speaks about what needs to change in Django ORM to make it more NoSQL friendly:
Reinout van Rees has a summary of the talk ☞ here.
Using MongoDB in your app
Peter Bengtsson talks about his experience of passing from using ZODB for the last 10 years to MongoDB
Some notes from the talk are available ☞ here.
Relax your project with CouchDB
Benoît Chesneau talks about what makes CouchDB appealing to python developers. He also covers the CouchDBkit python framework.
Django and Neo4j: Domain Modeling that Kicks Ass
Not coming from DjangoCon, but still about Django and Neo4j, is Tobias Ivarsson’s presentation: “Django and Neo4j - Domain modeling that kicks ass”:
Derek Stainer summarizes the slide deck ☞ here.
Django and NoSQL Panel
A fantastic panel on the future of Django and NoSQL databases that you can watch over ☞ blip.tv. Reinout van Rees published a transcript of the panel ☞ here.
All in all a lot of NoSQL excitement in the Django world! Or should it be the opposite?
Update: Here is the latest Django and NoSQL Databases status update
Django and NoSQL Databases Revisited originally posted on the NoSQL blog: myNoSQL
Tuesday, 3 August 2010
Gephi: Visualization Library for Graph Databases ☞
You probably know by now that I love visualization tools:
Get the version of Gephi app that can read neo4j databases bzr branch http://bazaar.launchpad.net/~bujacik/gephi/support-for-neo4j:
![]()
Wednesday, 28 July 2010
Transport Route Planner Using Neo4j ☞
TransportDublin.ie:
It is combines Neo4j , Google Maps API v3 , Spring 3.0 MVC-AJAX with JQuery and Javascript parsed JSON for the presentation layer.
Friday, 23 July 2010
Neo4j Tips & Tricks: Handling Long Transactions ☞
An answer to the question: is write performance influenced by the size of transactions? (nb the “popular” question though is: why does my write performance drops off when performing many operations in a single transaction?):
The reason is because Neo4j keeps the transaction’s operations in memory until commit, so your JVM will eventually run out of memory and start paging to disk.
There are two solutions:
- split your transactions into groups of 30,000 or so (obviously you give up the ability to do a full rollback)
- skip the transaction part and use the BatchInserter, which writes directly to the persistence layer rather than keeping everything in memory.
Friday, 16 July 2010
Morpheus: A Web Admin for Neo4j ☞
Always good to see this sort of tools coming out:
Morpheus: This is a stand-alone, full feature Neo4j distribution that exposes a Neo4j database over REST together with a web-based administrative interface for said database.
Before starting to use it make sure you check the license though[1]
- Neo4j is using an AGPL license and tools connecting directly to it might need to be distributed using the same license. (↩)
Wednesday, 14 July 2010
Tuesday, 13 July 2010
Neo4j and PHP and Probably More ☞
Protocols are extremely important and Neo4j has been opening up to a whole new world with its addition of the REST API. Now people using any programing language can try out this graph database.
Rob Olmost shows an example of using the Neo4j REST API[1]
from PHP:
I was trying out Neo4j due to my curiosity of the graph specialization. Although Neo4j is not designed to run stand-alone like a database server there is a sub-project that adds a REST API to allow non-Java applications to make use of Neo4j. Neo4j is pretty simple, you basically have nodes, relationships, and properties on both. That’s about it.
First time I’ve heard of Neo4j, my first question was: how will I be able to define custom traversals? One part that I wasn’t aware of in the Neo4j REST is the embedded JavaScript engine — they chose Rhino for its easy integration with Java — allowing users to define node traversals using javascript. Problem solved!
As a side note (and I haven’t checked the code it), I think that integrating any language that runs on top of the Java VM would be possible, so imagine having your traversals in your preferred language like Groovy, Python, Ruby, or even Clojure. Pretty cool, isn’t it?
Friday, 11 June 2010
Recommendation Engines with Neo4j
Even if a bit hidden behind the Rails code, the slides embedded below and these ☞ code snippets are using the RESTful access to Neo4j to build a recommendation engine in a network of StarWars heroes:
Probably obvious, but graph databases are definitely a good tool for building recommendation engines.
Friday, 21 May 2010
NoSQL Graph Database Matrix
After triggering our quick review of graph databases, Pere Urbón came up with a nice comparison of these — Neo4j, HyperGraphDB, DEX, InfoGrid, Sones, VertexDB — in terms of License, Schema, Querying, Storage implementation, Utilities, Language and Operating system support.
Pere has made this very interesting NoSQL graph database matrix available as a ☞ PDF on his blog.
Friday, 14 May 2010
An Interesting Problem: Scaling Graph Databases
One of the problems mentioned when discussing relational databases scalability is that handling storage enforced relationships, ACID and scale do not play well together. In the NoSQL space there is a category of storage solutions that uses highly interconnected data: graph databases. (note also that some of these graph databases are also transactional).
Lately there have been quite a few interesting discussions related to scaling graph databases. Alex Averbuch is working on a sharding Neo4j thesis and his recent ☞ post presents some of the possible solutions. Alex’s article is a very good starting point for anyone interesting in scaling graph databases.
Then there is also this ☞ article on InfoGrid‘s blog that is presenting a different web-like solution based on a custom protocol: ☞ XPRISO: eXtensible Protocol for the Replication, Integration and Synchronization of distributed Objects. While I haven’t had the chance to dig deeper into InfoGrid suggested approach there was one thing that caught my attention right away: while the association with web-scale is definitely an interesting idea, having specific knowledge of the nodes location and having to use custom API for it doesn’t seem to be the best solution. Basically the web addressed this by having URIs for each reachable resource (InfoGrid should try a similar idea, get rid of the different API for accessing local vs remote nodes, etc.)
Update: make sure you check the comment thread for more details about InfoGrid perspective on scaling graph databases.
Oren Eini concludes in ☞ his post:
After spending some time thinking about it, I came to the conclusion that I can’t envision any general way to solve the problem. Oh, I can think of several ways of reduce the problem:
- Batching cross machine queries so we only perform them at the close of each breadth first step.
- Storing multiple levels of associations (So “users/ayende” would store its relations but also “users/ayende”’s relation and “users/arik”’s relations).
While I haven’t had enough time to think about this topic, my gut feeling is that possible solutions are to be found in the space of a combination of using unique identifiers for distributed nodes and a mapreduce-like approach. I cannot stop wondering if this is not what Google’s ☞ Pregel is doing (nb I should have read the ☞ paper (pdf) firstly).


