ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

neo4j: All content tagged as neo4j in NoSQL databases and polyglot persistence

Neo4j Data Modeling: What Question Do You Want to Answer?

Mark Needham:

Over the past few weeks I’ve been modelling ThoughtWorks project data in neo4j and I realised that the way that I’ve been doing this is by considering what question I want to answer and then building a graph to answer it.

This same principle should be applied to modeling with any NoSQL database. Thinking in terms of access patterns is one of the major differences between doing data modeling in the NoSQL space and the relational world, which is driven, at least in the first phases and theoretically, by the normalization rules.

Original title and link: Neo4j Data Modeling: What Question Do You Want to Answer? (NoSQL database©myNoSQL)

via: http://www.markhneedham.com/blog/2012/05/05/neo4j-what-question-do-you-want-to-answer/


How to Import Large Graphs to Neo4j With Spring Data

In my case, I wanted to create a simple recommendation engine (the domain doesn’t matter so much). To do that, I had to import FAST 20 million nodes of one-to-many, sparse matrix data. This became a bit more complicated (and interesting) task than originally anticipated, so it became a mini-project itself.

Bulk insert is a scenario that every database should have it covered.

Original title and link: How to Import Large Graphs to Neo4j With Spring Data (NoSQL database©myNoSQL)

via: http://iordanis.com/post/22677357894/import-large-graphs-to-neo4j-with-spring-data-fast


Neo4j REST API Tutorial

A detailed language agnostic intro to the Neo4j REST API:

In the above examples we have seen how nodes, relationships, and properties can be created, edited, updated, and deleted from the Neo4j HTTP terminal.

Original title and link: Neo4j REST API Tutorial (NoSQL database©myNoSQL)

via: http://www.hacksparrow.com/neo4j-tutorial-rest-api.html


NoSQL Databases Adoption in Numbers

Source of data is Jaspersoft NoSQL connectors downloads. RedMonk published a graphic and an analysis and Klint Finley followed up with job trends:

NoSQL databases adoption

Couple of things I don’t see mentioned in the RedMonk post:

  1. if and how data has been normalized based on each connector availability

    According to the post data has been collected between Jan.2011-Mar.2012 and I think that not all connectors have been available since the beginning of the period.

  2. if and how marketing pushes for each connectors have been weighed in

    Announcing the Hadoop connector at an event with 2000 attendees or the MongoDB connector at an event with 800 attendeed could definitely influence the results (nb: keep in mind that the largest number is less than 7000, thus 200-500 downloads triggered by such an event have a significant impact)

  3. Redis and VoltDB are mostly OLTP only databases

Original title and link: NoSQL Databases Adoption in Numbers (NoSQL database©myNoSQL)


Intro to Neo4j Cypher Query Language

Very good slidedeck from Max de Marzi introducing Neo4j’s Cypher query language. While you’ll have to go through the 50 slides yourself to get the details, I’ve extracted a couple of interesting bits:

  1. Cypher was created because Neo4j Java API was too verbose and Gremlin is too prescriptive
  2. SPARQL was designed for a different data model and doesn’t work very well with a graph database
  3. Cypher design decisions:
    • declarative
    • ASCII-art patterns (nb: when first sawing Cypher I haven’t thought of this, but it is cool)
    • pattern-matching
    • external DSL
    • closures
    • SQL familiarity (nb: as much as it’s possible with a radically different data model and processing model)


NoSQL Hosting Services

Michael Hausenblas put together a list of hosted NoSQL solutions including Amazon DynamoDB and SimpleDB, Google App Engine, Riak, Cassandra, CouchDB, MongoDB, Neo4j, and OrientDB. If you go through my posts on NoSQL hosting , you’ll find a couple more.

Original title and link: NoSQL Hosting Services (NoSQL database©myNoSQL)

via: http://webofdata.wordpress.com/2012/03/18/hosted-nosql/


Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1

Two new releases in the graph databases space:

DEX Graph Database 4.5

The new DEX Graph Database release comes with pre-packaged graph algorithms—breadth and depth first traversal, shortest path, Gabow connectivity—available for Java, .NET, and C++. You can get the new version from here.

Neo4j 1.7 Milestone 1

As per Neo4j 1.7 milestone 1 update, this version features:

  • improved Cypher
  • SSL support
  • improved Neo4j documentation
  • high availability improvements (nb: there are recommended maintenance releases for Neo4j 1.5 and 1.6)
  • upgraded Blueprints and Gremlin support

You can get Neo4j 1.7 from here.

Original title and link: Graph Databases Updates: DEX Graph Database 4.5 and Neo4j 1.7 Milestone 1 (NoSQL database©myNoSQL)


Neo4j and the Java Universal Network/Graph Framework

Max De Marzi1:

In the world of graph databases, one such stock room is the Java Universal Network/Graph Framework(JUNG) which contains a cache of algorithms from graph theory, data mining, and social network analysis, such as routines for clustering, decomposition, optimization, random graph generation, statistical analysis, and calculation of network distances, flows, and importance measures (centrality, PageRank, HITS, etc.).

Update: there’s a second part, in which De Marzi looks into visualizing graphs with Node Quilt:

Node Quilt


  1. I’ve already told you that Max de Marzi became my favorite read on graph database subjects. There’s only one thing I don’t like, but it’s his content. 

Original title and link: Neo4j and the Java Universal Network/Graph Framework (NoSQL database©myNoSQL)

via: http://architects.dzone.com/articles/how-implement-java-universal


Neo4j and D3.js: Visualizing Connections Over Time

Another great graph data visualization using Neo4j and D3.js from Max De Marzi:

Graph data visualization of connections over time

  • Max de Marzi is lately my favorite source for graph data visualization posts
  • Even if the diagram looks amazing I’m wondering if it would scale for larger data sets
  • Even if I gave it some thought, I’m still not sure how graph databases can record historical relationship/the evolution of relationships in a graph. If you have any ideas I’d love to hear.

Original title and link: Neo4j and D3.js: Visualizing Connections Over Time (NoSQL database©myNoSQL)


Neo4j and JRuby: Expressive Graph Traversals With Jogger

Jogger gives you named traversals and is a little bit like named scopes. Jogger groups multiple pacer traversals together and give them a name. Pacer traversals are are like pipes. What are pipes? Pipes are great!!

The most important conceptual difference is, that the order in which named traversals are called matter, while it usually doesn’t matter in which order you call named scopes.

Knowing how Gremlin and Cypher compare, question is how is Jogger compared to Cypher?

Original title and link: Neo4j and JRuby: Expressive Graph Traversals With Jogger (NoSQL database©myNoSQL)


Beer Recommendations With Graph Databases

Josh Adell explains how to extend a simple recommendation engine to similarity-based collaborative filtering:

Instead of basing recommendations off of one similar rating, I can calculate how similarly you and I rated all the things we have rated, and only get recommendations from you if I have determined we are similar enough in our tastes.

This is much closer to how recommendation engines developed by sites like Amazon or Netflix are working.

Original title and link: Beer Recommendations With Graph Databases (NoSQL database©myNoSQL)

via: http://blog.everymansoftware.com/2012/02/similarity-based-recommendation-engines.html


Gremlin vs Cypher

Romiko Derbynew comparing Gremlin and Neo4j Cypher:

  • Simple graph traversals are much more efficient when using Gremlin
  • Queries in Gremlin are 30-50% faster for simple traversals
  • Cypher is ideal for complex traversals where back tracking is required
  • Cypher is our choice of query language for reporting
  • Gremlin is our choice of query language for simple traversals where projections are not required
  • Cypher has intrinsic table projection model, where Gremlins table projection model relies on AS steps which can be cumbersome when backtracking e.g. Back(), As() and _CopySplit, where cypher is just comma separated matches
  • Cypher is much better suited for outer joins than Gremlin, to achieve similar results in gremlin requires parallel querying with CopySplit, where as in Cypher using the Match clause with optional relationships
  • Gremlin is ideal when you need to retrieve very simple data structures
  • Table projection in gremlin can be very powerful, however outer joins can be very verbose

So in a nutshell, we like to use Cypher when we need tabular data back from Neo4j and is especially useful in outer joins.

Patrick Durusau

Original title and link: Gremlin vs Cypher (NoSQL database©myNoSQL)

via: http://romikoderbynew.com/2012/02/22/gremlin-vs-cypher-initial-thoughts-neo4j/