NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



orientdb: All content tagged as orientdb in NoSQL databases and polyglot persistence

Why relationships are cool… Relationship in RDBMS vs graph databases

I have to agree with Patrick Durusau on this:

I have been trying to avoid graph “intro” slides and presentations.

There are only so many times you can stand to hear “…all the world is a graph…” as though that’s news. To anyone.

This presentation by Luca is different from the usual introduction to graphs presentation.

Original title and link: Why relationships are cool… Relationship in RDBMS vs graph databases (NoSQL database©myNoSQL)

A Human-Readable Jackrabbit Persistence Manager Prototype for Orientdb

Jackrabbit still has a very special place in my heart. I’ve fought it many times, sometimes losing, most of the time winning. But for over 7 years now, it is still the main storage engine serving the content of InfoQ. So this OrientDB engine for Jackrabbit by Thomas Kratz caught my attention:

This has some limitations, as jackrabbit will still access only one node at a time, being able to traverse the graph at the storage level is simply not intended by the whole api. But it works, it’s readable, can be modified at the db level easily.

Original title and link: A Human-Readable Jackrabbit Persistence Manager Prototype for Orientdb (NoSQL database©myNoSQL)


Neo Technology Is H… Wait, It’s Building Neo4j-As-A-Service

Neo Technology’s hiring announcement is clear about their intention:

“[…] you will be resonsible for building, managing, and maintaining a 24x7 NOSQL Databases-as-a-Service operation […]”

In the graph databases space, OrientDB is offering a hosting solution NuvolaBase, but I have no numbers about their business so far.

Original title and link: Neo Technology Is H… Wait, It’s Building Neo4j-As-A-Service (NoSQL database©myNoSQL)

A Comparison of 7 Graph Databases

The main page of InfiniteGraph, a graph database commercialized by Objectivity, features an interesting comparison of 7 graph databases (InfiniteGraph, Neo4j, AllegroGraph, Titan, FlockDB, Dex, OrientDB) based on 16 criteria: licensing, source, scalability, graph model, schema model, API, query method, platforms, consistency, concurrency (distributed processing), partitioning, extensibility, visualizing tools, storage back end/persistency, language, backup/restore.

7 graph databases

Unfortunately the image is almost unreadable, but Peter Karussell has extracted the data in a GoogleDoc spreadsheet embedded below.

Original title and link: A Comparison of 7 Graph Databases (NoSQL database©myNoSQL)

NoSQL Hosting Services

Michael Hausenblas put together a list of hosted NoSQL solutions including Amazon DynamoDB and SimpleDB, Google App Engine, Riak, Cassandra, CouchDB, MongoDB, Neo4j, and OrientDB. If you go through my posts on NoSQL hosting , you’ll find a couple more.

Original title and link: NoSQL Hosting Services (NoSQL database©myNoSQL)


A Question About NoSQL Managed Hosting

It’s impossible to always have the right answers to all the questions. So this time I’ll have to ask you all: why only some NoSQL databases are present in managed hosting offers?

The first wave of NoSQL managed hosting services brought MongoDB, CouchDB, and some Redis. The second wave brought some more MongoDB, CouchDB, and just a bit more of Redis. It was only the third wave that brought some managed services for graph databases: Neo4j and OrientDB. Plus the first proposal for Cassandra managed hosting.

The first answer that comes to mind when thinking about NoSQL managed services is adoption. If a product is not in wide use then the chances for a company to run a profitable hosting business are very low. But I have the feeling that this is not the only or the complete answer.

Please chime in and share your thoughts.

Original title and link: A Question About NoSQL Managed Hosting (NoSQL database©myNoSQL)

Hosted and Managed NoSQL: Cassandra, Redis, OrientDB

In the last few days I’ve read about some new NoSQL hosting solutions:

  • Cassandra: managed hardware & software hosting:

    Per node:

    • Intel Dual Quad-core (8 cpu’s), 16gb of memory, 2tb primary storage + 500gb commitlog drive
    • 5 public ip addresses, 1000Mbps private network port.
    • Debian, CentOS, RedHat or FreeBSD
    • Cassandra setup, configuration and ongoing maintenance (repairs, cleanups, troubleshooting)
    • Cassandra upgrades (rolling restart)
    • 24x7 real-time monitoring (load, tcp, jmx and cassandra logs)
    • Multi-datacenter environment (we’ll spread your cluster across two or three geographic locations, based on your needs)
    • 30 days test drive

    Cost: $850/monthly per node (5tb bandwidth, includes backups & monitoring)

  • OrientDB: NuvolaBase

    • Real-time replicated deployment
    • Managed
    • JSON over HTTP access
    • can offer VPN connections to the cluster
  • Redis: Cloudnode

    • is still in beta
    • “one Redis instance free with every Cloudnode account”, but no further details about the characteristicts of the instance

Hosting for NoSQL databases has been available in some form or another for a while, but only for the most popular ones (MongoDB, CouchDB, Redis). Things are changing fast. Neo4j is advertising heavily the Heroku add-on, OrientDB got NuvolaBase, and so on.

This is the market that Amazon is targeting with Amazon RDS, SimpleDB, and DynamoDB: the managed data services and that as part of a bigger strategy. What should be clear is that Amazon is not after NoSQL database companies.

Anyone considering a business in the managed data services market should realize that Amazon will not get into supporting all the NoSQL databases out there. They’d also better take a deep look and learn from what Amazon is offering with SimpleDB and DynamoDB.

Original title and link: Hosted and Managed NoSQL: Cassandra, Redis, OrientDB (NoSQL database©myNoSQL)

11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB

Such list would be even more useful with the following classification:

Production ready


Note: A special mention in this category for OrientDB and Terrastore which even if they might not be largely adopted they are still active projects probably counting a couple of production deployments.


Original title and link: 11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB (NoSQL database©myNoSQL)

An Intro to Gremlin the Graph Traversal Language

A nice intro to Gremlin, the Groovy-based graph traversal language supporting Neo4j, OrientDB, DEX, RDF Sail, TinkerGraph, and ReXster:

Next thing you should do is take your favorite graph database and try out Gremlin.

Original title and link: An Intro to Gremlin the Graph Traversal Language (NoSQL database©myNoSQL)

OrientDB - Pure Java NoSQL Datastore

I analysed all the popular ones but none fitted my requirements. I had one criteria for selecting a database: I must be able to code in Java. Most available systems were non-Java based which would be a significant issue for a one man project. Even if they had Java interface, the installation, setup, etc. were a tedious process. Having a database developed purely in Java has many advantages:

  1. Easy packaging with other applications
  2. Easy to install and run
  3. Can be embedded
  4. Can run in same or different VM
  5. Easy to debug
  6. Easy to test

After much searching, I came across OrientDB.

These are less storage requirements than programming and deployment requirements. Judging by the above points alone quite a few other databases would make the list.

Original title and link: OrientDB - Pure Java NoSQL Datastore (NoSQL database©myNoSQL)


OrientDB Improves Performance Through Defrag

The problems I found with the HOLES were that small spaces aren’t reused at all and huge defragmentation was present. This caused a global slowness and the growth of the database on disk (in some cases many times the original size). After 2 weeks of work I’ve published in the SVN and maven the new version of the OrientDB storage with:

  • In-line defrag: something like some File Systems already do by joining small holes all together. In-line defrag works while the database is online and in use
  • Improved the management of small changes to records
  • 2 configurable strategies of how to find the best hole to join during defrag process
  • configurable hole distance to decide when to join multiple holes all together

Original title and link: OrientDB Improves Performance Through Defrag (NoSQL databases © myNoSQL)


New OrientDB Release: new memory model, new graph api, much more stable

In its way towards the 1.0 version, OrientDB announced a new release featuring:

  • Brand new memory model with level-1 and level-2 caches (Issue #242)
  • SQL prepared statement (Issue #49)
  • SQL Projections with the support of links (Issue #15)
  • Graphical editor for documents in OrientDB Studio app (Issue #217)
  • Graph representation in OrientDB Studio app
  • Support for JPA annotation by the Object Database interface (Issue #102)
  • Smart Console under bash: history, auto completition, etc. (Issue #228)
  • Operations to work with GEO-spatial points (Issue #182)
  • @rid support in SQL UPDATE statement (Issue #72)
  • Range queries against Indexes (Issue #231)
  • 100% support of TinkerPop Blueprints 0.5

Regrettably the same comment thread shows that there are still some problems handling large amounts of data in OrientDB.

Original title and link: New OrientDB Release: new memory model, new graph api, much more stable (NoSQL databases © myNoSQL)