ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Membase Amazon SimpleDB MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

orientdb: All content tagged as orientdb in NoSQL databases and polyglot persistence

Hosted and Managed NoSQL: Cassandra, Redis, OrientDB

In the last few days I’ve read about some new NoSQL hosting solutions:

  • Cassandra: managed hardware & software hosting:

    Per node:

    • Intel Dual Quad-core (8 cpu’s), 16gb of memory, 2tb primary storage + 500gb commitlog drive
    • 5 public ip addresses, 1000Mbps private network port.
    • Debian, CentOS, RedHat or FreeBSD
    • Cassandra setup, configuration and ongoing maintenance (repairs, cleanups, troubleshooting)
    • Cassandra upgrades (rolling restart)
    • 24x7 real-time monitoring (load, tcp, jmx and cassandra logs)
    • Multi-datacenter environment (we’ll spread your cluster across two or three geographic locations, based on your needs)
    • 30 days test drive

    Cost: $850/monthly per node (5tb bandwidth, includes backups & monitoring)

  • OrientDB: NuvolaBase

    • Real-time replicated deployment
    • Managed
    • JSON over HTTP access
    • can offer VPN connections to the cluster
  • Redis: Cloudnode

    • Cloudeno.de is still in beta
    • “one Redis instance free with every Cloudnode account”, but no further details about the characteristicts of the instance

Hosting for NoSQL databases has been available in some form or another for a while, but only for the most popular ones (MongoDB, CouchDB, Redis). Things are changing fast. Neo4j is advertising heavily the Heroku add-on, OrientDB got NuvolaBase, and so on.

This is the market that Amazon is targeting with Amazon RDS, SimpleDB, and DynamoDB: the managed data services and that as part of a bigger strategy. What should be clear is that Amazon is not after NoSQL database companies.

Anyone considering a business in the managed data services market should realize that Amazon will not get into supporting all the NoSQL databases out there. They’d also better take a deep look and learn from what Amazon is offering with SimpleDB and DynamoDB.

Original title and link: Hosted and Managed NoSQL: Cassandra, Redis, OrientDB (NoSQL database©myNoSQL)


11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB

Such list would be even more useful with the following classification:

Production ready

Experimental

Note: A special mention in this category for OrientDB and Terrastore which even if they might not be largely adopted they are still active projects probably counting a couple of production deployments.

Abandonware

Original title and link: 11 Document-Oriented Databases Which Are 8: CouchDB, Jackrabbit, MongoDB, RavenDB (NoSQL database©myNoSQL)


An Intro to Gremlin the Graph Traversal Language

A nice intro to Gremlin, the Groovy-based graph traversal language supporting Neo4j, OrientDB, DEX, RDF Sail, TinkerGraph, and ReXster:

Next thing you should do is take your favorite graph database and try out Gremlin.

Original title and link: An Intro to Gremlin the Graph Traversal Language (NoSQL database©myNoSQL)


OrientDB - Pure Java NoSQL Datastore

I analysed all the popular ones but none fitted my requirements. I had one criteria for selecting a database: I must be able to code in Java. Most available systems were non-Java based which would be a significant issue for a one man project. Even if they had Java interface, the installation, setup, etc. were a tedious process. Having a database developed purely in Java has many advantages:

  1. Easy packaging with other applications
  2. Easy to install and run
  3. Can be embedded
  4. Can run in same or different VM
  5. Easy to debug
  6. Easy to test

After much searching, I came across OrientDB.

These are less storage requirements than programming and deployment requirements. Judging by the above points alone quite a few other databases would make the list.

Original title and link: OrientDB - Pure Java NoSQL Datastore (NoSQL database©myNoSQL)

via: http://myjavaexp.blogspot.com/2011/06/orientdb-pure-java-nosql-datastore.html


OrientDB Improves Performance Through Defrag

The problems I found with the HOLES were that small spaces aren’t reused at all and huge defragmentation was present. This caused a global slowness and the growth of the database on disk (in some cases many times the original size). After 2 weeks of work I’ve published in the SVN and maven the new version of the OrientDB storage with:

  • In-line defrag: something like some File Systems already do by joining small holes all together. In-line defrag works while the database is online and in use
  • Improved the management of small changes to records
  • 2 configurable strategies of how to find the best hole to join during defrag process
  • configurable hole distance to decide when to join multiple holes all together

Original title and link: OrientDB Improves Performance Through Defrag (NoSQL databases © myNoSQL)

via: http://zion-city.blogspot.com/2011/04/graphdb-benchmark-part-ii.html


New OrientDB Release: new memory model, new graph api, much more stable

In its way towards the 1.0 version, OrientDB announced a new release featuring:

  • Brand new memory model with level-1 and level-2 caches (Issue #242)
  • SQL prepared statement (Issue #49)
  • SQL Projections with the support of links (Issue #15)
  • Graphical editor for documents in OrientDB Studio app (Issue #217)
  • Graph representation in OrientDB Studio app
  • Support for JPA annotation by the Object Database interface (Issue #102)
  • Smart Console under bash: history, auto completition, etc. (Issue #228)
  • Operations to work with GEO-spatial points (Issue #182)
  • @rid support in SQL UPDATE statement (Issue #72)
  • Range queries against Indexes (Issue #231)
  • 100% support of TinkerPop Blueprints 0.5

Regrettably the same comment thread shows that there are still some problems handling large amounts of data in OrientDB.

Original title and link: New OrientDB Release: new memory model, new graph api, much more stable (NoSQL databases © myNoSQL)

via: http://groups.google.com/group/orient-database/browse_thread/thread/7f2e53b1894fc9b7


NuvolaBase: OrientDB in the Cloud

Another interesting announcement coming out today is NuvolaBase, the OrientDB in the cloud. The information about the service is very scarce on its website, so except the different account plans I couldn’t find out much. I hope to hear more about it from Luca Garulli, the creator of OrientDB and the guy behind NuvolaBase.

Original title and link: NuvolaBase: OrientDB in the Cloud (NoSQL databases © myNoSQL)


OrientDB New Release Featuring Sync and Async Replication

OrientDB, the document or graph store, has announced a new release, 0.9.24, featuring amongst a few SQL support improvements, synchronous and asynchronous replication.

The complete list of changes can be found ☞ here. The ☞ official announcement is listing the following new features:

  • Support for Clustering with synchronous and asynchronous replication
  • New SQL RANGE keyword: SELECT FROM ... WHERE ... RANGE <from> [,<to>]
  • New SQL LIMIT keyword: SELECT FROM ... WHERE ... LIMIT 20
  • Improved CREATE INDEX command
  • New REMOVE INDEX command
  • New console command INFO CLASS
  • New console command TRUNCATE CLASS and TRUNCATE CLUSTER
  • MRB+Tree now is faster and stable
  • Improved import/export commands
  • Improved JSON compliance
  • Improved TRAVERSE operator with the optional field list to traverse

I’ve contacted Luca Garulli, OrientDB main developer, for more details about the OrientDB replication.

Original title and link: OrientDB New Release Featuring Sync and Async Replication (NoSQL databases © myNoSQL)


Neo4j and OrientDB Performance Compared

Sort of a benchmark based on running the ☞ TinkerPop test suite against Neo4j and OrientDB (nb: we’ve learned recently that OrientDB is a document-graph database).

OrientDB vs Neo4j Performance NoSQL benchmark

A couple of notes:

  • I don’t think the test suite is also addressing the concurrency angle of these graph databases
  • Neo4j is fully ACID compliant and transactions can have a huge impact on the performance, at least for bulk operations

If not mistaking, this is the first data comparing the performance of two graph database. It doesn’t mean it is a relevant NoSQL benchmark or performance evaluation though.

Original title and link: Neo4j and OrientDB Performance Compared (NoSQL databases © myNoSQL)

via: http://zion-city.blogspot.com/2010/09/orientdb-fastest-graphdb-available.html


Correction: OrientDB is a Document and Graph Store

Luca Garulli, ☞ OrientDB project lead, contacted me a couple of days ago offering some clarifications about OrientDB.

Luca Garulli: OrientDB is a document-graph dbms with schema-less, schema-full or mixed modes. Why also graph? Because the relationships are all direct links between documents. No “JOIN” is used. This allow to load entire graph of interconnected documents in few ms!

The Graph interface is documented ☞ here and starting from v. 0.9.22 OrientDB is compliant with Tinkerpop stack of Graph tools such as the Gremlin language. ☞ This is the link that shows the OrientDB usage from Gremlin.

Alex: Couple of questions:

  1. what is the format in which data is stored?
  2. how do you query data?

Luca: The document is stored in a compressed JSON-like format. Documents are contained in clusters. Clusters can be physical, logical or in-memory. A cluster is something close to the Collection of MongoDB and its aim is to group documents all together. The first use of a cluster is to group documents of the same type, as a sort of TABLE in the Relational world. But you can create a cluster “UrgentInvoices” and put all the urgent invoices close to be expired.

A cluster can be browsed and queried using Native queries and SQL queries. The SQL support is good enough and has extension to handle the schema-free features such as add/remove items in collections and maps. This example add the String ‘Luca’ to the collection “names”.

update Account add names = 'Luca'

And special operators to treat Trees and Graphs. This cross all the relationships avoiding costly JOINs:

select from Profile where address.city.country.name = 'Rome'

This one is much more powerful and complex:

select from Profile where any() traverse( 0,3 ) ( 
    any().toUpperCase().indexOf( 'NAVONA' ) > -1 )

any() means any fields because each documents can have different fields (is schema-less). the traverse operator goes recursively from the current document (0) to maximum the 3rd level of nesting (3) checking the condition on the right.

Then you have native queries:

new ONativeAsynchQuery<ODocument, OQueryContextNativeSchema<ODocument>>(
        database, 
        "Profile",
        new OQueryContextNativeSchema<ODocument>(), this) {

      @Override
      public boolean filter(OQueryContextNativeSchema<ODocument> iRecord) {
        return iRecord.column("id").toInt().minor(10).go();
      }
}.run();

Alex: Thanks a lot!

Update: It looks like OrientDB is also seeing some speed improvements these days. You can read about it ☞ here.

Original title and link: Correction: OrientDB is a Document and Graph Store (NoSQL databases © myNoSQL)


OrientDB Schema Options: Schema-less, Schema-full, and Schema-Mixed

I’ve seen a similar approach on handling different schema approaches in the Java Content Repository spec:

Although OrientDB can work in Schema-less mode sometime you need to enforce your data model using a schema. OrientDB supports schema-full or schema-mixed solutions where the second one means to set such constraints only for certain fields and leave the user to add custom fields to the records.

Just in case you aren’t familiar with OrientDB:

via: http://code.google.com/p/orient/wiki/Schema


Release: OrientDB 0.9.20 Featuring Runtime Fetch Strategies

OrientDB, the mixed document/graph database with SQL flavor, ☞ has announced a new release featuring:

  • New run-time Fetch Plans
  • New database properties (Issue #54)
  • POJO callback on serialization/deserialization (Issue #56)
  • New annotation to use RAW binding (Issue #57)

You can read more about OrientDB fetch strategies ☞ here. OrientDB 0.9.20 can be downloaded from ☞ here.

Considering I’ve started to look at OrientDB quite recently and except this presentation I couldn’t find many more details, I’m wondering if there are people using OrientDB in their projects and what are some of its use cases.