ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

oodb: All content tagged as oodb in NoSQL databases and polyglot persistence

NoSQL Databases: HSS Database for .NET, Silverlight, and WP7

Stefan Edlich adds another object oriented database to the long collection of non-relational databases:

The HSS Database is an object oriented database management system (OODB or ODBMS) for Microsoft .NET, Silverlight and Windows Phone 7. HSS Database gives developers the ability to store and retrieve objects from their applications with extremely high speeds compared to other solutions, and with a small footprint (~100KB).

With so many non-relational databases and quite a few well established object oriented database management systems out there, I doubt a new and commercial offer will see any significant adoption.

Original title and link: NoSQL Databases: HSS Database for .NET, Silverlight, and WP7 (NoSQL database©myNoSQL)


MagLev NoSQL OODB With Smalltalk-Based Ruby VM

Monty Williams (VMWare/GemStone) interviewed by Werner Schuster:

  • MagLev VM takes full advantage of GemStone/S JIT to native code performance, distributed shared cache, fully ACID transactions, and enterprise class NoSQL data management capabilities to provide a robust and durable programming platform. It can transparently manage a much larger amount (terabytes) of data and code than will fit in memory.
  • I don’t think of MagLev only as a Ruby VM that has an integrated NoSQL database. I think of MagLev as a NoSQL database that uses Ruby for its data manipulation language.
  • The one thing I don’t think people have wrapped their heads around is MagLev provides a “single object space”. Nothing has to be sent/retrieved to/from a separate DB. All your code is executed “in the database.” You don’t even need to keep track of which objects have been modified so you can flush them to disk. MagLev handles that automatically. 
  • You can store any Ruby object, even procs, lambdas, threads, continuations. Here is an example of stopping, copying, saving, and restarting Threads in a different VM than they originated in. blog.bithug.org/2011/09/maglev-debug
  • MagLev persistence is akin to Image Persistence, i.e. objects are persisted to disk in the same format they are in shared cache. You don’t need to marshal them or convert them to JSON or another format.
  • MagLev transactions are ACID, which means that multiple VM’s can interact with the same repository and share state, objects, and code while maintaining referential integrity.
  • When you start a new MagLev VM, code loaded by another VM is likely to still be in the cache. So loading/requiring it can be quite fast.

Am I the only one confused by the mouthfullness of the above “NoSQL OODB” description?

Original title and link: MagLev NoSQL OODB With Smalltalk-Based Ruby VM (NoSQL database©myNoSQL)

via: http://www.infoq.com/news/2011/11/ruby-maglev-10


Comparing NoSQL Databases with Object-Oriented Databases

Don White1 in an interview over odbms.org:

The new data systems are very data centric and are not trying to facilitate the melding of data and behavior. These new storage systems present a specific model abstractions and provide their own specific storage structure. In some cases they offer schema flexibility, but it is basically used to just manage data and not for building sophisticated data structures with type specific behavior.

Decoupling data from behavior allows both to evolve separately. Or differently put, it allows one to outlive the other.

Another intersting quote from the interview:

[…] why would you want to store data differently than how you intend to use it? I guess the simple answer is when you don’t know how you are going to use your data, so if you don’t know how you are going to use it then why is any data store abstraction better than another?

I guess this explains the 30 years dominance of relational databases. Not in the sense that we never knew how to use data, but rather that we always wanted to make sure we can use it in various ways.

And that explains also the direction NoSQL databases took:

To generalize it appears the newer stores make different compromises in the management of the data to suit their intended audience. In other words they are not developing a general purpose database solution so they are willing to make tradeoffs that traditional database products would/should/could not make. […] They do provide an abstraction for data storage and processing capabilities that leverage the idiosyncrasies of their chosen implementation data structures and/or relaxations in strictness of the transaction model to try to make gains in processing.


  1. Don White: senior development manager at Progress Software Inc., responsible for all feature development and engineering support for ObjectStore  

Original title and link: Comparing NoSQL Databases with Object-Oriented Databases (NoSQL databases © myNoSQL)


Using Object Database db4o as Storage Provider in Voldemort

Abstract: In this article I will show you how easy it is to add support for Versant’s object database, db4o in project Voldemort, an Apache licensed distributed key-value storage system used at LinkedIn which is useful for certain high-scalability storage problems where simple functional partitioning is not sufficient (Note: Voldemort borrows heavily from Amazon’s Dynamo, if you’re interested in this technology all the available papers about Dynamo should be useful).

Voldemort’s storage layer is completely mockable so development and unit testing can be done against a throw-away in-memory storage system without needing a real cluster or, as we will show next, even a real storage engine like db4o for simple testing.

Note: All db4o related files that are part of this project are publicly available on ☞ GitHub

This is an article contributed by German Viscuso from db4objects (a division of Versant Corporation).

Content:

  1. Voldemort in a Nutshell
  2. Key/Value Storage System
    • Pros and Cons
  3. Logical Architecture
  4. db4o API in a Nutshell
  5. db4o as Key/Value Storage Provider
  6. db4o and BerkeleyDB side by side
  7. Conclusion

Voldemort in a Nutshell

In one sentence Voldemort is basically a big, distributed, persistent, fault-tolerant hash table designed to tradeoff consistency for availability.

It’s implemented as a highly available key-value storage system for services that need to provide an “always-on” experience. To achieve this level of availability, Voldemort sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution.

Voldemort targets applications that operate with weak/eventual consistency (the “C” in ACID) if this results in high availability even under occasional network failures. It does not provide any isolation guarantees and permits only single key updates. It is well known that when dealing with the possibility of network failures, strong consistency and high data availability cannot be achieved simultaneously.

The system uses a series of known techniques to achieve scalability and availability: data is partitioned and replicated using consistent hashing, and consistency is facilitated by object versioning. The consistency among replicas during updates is maintained by a quorum-like technique and a decentralized replica synchronization protocol. It employs a gossip based distributed failure detection and membership protocol and is a completely decentralized system with minimal need for manual administration. Storage nodes can be added and removed from without requiring any manual partitioning or redistribution.

Voldemort is not a relational database, it does not attempt to satisfy arbitrary relations while satisfying ACID properties. Nor is it an object database that attempts to transparently map object reference graphs. Nor does it introduce a new abstraction such as document-orientation.

However, given the current complex scenario of persistence solutions in the industry we’ll see that different breeds can be combined to offer solutions that become the right tool for the job at hand. In this case we’ll see how a highly scalable NoSQL solution such as Voldemort can use a fast Java based embedded object database (Versant’s db4o) as the underlying persistence mechanism. Voldemort provides the scalability while db4o provides fast key/value pair persistence.

Key/Value Storage System

To enable high performance and availability Voldemort allows only very simple key-value data access. Both keys and values can be complex compound objects including lists or maps, but none-the-less the only supported queries are effectively the following:

value = store.get( key )
store.put( key, value ) 
store.delete( key )

This is clearly not good enough for all storage problems, there are a variety of trade-offs:

Cons: no complex query filters all joins must be done in code no foreign key constraints no triggers/callbacks

Pros:

  • only efficient queries are possible, very predictable performance
  • easy to distribute across a cluster
  • service-orientation often disallows foreign key constraints and forces joins to be done in code anyway (because key refers to data maintained by another service)
  • using a relational db you need a caching layer to scale reads, the caching layer typically forces you into key-value storage anyway
  • often end up with xml or other denormalized blobs for performance anyway
  • clean separation of storage and logic (SQL encourages mixing business logic with storage operations for efficiency)
  • no object-relational miss-match, no mapping

Having a three operation interface makes it possible to transparently mock out the entire storage layer and unit test using a mock-storage implementation that is little more than a HashMap. This makes unit testing outside of a particular container or environment much more practical and was certainly instrumental in simplifying the db4o storage engine implementation.

Logical Architecture

In Voldemort, each storage node has three main software component groups: request coordination, membership and failure detection, and a local persistence engine. All these components are implemented in Java.

If we take a closer look we’ll see that each layer in the code implements a simple storage interface that does put, get, and delete. Each of these layers is responsible for performing one function such as TCP/IP network communication, serialization, version reconciliation, inter-node routing, etc. For example the routing layer is responsible for taking an operation, say a PUT, and delegating it to all the N storage replicas in parallel, while handling any failures.

Voldemort’s local persistence component allows for different storage engines to be plugged in. Engines that are in use are Oracle’s Berkeley Database (BDB), Oracle’s MySQL, and an in-memory buffer with persistent backing store. The main reason for designing a pluggable persistence component is to choose the storage engine best suited for an application’s access patterns. For instance, BDB can handle objects typically in the order of tens of kilobytes whereas MySQL can handle objects of larger sizes. Applications choose Voldemort’s local persistence engine based on their object size distribution. The majority of Voldemort’s production instances currently use BDB.

In this article we’ll provide details about the creation of a new storage engine (yellow box layer on the image above) that uses db4o instead of Oracle’s Berkeley DB (BDB), Oracle’s MySQL or just memory. Moreover we’ll show that db4o’s performance is on-par with BDB (if not better) while introducing a significant reduction in the implementation complexity.

db4o API in a Nutshell

db4o’s basic API is not far away from Voldemort’s API in terms of simplicity but it’s not limited to the storage of key/value pairs. db4o is a general purpose object persistence engine that can deal with the peculiarities of native POJO persistence that exhibit arbitrarily complex object references.

The very basic API looks like this:

container.store( object ) 
container.delete( object ) 
objectSet = container.queryByExample( prototype )

Two more advanced querying mechanisms are available and we will introduce one of them (SODA) on the following sections. As you can see, in order to provide an interface for key/value storage (what Voldemort expects) we’ll have to provide an intermediate layer which will take requests via the basic Voldemort API and will translate that into db4o API calls.

db4o as Key/Value Storage Provider

db4o fits seamlessly into the Voldemort picture as a storage provider because:

  • db4o allows free form storage of objects with no special requirements on object serialization which results in a highly versatile storage solution which can easily be configured to store from low level objects (eg byte arrays) to objects of arbitrarily complexity (both for keys and values on key/value pairs)
  • db4o’s b-tree based indexing capabilities allow for quick retrieval of keys while acting as a key/value pair storage provider
  • db4o’s simple query system allows the retrieval of key/value pairs with one-line of code
  • db4o can persist versioned objects directly which frees the developer from having to use versioning wrappers on stored values

db4o can act as a simpler but powerful replacement to Oracle’s Berkley DB (BDB) because the db4o complexity of the db4o storage provider is closer to Voldemort’s “in memory” storage provider than the included BDB provider which results in faster test and development cycles (remember Voldemort is work in progress).

As a first step in the implementation we provide a generic class to expose db4o as a key/value storage provider which is not dependant on Voldemort’s API (also useful to db4o developers who need this functionality in any application). The class is voldemort.store.db4o.Db4oKeyValueProvider<Key, Value> and relies on Java generics so it can be used with any sort of key/value objects.

This class implements the following methods which are more or less self explanatory considering they operate on a store of key/value pairs:

voldemort.store.db4o.Db4oKeyValueProvider.keyIterator()
voldemort.store.db4o.Db4oKeyValueProvider.getKeys() 
voldemort.store.db4o.Db4oKeyValueProvider.getValues(Key) 
voldemort.store.db4o.Db4oKeyValueProvider.pairIterator()
voldemort.store.db4o.Db4oKeyValueProvider.get(Key) 
voldemort.store.db4o.Db4oKeyValueProvider.delete(Key) 
voldemort.store.db4o.Db4oKeyValueProvider.delete(Key, Value) 
voldemort.store.db4o.Db4oKeyValueProvider.delete(Db4oKeyValuePair<Key, Value>)
voldemort.store.db4o.Db4oKeyValueProvider.put(Key, Value) 
voldemort.store.db4o.Db4oKeyValueProvider.put(Db4oKeyValuePair<Key, Value>)
voldemort.store.db4o.Db4oKeyValueProvider.getAll()
voldemort.store.db4o.Db4oKeyValueProvider.getAll(Iterable<Key>) 
voldemort.store.db4o.Db4oKeyValueProvider.truncate()
voldemort.store.db4o.Db4oKeyValueProvider.commit() 
voldemort.store.db4o.Db4oKeyValueProvider.rollback() 
voldemort.store.db4o.Db4oKeyValueProvider.close() 
voldemort.store.db4o.Db4oKeyValueProvider.isClosed()

Second, we implement a db4o StorageProvider following Voldemort’s API, namely voldemort.store.db4o.Db4oStorageEngine which define the key (K) as ByteArray and the value (V) as byte[] for key/value classes.

Since this class inherits from abstract class voldemort.store.StorageEngine<K, V> it must (and it does) implement the following methods:

voldemort.store.db4o.Db4oStorageEngine.getVersions(K) 
voldemort.store.db4o.Db4oStorageEngine.get(K) 
voldemort.store.db4o.Db4oStorageEngine.getAll(Iterable<K>) 
voldemort.store.db4o.Db4oStorageEngine.put(K, Versioned<V>) 
voldemort.store.db4o.Db4oStorageEngine.delete(K, Version)

We also implement the class that handles the db4o storage configuration: voldemort.store.db4o.Db4oStorageConfiguration This class is responsible for providing all the initialization parameters when the db4o database is created (eg. index creation on the key (K) field for fast retrieval of key/value pairs)

Third and last we provide three support classes:

  • voldemort.store.db4o.Db4oKeyValuePair<Key, Value> : a generic key/value pair class. Instances of this class will be what ultimately gets stored in the db4o database.
  • voldemort.store.db4o.Db4oKeysIterator<Key, Value> : an Iterator implementaion that allows to iterate over the keys (K).
  • voldemort.store.db4o.Db4oEntriesIterator<Key, Value>: an Iterator implementaion that allows to iterate over the values (V).

and a few test classes: voldemort.CatDb4oStore, voldemort.store.db4o.Db4oStorageEngineTest (this is the main test class), voldemort.store.db4o.Db4oSplitStorageEngineTest.

db4o and BerkeleyDB side by side

Let’s take a look at db4o’s simplicity by comparing the matching methods side by side with BDB (now both can act as Voldemort’s low level storage provider).

Constructors: BdbStorageEngine() vs Db4oStorageEngine()

The db4o constructor get rids of the serializer object because it can store objects of any complexity with no transformation. The code:

this.versionSerializer = new Serializer<Version>() { 
    public byte[] toBytes(Version object) {
        return ((VectorClock) object).toBytes();
    }
     public Version toObject(byte[] bytes) {
        return versionedSerializer.getVersion(bytes);
    }
};

is no longer necessary and, of course, this impacts all storage operations because db4o requires no conversions “to bytes” and “to object” back and forth:

Fetch by Key

BDB (fetch by key):

DatabaseEntry keyEntry = new DatabaseEntry(key.get());
 DatabaseEntry valueEntry = new DatabaseEntry(); 
List<T> results = Lists.newArrayList();

for(OpStatus status = cursor.getSearchKey(keyEntry, valueEntry, lockMode); 
    status == OperationStatus.SUCCESS; 
    status = cursor.getNextDup(keyEntry, valueEntry, lockMode)) {

        results.add(serializer.toObject(valueEntry.getData()));
};
return results;

db4o (fetch by key)

Query query = getContainer().query(); 
query.constrain(Db4oKeyValuePair.class); 
query.descend("key").constrain(key); 
return query.execute();

in this example we use “SODA”, a low level and powerful graph based query system where you basically build a query tree and pass it to db4o for execution.

Store Key/Value Pairs

BDB (store key/value pair):

DatabaseEntry keyEntry = new DatabaseEntry(key.get());
 boolean succeeded = false; 
transaction = this.environment.beginTransaction(null, null); 
// Check existing values
// if there is a version obsoleted by this value delete it 
// if there is a version later than this one, throw an exception
DatabaseEntry valueEntry  = new DatabaseEntry();
cursor = getBdbDatabase().openCursor(transaction, null);

for(OpStatus status=cursor.getSearchKey(keyEntry,valueEntry,LockMode.RMW); 
    status == OperationStatus.SUCCESS; 
    status = cursor.getNextDup(keyEntry, valueEntry, LockMode.RMW)) {

    VectorClock clock = new VectorClock(valueEntry.getData());
    Occured occured = value.getVersion().compare(clock);

    if(occured == Occured.BEFORE) throw new ObsoleteVersionException();
     else if(occured == Occured.AFTER) // best effort delete of obsolete previous value!
        cursor.delete();
}

// Okay so we cleaned up all the prior stuff, so now we are good to 
// insert the new thing 
valueEntry = new DatabaseEntry(versionedSerializer.toBytes(value)); 
OperationStatus status = cursor.put(keyEntry, valueEntry);
 if(status != OperationStatus.SUCCESS)
    throw new PersistenceFailException("Put operation failed:" + status); 
succeeded = true;

db4o (store key/value pair):

boolean succeeded = false; 
candidates = keyValueProvider.get(key);

for(Db4oKeyValuePair<ByteArray, Versioned<byte[]>> pair: candidates) { 
    Occured occured = value.getVersion().compare(pair.getValue().getVersion());
    if(occured == Occured.BEFORE) throw new ObsoleteVersionException();
    else if(occured == Occured.AFTER) // best effort delete of obsolete previous value!
        keyValueProvider.delete(pair);
} 
// Okay so we cleaned up all the prior stuff, so we can now insert
try {
    keyValueProvider.put(key, value); 
} catch(Db4oException de) {
    throw new PersistenceFailException("Put op failed:" + de.getMessage()); succeeded = true;
}

Status of Unit Tests

Let’s take a look at the current status of unit tests in for the db4o storage provider (Db4oStorageEngineTest):

  1. testPersistence: single key/value pair storage and retrieval with storage engine close operation in between (it takes some time because this is the first test and the Voldemort system is kick started).
  2. testEquals: test that retrieval of storage engine instance by name gets you the same instance.
  3. testNullConstructorParameters: null constructors for storage engine instantiation are illegal. Arguments are storage engine name and database configuration.
  4. testSimultaneousIterationAndModification: threaded test of simultaneous puts (inserts), deletes and iteration over pairs (150 puts and 150 deletes before start of iteration).
  5. testGetNoEntries: test that empty storage empty returns zero pairs.
  6. testGetNoKeys: test that empty storage empty returns zero keys.
  7. testKeyIterationWithSerialization: test storage and key retrieval of 5 pairs serialized as Strings.
  8. testIterationWithSerialization: test storage and retrieval of 5 pairs serialized as Strings.
  9. testPruneOnWrite: stores 3 versions for one key and tests prune (overwriting of previous versions should happen).
  10. testTruncate: stores 3 pairs and issues a truncate database operation. Then verifies db is empty.
  11. testEmptyByteArray: stores 1 pair with zeroed key and tests correct retrieval.
  12. testNullKeys: test that basic operations (get, put, getAll, delete,etc) fail with a null key parameter (5 operations on pairs total).
  13. testPutNullValue: test that put operation works correctly with a null value (2 operations on pairs total, put and get).
  14. testGetAndDeleteNonExistentKey: test that key of non persistent pair doesn’t return a value.
  15. testFetchedEqualsPut: stores 1 pair with complex version and makes sure only one entry is stored and retrieved.
  16. testVersionedPut: tests that obsolete or equal versions of a value can’t be stored. Test correct storage of incremented version (13 operations on pairs total).
  17. testDelete: puts 2 pairs with conflicting versions and deletes one (7 operations on pairs total).
  18. testGetVersions: Test retrieval of different versions of pair (3 operations on pairs total).
  19. testGetAll: puts 10 pairs and tests if all can be correctly retrieved at once (getAll)
  20. testGetAllWithAbsentKeys: same as before but with non persistent keys (getAll returns 0 pairs).
  21. testCloseIsIdempotent: tests that a second close does not result in error.

Conclusion

db4o provides a low level implementation of a key/value storage engine that is both simpler than BDB and on-par (if not better) in performance. Moreover, the implementation shows that different persistence solutions such as the typical highly scalable, eventually consistent NoSQL engine and object databases can be combined to provide a solution that becomes the right tool for the job at hand which makes typical “versus” arguments obsolete.

Finally the short implementation path taken to make db4o act as a reliable key/value storage provider show the power and versatility of native embedded object databases.

Original title and link for this post: Using Object Database db4o as Storage Provider in Voldemort (published on the NoSQL blog: myNoSQL)


What is HyperGraphDB?

Recently we’ve seen a lot of activity in the graph database world. Better understanding the space will help us make smarter decisions, so I’ve decided to reach out to the main players in the market and run a series of interviews about their projects and goals. The first in this series is about HyperGraphDB and Borislav Iordanov, his creator, has been kind enough to answer my questions.

myNoSQL: What is HyperGraphDB?

Borislav Iordanov: HyperGraphDB is a storage framework based on generalized hypergraphs as its underlying data model. The unit of storage is a tuple made up of 0 or more other tuples. Each such tuple is called an atom. One could think of the data model as relational where higher-order, n-ary relationships are allowed or as graph-oriented where edges can point to an arbitrary set of nodes and other edges. Each atom has an arbitrary, strongly-typed value associated with it. The type system managing those values is embedded as a hypergraph and customizable from the ground up. HyperGraphDB itself is an embedded database with an XMPP-based distribution framework and it relies on a key-value store underneath, currently BerkeleyDB. In its present form, it is a full-fledged object-oriented database for Java as well. Storage layout, indexing and caching are designed to support graph traversals and pattern matching.

myNoSQL: How would you position HyperGraphDB inside the NoSQL space?

Boris: I think it is quite apart and I don’t see it fit into any particular category. Because of the term “hypergraph”, it’s been categorized as a “graph database”, but strictly speaking it is not. The focus is highly complex data and knowledge representation problems. It originated from an AI project (http://www.opencog.org) and its power is partly in its data model and in its open-architecture framework.

myNoSQL: Would you mind explaining a bit more why you are placing HyperGraphDB closer to object databases than to graph databases?

Boris: Probably because object structures have the same kind of generality — arbitrary nesting, n-ary relations and if you model a relation as an identifiable object, it’s in effect reified, so you can have higher-order relationships. In addition OO database have well-developed type systems, as HyperGraphDB does (but HyperGraphDB’s is more general because you could model functional style type systems in it, you could also have types of types of types etc. ad infinitum).

Standard graphs are really just one kind of data structure that is conceptually simple and that happens to be very well studied mathematically so people use them a lot in modeling. A graph database is probably very good at dealing with graph-oriented problems with large datasets, but for general programming one would want a more versatile data model, and HyperGraphDB offers that as well as OO databases.

Obviously, one could model object structures as well as hypergraphs with classical graphs, but that doesn’t mean much - one could translate a C program into a Turing machine, and this doesn’t make the Turing machine a good choice for the problem the C program is solving.

myNoSQL: What are other solutions in this category/space?

Boris: I don’t know of any. The topic maps formalism (an RDF rival, that sadly is not very popular) is very close to the HyperGraphDB data model. RDF itself, named graphs etc. are close. Then graph and OO databases obviously touch on some of the functionality, with OO databases probably being closer. The database behind freebase.com is very similar in architecture, but relations are with fixed arity there too.

myNoSQL: Could you identify a couple of unique features that are differentiating HyperGraphDB from the other solutions?

Boris: Probably the two most interesting ones are:

  1. Higher-order, n-ary relations are unique to HyperGraphDB
  2. Open-architecture: there’s a very strong “frameworky” aspect to HyperGraphDB, it’s not a black box with fixed, restrictive data-model. The storage layout is open and documented. One can plugin customized indexing, customized type handling, customized back-end storage, customized distribution algorithms etc.

myNoSQL: What’s coming next on HyperGraphDB’ roadmap and why?

Boris The next release will be 1.1 within the next month or so, containing many bug fixes and polishing of the APIs. In addition, it will contain an MVCC implementation to increase transaction throughput, out-of-the-box replication, some optimizations for querying and graph traversals.

Following that, we will be focusing on developing a query language geared towards HyperGraphDB’s unique data model and developing more distribution algorithms for truly massive scalability.

People have also asked about full-text search so an integration with Lucene might happen some time within the next couple of months.

Nested graphs, RAM only graphs and a C++ port are also desirable features on our radar, as time and resources allow. We are an open-source, LGPL project and it all depends on how many people are willing to contribute and how much time they are willing to put in, so no definite dates yet.

myNoSQL: Thanks a lot Boris!

What is HyperGraphDB? originally posted on the NoSQL blog: myNoSQL


Ongoing comparison of OODBs and NoSQL

Last week I tried to briefly present the main differences between OODB and NoSQL. Roberto Zicari, over ☞ ODBMS Industry Watch, is looking into this topic in more detail and in his last post, he invited a few people to answer this question.

I am including below the ones I’ve found most interesting, but I’d encourage you to check the original article and let me know which ones are your favorites.

Peter Norvig (Director of Research at Google Inc.): “You should probably use what works for you and not worry about what people call it.”

Miguel-Angel Sicilia (University of Alcalá): “[…] I do not see NoSQL and ODBMS as overlapping, but as complementary solutions for non-traditional data management problems.”

Jan Lehnardt (CouchDB): ” For me, NoSQL is about choice. OODBs give users a choice. By that definition though, Excel is a NoSQL storage solution and I wouldn’t support that idea :) I think, as usual, common sense needs to be applied. Prescriptive categorisation rarely helps but those who are in the business of categorising.”

Dwight Merriman, (CEO of 10gen): “A comparison of document-oriented and object-oriented databases is fascinating as they are philosophically more different than one might at first expect.”

Erik Falsken (db4o): “[…] The complexity of object relationships is their shared drawback. Being unable to handle things like inheritance and polymorphism is what stops them from becoming object databases. You can think of db4o as a “document-oriented database” which has been extended to support object-oriented principles.”

Tobias Downer (MckoiDB): “[…] I wouldn’t count on the word being in our lexicon for very long though or any products seriously branding themselves using this label, because these systems are likely to eventually come back around and support SQL like query languages in the future.”

I guess only time will tell which ones got this right and which didn’t.


Comparing OODB and NoSQL

In the last couple of hours I read two interesting comparisons of object-oriented databases and NoSQL stores.

The first one, published on the ☞ ODBMS Industry Watch blog, comes from Anat Gafni, VP of Engineering at db4objects which suggests using the following criteria:

By each dimension of the purpose of OODBS:

  1. persistent (could be accomplished by other methods like: replicating to other machines, using non-volatile caches, etc.)
  2. Being queriable
  3. Scalable (beyond what can be in cache, but could be distributed instead)
  4. Objects vs.Relations

Arguable properties:

  1. can express and query based on complex relationships among data items
  2. can be shared among multiple “clients”

Many of these other database are similar to oodbs in item 1, 3 and 4. I am not sure they have capabilities in 3, 5 and 6 above.

A couple of criteria are either pretty vague or too generic (f.e. 2, 5, 6), but I’d say the comparison is pretty fair.

The other short comparison was given by Mike Dirolf in a MongoDB presentation:

while they use fairly similar concepts, the main difference is that in OODB you are saving instances, in document databases you are saving data.

Anything else we should be adding to these?