NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



RavenDB: All content tagged as RavenDB in NoSQL databases and polyglot persistence

A map to reviewing RavenDB code base

Ayende provides some answers to a series of questions about specific code base details of RavenDB. This could be a very good starting point for those interested into how RavenDB is implemented.

Original title and link: A map to reviewing RavenDB code base (NoSQL database©myNoSQL)


A practical comparison of Map-Reduce in MongoDB and RavenDB

Ben Foster looks at MongoDB’s Map-Reduce and aggregation framework and then compares them with RavenDB’s Map-Reduce:

I thought it would be interesting to do a practical comparison of Map-Reduce in both MongoDB and RavenDB.

There are more differences than similarities — I’m not referring to the API differences, but to fundamental differences to the ways they operate.

✚ RavenDB’s author has a follow up post in which he underlines another major difference: RavenDB’s Map-Reduce operates as an index, while MongoDB’s Map-Reduce is an online operation.

Original title and link: A practical comparison of Map-Reduce in MongoDB and RavenDB (NoSQL database©myNoSQL)


RavenDB: The Road to Release

Ayende Rahien shares with RavenDB’s community a 5 point plan for the future of RavenDB. One of these caught my eyes:

Second, we do acknowledge that we suffer from a typical blindness for how we approach RavenDB. Since we built it, we know how things are supposed to be, and that is how we usually test them. Even when we try to go for the edge cases, we are constrained by our own thinking. We are currently working on getting an external testing team to do just that. Actively work to make use of RavenDB in creative ways specifically to try to break it.

To say that testing a database is complicated is an understatement. Moreover so if it’s a distributed database.

Original title and link: RavenDB: The Road to Release (NoSQL database©myNoSQL)


RavenDB 2.5 with Dynamic Aggregation and Query Streaming

Jan Stenberg summarizes on InfoQ the latest RavenDB release:

A stable version 2.5 of the document database RavenDB has been released with dynamic aggregation allowing for complex queries and an Unbounded results API using query streaming to retrieve large result sets in a single request.

While the Hadoop space is lately about SQL and speed, the NoSQL databases are starting to look into an area where users have high expectations: advanced queries over large amounts of data. If you remember the early days pretty much everything was about key-based access and then map-reduces data sifting. Today we have many different query languages or data processing frameworks. And there’s still a lot to come.

Original title and link: RavenDB 2.5 with Dynamic Aggregation and Query Streaming (NoSQL database©myNoSQL)


RavenDB document indexing process

Itamar Syn-Hershko explains the indexing process in RavenDB:

RavenDB has a background process that is handed new documents and document updates as they come in, right after they were stored in the Document Store, and it passes them in batches through all the indexes in the system. For write operations, the user gets an immediate confirmation on their transaction—even before the indexing process started processing these updates—without waiting for indexing, but being 100 percent certain the changes were recorded in the database. Queries do not wait for indexing either—they just use the indexes that exist at the time the query was issued. This ensures both smooth operation on all fronts, and that no documents are left behind.

Asynchronous indexing is tricky. While it looks like addressing the performance penalty on both read and write, it actually has a few drawbacks:

  1. immediate inconsistency: with asynchronous indexes, there are no consistency guarantees.
  2. impossibility of defining unique indexes. When using async indexes, it’s impossible to define unique indexes as by the time the index would be updated it would be too late to acknowledge the client that the uniqueness constraint is not satisfied.
  3. complicated crash recovery. With async indexing, the server must be able to continue the indexing process from where it was left. If this information is not persistent, crash recovery might lead to permanent data inconsistencies.

Any other obvious ones I’ve missed?

Original title and link: RavenDB document indexing process (NoSQL database©myNoSQL)


NoSQL Bug Fix Releases: Redis 2.6.10 and RavenDB 2.01

The RavenDB team has released mostly a bug fix new version RavenDB 2.01. The change log is here.

Redis also has a new bug fix release: 2.6.10 including non-critical fixes and 5 small improvements. Change log is here

Original title and link: NoSQL Bug Fix Releases: Redis 2.6.10 and RavenDB 2.01 (NoSQL database©myNoSQL)

NoSQL Hosting: Redis and RavenDB

More service providers for hosted NoSQL solutions:

  1. Garantia Data to Offer its Redis & Memcached Hosting Services in Europe: “In-memory NoSQL Company extends Redis Cloud and Memcached Cloud to European Amazon Web Services users.”
  2. CloudBird Launch, now with RavenDB 2.0 support - The CloudBird Blog: “Today we’re cracking open the Champagne as we peel off the beta label and officially welcome production databases to our RavenDB hosting service. What’s more we’re also introducing support for the Raven 2.0 RTM.”

It’s not anymore just “a database for every taste”, but steadly becoming more of “a database for every taste served from anywhere you like”,

Original title and link: NoSQL Hosting: Redis and RavenDB (NoSQL database©myNoSQL)

RavenDB Bulk Inserts: Implementation Details

Ayende Rahien:

We stream the results to the server directly, so while the client is still sending results, we are already flushing them to disk.

To make things even more interesting, we aren’t using standard GZip compression over the whole request. Instead, each batch is compressed independently, which means we don’t have a dependency on the internals of the compression routine internal buffering system, etc. It also means that we get each batch much faster.

There are, of course, rate limits built in, to protect ourselves from flooding the buffers, but for the most part, you will have hard time hitting them.

Bulk inserts and data import are two interesting topics in the world of NoSQL databases where there are no ACID guarantees. What is the state of the databases if data stream is cut midway? What is the state of the database if the import fails midway? What is the state of the database if some insert/update operations fail? I’m not aware of any good answers for these possible issues.

Original title and link: RavenDB Bulk Inserts: Implementation Details (NoSQL database©myNoSQL)


RavenDB 2.0 Is Out: Over 6 Months of Features, Improvements, and Bug Fixes

Briefly announced by Ayende yesterday, RavenDB’s 2.0 list of improvements and bug fixes is quite long. Digging through his blog, I’ve found this old post summarizing the most interesting features in RavenDB 2.0:

  1. drastically improved RavenDB Management Studio
  2. improved operational support—more monitoring data exposed through performance monitors and logs
  3. core bundles
  4. Changes() API: a feature that allows subscribing to change events. If you are familiar with CouchDB, this sounds like _changes.
  5. Async API
  6. Eval patching: running JS scripts serever side against stored objects
  7. more authentication options & control
  8. Indexing optimizations
  9. Improved map/reduce, facets, IN queries, and sharding
  10. Support for JOINs

Original title and link: RavenDB 2.0 Is Out: Over 6 Months of Features, Improvements, and Bug Fixes (NoSQL database©myNoSQL)

NoSQL and JOINs: RavenDB and RethinkDB

Daniel Lang:

One of the main differences between relational databases and document databases is the lack of native joining capabilities, right? This is no longer true for RavenDB.

This wasn’t the case for RethinkDB1 which launched with support for JOINs. But it’s great to see others doing it too.

  1. First and last time disclaimer here: I work for RethinkDB.  

Original title and link: NoSQL and JOINs: RavenDB and RethinkDB (NoSQL database©myNoSQL)


An Overview of RavenDB Replication

Good overview of main characteristics of RavenDB replication by John Bennett:

  1. one-way
  2. push-based
  3. asynchronous
  4. secure
  5. batched

Original title and link: An Overview of RavenDB Replication (NoSQL database©myNoSQL)


RavenDB vs MSSQL: Which to Choose?

Daniel Lang:

The question which database to choose obviously depend on your concrete scenario, the skills of your team, your environment (existing licenses), etc. but here is what I think could help you:

We choose RavenDB when

  • we can think of our data in terms of aggregates with mostly independent chunks of data (e.g. customer, order, product, etc.)
  • we need to have good performance on aggregation and calculation queries
  • we need to have complex searching (full-text, facets, etc.)
  • we need to be able to scale
  • we need high availability at low costs

We choose SQL Server when

  • when we need to support user generated reports and highly dynamical data analysis
  • we have to deal with mostly relation data (e.g. accounting, statistics)
  • we want to use Windows Azure
  • our customer definitely wants us to choose sql server without knowing better

My additional 2 cents:

  1. the easy part: don’t choose one or another based on feature lists. Feature lists should be used only in apples-to-apples comparisons.
  2. the more complicated part: don’t use a relational database just because you’ve always used one. Don’t use a NoSQL database just because it’s the shiny new toy you need on your portfolio/resumé. Don’t use both just because it might be fun.

Original title and link: RavenDB vs MSSQL: Which to Choose? (NoSQL database©myNoSQL)