NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



RavenDB document indexing process

Itamar Syn-Hershko explains the indexing process in RavenDB:

RavenDB has a background process that is handed new documents and document updates as they come in, right after they were stored in the Document Store, and it passes them in batches through all the indexes in the system. For write operations, the user gets an immediate confirmation on their transaction—even before the indexing process started processing these updates—without waiting for indexing, but being 100 percent certain the changes were recorded in the database. Queries do not wait for indexing either—they just use the indexes that exist at the time the query was issued. This ensures both smooth operation on all fronts, and that no documents are left behind.

Asynchronous indexing is tricky. While it looks like addressing the performance penalty on both read and write, it actually has a few drawbacks:

  1. immediate inconsistency: with asynchronous indexes, there are no consistency guarantees.
  2. impossibility of defining unique indexes. When using async indexes, it’s impossible to define unique indexes as by the time the index would be updated it would be too late to acknowledge the client that the uniqueness constraint is not satisfied.
  3. complicated crash recovery. With async indexing, the server must be able to continue the indexing process from where it was left. If this information is not persistent, crash recovery might lead to permanent data inconsistencies.

Any other obvious ones I’ve missed?

Original title and link: RavenDB document indexing process (NoSQL database©myNoSQL)