NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



Using Solr and Hadoop as a NoSQL database

The combination of Hadoop and Solr makes it easy to crunch lots of data and then quickly serve up the results via a fast, flexible search & query API. Because Solr supports query-style requests, it’s suitable as a NoSQL replacement for traditional databases in many situations, especially when the size of the data exceeds what is reasonable with a typical RDBMS.

Using Solr and Hadoop as a NoSQL database

I think the first time I’ve heard about Solr and Lucene mentioned as NoSQL-like storages was from Grant Ingersoll and from the guys.

From a NoSQL perspective:

  • there’s no fixed schema
  • there’s key-value access — hopefully that’s very fast and scalable
  • even if not standardized, there’s an advanced querying language

But as the original article points out some characteristics are missing:

  • Updating the index works best as a batch operation. Individual records can be updated, but each commit (index update) generates a new Lucene segment, which will impact performance.
  • Current support for replication, fail-over, and other attributes that you’d want in a production-grade solution aren’t yet there in SolrCloud. If this matters to you, consider Katta instead.

Original title and link: Using Solr and Hadoop as a NoSQL database (NoSQL databases © myNoSQL)