NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



HA: All content tagged as HA in NoSQL databases and polyglot persistence

How to Achieve the High Availability Imperative

Dr. John Busch (founder, Chairman, and CTO of Schooner):

Tightly-coupled database architectures utilizing  parallel synchronous replication exploiting commodity multi-core servers can achieve 99.999% availability with full data integrity; unlimited scaling with exceptional performance and high data consistency; and greatly simplified administration including instantaneous, automatic fail-over and on-line scaling and upgrades.

Tightly-coupled architectures and high availability used in the same phrase.

Original title and link: How to Achieve the High Availability Imperative (NoSQL database©myNoSQL)


Google Paper: Availability in Globally Distributed Storage Systems

Google paper presented at Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, 2010:

Highly available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity servers and disk drives. Sophisticated management, load balancing and recovery techniques are needed to achieve high performance and availability amidst an abundance of failure sources that include software, hardware, network connectivity, and power issues. While there is a relative wealth of failure studies of individual components of storage systems, such as disk drives, relatively little has been reported so far on the overall availability behavior of large cloud-based storage services. We characterize the availability properties of cloud storage systems based on an extensive one year study of Google’s main storage infrastructure and present statistical models that enable further insight into the impact of multiple design choices, such as data placement and replication strategies. With these models we compare data availability under a variety of system parameters given the real patterns of failures observed in our fleet.

Original title and link: Google Paper: Availability in Globally Distributed Storage Systems (NoSQL databases © myNoSQL)


Clustrix: Distribution, Fault Tolerance, and Availability Models

Using as a pretext a comparison with MongoDB — why MongoDB? — Sergei Tsarev provides some details about Clustrix data distribution, fault tolerance, and availability models.

At Clustrix, we think that Consistency, Availability, and Performance are much more important than Partition tolerance. Within a cluster, Clustrix keeps availability in the face of node loss while keeping strong consistency guarantees. But we do require that more than half of the nodes in the cluster group membership are online before accepting any user requests. So a cluster provides fully ACID compliant transactional semantics while keeping a high level of performance, but you need majority of the nodes online.

Clustrix Distribution Model

Original title and link: Clustrix: Distribution, Fault Tolerance, and Availability Models (NoSQL databases © myNoSQL)


Neo4j High Availability Cluster

Neo4j uses the Master-Slave replication model. All writes must go through the master and the slaves will be read only. Changes performed on the master will be pushed out to the slaves when the logical log is rotated (based on configured size or invoking a method on the master).

The online backup utility used to synchronize a destination Neo4j database from a source Neo4j database can be used to emulate “high availability” (HA) having the master replicating changes to read only slaves.

I didn’t know Neo4j supports a highly available setup. Since when?[1]

  1. The ☞ official documentation mentions as last modification time Oct. 22nd.  ()

Original title and link: Neo4j High Availability Cluster (NoSQL databases © myNoSQL)