NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



high availability: All content tagged as high availability in NoSQL databases and polyglot persistence

Microsoft SQL Server 2012 High Availability Solutions

The recent announcement of the Microsoft SQL Server 2012 release emphasized the high availability features added to this version. Here is what I could find after some digging through the documentation:

  • AlwaysOn Failover Cluster Instances: As part of the SQL Server AlwaysOn offering, AlwaysOn Failover Cluster Instances leverages Windows Server Failover Clustering (WSFC) functionality to provide local high availability through redundancy at the server-instance level—a failover cluster instance (FCI). An FCI is a single instance of SQL Server that is installed across Windows Server Failover Clustering (WSFC) nodes and, possibly, across multiple subnets. On the network, an FCI appears to be an instance of SQL Server running on a single computer, but the FCI provides failover from one WSFC node to another if the current node becomes unavailable.

    This is explained in more detail on AlwaysOn Failover Cluster Instances (SQL Server).

  • AlwaysOn Availability Groups: The AlwaysOn Availability Groups feature is a high-availability and disaster-recovery solution that provides an enterprise-level alternative to database mirroring. Introduced in SQL Server 2012, AlwaysOn Availability Groups maximizes the availability of a set of user databases for an enterprise. An availability group supports a failover environment for a discrete set of user databases, known as availability databases, that fail over together. An availability group supports a set of read-write primary databases and one to four sets of corresponding secondary databases. Optionally, secondary databases can be made available for read-only access and/or some backup operations.

    More documentation about AlwaysOn Availability groups can be found here.

  • Database mirroring: This feature will be removed in a future version of Microsoft SQL Server.

  • Log shipping: SQL Server Log shipping allows you to automatically send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances.

    This is the well-known master-slave setup. More details can be found here.

Also worth checking the availability of these feature per SQL Server 2012 editions:

SQL Server 2012 Hgih Availability

Original title and link: Microsoft SQL Server 2012 High Availability Solutions (NoSQL database©myNoSQL)

40% Penetration for NoSQL: An Interview With Basho's CEO Don Rippert

Don Rippert interviewed by Derrick Harris (GigaOm):

Enterprises will start adopting NoSQL en masse, Rippert thinks, because the types of data they’re now dealing with require new technologies. “We are the data store for the new type of data being stored,” he explained. […]

That data is largely of the unstructured variety coming from web applications, machines and other sources that aren’t the traditional business-transaction data for which relational databases were created. Relational databases were the answer to almost everything previously, but now Rippert thinks NoSQL is “the answer to about 40 percent of business use cases today”.

A couple of follow up questions for Don Rippert[1]:

  1. Is your prediction of 40% market share relative to scenarios for large scale, unstructured data with high availability requirements? That would basically mean a 40% market share for just a couple of products: Cassandra, HBase, Riak, Project Voldemort, and (probably) Couchbase.

  2. How is the rest of 60% of the market devided between the other NoSQL databases, NewSQL databases, and the traditional relational databases?

  3. Considering the current market structure, when do you think the shift towards large scale, highly available requirements happened?

  4. How long do you think it will take the market to remodel? What factors will accelerate this transition?

  1. I’d really appreciate if someone could forward these questions to him.  

Original title and link: 40% Penetration for NoSQL: An Interview With Basho’s CEO Don Rippert (NoSQL database©myNoSQL)


Improvements to Hadoop Availability

Six areas to improve Hadoop availability when dealing with common scenarios like host maintenance, configuration changes, software upgrades, host failures:

A number of efforts are under way to improve Hadoop availability, and implement missing functionality required by the above use cases. Tasks related to HDFS availability are tracked here, tasks related to MapReduce availability are tracked here.

Hadoop community is already working to improve Hadoop availability and these features will be available sooner than seeing Yahoo’s next generation of Hadoop MapReduce.

Original title and link: Improvements to Hadoop Availability (NoSQL databases © myNoSQL)


Google Megastore: Scalable, Highly Available Storage for Interactive Services

A new paper from Google:

Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability.
We provide fully serializable ACID semantics within fine-grained partitions of data. This partitioning allows us to synchronously replicate each write across a wide area network with reasonable latency and support seamless failover between datacenters.
This paper describes Megastore’s semantics and replication algorithm.

Megastore seems to be the solution behind the Google App Engine high replication datastore.

Emphases are mine.

Original title and link: Google Megastore: Scalable, Highly Available Storage for Interactive Services (NoSQL databases © myNoSQL)


Xeround: MySQL Elastic, Always-on Storage Engine for the Cloud

Xeround is a new MySQL storage engine offered as Database-as-a-Service.

What it promises sounds (a bit?) too good to be true (nb this list have been extracted from their site):

  • seamless replacement of existing MySQL database
  • high availability (including schema changes)
  • automatic fault-detection and recovery
  • full consistency with low latency
  • elasticity

What’s the catch?

Original title and link: Xeround: MySQL Elastic, Always-on Storage Engine for the Cloud (NoSQL databases © myNoSQL)

High Availability MySQL at Yahoo!

Jay Jenssen talks about Yahoo!’s approach

Now, what makes our solution different? Not much. The layout is this: two master databases, one in each of our two colocations. These masters replicate from each other, but we would never have more than two masters in this replication loop for the same reason we don’t use token ring networks today: one master outage would break replication in a chain of size > 2. Our slaves replicate from one of the two masters, often half of the slaves in a given colocation replicate from one of the masters, and half from the other master.

But there is much more in the original article (e.g. allowing writes to a single master, dealing with failure, etc.). There are also three slide decks on infrastructure resiliency, high availability/business continuity planning, and application resiliency.

Infrastructure resiliency at Yahoo

High availability/Business continuity planning at Yahoo

Application resiliency at Yahoo

It doesn’t sound so exciting as what Google is doing, or Facebook, but it is probably something many could learn from.

Original title and link for this post: High Availability MySQL at Yahoo! (published on the NoSQL blog: myNoSQL)