NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



EBS: All content tagged as EBS in NoSQL databases and polyglot persistence

Deploying Riak on EC2 - What to Pick?

Deepak Bala sharing his recommendations for running Riak on EC2 based on his own experience:

There are a couple of problems to field when deploying Riak.

  1. The EC2 instances that are provisioned by default change the following on restart.

    • Private IP address
    • Public IP address
    • Private DNS
    • Public DNS
    • EBS instances provide stable durable storage while Ephemeral storage provides for better predictable performance at the cost of losing data on restarts.
  2. Performance.

Original title and link: Deploying Riak on EC2 - What to Pick? (NoSQL database©myNoSQL)


Amazon EBS, SSD, and Rackspace IOPS Per Dollar

Staying on the subject of IOPS in the cloud, Jeff Darcy did some testing with GlusterFS against Amazon EBS, Amazon SSD, Storm on Demand SS, and Rackspace instance storage and computed for each IOPS/$:

  • Amazon EBS: 1000 IOPS (provisioned) for $225/month or 4.4 IOPS/$ (server not included)
  • Amazon SSD: 4300 IOPS for $4464/month or 1.0 IOPS/month (that’s pathetic)
  • Storm on Demand SSD: 5500 IOPS for $590/month or 9.3 IOPS/$
  • Rackspace instance storage: 3400 IOPS for $692/month (8GB instances) or 4.9 IOPS/$
  • Rackspace with 4x block storage per server: 9600 IOPS for $811/month or 11.8 IOPS/$ (hypothetical, assuming CPU or network don’t become bottlenecks)

Original title and link: Amazon EBS, SSD, and Rackspace IOPS Per Dollar (NoSQL database©myNoSQL)


I/O Intensive Apps and Amazon Cloud Improvements: EBS Provisioned IOPS & Optimized Instance Types

James Hamilton puts in perspective the last two new I/O related features coming from Amazon: the high performance I/O EC2 instances and EBS provisioned IOPS together with EBS-optimized EC2 instances:

With the announcement today, EC2 customers now have access to two very high performance storage solutions. The first solution is the EC2 High I/O Instance type announced last week which delivers a direct attached, SSD-powered 100k IOIPS for $3.10/hour. In today’s announcement this direct attached storage solution is joined by a high-performance virtual storage solution. This new type of EBS storage allows the creation of striped storage volumes that can reliably delivery 10,000 to 20,000 IOPS across a dedicated virtual storage network.

I’ve already said it, but this confirms it once again that Amazon is addressing most of the complains of running I/O intensive applications on EC2 and EBS.

Original title and link: I/O Intensive Apps and Amazon Cloud Improvements: EBS Provisioned IOPS & Optimized Instance Types (NoSQL database©myNoSQL)


The Total Cost of (Non) Ownership of a NoSQL Database Service

The Amazon team released a whitepaper comparing the total cost of ownership for 3 scenarios:

  1. on-premise NoSQL database
  2. NoSQL database deployed on Amazon EC2 and Amazon EBS
  3. Amazon DynamoDB

The Total Cost of Ownership of a NoSQL Database service

As you can imagine DynamoDB comes out as the most cost-effective solution (79% more effective than on-premise NoSQL database and 61% more cost-effective than AWS hosted NoSQL database). Read or download the paper after the break.

MongoDB and Amazon Elastic Block Storage (EBS)

The topic of running MongoDB on Amazon Web Services using Elastic Block Storage came up again among the 10 tips for running MongoDB from Engine Yard:

you should know that the performance of Amazon’s Elastic Block Storage (EBS) can be inconsistent.

Following up on that Mahesh P-Subramanya aptly added:

Indeed!  I’d actually take it a step further and say Do not use EBS in any environment where reliability and/or performance characteristics of your disk-access are important.  Or, to put it differently, asynchronous backups - OK, disk-based databases - Not So Much.  

Interestingly though, some presentations earlier this year–MongoDB in the Amazon Cloud and Running MongoDB on the Cloud—left me, and others with the impression that EBS should not be dismissed so fast.

Original title and link: MongoDB and Amazon Elastic Block Storage (EBS) (NoSQL database©myNoSQL)

MongoDB and Amazon: Why EBS?

After linking to the MongoDB in the Amazon cloud, MongoDB and EC2 and the older MongoDB on Amazon EC2 with EBS volumes , Arnout Kazemier commented:

The only thing I dislike about that EC2 guide is that it’s suggesting to use EBS instead of the regular EC2 instance storage

This is an apt question in the light of the prolongued Amazon outage, Reddit’s experience with EBS, the unpredictable EBS performance, and Netflix’s Adrian Cockcroft explanation of multi-tenancy impact on the Amazon EBS performance. Maybe someone could answer it.

Original title and link: MongoDB and Amazon: Why EBS? (NoSQL database©myNoSQL)

HBase on EC2 using EBS volumes : Lessons Learned

There lies the answer! We have a requirement of recreating the cluster in case we accidentally delete entire data or if we loose our master. In such a case the reliable backup can only be taken if your HDFS data does not reside on the root devices. A reliable backup of the root device cannot be taken without rebooting the device. Furthermore it’s stored as an AMI which mean you have to create a new AMI every day and delete the old one. This means to solve all of our problems we need HBase installation and data both stored on attached EBS volumes that are not the root devices.

Update: after reading the post both Bradford Stephens[1] and Andrew Purtell[2] recommended using instance store instead of EBS:

EBS adds complexity, failure risk, and cost

  1. CEO of Drawn to Scale  

  2. Systems architect and HBase committer, @akpurtell  

Original title and link: HBase on EC2 using EBS volumes : Lessons Learned (NoSQL databases © myNoSQL)


Membase on Amazon EC2 with EBS

The decision was made and we decided to go with a 2 server solution, each server has 16G of memory and 100G of EBS volume attached to it.

Both will have membase latest stable version installed and perform as a cluster in case one falls or anything happens, a fail safe if you will.

In this post, I will walk you though what was done to perform this and how exactly it was done on the amazon cloud.

Wouldn’t it be easier if there would be an always up-to-date official Membase AMI and the corresponding guide (making sure important details about EBS are not left out)?

Original title and link: Membase on Amazon EC2 with EBS (NoSQL databases © myNoSQL)


Neo4j REST Server Image in Amazon EC2

OpenCredo created it, Jussi Heinonen shares the details:

Neo4j EC2 Components Image

Original title and link: Neo4j REST Server Image in Amazon EC2 (NoSQL databases © myNoSQL)


Multi-tenancy and Cloud Storage Performance

Adrian Cockcroft[1] has a great explanation of the impact of multi-tenancy on cloud storage performance. The connection with NoSQL databases is not necessarily in the Amazon EBS and SSD Price, Performance, QoS comparison, but:


If you ever see public benchmarks of AWS that only use m1.small, they are useless, it shows that the people running the benchmark either didn’t know what they were doing or are deliberately trying to make some other system look better. You cannot expect to get consistent measurements of a system that has a very high probability of multi-tenant interference.

  1. Adrian Cockcroft: Netflix, @adrianco  

Original title and link: Multi-tenancy and Cloud Storage Performance (NoSQL databases © myNoSQL)


MongoDB in the Amazon Cloud

A discussion on the MongoDB group about EBS snapshot backups of journaled MongoDB reminded me of a Jared Rosoff’s slides “MongoDB on EC2 and EBS” covering many important aspects of running MongoDB on the Amazon cloud:

  • MongoDB components and their requirements

    MongoDB components

  • deployment options and corresponding Amazon EC2 instance types

    MongoDB and Amazon EC2 instance types

  • operating systems, specific configurations, and operational advise:

    • deployment automation
    • backups and restoration
    • security
  • deployment scenarios:

    • 3-node replica set
    • 2-nodes + arbiter
    • multi-datacenter (availability zone) 3-node replica set
    • sharded MongoDB

While tempting, running databases in the cloud is not as simple as Amazon makes it sound. Reddit felt that with their Cassandra and PostgreSQL deployment.

Original title and link: MongoDB in the Amazon Cloud (NoSQL databases © myNoSQL)