NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



RDS: All content tagged as RDS in NoSQL databases and polyglot persistence

Storage technologies at HipChat - CouchDB, ElasticSearch, Redis, RDS

As per the list below, HipChat’s storage solution is based on a couple of different solutions:

  • Hosting: AWS EC2 East with 75 Instance currently all Ubuntu 12.04 LTS
  • Database: CouchDB currently for Chat History, transitioning to ElasticSearch. MySQL-RDS for everything else
  • Caching: Redis
  • Search: ElasticSearch
  1. This post made me wonder what led HipChat team to use CouchDB in the first place. I’m tempted to say that it was the master-master replication and the early integration with Lucene.
  2. This is only the 2nd time in quite a while I’m reading an article mentioning CouchDB — after the February “no-releases-but-we’re-still-merging-BigCouch” report for ASF. And according to the story, CouchDB is on the way out.

Original title and link: Storage technologies at HipChat - CouchDB, ElasticSearch, Redis, RDS (NoSQL database©myNoSQL)


MySQL in the Cloud: Discontinuing of Xeround Cloud Database Public Service

Cloud and MySQL related:

We are deeply sorry to announce that Xeround’s public cloud offering will be discontinued soon. All Xeround FREE database instances will be terminated on May 8th, and the paid plans terminated on May 15th.

This was announced on May 1st.

✚ This only means more for Amazon RDS.

Original title and link: MySQL in the Cloud: Discontinuing of Xeround Cloud Database Public Service (NoSQL database©myNoSQL)


Amazon Web Services Annual Revenue Estimation

Over the weekend, Christopher Mims has published an article in which he derives a figure for Amazon Web Services’s annual revenue: $2.4 billions:

Amazon is famously reticent about sales figures, dribbling out clues without revealing actual numbers. But it appears the company has left enough hints to, finally, discern how much revenue it makes on its cloud computing business, known as Amazon Web Services, which provides the backbone for a growing portion of the internet: about $2.4 billion a year.

There’s no way to decompose this number into the revenue of each AWS solution. For the data space I’d be interested into:

  1. S3 revenues. This is the space Basho’s Riak CS competes into.

    After writing my first post about Riak CS, I’ve learned that in Japan, the same place where Riak CS is run by Yahoo! new cloud storage, Gemini Mobile Technologies has been offering to local ISPs a similar S3-service built on top of Cassandra.

  2. Redshift is pretty new and while I’m not aware of immediate competitors (what am I missing?), I don’t think it accounts for a significant part of this revenue. Even if some of the early users, like AirBnb, report getting very good performance and costs from it.

    Redshift is powered by ParAccell, which, over the weekend, has been acquired by Actian.

  3. Amazon Elastic MapReduce. This is another interesting space from which Microsoft wants a share with its Azure HDInsight developed in collaboration with Hortonworks.

    In this space there’s also MapR and Google Compute combination which seem to be extremely performant.

  4. Interestingly Amazon is making money also from some of the competitors of its Amazon Dynamo and RDS services. The advantage of owning the infrastructure.

Original title and link: Amazon Web Services Annual Revenue Estimation (NoSQL database©myNoSQL)

Provisioned IOPS for Amazon RDS

Werner Vogels:

Following the huge success of being able to provision a consistent, user-requested I/O rate for DynamoDB and Elastic Block Store (EBS), the AWS Database Services team has now released Provisioned IOPS, a new high performance storage option for the Amazon Relational Database Service (Amazon RDS). Customers can provision up to 10,000 IOPS (input/output operations per second) per database instance to help ensure that their databases can run the most stringent workloads with rock solid, consistent performance.

Amazon is the first company I know of championing guaranteed performance SLAs. Until recently most of the SLAs were referring to availability, resilience, and redundancy. But soon performance-based SLAs will become the norm for other service providers. I’d also expect appliance vendors to be asked for similar guarantees sooner than later.

Original title and link: Provisioned IOPS for Amazon RDS (NoSQL database©myNoSQL)


Amazon RDS: The Good and Bad of Hosted MySQL

Mostly is good:

RDS is pretty awesome — it’s basically a highly available MySQL setup with backups and optional goodness like read-slaves. RDS is one of the best services as far as Amazon Webservices are concerned: 90% of what anyone would need from RDS, Amazon allows you to do with a couple clicks.


Aside from the monitoring and backup quirks, one of the real pain points of Amazon RDS is that a lot of the collective MySQL knowledge is not available to us. The knowledge which is manifested in books, blogs, various monitoring solutions and outstanding tools like Percona’s backup tools are not available to people who run Amazon RDS setups.

Most of the time is difficult to get access to your preferred tools from small service providers. Amazon cannot afford to include in their operations all available tools for MySQL. But I’m pretty sure they have a prioritized list of the most requested ones.

Original title and link: Amazon RDS: The Good and Bad of Hosted MySQL (NoSQL database©myNoSQL)


99designs: Powered by Amazon RDS, Redis, MongoDB, and Memcached

While the authoritative storage is Amazon RDS, 99designs is using Redis, MongoDB, and Memcached for transient data:

We log errors and statistics to capped collections in MongoDB, providing us with more insight into our system’s performance. Redis captures per-user information about which features are enabled at any given time; it supports our development stragegy around dark launches, soft launches and incremental feature rollouts.

It’s also worth noting the nice things they say about using Amazon RDS:

An RDS instance configured to use multiple availability zones provides master-master replication, providing crucial redundancy for our DB layer. This feature has already saved our bacon multiple times: the fail over has been smooth enough that by the time we realised anything was wrong, another master was correctly serving requests. Its rolling backups provide a means of disaster recovery. We load-balance reads across multiple slaves as a means of maintaining performance as the load on our database increases.

Original title and link: 99designs: Powered by Amazon RDS, Redis, MongoDB, and Memcached (NoSQL database©myNoSQL)


MongoDB vs MySQL: A DevOps point of view

Pierre Bailet and Mathieu Poumeyrol of fotopedia (a French photo site) share their experience of operating a small MongoDB cluster since Sep.2009 compared to a MySQL cluster.

Some details about fotopedia:

  • fotopedia is 100% on AWS
  • Amazon RDS for MySQL
  • 4 nodes MongoDB cluster
  • 150mil. photo views

MongoDB advantages:

  • no alter table
  • background index creation
  • data backup & restoration
    • note: as far as I can tell MySQL is able to do the same
  • replica sets
  • hardware migration
    • note: the same procedure can be used for MySQL

Before leaving you with the slides, here is an interesting accepted trade-off:

Quietly losing seconds of writes is preferable to:

  • weekly minutes-long maintenance periods
  • minutes-long unscheduled downtime and manual failover in case of hardware failures

Get them by the data

Gavin Clarke and Chris Mellor about AWS Storage Gateway:

Once you’ve got them by the data, of course, their hearts and minds will follow, and Amazon’s using the AWS Storage Gateway beta as a sampler for the rest of its compute cloud.

The Storage Gateway is another piece, together with S3, DynamoDB, SimpleDB, Elastic MapReduce, in Amazon’s great strategical puzzle of a complete polyglot platform.

Original title and link: Get them by the data (NoSQL database©myNoSQL)


Amazon RDS TPC-C Benchmark

A paper by Md. Borhan Uddin, Bo He, and Radu Sion:

Experiments were performed to benchmark the Amazon Relational Database Service (RDS) within a TPC-C benchmarking framework. The TPC-C benchmark is one of the most widely adopted database performance benchmarking frameworks comparing OLTP performance of online transaction processing systems. Two types of Amazon RDS services were tested, namely the standard RDS (single availability zone) and the Multi- AZ RDS (synchronous ‘standby’ replica in multiple availability zones). For each service type, five different RDS instances were tested: Small, Large, Extra Large (XLarge), Double Extra Large (2XLarge), and Quadruple Extra Large (4XLarge).

Results are interesting to say the least:

Overall, we observed that at a very low load, the resulting throughput was also relatively low; at medium load, the throughput increased to a peak; at very high loads, the throughput decreased again.

You can get the paper from here. Some independent comments—independent in the sense of not belonging to the authors, but to a company offering a scaling solution for MySQL—about the results here.

Original title and link: Amazon RDS TPC-C Benchmark (NoSQL database©myNoSQL)