Redshift: All content tagged as Redshift in NoSQL databases and polyglot persistence
Over the weekend, Christopher Mims has published an article in which he derives a figure for Amazon Web Services’s annual revenue: $2.4 billions:
Amazon is famously reticent about sales figures, dribbling out clues without revealing actual numbers. But it appears the company has left enough hints to, finally, discern how much revenue it makes on its cloud computing business, known as Amazon Web Services, which provides the backbone for a growing portion of the internet: about $2.4 billion a year.
There’s no way to decompose this number into the revenue of each AWS solution. For the data space I’d be interested into:
S3 revenues. This is the space Basho’s Riak CS competes into.
After writing my first post about Riak CS, I’ve learned that in Japan, the same place where Riak CS is run by Yahoo! new cloud storage, Gemini Mobile Technologies has been offering to local ISPs a similar S3-service built on top of Cassandra.
Redshift is pretty new and while I’m not aware of immediate competitors (what am I missing?), I don’t think it accounts for a significant part of this revenue. Even if some of the early users, like AirBnb, report getting very good performance and costs from it.
Redshift is powered by ParAccell, which, over the weekend, has been acquired by Actian.
Amazon Elastic MapReduce. This is another interesting space from which Microsoft wants a share with its Azure HDInsight developed in collaboration with Hortonworks.
Interestingly Amazon is making money also from some of the competitors of its Amazon Dynamo and RDS services. The advantage of owning the infrastructure.
Original title and link: Amazon Web Services Annual Revenue Estimation ( ©myNoSQL)
As shown above the performance gain is pretty significant, and the cost saving is even more impressive: $13.60/hour versus $57/hour. This is hard to compare due to the different pricing models, but check out pricing here for more info. In fact, our analysts like Redshift so much that they don’t want to go back to Hive and other tools even though a few key features are lacking in Redshift. Also, we have noticed that big joins of billions of rows tend to run for a very long time, so for that we’d go back to hadoop for help.
If I’m not mistaking, this is the second story in the last week about the performance of Redshift. But here’s something I don’t understand (or I don’t see mentioned in this post):
- you use Hadoop to store your data. The reason is that 12 months ago, 6 months ago (and today) there is no other more cost effective and productive solution.
- in this time you learn about the data. You develop models and queries
- your analysts prefer SQL because that’s what makes them more productive
- you take the data, the knowledge you’ve built in this time, you craft it to fit into a columnar analytic database
- then you write that the columnar analytic-oriented database is more performant than using Hive over Hadoop
To me this feels like saying that you are more efficient in your mother tongue than in a foreign language. Or am I missing something?
Original title and link: Redshift Performance & Cost at Airbnb ( ©myNoSQL)