Google using Cassandra to show the performance and cost efficiency of the Google Compute Engine:
- sustain one million writes per second to Cassandra with a median latency of
10.3 ms and 95% completing under 23 ms
- sustain a loss of 1/3 of the instances and volumes and still maintain the 1
million writes per second (though with higher latency)
- scale up and down linearly so that the configuration described can be used
to create a cost effective solution
- go from nothing in existence to a fully configured and deployed instances
hitting 1 million writes per second took just 70 minutes. A configured
environment can achieve the same throughput in 20 minutes.
Make sure you check the charts and get to the conclusion part. The other conclusion I’d suggest is: based on the real benchmarks I’ve seen over the years, Cassandra is the only system that was proven to scale lineary and provide top performance.
Original title and link: Cassandra hits 1 million writes per second on Google Compute Engine
Next day after my Google Compute Engine and Data, Derrick Harris writes for GigaOm:
Google’s Compute Engine cloud doesn’t yet have a Hadoop offering of its own,
but the platform is making a name for itself as a viable, if not ideal,
place to run big data workloads.
Original title and link: Maybe big data is the killer app for Google’s cloud
Since announcing the GA couple of weeks ago, I’ve been noticing quite a few data related posts on the Google Compute Engine blog:
If you look at these, you’ll notice a theme: covering data from every angle; Cassandra/DSE from DataStax for OLTP, DataTorrent for stream processing, Qubole for Hadoop, MapR for their Hadoop-like solution. I can see this continuing for a while and making Google Compute Engine a strong competitor for Amazon Web Services.
One question remains though: will they be able to come up with a good integration strategy for all these 3rd party tools?
Original title and link: Google Compute Engine and Data
On the Google Cloud Platform blog:
The guide walks you through creating your nodes (instances), setting up
Java, and creating and configuring a firewall. Included in the guide are
several scripts that make the configuration and setup easy to understand and
execute. Once you are finished with your cluster, a simple call to a
teardown script cleans up your project’s environment.
Can you speculate why Cassandra is the first NoSQL database that gets mentioned on Google’s blog? (hint: maybe this?)
Original title and link: Get up and Running with Cassandra on Google Compute Engine ( ©myNoSQL)