cloud computing: All content tagged as cloud computing in NoSQL databases and polyglot persistence
Wednesday, 14 March 2012
Hadoop in the Cloud: Pros and Cons
Steve Loughran covering the pro and con arguments of running Hadoop in a cloud environment:
- If your data is stored in a cloud provider’s storage infrastructure, doing the analysis locally is the only rational action. It’s that “work near the data” philosophy.
- If you are only doing some computation -say nightly- then you can rent some cluster time. Even if compute performance is worse, you can just rent some more machines to compensate.
- You may be able to achieve better security through isolation of clusters (depends on your IaaS vendor’s abilities).
- No upfront capex; fund from ongoing revenue.
- Easier to expand your cluster; no need to buy more racks, find more rack space.
- You don’t need to care about the problems of networking.
- Less of a problem of heterogenous clusters if you expand later.
Interestingly the list of counter-arguments is much shorter and the important bit, further detailed in the post, is: “Hadoop contains lots of assumptions about running in a static infrastructure; it’s scheduling and recovery algorithms assume this.”
Original title and link: Hadoop in the Cloud: Pros and Cons (©myNoSQL)
via: http://steveloughran.blogspot.com/2012/03/hadoop-in-cloud-infrastructures.html
Tuesday, 27 September 2011
OpenStack-based SDSC Cloud Storage Services
The San Diego Supercomputer Center (SDSC) at the University of California, San Diego announced a cloud storage solution based on OpenStack Swift Object Storage:
SDSC’s Cloud Storage provides academic and industry users with a convenient and affordable way to store, share, and archive data, including extremely large data sets. The object based storage system and multiple interface methods make the SDSC Cloud easy to use for the average user, but also provide a flexible, configurable, and expandable solution to meet the needs of more demanding applications.
Check out the project homepage for a short description of this new cloud offering characteristics.
Original title and link: OpenStack-based SDSC Cloud Storage Services (©myNoSQL)
Thursday, 22 September 2011
Tanuki: A 30000 Cores AWS Cluster
Sometimes the only valid comment is wow.
We have now launched a cluster 3 times the size of Tanuki, or 30,000 cores, which cost $1279/hour to operate for a Top 5 Pharma. It performed genuine scientific work — in this case molecular modeling — and a ton of it. The complexity of this environment did not necessarily scale linearly with the cores.
In fact, we had to implement a triad of features within CycleCloud to make it a reality:
- MultiRegion support: To achieve the mind boggling core count of this cluster, we launched in three distinct AWS regions simultaneously, including Europe.
- Massive Spot instance support: This was a requirement given the potential savings at this scale by going through the spot market. Besides, our scheduling environment and the workload had no issues with the possibility of early termination and rescheduling.
- Massive CycleServer monitoring & Grill GUI app for Chef monitoring: There is no way that any mere human could keep track of all of the moving parts on a cluster of this scale.
Facebook runs a 30PB Hadoop analytic data warehouse and Yahoo! has a 100,000 cores/40,000 machines Hadoop cluster. I’m wondering what are the largest Amazon Elastic MapReduce jobs ever run. Any ideas?
Original title and link: Tanuki: A 30000 Cores AWS Cluster (©myNoSQL)
Sunday, 11 September 2011
Will Oracle Win the NoSQL Competition
I agree this title is misleading but problem is clear: today Oracle does not provide any product can compete with new cloud computing needs and with the NoSQL movement. It is not possibile to think that actually the RAC technology of oracle can be used in a cloud environment and also a cloud service cannot be deployed over an Exadata.
The real question though is if Oracle is really interested by the market currently served by NoSQL databases and/or hybrid solutions. And judging by the latest versions of MySQL and MySQL Cluster[1] it looks like they are testing the waters.
-
Latest versions of MySQL and MySQL Cluster are adding support for using the Memcached protocol. See NoSQL to MySQL with Memcached ↩
Original title and link: Will Oracle Win the NoSQL Competition (©myNoSQL)
via: http://www.stefanocislaghi.eu/2011/09/will-oracle-win-the-nosql-competition/
Saturday, 27 August 2011
Running MongoDB on the Cloud
I’ve been posting a lot about deployments in the cloud and especially about deploying MongoDB in the Amazon cloud:
- MongoDB on Amazon EC2 with EBS Volumes
- MongoDB on EC2
- MongoDB in the Amazon Cloud
- Setting Up MongoDB Replica Sets on Amazon EC2
- MongoDB and Amazon: Why EBS?
- Amazon EBS vs SSD: Price, Performance, QoS
- Multi-tenancy and Cloud Storage Performance
In this video Jared Rosoff covers topics like scaling and performance characteristics of running MongoDB in the cloud and he also shares some best practices when using Amazon EC2.
Tuesday, 23 August 2011
Memcached in the Cloud: Amazon ElastiCache
Amazon announced today a new service Amazon ElastiCache or Memcached in the cloud. The new service is still in beta and available only in the US East (Virginia) Region.
While many will find this new service useful, it is a bit of a disappointement that Amazon took the safe route and went with pure Memcached. The only notable feature of Amazon ElastiCache is automatic failure detection and recovery. But compared with Membase (and the soon to be released Couchbase 2.0) it is missing clustering, replication, support for virtual nodes, etc. Even if advertising a push-button scaling, ElastiCache will lose cached data on adding or removing instances.
The pace at which Amazon is launching new services is indeed impressive. I’m wondering what will be the first NoSQL database that will get official Amazon support.
Original title and link: Memcached in the Cloud: Amazon ElastiCache (©myNoSQL)
Monday, 22 August 2011
Reliable, Scalable, and Kinda Sorta Cheap: A Cloud Hosting Architecture for MongoDB
Using MongoDB replicate sets:
At Famigo, we house all of our valuable data in MongoDB and we also serve all requests from Amazon EC2 instances. We’ve devoted many mental CPU cycles to finding the right architecture for our data in the cloud, focusing on 3 main factors: cost, reliability, and performance.
Original title and link: Reliable, Scalable, and Kinda Sorta Cheap: A Cloud Hosting Architecture for MongoDB (©myNoSQL)
via: http://www.codypowell.com/taods/2011/08/a-cloud-hosting-architecture-for-mongodb.html
Friday, 12 August 2011
Data Integrity in the Cloud
Chris Marsh:
Cloud storage can be an attractive means of outsourcing the day-to-day management of data, but ultimately the responsibility and liability for that data falls on the company that owns the data, not the hosting provider. With this in mind, it is important to understand some of the causes of data corruption, how much responsibility a cloud service provider holds, some basic best practices for utilizing cloud storage safely, and some methods and standards for monitoring the integrity of data regardless of whether that data resides locally or in the cloud.
This reminded me of how the Adobe SaaS Infrastructure Team has tested HBase.
Original title and link: Data Integrity in the Cloud (©myNoSQL)
Saturday, 6 August 2011
MongoDB Positioning: Big Data and Development Agility
Max Schireson positions MongoDB as a solution for Big Data and development agility:
Monday, 11 July 2011
The Server Architecture Debate Rages On
Big processors or little processors, scale-up or scale-out, on-premise or in the cloud […] The plethora of choices for application architecture and delivery model are great if you like variety, but I don’t envy anyone tasked with choosing which system on which to spend their limited budget dollars.
Too little options is bad[1]. Too many options are paralizing[2]. Then what’s the solution? I think the only answer is to build experience. By trying, failing, learning, and sharing with everyone else.
Original title and link: The Server Architecture Debate Rages On (©myNoSQL)
via: http://gigaom.com/cloud/the-server-architecture-debate-rages-on/
Wednesday, 6 July 2011
MongoDB Hosting Matrix
Maurício Maia put together a price comparison for MongoDB hosting:
Additionally there’s also the matrix of MongoDB hosting features. While far from being exhaustive, these MongoDB hosting matrix are meant to give you an idea of what options are out there.
Unfortunately, they don’t include MongoDB hosting on:
Each of these offer a free plan, but the pricing will depend on many factors. On the other hand, they also offer application hosting and that basically means collocating your app and data which is better than putting the whole internet between the two.
Original title and link: MongoDB Hosting Matrix (©myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling
