cloud computing: All content tagged as cloud computing in NoSQL databases and polyglot persistence
Tuesday, 27 September 2011
OpenStack-based SDSC Cloud Storage Services
The San Diego Supercomputer Center (SDSC) at the University of California, San Diego announced a cloud storage solution based on OpenStack Swift Object Storage:
SDSC’s Cloud Storage provides academic and industry users with a convenient and affordable way to store, share, and archive data, including extremely large data sets. The object based storage system and multiple interface methods make the SDSC Cloud easy to use for the average user, but also provide a flexible, configurable, and expandable solution to meet the needs of more demanding applications.
Check out the project homepage for a short description of this new cloud offering characteristics.
Original title and link: OpenStack-based SDSC Cloud Storage Services (©myNoSQL)
Friday, 23 September 2011
Tanuki: A 30000 Cores AWS Cluster
Sometimes the only valid comment is wow.
We have now launched a cluster 3 times the size of Tanuki, or 30,000 cores, which cost $1279/hour to operate for a Top 5 Pharma. It performed genuine scientific work — in this case molecular modeling — and a ton of it. The complexity of this environment did not necessarily scale linearly with the cores.
In fact, we had to implement a triad of features within CycleCloud to make it a reality:
- MultiRegion support: To achieve the mind boggling core count of this cluster, we launched in three distinct AWS regions simultaneously, including Europe.
- Massive Spot instance support: This was a requirement given the potential savings at this scale by going through the spot market. Besides, our scheduling environment and the workload had no issues with the possibility of early termination and rescheduling.
- Massive CycleServer monitoring & Grill GUI app for Chef monitoring: There is no way that any mere human could keep track of all of the moving parts on a cluster of this scale.
Facebook runs a 30PB Hadoop analytic data warehouse and Yahoo! has a 100,000 cores/40,000 machines Hadoop cluster. I’m wondering what are the largest Amazon Elastic MapReduce jobs ever run. Any ideas?
Original title and link: Tanuki: A 30000 Cores AWS Cluster (©myNoSQL)
Monday, 12 September 2011
Will Oracle Win the NoSQL Competition
I agree this title is misleading but problem is clear: today Oracle does not provide any product can compete with new cloud computing needs and with the NoSQL movement. It is not possibile to think that actually the RAC technology of oracle can be used in a cloud environment and also a cloud service cannot be deployed over an Exadata.
The real question though is if Oracle is really interested by the market currently served by NoSQL databases and/or hybrid solutions. And judging by the latest versions of MySQL and MySQL Cluster[1] it looks like they are testing the waters.
-
Latest versions of MySQL and MySQL Cluster are adding support for using the Memcached protocol. See NoSQL to MySQL with Memcached ↩
Original title and link: Will Oracle Win the NoSQL Competition (©myNoSQL)
via: http://www.stefanocislaghi.eu/2011/09/will-oracle-win-the-nosql-competition/
Saturday, 27 August 2011
Running MongoDB on the Cloud
I’ve been posting a lot about deployments in the cloud and especially about deploying MongoDB in the Amazon cloud:
- MongoDB on Amazon EC2 with EBS Volumes
- MongoDB on EC2
- MongoDB in the Amazon Cloud
- Setting Up MongoDB Replica Sets on Amazon EC2
- MongoDB and Amazon: Why EBS?
- Amazon EBS vs SSD: Price, Performance, QoS
- Multi-tenancy and Cloud Storage Performance
In this video Jared Rosoff covers topics like scaling and performance characteristics of running MongoDB in the cloud and he also shares some best practices when using Amazon EC2.
Tuesday, 23 August 2011
Memcached in the Cloud: Amazon ElastiCache
Amazon announced today a new service Amazon ElastiCache or Memcached in the cloud. The new service is still in beta and available only in the US East (Virginia) Region.
While many will find this new service useful, it is a bit of a disappointement that Amazon took the safe route and went with pure Memcached. The only notable feature of Amazon ElastiCache is automatic failure detection and recovery. But compared with Membase (and the soon to be released Couchbase 2.0) it is missing clustering, replication, support for virtual nodes, etc. Even if advertising a push-button scaling, ElastiCache will lose cached data on adding or removing instances.
The pace at which Amazon is launching new services is indeed impressive. I’m wondering what will be the first NoSQL database that will get official Amazon support.
Original title and link: Memcached in the Cloud: Amazon ElastiCache (©myNoSQL)
Monday, 22 August 2011
Reliable, Scalable, and Kinda Sorta Cheap: A Cloud Hosting Architecture for MongoDB
Using MongoDB replicate sets:
At Famigo, we house all of our valuable data in MongoDB and we also serve all requests from Amazon EC2 instances. We’ve devoted many mental CPU cycles to finding the right architecture for our data in the cloud, focusing on 3 main factors: cost, reliability, and performance.
Original title and link: Reliable, Scalable, and Kinda Sorta Cheap: A Cloud Hosting Architecture for MongoDB (©myNoSQL)
via: http://www.codypowell.com/taods/2011/08/a-cloud-hosting-architecture-for-mongodb.html
Friday, 12 August 2011
Data Integrity in the Cloud
Chris Marsh:
Cloud storage can be an attractive means of outsourcing the day-to-day management of data, but ultimately the responsibility and liability for that data falls on the company that owns the data, not the hosting provider. With this in mind, it is important to understand some of the causes of data corruption, how much responsibility a cloud service provider holds, some basic best practices for utilizing cloud storage safely, and some methods and standards for monitoring the integrity of data regardless of whether that data resides locally or in the cloud.
This reminded me of how the Adobe SaaS Infrastructure Team has tested HBase.
Original title and link: Data Integrity in the Cloud (©myNoSQL)
Sunday, 7 August 2011
MongoDB Positioning: Big Data and Development Agility
Max Schireson positions MongoDB as a solution for Big Data and development agility:
Monday, 11 July 2011
The Server Architecture Debate Rages On
Big processors or little processors, scale-up or scale-out, on-premise or in the cloud […] The plethora of choices for application architecture and delivery model are great if you like variety, but I don’t envy anyone tasked with choosing which system on which to spend their limited budget dollars.
Too little options is bad[1]. Too many options are paralizing[2]. Then what’s the solution? I think the only answer is to build experience. By trying, failing, learning, and sharing with everyone else.
Original title and link: The Server Architecture Debate Rages On (©myNoSQL)
via: http://gigaom.com/cloud/the-server-architecture-debate-rages-on/
Thursday, 7 July 2011
MongoDB Hosting Matrix
Maurício Maia put together a price comparison for MongoDB hosting:
Additionally there’s also the matrix of MongoDB hosting features. While far from being exhaustive, these MongoDB hosting matrix are meant to give you an idea of what options are out there.
Unfortunately, they don’t include MongoDB hosting on:
Each of these offer a free plan, but the pricing will depend on many factors. On the other hand, they also offer application hosting and that basically means collocating your app and data which is better than putting the whole internet between the two.
Original title and link: MongoDB Hosting Matrix (©myNoSQL)
Wednesday, 6 July 2011
Zynga, Data Centers, Polyglot Persistence, and Big Data
Cadir Lee (CTO Zynga) quoted in a VentureBeat post:
It’s not the amount of hardware that matters. It’s the architecture of the application. You have to work at making your app architecture so that it takes advantage of Amazon. You have to have complete fluidity with the storage tier, the web tier. We are running our own data centers. We are looking more at doing our own data centers with more of a private cloud.
Couple of thoughts:
- Zynga is going the opposite direction than Netflix. While Netflix is focusing (by using Amazon for most of their infrastructure), Zynga is diversifying (building their own data centers) .
- Zynga’s applications are great examples of where fully distributed NoSQL databases fit. Availability is key.
- My answer to the question: “how many Zyngas are out there” would be: “enough to ensure some good business for the most reliable and scalable distributed databases”
- Zynga has contributed and is an investor in Membase, the company that merged with CouchOne to form Couchbase. But Zynga was using a custom version of Membase.
- Zynga also operates a large MySQL cluster.
- Zynga processes over 15 terabytes of game data every day (according to their SEC filing ). That’s Hadoop sweet spot.
PS: I’d love to talk to someone from Zynga about their data storage approach. If you have any connections I’d really appreciate an introduction.
Original title and link: Zynga, Data Centers, Polyglot Persistence, and Big Data (©myNoSQL)
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling
