ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

cloud: All content tagged as cloud in NoSQL databases and polyglot persistence

The Myth of Auto Scaling as a Capacity Planning Approach

A quite old, but very educative post by James Golick dissecting the mythical extra server capacity:

There’s this idea floating around that we can scale out our data services “just in time”. Proponents of cloud computing frequently tout this as an advantage of such a platform. Got a load spike? No problem, just spin up a few new instances to handle the demand. It’s a great sounding story, but sadly, things don’t quite work that way.

This is the Mythical Man-Month of the IT department.

John Allspaw

Original title and link: The Myth of Auto Scaling as a Capacity Planning Approach (NoSQL database©myNoSQL)

via: http://jamesgolick.com/2010/10/27/we-are-experiencing-too-much-load-lets-add-a-new-server..html


CloudStack and Hadoop: A Match Made in the Cloud

Sheng1:

I am the most excited, however, about the prospect of integrating with Apache Hadoop project. Known primarily as the technology for Big Data applications, Hadoop has gained wide-spread adoption in the industry. Similar to CloudStack which is inspired by Amazon’s EC2 service, Hadoop is modeled after Google’s MapReduce and Google File System technologies. And just like CloudStack, Hadoop is implemented in Java.

Hortonworks is sharing the same feeling:

Today’s announcement also highlights the great synergies between CloudStack and Apache Hadoop. As the first cloud platform in the industry to join the ASF, CloudStack becomes the logical cloud choice for organizations that prefer an open source option for their cloud and big data infrastructure. Hortonworks is excited to work with the CloudStack project team to identify opportunities where Hadoop components can be used to back Cloud APIs and also where Cloud APIs can be used to deploy Hadoop.


  1. Sheng is the CEO and founder of Cloud.com 

Original title and link: CloudStack and Hadoop: A Match Made in the Cloud (NoSQL database©myNoSQL)

via: http://cloudstack.org/blog/121-cloudstack-and-hadoop-a-match-made-in-the-cloud.html


Two Sides of the OMGPOP Cloud and Couchbase Scalability Story

Many media sites published on Friday the PR release of OMGPOP growth story citing the usage of cloud services and Couchbase as their scaling solution (GigaOm, BusinessInsider).

When reading it, I’ve jotted down:

  1. The good: using a combination of cloud and a NoSQL database (Couchbase) allowed OMGPOP to scale
  2. The bad: OMGPOP had to call in people from Couchbase to help out with scaling

Question is if you can throw in more iron and hire experts wouldn’t many other database solutions be able to cope with OMGPOP’s growth?

Original title and link: Two Sides of the OMGPOP Cloud and Couchbase Scalability Story (NoSQL database©myNoSQL)


Research Shoots Hadoop, Clouds Out of the Sky

However, despite the buzz around the topic, one research group suggests that when it comes to actually deploying Hadoop and other parallel software frameworks, enterprises with big BI needs just aren’t biting. Even bleaker are the parallels between the cloud hype and reality. According to the Bloor Group, “cloud adoption appears lower than the hype with only 5 percent reporting extensive use and 51 percent with no plans for deployment.”

Filed for future claim chowder.

Original title and link: Research Shoots Hadoop, Clouds Out of the Sky (NoSQL database©myNoSQL)

via: http://www.datanami.com/datanami/2011-12-13/research_shoots_hadoop,_clouds_out_of_the_sky.html


Hadoop, Big Data Apps, Data Science Tools, Cloud Collision: Wikibon Big Data Predictions for 2012

Jeff Kelly for Wikibon Blog:

  1. 2012 Will Be the Year of Big Data Applications.
  2. Analytic Platform Vendors Add Improved Functionality, Social Capabilities for Data Scientists.
  3. The Cloud and Big Data Collide.
  4. Big Data Appliances Gain Steam.
  5. Industry Responds to Big Data Skills Gap with Training and Education Resources.
  6. The Big Data Privacy Discussion Begins In Ernest.

According to TRIGG that’s 6 Ts out of 6.

Original title and link: Hadoop, Big Data Apps, Data Science Tools, Cloud Collision: Wikibon Big Data Predictions for 2012 (NoSQL database©myNoSQL)

via: http://wikibon.org/blog/big-data-in-2012-hadoop-big-data-apps-data-science-tools-cloud-collision-and-more/


Building an Ad Network Ready for Failure

The architecture of a fault-tolerant ad network built on top of HAProxy, Apache with mod_wsgi and Python, Redis, a bit of PostgreSQL and ActiveMQ deployed on AWS:

The real workhorse of our ad targeting platform was Redis. Each box slaved from a master Redis, and on failure of the master (which happened once), a couple “slaveof” calls got us back on track after the creation of a new master. A combination of set unions/intersections with algorithmically updated targeting parameters (this is where experimentation in our setup was useful) gave us a 1 round-trip ad targeting call for arbitrary targeting parameters. The 1 round-trip thing may not seem important, but our internal latency was dominated by network round-trips in EC2. The targeting was similar in concept to the search engine example I described last year, but had quite a bit more thought regarding ad targeting. It relied on the fact that you can write to Redis slaves without affecting the master or other slaves. Cute and effective. On the Python side of things, I optimized the redis-py client we were using for a 2-3x speedup in network IO for the ad targeting results.

Original title and link: Building an Ad Network Ready for Failure (NoSQL database©myNoSQL)

via: http://dr-josiah.blogspot.com/2011/06/building-ad-network-ready-for-failure.html


Getting Started With VMware CloudFoundry, MongoDB and Node.js

A step by step intro to VMware CloudFoundry, MongoDB and Node.js.

However great Node.js is, this code still hurts my eyes.

Original title and link: Getting Started With VMware CloudFoundry, MongoDB and Node.js (NoSQL database©myNoSQL)

via: http://blog.mongodb.org/post/6587009156/cloudfoundry-mongodb-and-nodejs


Data Scientist and Cloud Architect: The 6 Hottest New Jobs in IT

Infoworld published a non scientific research on the hottest new jobs in IT and Data scientists and Cloud architects made it in the top 6.

About data scientists:

According to Norman Nie, CEO of Revolution Analytics, data science jobs will require workers with a spectrum of skills, from entry-level data cleaners to the high-level statisticians, yielding a range of opportunities for newcomers to the field. As the business world gets increasingly social, the demand for people to plumb the depths of all that social networking clickstream data will only increase. The cliché going around is that “data is the new oil.” A career in refining that raw material sounds like a good bet.

Cloud architects:

In addition to establishing and managing a private cloud infrastructure, Ron Gula, CEO of Tenable Network Security, says cloud architects will increasingly need to be experts in choosing public cloud services. “When you get into the nuances of SLAs, you become less of an IT person and more of a lawyer,” says Gula. The ultimate goal is the hybrid cloud, where cloud architects and business management decide which cloud services make the most sense to run internally and which should be farmed out on a pay-per-use basis.

Original title and link: Data Scientist and Cloud Architect: The 6 Hottest New Jobs in IT (NoSQL database©myNoSQL)

via: http://www.pcworld.com/businesscenter/article/230285/the_6_hottest_new_jobs_in_it.html


ParElastic: Cloud-Enabled Apps on Relational Databases

No idea how this would work:

ParElastic™ is an elastic database middle-ware that creates a virtual elastic relational database from a collection of unaffiliated, free-running, industry standard relational databases.

But they convinced some investors that it is possible.

Original title and link: ParElastic: Cloud-Enabled Apps on Relational Databases (NoSQL databases © myNoSQL)


MongoHQ Announces MongoDB Replica Set Support

MongoHQ, a MongoDB hosting solution:

Now, we are excited to announce that we offer high-availability multi-node replica set plans on our MongoHQ platform. We are pretty excited about this release, as it makes available some great features that our users have been asking for, including:

  • High availability databases with automatic failover
  • Nodes located in multiple availability zones
  • Dedicated volumes to maximize read/write performance
  • Slave nodes that can be used as read-slaves for enhanced read throughput

If scaling MongoDB with replica sets and auto-sharing is so easy, I’m wondering why it took MongoHQ so long to add support only for replica sets.

Original title and link: MongoHQ Announces MongoDB Replica Set Support (NoSQL databases © myNoSQL)

via: http://blog.mongohq.com/post/6136673911/announcing-mongohq-replica-set-plans


Xeround: MySQL Elastic, Always-on Storage Engine for the Cloud

Xeround is a new MySQL storage engine offered as Database-as-a-Service.

What it promises sounds (a bit?) too good to be true (nb this list have been extracted from their site):

  • seamless replacement of existing MySQL database
  • high availability (including schema changes)
  • automatic fault-detection and recovery
  • full consistency with low latency
  • elasticity

What’s the catch?

Original title and link: Xeround: MySQL Elastic, Always-on Storage Engine for the Cloud (NoSQL databases © myNoSQL)