ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

yelp: All content tagged as yelp in NoSQL databases and polyglot persistence

Hadoop and Elastic MapReduce at Yelp

A story of using Hadoop at Yelp and migrating it to Amazon Elastic MapReduce:

We used to do what a lot of companies do, which is run a Hadoop cluster. We had a dozen or so machines that we otherwise would have gotten rid of, and whenever we pushed our code to our webservers, we’d push it to the Hadoop machines.

It was also not so cool. You couldn’t really tell if a job was going to work at all until you pushed it to production. But the worst part was, most of the time our cluster would sit idle, and then every once in a while, a really beefy job would come along and tie up all of our nodes, and all the other jobs would have to wait.

Yelp has released their Python library for running MapReduce jobs on Hadoop or Amazon Elastic MapReduce on ☞ GitHub.

Original title and link: Hadoop and Elastic MapReduce at Yelp (NoSQL databases © myNoSQL)

via: http://engineeringblog.yelp.com/2010/10/mrjob-distributed-computing-for-everybody.html