NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



hRaven: the beginning of smart Hadoop schedulers?

While going through the A Bird’s-Eye View of Pig and Scalding with hRaven slidedeck, I’ve started to wonder if hRaven might actually represent the beginning of smart Hadoop schedulers. While Hadoop has a pluggable scheduler framework and YARN will feature some improvements in the fair scheduler, I don’t think these are on par yet with the resource allocation management solutions available in MPP systems.

In a way slide #29, titled Current uses, hinted at something similar:

Current uses of hRaven

  • Pig reducer optimizations
  • Cluster utilization/capacity planning
  • Application performance trending over time
  • Identifying common job anti-patterns
  • Ad-hoc analysis troubleshooting cluster problems

What do you think?

Original title and link: hRaven: the beginning of smart Hadoop schedulers? (NoSQL database©myNoSQL)