Hadoop Pluggable Scheduler Framework
M. Tim Jones takes a look at Hadoop schedulers and when to use each of them:
Up until 2008, Hadoop supported a single scheduler that was intermixed with the JobTracker logic. Luckily, a bug report (HADOOP-3412) was submitted for an implementation of a scheduler that was independent of the JobTracker. More importantly, the new scheduler is pluggable, which allows use of new scheduling algorithms to help optimize jobs that have specific characteristics. A further advantage to this change is the increased readability of the scheduler, which has opened it up to greater experimentation and the potential for a growing list of schedulers to specialize in Hadoop’s ever-increasing list of applications. With this change, Hadoop is now a multi-user data warehouse that supports a variety of different types of processing jobs, with a pluggable scheduler framework providing greater control. This framework allows optimal use of a Hadoop cluster over a varied set of workloads (from small jobs to large jobs and everything in between). Moving away from FIFO scheduling (which treats a job’s importance relative to when it was submitted) allows a Hadoop cluster to support a variety of workloads with varying priority and performance constraints.
Original title and link: Hadoop Pluggable Scheduler Framework (©myNoSQL)
via: http://www.ibm.com/developerworks/opensource/library/os-hadoop-scheduling/index.html?ca=drs-