In a conversation with Curt Monash, Arun Murthy (Hortonworks) explains what YARN (aka Hadoop MapReduce 2.0 or MRv2) is about:
YARN, as an aspect of Hadoop, has two major kinds of benefits:
- The ability to use programming frameworks other than MapReduce.
- Scalability, no matter what programming framework you use.
The central goal of YARN is to clearly separate two things that are unfortunately smushed together in current Hadoop, specifically in (mainly) JobTracker:
- Monitoring the status of the cluster with respect to which nodes have which resources available. Under YARN, this will be global.
- Managing the parallelization execution of any specific job. Under YARN, this will be done separately for each job.
Original title and link: Hadoop YARN - Beyond MapReduce ( ©myNoSQL)