ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Hadoop Research Topics

If you feel like starting to hack Hadoop, a list of interesting Hadoop improvements from Dhruba Borthakur[1]. Just a couple to wet your taste:

  • Ability to make Hadoop scheduler resource aware, especially CPU, memory and IO resources. The current implementation is based on statically configured slots. *Ability to dynamically increase replicas of data in HDFS based on access patterns. This is needed to handle hot-spots of data.
  • Make map-reduce jobs work across data centers. In many cases, a single hadoop cluster cannot fit into a single data center and a user has to partition the dataset into two hadoop clusters in two different data centers. *High Availability of the JobTracker. In the current implementation, if the JobTracker machine dies, then all currently running jobs fail.

  1. Dhruba Borthakur: Hadoop Engineer at Facebook  ()

Original title and link: Hadoop Research Topics (NoSQL databases © myNoSQL)

via: http://hadoopblog.blogspot.com/2010/11/hadoop-research-topics.html