Prashanth Babu sent an older Hadoop ecosystem map as a follow up to the Hadoop tools ecosystem and The components and their functions in the Hadoop ecosystem:
The map is not self explanatory so here’s the legend:
- How did it all start: huge data on the web!
- Nutch built to crawl this web data
- Huge data had to saved: HDFS was born!
- How to use this data?
- Map reduce framework built for coding and running analytics (Java, any language through streaming/pipes)
- How to import unstructured data: web logs, click streams – fuse,webdav, chukwa, flume, Scribe
- Hiho and sqoop for loading data into HDFS – RDBMS can join the Hadoop band wagon!
- High level interfaces required over low level map reduce programming– Pig, Hive, Jaql
- BI tools with advanced UI reporting- drilldown etc- Intellicus
- Workflow tools over Map-Reduce processes and High level languages
- Monitor and manage Hadoop, run jobs/Hive, view HDFS – high level view- Hue, Karmasphere, Eclipse plugin, Cacti, Ganglia
- Support frameworks- Avro (Serialization), Zookeeper (Coordination)
- More High level interfaces/uses- Mahout, Elastic map Reduce
- OLTP- also possible – HBase
Original title and link: Hadoop Ecosystem Map ( ©myNoSQL)