ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Hoop: All content tagged as Hoop in NoSQL databases and polyglot persistence

HttpFS: Another Hadoop File System Over HTTP

Just a new HTTP interface for Hadoop file system. The main differences between HttpFS and WebHDFS are that this one is created by Cloudera, not Hortonworks (on top of their previos Hoop library) and:

HttpFs is a proxy so, unlike WebHDFS, it does not require clients be able to access every machine in the cluster. This allows clients to to access a cluster that is behind a firewall via the WebHDFS REST API.

Question is: if they are API compatible and both open source, why not unifying them?

Original title and link: HttpFS: Another Hadoop File System Over HTTP (NoSQL database©myNoSQL)

via: http://www.cloudera.com/blog/2012/08/httpfs-for-cdh3-the-hadoop-filesystem-over-http/


Hoop - Hadoop HDFS Over HTTP

Cloudera has created a set of tools named Hoop allowing access through HTTP/S to HDFS. My first question was why would you use HTTP to access HDFS? Here is the answer:

  • Transfer data between clusters running different versions of Hadoop (thereby overcoming RPC versioning issues).
  • Access data in a HDFS cluster behind a firewall. The Hoop server acts as a gateway and is the only system that is allowed to go through the firewall.

Not sure though how many will use HTTP for transfering large amounts of data. But if you want to see how it is implemented, you can find the source code on GitHub.

Original title and link: Hoop - Hadoop HDFS Over HTTP (NoSQL database©myNoSQL)

via: http://www.cloudera.com/blog/2011/07/hoop-hadoop-hdfs-over-http/