Hadoop on top of… Intel adds Lustre support to Hadoop

Intel adds Lustre support to Hadoop:

We abstracted out an HDFS layer but underneath that it is actually talking to lustre.

This is not the first project based on the principle “we already have this distributed system, file system or database, so why not reusing it for Hadoop?”. What would be the first step of such a project? Provide a HDFS API compatible layer on top of your existing system. But how about the other assumptions in HDFS: large block, sequential, local access, etc? How do you guarantee that your integration addressed all of them?

If this trends continues, I could see one of the companies behind the open source Hadoop, Cloudera or Hortonworks or both, coming up with a TCK sold to any company that claims HDFS compatibility.

