Madhu Reddy comparing the commercial and not yet released Dryad with the open source, widely used Hadoop:
- While Hadoop has chosen to build these capabilities from scratch [management and administration of large clusters], Dryad has chosen to leverage the proven and tested cluster management capabilities already present in Windows HPC Server.
- Hadoop […] has focused on performance and scale. Dryad, building on the performance and scale of Windows HPC Server, has in addition focused on making big data easier to use for mainstream application developers.
- Dryad and DSC are based on the widely used and mature NTFS (New Technology File System), the file system that comes standard with Windows Server.
- Hadoop uses the MapReduce computational model, which provides support for expressing the application logic in two simple steps — map and reduce. However, to develop more complex applications, developers will have to manually string together a sequence of MapReduce steps. DryadLINQ offers a higher-level computational model where complex sequence of MapReduce steps can be easily expressed in a query language similar to SQL.
A couple of aspects that were left out:
- licensing costs for Windows HPC Server, Microsoft Visual Studio, and the future Dryad
- Dryad commercial closed source model versus Hadoop open source model. (nb: example question: how soon could you get a bug fix or improvement?)
- Hadoop tools ecosystem
- Other Hadoop tools like Karmasphere studio — a graphical environment to develop, debug, deploy and monitor MapReduce jobs.
That’s not to say that Dryad and DryadLINQ are not interesting projects.
Original title and link: Comparing Dryad and Hadoop (NoSQL databases © myNoSQL)