In preparation for the EMC Hadoop related announcement:
Shared DAS addresses the inevitable storage capacity growth requirements of Hadoop nodes in a cluster by placing disks in an external shelf shared by multiple directly attached hosts (aka Hadoop compute nodes). The connectivity from host to disk can be SATA, SAS, SCSI or even Ethernet, but always in a direct rather than networked storage configuration.
Therefore the three dimensions of Shared DAS benefit are:
- NetApp E-Series Shared DAS solutions can dramatically reduce the amount of background replication tasks by employing highly efficient RAID configurations to offload post-disk failure reconstruction tasks from the Hadoop cluster compute nodes and cluster network,
- When compared against single disk I/O configuration of regular Hadoop nodes, NetApp E-Series Shared DAS enables significantly higher disk I/O bandwidth at lower latency due to wide striping within the shelf, and finally,
- NetApp E-Series Shared DAS improves storage efficiency by reducing the number of object replicas within a rack using low-overhead high-performance RAID. Fewer replicas mean less disks to buy or more objects stored within the same infrastructure.
But it can also be connected to DataStax Brisk .
Original title and link: NetApp Hadoop Shared DAS (NoSQL databases © myNoSQL)