Conclusions of comparing SSDs and HDDs for different cluster scenarios from the cost perspective of performance and storage capacity:
- For a new cluster, SSDs deliver up to 70 percent higher MapReduce
performance compared to HDDs of equal aggregate IO bandwidth.
- For an existing HDD cluster, adding SSDs lead to more gains if configured
- On average, SSDs show 2.5x higher cost-per-performance, a gap far narrower
than the 50x difference in cost-per-capacity.
The post offers many details of the tests run and also various results. But the 3 bullets above should be enough to drive your decision.
Original title and link: SSDs and MapReduce performance