SAN: All content tagged as SAN in NoSQL databases and polyglot persistence
Monday, 4 April 2011
Big Data and Storage Area Networks
John Webster reporting his learnings from the Structure Big Data event:
This conference only confirmed a suspicion that’s been building for that last few months as I’ve been following the big-data wave: Big-data practitioners are generally hostile to shared storage. They like direct-attached storage (DAS) in various forms from solid state disk (SSD) to high-capacity SATA disk buried inside parallel processing nodes. SANs (storage area networks) need not apply.
[…]
Why? There are two reasons that are interrelated. First, most if not all of the attendees here would include real- or near-real-time information delivery as a one of the defining characteristics of big-data analytics. Latency is therefore avoided whenever and wherever possible. Data in memory is good. Data on spinning disk at the other end of a SAN connection is not, unless perhaps it’s a secondary copy of data. (I’ll get to that in a minute.) And while some here believed that it was theoretically possible to get high-performance shared storage to stand up to the low-latency requirement, the cost of such a SAN at the scale these people need was seen to be prohibitive.
Squeezing every drop of performance is one aspect. Costs are the second. But I also think there is also a CAP dimension in the sense that data locality increases the reliability of a distributed system.
Original title and link: Big Data and Storage Area Networks (NoSQL databases © myNoSQL)
via: http://news.cnet.com/8301-21546_3-20049693-10253464.html