(Overheard) Kevin Weil
one petabyte for a trillion Tweets might become 10 petabytes for a trillion Tweets
So far I’ve ignored the topic of internal data formats. Now I’m wondering how important is it for midscale (nb non Google, non-Facebook, non-Twitter scale) applications?
Update: Make sure you are checking the comments below for more details on why data format is important.