I’m not implying that this question appeared on Quora after my link and comments about Redshift’s performance and costs at AirBnb, but Reynold Xin’s answer covers in a more formal way the reasons of Redshift being faster than Hive I’ve suggested in that post:
Some of the advantages you gain from massive scale and flexibility make it
challenging to build a more performant query engine. The following outlines how various features (or lack of
features) influences performance:
task launch overhead (nb: this can be optimized in Hive/Hadoop)
This blog is called myNoSQL and it is written by me, Alex Popescu, a software architect with a passion for open source and communities.
It records my readings, learnings, and opinions on NoSQL databases, polyglot persistence, and distributed systems -- subjects that I'm passionate about.
The opinions expressed here are my own, and no other party necessarily agrees with them.
If you feel I'm biased, I probably am.