More Integrations for Hive
Hive is data warehouse infrastructure built on top of Hadoop offering tools for data ETL, a mechanism to put structures on the data, and the capability to querying and analyzing large data sets stored in Hadoop[1]. To better understand the benefits of Hive you can check how Facebook is using Hive to deal with petabyte scale data warehouse.
Recently, John Sichi a member of the Data infrastructure team at Facebook published an article on integrating Hive and HBase. Also there is interest in having Hive work with Cassandra and this is ☞ tracked in Cassandra JIRA (nb: not sure there’s any advance on this yet though).
Hypertable, another wide-column store, provides a way to integrating with Hive described ☞ here:
Hypertable-Hive integration allows Hive QL statements to read and write to Hypertable via SELECT and INSERT commands. […] Currently the Hypertable storage handler only supports external, non-native tables.
Somehow all this work to provide a common data warehouse infrastructure on top of existing NoSQL solutions (or at least the wide-column stores which are focused on large scale datasets) seems to confirm there’s no need for a common NoSQL language.
- From ☞ Hive wiki page (↩)