bigdata: All content tagged as bigdata in NoSQL databases and polyglot persistence
In 4 years of writing this blog I haven’t seen such a prolific month:
- Apache Hadoop 2.2.0 (more links here)
- Apache HBase 0.96 (here and here)
- Apache Hive 0.12 (more links here)
- Apache Ambari 1.4.1
- Apache Pig 0.12
- Apache Oozie 4.0.0
- Plus Presto.
Actually I don’t think I’ve ever seen such an ecosystem like the one created around Hadoop.
Original title and link: A prolific season for Hadoop and its ecosystem ( ©myNoSQL)
And at the end of October, Hortonworks has shared a new set of results:
Historically, even simple Hive queries could not run in less than 30 seconds, yet many of these queries are running in less than 10 seconds. How did that happen? The answer mainly boils down to Apache Tez and Apache Hadoop YARN, which proves that Hadoop is more than just batch. Tez features such as container pre-launch and re-use overcome Hadoop’s traditional latency barriers, and are available to any data processing framework running in Hadoop.
Original title and link: Status update on Project Stinger, the interactive query for Apache Hive ( ©myNoSQL)
I’ve learned that there’s an Apache Hadoop compatibility guide that covers API, wire, Java binary compatibility, any many other such aspects.
✚ Karthik Kambatla posted on Cloudera’s blog Writing Hadoop programs that work across releases that looks at the Hadoop API annotations and compatibility policies.
Original title and link: Apache Hadoop Compatibility Guide ( ©myNoSQL)