hbase: All content about hbase in NoSQL databases and polyglot persistence
- We say we can do failover in a couple of seconds. We want to make it subsecond, but we can’t do that reliably yet. In HBase this story is much more mixed.
- We wanted to really reduce complexity, as a result, you can just apt-get install c5 on each node and you are done. It’s one daemon, one log file, and that’s it. No xmx nonsense, and almost no tuning or config files. I don’t know if you have dealt with hadoop before, but the complexity is high.
- Finally we have a much more advanced wireformat. In fact it’s advanced by being simple (protobufs + http). As a result clients in languages other than java become very easy, without a thrift client.
Are we in a new stage of NoSQL databases: “X that doesn’t suck”?
Original title and link: OhmData C5: an improved HBase ( ©myNoSQL)
If you’ve never used Thrift (with or without HBase), the two articles authored by Jesse Anderson and posted on Cloudera’s blog will give you both a quick intro and
- How-to: Use the HBase Thrift Interface, Part 1: setting up, getting the language bindings, and connecting;
- How-to: Use the HBase Thrift Interface, Part 2: Inserting/Getting Rows: using HBase’s Thrift API from Python
Original title and link: An intro to HBase’s Thrift interface ( ©myNoSQL)
A presentation by Todd Eisenberger about the archival system used by Dropbox based on MySQL and HBase:
- fast queries for known keys over a (relatively) small dataset
- high read throughput
- high write throughput
- large suite of pre-existing tools for distributed computation
- easier to perform large processing tasks
✚ Both are consistent
✚ Most of the benefits in HBase’s section point in the direction of data processing benefits (and not data storage benefits)
This is a an important release for HBase. Both Hortonworks and Cloudera have posts covering it:
- Hortonworks: Announcing Apache HBase 0.96.0, More than 2000 issues resolved!
- Cloudera: HBase 0.96.0 Released!
Original title and link: Apache HBase 0.96.0 released after more than 2000 issues resolved ( ©myNoSQL)
Hortonworks, eBay and Scaled Risk have been collaborating in improving the mean time to recovery in HBase and after long testing performed at eBay, some results are now available for 2 scenarios:
- Node/RegionServer failures while writing
- Node/RegionServer failures while reading
Original title and link: Results of collaboration on improving the Mean Time to Recovery in HBase ( ©myNoSQL)
In 4 years of writing this blog I haven’t seen such a prolific month:
- Apache Hadoop 2.2.0 (more links here)
- Apache HBase 0.96 (here and here)
- Apache Hive 0.12 (more links here)
- Apache Ambari 1.4.1
- Apache Pig 0.12
- Apache Oozie 4.0.0
- Plus Presto.
Actually I don’t think I’ve ever seen such an ecosystem like the one created around Hadoop.
Original title and link: A prolific season for Hadoop and its ecosystem ( ©myNoSQL)