Jeff Kolesky describing the data model they are using with HBase and one (strange) trick to reduce the roundtrips to the database:
The idea is to put all of the data about a single entity into a single row in HBase.
When you need to run a computation that involves that entity’s data,
you have quick access to it by
the row key, and all of the data is stored close together on disk.
Additionally, against many suggestions from the HBase community,
and general confusion about how timestamps work, we are using
timestamps with logical values. Instead of just letting the region server
assign a timestamp version to each cell, we are explicitly setting those
so that we can use timestamp as a true queryable dimension in our gets and
In addition to the real timeseries data that is indexed using the cell
timestamp, we also have other columns that store metadata about the entity.
It’s amazing how many smart and weird tricks engineers put in their production systems when having to deal with real requirements and SLAs.
Original title and link: HBase Data Modeling Tips & Tricks - Timeshifting ( ©myNoSQL)