ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

HBase Data Modeling Tips & Tricks - Timeshifting

Jeff Kolesky describing the data model they are using with HBase and one (strange) trick to reduce the roundtrips to the database:

The idea is to put all of the data about a single entity into a single row in HBase. When you need to run a computation that involves that entity’s data, you have quick access to it by the row key, and all of the data is stored close together on disk.

Additionally, against many suggestions from the HBase community, and general confusion about how timestamps work, we are using timestamps with logical values. Instead of just letting the region server assign a timestamp version to each cell, we are explicitly setting those values so that we can use timestamp as a true queryable dimension in our gets and scans.

In addition to the real timeseries data that is indexed using the cell timestamp, we also have other columns that store metadata about the entity.

It’s amazing how many smart and weird tricks engineers put in their production systems when having to deal with real requirements and SLAs.

Original title and link: HBase Data Modeling Tips & Tricks - Timeshifting (NoSQL database©myNoSQL)

via: http://www.heyitsopower.com/code/timeshifting-in-hbase/