NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



3 Hash Functions

All public domain:

  • Murmur Hash

    The name, if you’re wondering, comes from the simplest sequence of operations which will thoroughly mix the bits of a value - “x *= m; x = rotate_left(x,r);” - multiply and rotate. Repeat that about 15 times using ‘good’ values of m and r, and x will end up pseudo-randomized. Unfortunately multiply+rotate has a few major weaknesses when used in a hash function, so I used multiply+shift+xor. I liked the name Murmur better than Musxmusx, so I kept it.

    Wikipedia page lists Hadoop, libmemcached, and Kyoto Cabinet as using MurmurHash.

  • Fowler/Noll/Vo (FNV) hash

    FNV hashes are designed to be fast while maintaining a low collision rate. The FNV speed allows one to quickly hash lots of data while maintaining a reasonable collision rate. The high dispersion of the FNV hashes makes them well suited for hashing nearly identical strings such as URLs, hostnames, filenames, text, IP addresses, etc.

  • Jenkins hash function

    I offer you a new hash function for hash table lookup that is faster and more thorough than the one you are using now. I also give you a way to verify that it is more thorough.

    Jenkins hash is used in HBase too. I’m waiting for THE book to learn much more about HBase.

Original title and link: 3 Hash Functions (NoSQL databases © myNoSQL)