ALL COVERED TOPICS

NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter

NAVIGATE MAIN CATEGORIES

Close

Haskell: All content tagged as Haskell in NoSQL databases and polyglot persistence

Writing a Simple Keyword Search Engine Using Haskell and Redis

Good guide on how to translate logical operators in Redis set commands:

Once the query is passed we need to evaluate the AST and get the results. The leaf case is simple, given a “Contains” query the only task is to create the search term from the text and look up the values in the inverted index. The Boolean operators are more interesting. As is obvious, “and” and “or” correspond to set intersection and set union. One way to implement this would be to look up the values from the left hand side and the right hand side and calculate the intersection, but this is a very choice because it involves grabbing all documents from Redis to the server. A better choice is to make use of a Redis function to do the heavy lifting. Redis provides ZINTERSTORE and ZUNIONSTORE for just this purpose. Given two (or more) existing keys, they calculate the intersection and union of the keys and store the result in a new key. Storing intermediate results in new keys runs the risk that the memory usage will grow unbounded, but it does allow query fragment caching (as long as the intermediate key represents the query). Although this doesn’t make much difference for such a small data set, it’s likely that on a large data set caching frequently used query strings will give some benefit. Redis provides ttl commands such as ttl, expire and persist to help address the memory consumption issues. I’ve made sure that intermediate keys have a short ttl and that should (hopefully) mean that the memory shouldn’t grow infinitely.

Caveat: don’t get tricked into thinking search is about inverted indeces. It is about relevancy (i.e. ranking algorithms). For that Redis sorted sets would come handy considering values have an associated score.

Original title and link: Writing a Simple Keyword Search Engine Using Haskell and Redis (NoSQL databases © myNoSQL)

via: http://www.fatvat.co.uk/2011/06/writing-simple-keyword-search-engine.html