NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



tokyo cabinet: All content tagged as tokyo cabinet in NoSQL databases and polyglot persistence

Presentation: Tokyo Cabinet / Tyrant @ Nosql Paris

Embedded below are the slides of Florent Solt (@florentsolt) Tokyo Cabinet / Tyrant presented at Nosql Paris.

Florent seems to be working at Netvibes and his slides are presenting briefly how and what kind of Tokyo Cabinet setup is in use there.

I also liked the Tokyo Cabinet / Tyrant strength and weaknesses slides:

Tokyo Cabinet / Tyrant Weaknesses

Tokyo Cabinet / Tyrant Strenghts

  • Easy to deploy and setup
  • Easy to use
  • It’s not a black box
  • Good to very good performance for most of the time
  • Small memory footprint
  • A single Tokyo Tyrant process can handle thousands of connections
  • Many command line tools
  • Lua extensions

I’d definitely be interested to hear much more about how Netvibes is using Tokyo Cabinet / Tyrant, so ping me if you are ready to share more with the Tokyo Cabinet community.

Introduction to Kyoto Products, Successors of Tokyo Products

I’ve just discovered these slides introducing Kyoto products, the successors of Tokyo products. The slides author is Mikio Hirabayashi, the creator and maintainer of Tokyo Cabinet, Tokyo Tyrant, Tokyo Promenade, Kyoto Cabinet, etc.

Now, what I have found really interesting is comparing these slides with some two years old slides authored by the same Mikio Hirabayashi about the Tokyo products.

More NoSQL-based Twitter apps

If you thought we’re running out of NoSQL Twitter apps, you were definitely wrong because I’ve just got a few more.


A Clojure-based solution to write the Twitter stream to Hadoop (by @ieure)


A simple Twitter clone in Python and using MongoDB by Michael Dirolf (@mdirolf). Michael has been featured on MyNoSQL a couple of times already:

Swordfish Twitter Clone

Swordfish — a key-value store built on top of Tokyo Cabinet and offering a RESTful HTTP interface — comes with a Twitter clone based on Django.

Another Tokyo Cabinet based Twitter app. There don’t seem to be many details about the project though. (via Matthew Ford)

Last, but not least, don’t forget to check the first series of NoSQL Twitter apps.

Tokyo Promenade: A Content Management System on top of Tokyo Cabinet

I didn’t know that Mikio Hirabayashi, the creator of Tokyo Cabinet and Kyoto Cabinet, the successor of Tokyo Cabinet is also offering GNU licensed content management system: ☞ Tokyo Promenade that runs on top of Tokyo Cabinet.

According to the main page, Tokyo Promenade offers the following features:

  • simple and logical user interface : aims at conciseness like LaTeX
  • high accessibility : complying with XHTML 1.0 and considering WCAG 1.0
  • hybrid data structure : available as BBS, blog, and Wiki
  • sufficient functionality : supports user management and file management
  • high performance : uses embedded database, Tokyo Cabinet
  • lightweight : implemented by C99 and without any dependency on other libraries

Now I don’t know how many would still be willing to run CGI based content management systems, but they”ll at least have Tokyo Cabinet as its storage.

Tokyo Cabinet Tutorial: Database Types and Configuration Options

A great piece of documentation for the 3 different storage types supported by Tokyo Cabinet: hash, B+ tree and fixed-length. The article also features a long list of tuning parameters.

Here are a couple of things that I’ve learned myself:

  • Tokyo Cabinet support multi-operation transactions
  • the extension of the file determines the type of storage:
    • tch: Tokyo Cabinet Hash database
    • tcb: Tokyo Cabinet B+Tree database
    • tcf: Tokyo Cabinet Fixed-Length database
  • while Tokyo Cabinet B+Tree storage might be a bit slower than Tokyo Cabinet Hash storage, it brings new features:
    • keys are ordered (default lexical, but can be configured by passing a comparison function)
    • as a consequence it supports key ranges
    • allows duplicate values to be stored under the same key
  • Tokyo Cabinet Fixed-Length has some restrictions:
    • all keys are positive integers
    • as Tokyo Cabinet B+Tree keys are ordered (based on the integer keys) and it is not configurable
    • all values stored have fixed-length
  • on the bright side, Tokyo Cabinet Fixed-Length support some special keys: :min, :max, :prev and :next.

I think this is a great contribution by James Edward Gray II to the Tokyo Cabinet community which is facing some problems including the lack of documentation.


Concerns in the Tokyo Cabinet Community

The Tokyo Cabinet community is starting to express its concerns related to the future of the project. Back when I covered Kyoto Cabinet, the successor of Tokyo Cabinet I have expressed the same concerns. Unfortunately even if I tried to contact the creator of these projects to shed some light on their future, I got no response back.

I really hope this will not be an issue for the Tokyo Cabinet users/community and they will find a solution that will work well for everyone.



Quick Intro to Tokyo Cabinet and Oklahoma_Mixer Ruby Library

A nice article covering the basic ops in Tokyo — getting and setting keys, counters and appended values, transactions, and database file management — with the help of the oklahoma_mixer ☞ Ruby library.

Those are the basics of using Tokyo Cabinet as a key-value store, but there’s really a lot more to what Tokyo Cabinet can do.


I am wondering if Kyoto Cabinet, the Tokyo Cabinet successor, will maintain the API compatibility and so the upgrade path will not get too complicated.


Tokyo Cabinet Database Recovery

Even if you don’t hit this ugly issue, non-transactional Tokyo Cabinet is not crash safe. Toru Maesaka took the time to document the recovery process:

  1. confirming that the database is broken by using the command line tools:

    Look at the “additional flags” line on the output of tchmgr inform or tcbmgr inform depending on your database type. If it says, “fetal” then your file is really broken. If it says “open”, it means that your application died or exited without closing the database. A file in the “open” state is still usable but your most recent records are most likely unavailable.

    Note: Even if I wish you’ll never have to see it, I really hope that the term is “fatal” (instead of “fetal”).

  2. use Tokyo Cabinet API

    1. open the database file without the lock option
    2. run tchdboptimize() or tcbdboptimize()

    […] If you’re lucky, the above would repair the database that is associated with TC’s database object.

    Note: the last sentence doesn’t sound too encouraging.

  3. use Tokyo Cabinet command line tool (alternative solution)

    TC provides a utility program called tchmgr (for a hash database) and tcbmgr (for a b+tree database) which allows you to run optimize on a database file.

Having in mind that the main developer of Tokyo Cabinet is already working on the Tokyo Cabinet successor, I really hope for two things:

  1. somebody will translate the Tokyo Cabinet documentation in English
  2. the Tokyo Cabinet successor will have both better tools and better documentation.


Kyoto Cabinet: The successor of Tokyo Cabinet

It looks like Mikio Hirabayashi, the author of Tokyo Cabinet is moving along and started developing the successor of Tokyo Cabinet. The name of the new project is Kyoto Cabinet. The project web page [1] looks extremely similar to the one of Tokyo Cabinet [2].

Tokyo to Kyoto

By comparing the declared goals of the two projects and the rest of the (scarce) documentation, the only major differences I could find are that Kyoto Cabinet is written in C++ and that it aims of supporting non-POSIX systems.

While Kyoto Cabinet is still in alpha, I cannot wonder what is the future of Tokyo Cabinet. Is there a community behind it to at least take care of any major bugs and help with the migration when Kyoto becomes more solid? (note: I tried to contact Mikio Hirabayashi but I haven’t heard back).

Tokyo Cabinet and CouchDB as Mnesia backends

If you are somehow familiar with Erlang you already know that Mnesia is a distributed database system that was designed with the following goals in mind:

  • Fast real-time key/value lookup
  • Complicated non real-time queries mainly for operation and maintenance
  • Distributed data due to distributed applications
  • High fault tolerance
  • Dynamic re-configuration
  • Complex objects

Even if the presentation is not so great (see below), Rickard Cardell’s experiments of using Tokyo Cabinet and CouchDB as Mnesia backends sound like a new and interesting usecase for NoSQL solutions.

Ugly 2GB limit in Tokyo Cabinet

When my hash database was reaching 2GB in size, the datafile would become corrupt. What’s scary is that at that stage, writes to the database seemingly disappeared into the void.

Oops! That’s definitely not something you’d like to see happening with your Tokyo Cabinet. Ugly issue!



Geo NoSQL: CouchDB, MongoDB, and Tokyo Cabinet

A lot of people say that location-enabled services will be the #### [*] of tomorrow, so is there any Geo NoSQL?

Populating a MongoDB with POIs

What I especially liked is the flexibility you get from this kind of databases (nb MongoDB) and the ease of installation and use. The downside for geographic applications is that at the moment there is no built-in support for geometries.

Using MongoDB to Store Geographic Data

Managing GIS data with NoSQL in circumstances where performances and scalability are a major issue could be the way for the win.

GeoCouch: The future

What I call “complex analytics” is things like: “return all apple trees that are located with a 10km range around buildings that have are over 100m high, but only in countries with a population over 50 million people” is not possible with GeoCouch as you would need the attribute values as well. Those are stored in CouchDB, so you would need to request them. What GeoCouch only supports is a simple: give me all IDs within a bounding box/polygon/radius.

Tokyo Cabinet: Loading and querying point data

I’m going to load 500.000 POIs in a database and query them with a bounding box query. I will use the table database from Tokyo Cabinet because it supports the most querying facilities. With a table database you can query numbers with full matched and range queries and for strings you can do full matching, forward matching, regular expression matching,…

And so the answer is: yes, we do have some Geo NoSQL!

In some geo parts of the world we are celebrating Christmas today, so Merry Christmas to everyone!