NoSQL Benchmarks NoSQL use cases NoSQL Videos NoSQL Hybrid Solutions NoSQL Presentations Big Data Hadoop MapReduce Pig Hive Flume Oozie Sqoop HDFS ZooKeeper Cascading Cascalog BigTable Cassandra HBase Hypertable Couchbase CouchDB MongoDB OrientDB RavenDB Jackrabbit Terrastore Amazon DynamoDB Redis Riak Project Voldemort Tokyo Cabinet Kyoto Cabinet memcached Amazon SimpleDB Datomic MemcacheDB M/DB GT.M Amazon Dynamo Dynomite Mnesia Yahoo! PNUTS/Sherpa Neo4j InfoGrid Sones GraphDB InfiniteGraph AllegroGraph MarkLogic Clustrix CouchDB Case Studies MongoDB Case Studies NoSQL at Adobe NoSQL at Facebook NoSQL at Twitter



tokyo tyrant: All content tagged as tokyo tyrant in NoSQL databases and polyglot persistence

Kyoto Tycoon 0.9.24 Released: Featuring Auto Snapshot

New release for Kyoto Tycoon — the Tokyo Tyrant Plus Plus:

Kyoto Tycoon 0.9.24 has been released. It features “auto snapshot” for on-memory databases, inspired by Redis’s snapshot mechanism.

Kyoto Tycoon release

@antirez .

Original title and link: Kyoto Tycoon 0.9.24 Released: Featuring Auto Snapshot (NoSQL databases © myNoSQL)

New versions of Kyoto Cabinet and Kyoto Tycoon Released

In a post yesterday about NoSQL comparisons, I was asking when was the last Tokyo Cabinet release. It looks like from that family of products, the ones going forward are Kyoto Cabinet and Kyoto Tycoon as Mikio Hirabayashi ☞ has announced on Twitter the release of Kyoto Cabinet 1.2.5 and Kyoto Tycoon 0.9.9:

released Kyoto Cabinet 1.2.25 and Kyoto Tycoon 0.9.9, which feature asynchronous replication!

Frank Denis

Original title and link: New versions of Kyoto Cabinet and Kyoto Tycoon Released (NoSQL databases © myNoSQL)

Another NoSQL Comparison: Evaluation Guide

The requirements were clear:

  • Fast data insertion.
  • Extremely fast random reads on large datasets.
  • Consistent read/write speed across the whole data set.
  • Efficient data storage.
  • Scale well.
  • Easy to maintain.
  • Have a network interface.
  • Stable, of course.

The list of NoSQL databases to be compared: Tokyo Cabinet, BerkleyDB, MemcacheDB, Project Voldemort, Redis, and MongoDB, not so clear.

The methodology to evaluate and the results definitely not clear at all.

NoSQL Comparison Guide / A review of Tokyo Cabinet, Tokyo Tyrant, Berkeley DB, MemcacheDB, Voldemort, Redis, MongoDB

And the conclusion is quite wrong:

Although MongoDB is the solution for most NoSQL use cases, it’s not the only solution for all NoSQL needs.

There were a couple of people asking for more details about my comments on this NoSQL comparison, so here they are:

  1. the initial list of NoSQL databases to be evaluated looks at the first glance a bit random. It includes some not so used solutions (memcachedb), some that are not , while leaving aside others that at least at the high level would correspond to the characteristics of others in the list (Riak, Membase)
  2. another reason for considering the initial choice a bit random is that while scaling is listed as one of the requirements, the only truly scalable in the list would be Project Voldemort. The recently added auto-sharding and replica sets would make MongoDB a candidate too, but a search on the MongoDB group would show that the solution is still young
  3. even if the set of requirements is clear, there’s no indication of what kind of evaluation and how was it performed. Without knowing what and how and it is impossible to consider the results as being relevant.
  4. as Janl was writing about benchmarks, most of the time you are doing it wrong. Creating good, trustworthy, useful, relevant benchmarks is very difficult
  5. .
  6. the matrix lists characteristics that are difficult to measure. And there are no comments on how the thumbs up were given. Examples: what is manageability and how was that measured? Same questions for stability and feature set.
  7. because most of it sounds speculative here are a couple of speculations:
    1. judging by the thumbs up MongoDB received for insertion/random reads for large data set, I can assume that data hasn’t overpassed the available memory. But on the other hand, Redis was dismissed and received less votes due to its “more” in-memory character
    2. Tokyo Cabinet and Redis project activity and community were ranked the same. When was the last release of Tokyo Cabinet?
  8. I’m leaving up to you to decide why the conclusion — “Although MongoDB is the solution for most NoSQL use cases”” is wrong.

Original title and link: Another NoSQL Comparison: Evaluation Guide (NoSQL databases © myNoSQL)


Kyoto Tycoon: Tokyo Tyrant Plus Plus

Back in January I was writing about Kyoto Cabinet the successor of Tokyo Cabinet which reached the 1.0.0 stable version around May. Tokyo Cabinet needed Tokyo Tyrant for distributed environment.

So, if Tokyo Cabinet got Kyoto Cabinet as a successor, Tokyo Tyrant got Kyoto Tycoon as its successor. But this time it is not only an implementation language port, as Kyoto Tycoon also behaves as a cache system with support for auto expiration (something similar to memcached). Moreover Kyoto Tycoon is offering a RESTful-style interface.

You can read more about Kyoto Tycoon ☞ here.

Update: Brenden Grace has ☞ a post to which Mikio Hirabayashi, Tokyo and Kyoto creator, responded.

Original title and link: Kyoto Tycoon: Tokyo Tyrant Plus Plus (NoSQL databases © myNoSQL)

Netvibes: A Large Scale Tokyo Tyrant Deployment Case Study

Since the release of the Kyoto 1.0 which can be considered the successor of Tokyo Cabinet, I haven’t heard much from the Tokyo Cabinet/Tyrant world (except some political news or some furniture related announcements on craigslist, but these are not really of interest for the NoSQL community)

Some time ago I had a chance to discuss with Florent Solt (@florentsolt), Chief Architect at ☞ Netvibes, about their usage of Tokyo family (Tokyo Cabinet and Tokyo Tyrant). While I don’t have enough details about the Tokyo market, I’d be ready to speculate that Netvibes is probably one of the biggest users of the Tokyo products family.

To give you an quick overview of the Netvibes system here are some interesting points in random order:

  • Netvibes uses Tokyo Tyrant, never Tokyo Cabinet directly
  • Netvibes architecture is a master-slave architecture (due to weird things in master-master)
  • Netvibes is using its own sharding method
  • Netvibes maes use of Tokyo Cabinet hash, btree and tables storages
  • only feeds related informations are in Tokyo databases (feeds, items, read/unread, …)
  • other informations are still in a MySQL database (accounts, tabs, pages, widgets, …)
  • to schedule crawling events, a queue has been implemented with a Tokyo Tyrant server and lua
  • Netvibes is using a custom transparent proxy (ruby + eventmachine) to move/migrate data between servers

And now the Q&A part:

nosql: It sounds like initially all data lived inside MySQL. What made you look to alternative storage solutions?

Florent: Exactly. We started looking at an alternative when we reached MySQL limits. It was mostly disk space fragmentation issues (with blobs) and raw speed for insert.

nosql: How did you choose Tokyo Cabinet and Tokyo Tyrant?

Florent: We did some research, but 1.5 years ago, there were less solutions than now.

So we did some benchmarks, based on our own data (very important) and our architecture. We tried : Hadoop, CouchDB, Tokyo Tyrant, File system only (it was only to have a raw comparison with IMHO one of the most simple way to store data) and MySQL.

In terms of budget, responsiveness and knowledge gap, Tokyo was the winner.

nosql: What data has been moved to Tokyo?

Florent: We are using Tokyo for our feeds backend. Everything related to feeds such as feed items, enclosures, read/unread flags are stored in Tokyo. Same goes for the data structures we need to crawl all these feeds, such as a queue.

nosql: What criteria have you used to make this separation?

Florent: The separation was not clearly related to Tokyo, it was product decision. We wanted to implement this feed backend as a standalone module. We only interact with it trough an API.

nosql: How have you migrated existing data?

Florent: Indeed, initially feeds data were in MySQL tables.

The migration was simple, in terms of logic, but long and difficult to achieve. The main point was when an unknown data was requested from the new backend, a fallback query asked MySQL for the data, and finally saved everything in Tokyo. It sounds easy, but in reality there were many specific cases and strange issues.

nosql: You are using Tokyo hash, btree and tables. Would you mind giving some examples for what kind of data lives in each of them and how have you decided that is the best option?

Florent: When you really understand each structures it’s pretty easy to pick the best choice. For example:

  • When we need only raw speed, we use a hash.
  • When we need complex key strategies (based on prefix), we use btree.
  • When we need conditional queries, we use tables.

For example, feeds (url, title, author, …) are stored in a Table. Same goes for the feed items and enclosures.

The queue is a Hash, to keep the focus on the speed. The first implementation was based on a BTree, but we improved our algorithms to have guessable keys only and prevent key scanning. There are also some lua functions linked to hide implementation and keep the whole thing fast too.

Flags (where we store read/unread data) are stored in a BTree with a lua extension because we are scanning keys a lot.

nosql: Can you speak a bit more about the in-house sharding solution you are using?

Florent: Sure. Tokyo does not come with sharding or dynamic partitioning implementation, so we built our own solution. It’s feed or user centric. For example, we know that the feed table will always fit on one dedicated server, whatever the number of feeds. So, for each feed we store where (the id of the shard server) its items are.

For the flags, same logic, for a given user we know where his flags are. It makes it easy to add new shards, because it’s a line in a configuration file. And we have created all the scripts we need to move data from one shard to another (migration, auto-balance, …)

nosql: What lessons have you learned that you’d have liked to know before using Tokyo?

Florent: Very difficult to say as we have learned so much with this project.

Maybe the most important point would be to know how Tokyo Tyrant servers would manage the load and what are the best practices to prevent common speed issue, that was what we learned the hard-way.

nosql: Any numbers about Netvibes Tokyo deployment you can share with us?

Florent: About numbers, you already know that it’s a sensitive information :-). I can’t say more than those numbers in my slides.

nosql: Fair enough. Thank you so much Florent!

Japanese Blogs Post Benchmarks on Membase, Memcached, Tokyo Tyrant and Redis

Two japanese blogs[1] have published some benchmarks comparing the newly released membase with memcached, Tokyo Tyrant and Redis.

Unfortunately both of them are just new examples of useless benchmarks:

  • only 1000 keys
  • the benchmark doesn’t vary the size of keys and values
  • no concurrency
  • no mixed reads/writes

I’d strongly suggest anyone planning to build a solid benchmark to take a look at these NoSQL benchmarks and performance evaluations to learn how to build useful/correct ones[2].

  1. The two benchmarks are published ☞ here and ☞ here. Unfortunately I don’t read Japanese and I’ve used Google Translator (which pretty much didn’t work)  ()
  2. Another useful resource about building correct benchmarks is Jan Lehnardt’s ☞ Benchmarks: You are Doing it Wrong  ()

Presentation: Tokyo Cabinet / Tyrant @ Nosql Paris

Embedded below are the slides of Florent Solt (@florentsolt) Tokyo Cabinet / Tyrant presented at Nosql Paris.

Florent seems to be working at Netvibes and his slides are presenting briefly how and what kind of Tokyo Cabinet setup is in use there.

I also liked the Tokyo Cabinet / Tyrant strength and weaknesses slides:

Tokyo Cabinet / Tyrant Weaknesses

Tokyo Cabinet / Tyrant Strenghts

  • Easy to deploy and setup
  • Easy to use
  • It’s not a black box
  • Good to very good performance for most of the time
  • Small memory footprint
  • A single Tokyo Tyrant process can handle thousands of connections
  • Many command line tools
  • Lua extensions

I’d definitely be interested to hear much more about how Netvibes is using Tokyo Cabinet / Tyrant, so ping me if you are ready to share more with the Tokyo Cabinet community.