tokyo tyrant: All content tagged as tokyo tyrant in NoSQL databases and polyglot persistence
New release for Kyoto Tycoon — the Tokyo Tyrant Plus Plus:
Kyoto Tycoon 0.9.24 has been released. It features “auto snapshot” for on-memory databases, inspired by Redis’s snapshot mechanism.
Original title and link: Kyoto Tycoon 0.9.24 Released: Featuring Auto Snapshot (NoSQL databases © myNoSQL)
In a post yesterday about NoSQL comparisons, I was asking when was the last Tokyo Cabinet release. It looks like from that family of products, the ones going forward are Kyoto Cabinet and Kyoto Tycoon as Mikio Hirabayashi ☞ has announced on Twitter the release of Kyoto Cabinet 1.2.5 and Kyoto Tycoon 0.9.9:
released Kyoto Cabinet 1.2.25 and Kyoto Tycoon 0.9.9, which feature asynchronous replication!
Original title and link: New versions of Kyoto Cabinet and Kyoto Tycoon Released (NoSQL databases © myNoSQL)
So, if Tokyo Cabinet got Kyoto Cabinet as a successor, Tokyo Tyrant got Kyoto Tycoon as its successor. But this time it is not only an implementation language port, as Kyoto Tycoon also behaves as a cache system with support for auto expiration (something similar to memcached). Moreover Kyoto Tycoon is offering a RESTful-style interface.
You can read more about Kyoto Tycoon ☞ here.
Update: Brenden Grace has ☞ a post to which Mikio Hirabayashi, Tokyo and Kyoto creator, responded.
Since the release of the Kyoto 1.0 which can be considered the successor of Tokyo Cabinet, I haven’t heard much from the Tokyo Cabinet/Tyrant world (except some political news or some furniture related announcements on craigslist, but these are not really of interest for the NoSQL community)
Some time ago I had a chance to discuss with Florent Solt (@florentsolt), Chief Architect at ☞ Netvibes, about their usage of Tokyo family (Tokyo Cabinet and Tokyo Tyrant). While I don’t have enough details about the Tokyo market, I’d be ready to speculate that Netvibes is probably one of the biggest users of the Tokyo products family.
To give you an quick overview of the Netvibes system here are some interesting points in random order:
- Netvibes uses Tokyo Tyrant, never Tokyo Cabinet directly
- Netvibes architecture is a master-slave architecture (due to weird things in master-master)
- Netvibes is using its own sharding method
- Netvibes maes use of Tokyo Cabinet hash, btree and tables storages
- only feeds related informations are in Tokyo databases (feeds, items, read/unread, …)
- other informations are still in a MySQL database (accounts, tabs, pages, widgets, …)
- to schedule crawling events, a queue has been implemented with a Tokyo Tyrant server and lua
- Netvibes is using a custom transparent proxy (ruby + eventmachine) to move/migrate data between servers
And now the Q&A part:
nosql: It sounds like initially all data lived inside MySQL. What made you look to alternative storage solutions?
Florent: Exactly. We started looking at an alternative when we reached MySQL limits. It was mostly disk space fragmentation issues (with blobs) and raw speed for insert.
nosql: How did you choose Tokyo Cabinet and Tokyo Tyrant?
Florent: We did some research, but 1.5 years ago, there were less solutions than now.
So we did some benchmarks, based on our own data (very important) and our architecture. We tried : Hadoop, CouchDB, Tokyo Tyrant, File system only (it was only to have a raw comparison with IMHO one of the most simple way to store data) and MySQL.
In terms of budget, responsiveness and knowledge gap, Tokyo was the winner.
nosql: What data has been moved to Tokyo?
Florent: We are using Tokyo for our feeds backend. Everything related to feeds such as feed items, enclosures, read/unread flags are stored in Tokyo. Same goes for the data structures we need to crawl all these feeds, such as a queue.
nosql: What criteria have you used to make this separation?
Florent: The separation was not clearly related to Tokyo, it was product decision. We wanted to implement this feed backend as a standalone module. We only interact with it trough an API.
nosql: How have you migrated existing data?
Florent: Indeed, initially feeds data were in MySQL tables.
The migration was simple, in terms of logic, but long and difficult to achieve. The main point was when an unknown data was requested from the new backend, a fallback query asked MySQL for the data, and finally saved everything in Tokyo. It sounds easy, but in reality there were many specific cases and strange issues.
nosql: You are using Tokyo hash, btree and tables. Would you mind giving some examples for what kind of data lives in each of them and how have you decided that is the best option?
Florent: When you really understand each structures it’s pretty easy to pick the best choice. For example:
- When we need only raw speed, we use a hash.
- When we need complex key strategies (based on prefix), we use btree.
- When we need conditional queries, we use tables.
For example, feeds (url, title, author, …) are stored in a Table. Same goes for the feed items and enclosures.
The queue is a Hash, to keep the focus on the speed. The first implementation was based on a BTree, but we improved our algorithms to have guessable keys only and prevent key scanning. There are also some lua functions linked to hide implementation and keep the whole thing fast too.
Flags (where we store read/unread data) are stored in a BTree with a lua extension because we are scanning keys a lot.
nosql: Can you speak a bit more about the in-house sharding solution you are using?
Florent: Sure. Tokyo does not come with sharding or dynamic partitioning implementation, so we built our own solution. It’s feed or user centric. For example, we know that the feed table will always fit on one dedicated server, whatever the number of feeds. So, for each feed we store where (the id of the shard server) its items are.
For the flags, same logic, for a given user we know where his flags are. It makes it easy to add new shards, because it’s a line in a configuration file. And we have created all the scripts we need to move data from one shard to another (migration, auto-balance, …)
nosql: What lessons have you learned that you’d have liked to know before using Tokyo?
Florent: Very difficult to say as we have learned so much with this project.
Maybe the most important point would be to know how Tokyo Tyrant servers would manage the load and what are the best practices to prevent common speed issue, that was what we learned the hard-way.
nosql: Any numbers about Netvibes Tokyo deployment you can share with us?
Florent: About numbers, you already know that it’s a sensitive information :-). I can’t say more than those numbers in my slides.
nosql: Fair enough. Thank you so much Florent!
Unfortunately both of them are just new examples of useless benchmarks:
- only 1000 keys
- the benchmark doesn’t vary the size of keys and values
- no concurrency
- no mixed reads/writes
Embedded below are the slides of Florent Solt (@florentsolt) Tokyo Cabinet / Tyrant presented at Nosql Paris.
Florent seems to be working at Netvibes and his slides are presenting briefly how and what kind of Tokyo Cabinet setup is in use there.
I also liked the Tokyo Cabinet / Tyrant strength and weaknesses slides:
Tokyo Cabinet / Tyrant Weaknesses
- No bug tracker, no public code repository
Note: not so long ago, I’ve posted about these concerns in the Tokyo Cabinet community
- The documentation is not good enough
Note: I’m might have some good news here. Stay tuned!
- Under heavy load, master-master replication can fail
- Databases can be corrupted
Note: you might find Tokyo Cabinet database recovery useful for such unwanted situations
- With big tables, queries need a lot of RAM and time
- Tables seem slow & their configuration is not so clear
Note: you can learn more about the different Tokyo Cabinet database types and their configuration options
- No live backup, the copy function locks the database
Tokyo Cabinet / Tyrant Strenghts
- Easy to deploy and setup
- Easy to use
- It’s not a black box
- Good to very good performance for most of the time
- Small memory footprint
- A single Tokyo Tyrant process can handle thousands of connections
- Many command line tools
- Lua extensions
I’d definitely be interested to hear much more about how Netvibes is using Tokyo Cabinet / Tyrant, so ping me if you are ready to share more with the Tokyo Cabinet community.