interview: All content tagged as interview in NoSQL databases and polyglot persistence
Since the release of the Kyoto 1.0 which can be considered the successor of Tokyo Cabinet, I haven’t heard much from the Tokyo Cabinet/Tyrant world (except some political news or some furniture related announcements on craigslist, but these are not really of interest for the NoSQL community)
Some time ago I had a chance to discuss with Florent Solt (@florentsolt), Chief Architect at ☞ Netvibes, about their usage of Tokyo family (Tokyo Cabinet and Tokyo Tyrant). While I don’t have enough details about the Tokyo market, I’d be ready to speculate that Netvibes is probably one of the biggest users of the Tokyo products family.
To give you an quick overview of the Netvibes system here are some interesting points in random order:
- Netvibes uses Tokyo Tyrant, never Tokyo Cabinet directly
- Netvibes architecture is a master-slave architecture (due to weird things in master-master)
- Netvibes is using its own sharding method
- Netvibes maes use of Tokyo Cabinet hash, btree and tables storages
- only feeds related informations are in Tokyo databases (feeds, items, read/unread, …)
- other informations are still in a MySQL database (accounts, tabs, pages, widgets, …)
- to schedule crawling events, a queue has been implemented with a Tokyo Tyrant server and lua
- Netvibes is using a custom transparent proxy (ruby + eventmachine) to move/migrate data between servers
And now the Q&A part:
nosql: It sounds like initially all data lived inside MySQL. What made you look to alternative storage solutions?
Florent: Exactly. We started looking at an alternative when we reached MySQL limits. It was mostly disk space fragmentation issues (with blobs) and raw speed for insert.
nosql: How did you choose Tokyo Cabinet and Tokyo Tyrant?
Florent: We did some research, but 1.5 years ago, there were less solutions than now.
So we did some benchmarks, based on our own data (very important) and our architecture. We tried : Hadoop, CouchDB, Tokyo Tyrant, File system only (it was only to have a raw comparison with IMHO one of the most simple way to store data) and MySQL.
In terms of budget, responsiveness and knowledge gap, Tokyo was the winner.
nosql: What data has been moved to Tokyo?
Florent: We are using Tokyo for our feeds backend. Everything related to feeds such as feed items, enclosures, read/unread flags are stored in Tokyo. Same goes for the data structures we need to crawl all these feeds, such as a queue.
nosql: What criteria have you used to make this separation?
Florent: The separation was not clearly related to Tokyo, it was product decision. We wanted to implement this feed backend as a standalone module. We only interact with it trough an API.
nosql: How have you migrated existing data?
Florent: Indeed, initially feeds data were in MySQL tables.
The migration was simple, in terms of logic, but long and difficult to achieve. The main point was when an unknown data was requested from the new backend, a fallback query asked MySQL for the data, and finally saved everything in Tokyo. It sounds easy, but in reality there were many specific cases and strange issues.
nosql: You are using Tokyo hash, btree and tables. Would you mind giving some examples for what kind of data lives in each of them and how have you decided that is the best option?
Florent: When you really understand each structures it’s pretty easy to pick the best choice. For example:
- When we need only raw speed, we use a hash.
- When we need complex key strategies (based on prefix), we use btree.
- When we need conditional queries, we use tables.
For example, feeds (url, title, author, …) are stored in a Table. Same goes for the feed items and enclosures.
The queue is a Hash, to keep the focus on the speed. The first implementation was based on a BTree, but we improved our algorithms to have guessable keys only and prevent key scanning. There are also some lua functions linked to hide implementation and keep the whole thing fast too.
Flags (where we store read/unread data) are stored in a BTree with a lua extension because we are scanning keys a lot.
nosql: Can you speak a bit more about the in-house sharding solution you are using?
Florent: Sure. Tokyo does not come with sharding or dynamic partitioning implementation, so we built our own solution. It’s feed or user centric. For example, we know that the feed table will always fit on one dedicated server, whatever the number of feeds. So, for each feed we store where (the id of the shard server) its items are.
For the flags, same logic, for a given user we know where his flags are. It makes it easy to add new shards, because it’s a line in a configuration file. And we have created all the scripts we need to move data from one shard to another (migration, auto-balance, …)
nosql: What lessons have you learned that you’d have liked to know before using Tokyo?
Florent: Very difficult to say as we have learned so much with this project.
Maybe the most important point would be to know how Tokyo Tyrant servers would manage the load and what are the best practices to prevent common speed issue, that was what we learned the hard-way.
nosql: Any numbers about Netvibes Tokyo deployment you can share with us?
Florent: About numbers, you already know that it’s a sensitive information :-). I can’t say more than those numbers in my slides.
nosql: Fair enough. Thank you so much Florent!
Couple of weeks ago, I had the pleasure to sit down with Mathias Meyer, Chief Visionary at Scalarium, a Berlin startup and discuss NoSQL adoption. Like myself, Mathias is really excited about NoSQL and he uses every opportunity to introduce more people to the NoSQL space. Recently he gave quite a few presentations around the Europe about NoSQL databases.
The discussion has focused on how would someone start learning and using NoSQL databases and the path to follow in this new ecosystem. Below is a transcript of our conversation.
Alex: How does one get started with NoSQL?
Mathias: Well, that’s a question I get quite a lot, but it is not that easy to answer. For me, I just pick one tool and start playing with it. If I see a use case for it, I add it to my tool box. If not, I broadened my personal horizon. Just that is always a win in my book.
From a business perspective, you are probably going to find some use cases where storing your data in a relational database doesn’t make too much sense and you’ll start looking for ways to get it out of the database. For example, think about storing logs data, or collecting historical data, or page impressions.
Alex: So, as a developer you should just give yourself a chance to play with the new shiny toys. As a business, a NoSQL database can be a viable solution for scenarios where you discover that your data doesn’t really fit the relational model.
Mathias: Indeed. You have stuff in your database and it is too much for your database, or it puts too much load on your database, and you’re looking for ways to get that out of your database. Load is a relative term, but consider data like logging or statistical data that grows somewhat exponentially. Relational databases are not a great fit to keep track of that kind of data, as it gets harder and harder to maintain or clean up as it grows.
As a developer playing with new tools and different ways of solving problems makes sense all by itself, simply because it adds to your toolbox, and it broadens your personal and professional horizon. That’s basically how I got into NoSQL. I stumbled upon tools, which in turn use databases that are more optimized to store data for their use case. It’s just fun playing with them, and new tools with different approaches of storing data always managed to make me curious. And who can resist a database that allows you to connect through telnet? I think that appeals to any geek I know.
Alex: There are quite a few NoSQL databases out there. Do you have any favorites or recommendations?
Mathias: If there’s any bunch of tools I’d recommend for anyone to start playing with, it’d probably be MongoDB, CouchDB or Redis. They are excellent candidates to take data off your main database, and happily live alongside of it.
If you just want to play with a NoSQL database, and you’re coming from a relational background, your easiest bet would probably be MongoDB, as it’s a good mix what you’re used to from relational databases with the best of schemaless storage. Redis makes sense to look at because it’s a good candidate to take certain types of data out of your main database. Statistics, message queues, historical data are just some examples.
When you work with something like MongoDB and CouchDB you’ll get a good idea of what NoSQL is about, as MongoDB is halfway between a relational and NoSQL database while CouchDB is basically totally different thinking all the way. If all you’re looking for is scale, have a look at Riak or Cassandra. They follow pretty interesting models of scaling up.
Alex: These NoSQL databases are proposing some new non-relational data models. Do you like one model more than the others?
I’d say my favorite is the document database as it is pretty much the most versatile of all of them. You can put any data in a document database and it leaves you all the freedom to model that data and to model some of the relationships between documents. It leaves all that up to you. And it is very flexible on how you can do that.
Personally I like looking into different solutions and maybe even combining them. That’s exactly what I do in practice I usually have something like CouchDB as my main database and something like Redis as a really nice and handy small store on the side where I put data that’s not suited for putting into CouchDB.
Alex: Is there something that you should be aware of before trying any of these NoSQL projects?
Mathias: It depends if you are doing it for your business or for yourself, or if you are using it on green field projects because that’s usually a lot easier. The things I always like to tell people is that they need to look at what they think their data is gonna be shaped like. Obviously you won’t know that right from the start, but you’ll still have an idea of how loose your data will be, if you need something like typed relationships, transactions, and so on.
You can’t really give a universal answer here. In the end you’ll have to get an idea of what your data will look like and how you’re going to read or write it. If a NoSQL database seemingly is a good fit for it, go for it. It’s just important to be aware of both the benefits and the potential downsides, but that should be common sense for any tool you pick for a particular use case.
Alex: Well, I’d say that based on my experience with relational databased there are at least 3 things I’ve really gotten used to: the relational model, the query model and transactions. So for someone looking to NoSQL databases he should be aware that all these 3 concepts will have a different form.
Mathias: Yes, absolutely. You need to be aware that you’ll meet a different data model, which brings great power and flexibility. You’ll find that most of the tools in the NoSQL landscape removed any kind of transactional means, for the benefit of simplicity, making it a lot easier to scale up. We might not realize that transactions are not always needed, which is not to say they’re totally unnecessary, it’s merely that oftentimes they’re lack is not really a problem.
As for querying, for the most part you’re saying good bye to ad-hoc queries. Most NoSQL databases removed means to run any kind of dynamic query on your data, MongoDB being the noteworthy exception here. Data is usually pre-aggregated by e.g. using Map/Reduce, or access is simply done by keys. Is it a problem? Only you can make that decision simply based on requirements and features.
Either way, it does take a while to get used to these things, no doubt.
Alex: Once you start using NoSQL databases, will you have to get rid of RDBMS?
Mathias: No. If someone comes to me asking if they should switch to a NoSQL database without having a specific problem, my answer is always no. You should look for alternative solutions only when you need to solve a real problem (which is usually that your current database is not able to keep up with all the types of data you throw at it, or you’re storing lots of data and it’s kind of a pain to get it out again, both in terms of querying or simply removing stale data in large tables, or your data simply has reached a limit where it’s too high a cost to migrate your schema).
As Jan Lehnardt said, NoSQL is more about choice, you pick the tool that is right for the job, and if that tool is an RDBMS then you don’t need to look for a NoSQL database until you have a specific problem. While the new tools are shiny and tempting to throw at every problem, there’s always a learning curve involved, both in development and operations. It makes more sense to start off slow, and see how you go by just moving small parts at a time to a secondary database.
Alex: Thanks Mathias!
Last week, Terrastore has seen a new release (0.5.0) and the new version brings quite a few new features:
- Multi-cluster (aka Ensemble) support: Terrastore can now be configured to join multiple clusters together and have them working as a unique cluster ensemble, with data automatically partitioned and accessed among clusters and related nodes. This will greatly enhance scalability, because each cluster will provide its own active master.
- Autowired conditions, functions, predicates and event listeners: users are now able to write their own conditions, functions and so on, annotate and put them in Terrastore classpath, and have them automatically detected and used without modifying Terrastore configuration.
- Custom partitioning APIs: users can now easily write their own custom partitioning scheme, to suit their data-locality needs.
- Conditional get/put operations, very useful for implementing atomic CAS, optimistic locking and so on.
- Basic JMX monitoring of cluster status.
- Brand new java client with fluent interfaces.
I had a chance to talk to Terrastore’s lead developer Sergio Bossa (@sbtourist) about some of these interesting features and here is the result.
NoSQL: Could you please expand a bit on the multi-cluster support? An example of how that works (is data relocated, is there data deduplication happening, etc.) would be really useful.
Sergio Bossa: Terrastore multi-cluster ensemble works by simply configuring server nodes, belonging to multiple independent clusters, in order to discover and communicate with each other regardless the original cluster they belong to.
This way, Terrastore masters will be unaware of each other and will not need to communicate and share data, achieving so a kind of shared-nothing architecture between clusters and allowing independent scaling of different masters, while Terrastore servers will be able to discover each other and partition data for spreading data load and computational power.
Data is partitioned following a two-level scheme: first, each document (depending on its bucket/key and the partitioning strategy used, more on this later) is uniquely assigned to a cluster, and then moved to a specific server node inside the cluster. Data is never relocated/repartitioned among different clusters, that is, each document is permanently assigned to one and only one owning cluster, and this piece of information is shared by all server nodes. As a consequence, if all server nodes from a given cluster die, documents owned by that cluster will be unavailable, and other Terrastore servers (belonging to other clusters) will clearly communicate to the clients that some data will be unavailable due to unavailable clusters.
This happens in order to keep consistency even in case of severe failures or cluster partitions: available clusters, as well as both ends of two partitioned clusters, will continue working and just see part of their data as unavailable. In other words, this kind of setup will keep consistency and partition tolerance but sacrifice some part of data availability.
Please note that a whole cluster turns to be unavailable only if all of its server nodes crash, which may be unlikely if you have several server nodes, so the only other event that could make a cluster unavailable would be a partition.
NoSQL: Conditions, functions and predicates sound a bit like basic components of mapreduce. Am I correct? How are they working?
Sergio: Terrastore doesn’t still support Map/Reduce, but it’s on the issue tracker: the link is ☞ http://code.google.com/p/terrastore/issues/detail?id=4 and I invite the community to take part to the discussion.
Conditions are rather used for range/predicate queries and conditional get/put operations while functions are currently used to atomically update a document value (things like atomic counters, generators and so on).
NoSQL: Could you explain our readers the importance of having conditional get/put operations? Are you aware of how other NoSQL projects are handling these scenarios?
Sergio: Conditional get/put operations with custom conditions come very useful for implementing a wide range of well known techniques related to version control/conflict resolution, such as optimistic concurrency, as well as several other use cases such as bandwidth saving by avoiding getting documents if certain conditions aren’t met and so on.
As far as I know, only Riak, through HTTP headers, and Amazon SimpleDB support similar use cases.
NoSQL: Afaik Cassandra provides support for custom partitioning. A similar concept is present in Twitter Gizzard framework. Are there any default partitioning implementation provided? How did you come up with those?
Sergio: Well, Terrastore custom partitioning APIs are probably way easier to configure: it just boils down to implement and annotate two simple interfaces.
Anyways, default partitioning scheme is based on consistent hashing, as happens in many other products: this has a clear advantage of shielding the user from partitioning details he/she may don’t want to be aware of, as well as of trying to equally balance data distribution.
But, some users may want to have full control over data partitioning. For example, provided they have two kind of buckets, say Customers and Orders, they may want to allocate the former on a given server, and the latter on another one, or, they may want to allocate customers from A to M with related orders on a given server, and others on another one: this may happen for hardware reasons, maybe for allocating more data on bigger machines, or for pure data locality reasons, maybe because they have a third party order processing application on a given server and they just want to keep data and processing as near as possible (which is, we all remember, an essential principle). So, it’s not a matter of creating a “better” partitioning scheme: it’s a matter of creating a partitioning scheme able to fit users’ needs.
Please note that I was talking of allocating data to specific servers, but the same applies to clusters: Terrastore APIs allow users to have full control and decide which cluster and which server in that cluster a given document will be assigned to.
NoSQL: Anything else you’d like to add?
Sergio: Well, a growing developers community is another important aspect, and something I’m trying to strive for more and more.
There’s a guy from Sweden, Sven Johansson, working on a new Java Client implementation, and Greg Luck, of Ecache fame, will probably start contributing soon.
But we need more: so I invite everyone reading this to take the opportunity to participate: “just” subscribing to the mailing list and providing feedback would be crucial and obviously warmly welcome.
NoSQL: Thank you Sergio!
CTO of 10gen, MongoDB creators: We are sort of similar to MySQL or PostgreSQL in terms of how you could use us
Some quotes and comments from ☞ (a quite long) interview with Eliot Horowitz, CTO of 10gen, creators of MongoDB:
I think the first question you have to ask about any database these days is, “What’s the data model?”
The only thing I’d add is: “… and how does that fit my problem?”.
That whole class of problems exists because there’s a very clunky mapping from objects to relational databases. With document databases, that mapping becomes much simpler.
I only partially agree with this. There are some scenarios when mappings seem to be easier with document databases, but for very complex models (read hierarchical, multi-relational) things will remain quite the same — I am saying “quite” because you can still use some short routes, but at the end of the day it will depend also on how you’ll use that data.
I also think […] that the object databases before were actually more closely related to current graph databases than to document databases. The document database is really just taking MySQL, and instead of having a row, you have a document. So I think it’s a much simpler transition and it’s actually much closer to MySQL than a lot of people might think.
I assume that the connection with graph databases is based on the following arguments: the connectivity between objects can be very rich and while all that can be persisted it is not accomplished in a transparent way.
If you look at our road map for this year, there’s no one big feature. I think the only big thing we’re doing right now is getting the auto-sharding to be fully production bulletproof.
I totally agree. That auto-sharding feature has been in alpha for too long.
We are sort of similar to MySQL or PostgreSQL in terms of how you could use us, and people want all the features that they’re used to in MySQL and PostgreSQL. These include things like full-text search, SNMP, and all the assorted add-ons providing special indexing.
Is this the market MongoDB is trying to reach? Is MongoDB trying to become the new MySQL? Definitely interesting.
If there was a feature that would hurt our performance, we would think long and hard about implementing it, and we are definitely more interested in making the basics work than we are in adding more features.
I really appreciate this sort of opinionated approach, even if sometimes you’ll have to tell users a bit more about the tradeoffs.