memcached: All content tagged as memcached in NoSQL databases and polyglot persistence
Wednesday, 3 April 2013
Memcached vs InnoDB Memcached in MySQL 5.6
Some numbers from comparing Memcached with InnoDB Memcached in MySQL 5.6:
Keep in mind that the entire data set fits into the buffer pool, so there are no reads from disk. However, there is write activity stemming from the fact that this is using InnoDB under the hood (redo logs, etc).
There is a significant impact on the speed so deciding which solution to use gets down to analysing the costs and complexity of maintaining another tool, the cost of Memcached warmup and the performance drop of using InnoDB Memcached.
Original title and link: Memcached vs InnoDB Memcached in MySQL 5.6 (©myNoSQL)
Wednesday, 6 March 2013
A Key-Value Cache for Flash Storage: Facebook's McDipper and What Preceded It
A post on Facebook Engineering’s blog:
The outgrowth of this was McDipper, a highly performant flash-based cache server that is Memcache protocol compatible. The main design goals of McDipper are to make efficient use of flash storage (i.e. to deliver performance as close to that of the underlying device as possible) and to be a drop-in replacement for Memcached. McDipper has been in active use in production at Facebook for nearly a year.
I know at least 3 companies that have attacked this problem with different approaches and different results:
- Couchbase (ex-Membase, ex-NorthScale) started as a persistent clustered Memcached implementation. It was not optimized for Flash storage though. Today’s Couchbase product is still based on the memcache protocol, but it adding new features inspired by CouchDB.
- RethinkDB, a YC company and the company that I work for, has worked and released in 2011 a Memcache compatible storage engine optimized for SSDs. Since then, RethinkDB has been building and released an enhanced product, a distributed JSON store with advanced data manipulation support.
- Aerospike (ex Citrusleaf) sells a storage engine for flash drives. Its API is not Memcache compatible though.
People interested in this market segment have something to learn from this.
Original title and link: A Key-Value Cache for Flash Storage: Facebook’s McDipper and What Preceded It (©myNoSQL)
Wednesday, 9 January 2013
Migrating Live Memcached Servers
Ian Winter describes the possible approaches to migrate their memcached server farm:
- Big Bang. Update all the configuration files and simply push that code live. This has a huge hit on performance as all the new instances are cold, it is however the quickest option.
- Migration instance by instance. Each instance would be moved and relevant configuration files pushed live. This way only 1 or X instances is “cold”. This limits impact, but, also takes time as only 1 instance can be done at any one time.
- Write to both. The application code could be altered to make all writes to both the existing set of instances and the new. This isn’t great as we’d need a full QA cycle to validate the code works. It’s also something we’d have to implement then pull back out down the line.
- Mirror traffic. Similar to the above, but, this time lower down the stack. Essentially duplicate all the TCP level traffic so it warms in parallel and keeps both instances in sync meaning new writes, deletes occur on both existing and new sets.
Can you tell which approach they used?
PS: For the first two approaches I’d use different names: Stop the World and Rolling upgrade respectively.
Original title and link: Migrating Live Memcached Servers (©myNoSQL)
Monday, 1 October 2012
Using Riak as Cache Layer
Sean Cribbs explains how to use Riak as a caching solution:
- Bitcask or Memory backends
- The possibility of configuring the cluster for lower guarantees of per-key availability
Then benchmark the system for your scenario.
Original title and link: Using Riak as Cache Layer (©myNoSQL)
via: http://lists.basho.com/pipermail/riak-users_lists.basho.com/2012-October/009680.html
Thursday, 27 September 2012
Redis and Memcached Benchmark on Amazon Cloud
Garantia Data, providers of Redis and Memcached as-a-Service-in-the-Amazon-Cloud, published the results of a throughput and latency benchmark for different AWS deployment models:
The first thing we looked at when putting together our benchmark was the various architectural alternatives we wanted to compare. Users typically choose the most economical AWS instance based on the initial size estimate of their dataset, however, it’s crucial to also keep in mind that other AWS users might share the same physical server that runs your data (as nicely explained by Adrian Cockcroft here). This is especially true if you have a small-to-medium dataset, because instances between m1.small and m1.large are much more likely to be shared on a physical server than large instances like m2.2xlarge and m2.4.xlarge, which typically run on a dedicated physical server. Your “neighbours” may become “noisy” once they start consuming excess I/O and CPU resources from your physical server. In addition, small-to-medium instances are by nature weaker in processing power than large instances.
Only two comments:
- it’s not clear if there were multiple instances of Redis used per machine when the chosen instances had multi-cores
- I would have really liked to also have a pricing comparison in the conclusion section
Original title and link: Redis and Memcached Benchmark on Amazon Cloud (©myNoSQL)
via: https://garantiadata.com/blog/its-true-even-modest-datasets-can-enjoy-the-speediest-performance
Sunday, 5 August 2012
A Quick Test of the New MySQL Memcached Plugin With (J)Ruby
Gabor Vitez:
With a new post hitting Hacker News again on MySQL’s memcached plugin, I really wanted to do a quick-and-dirty benchmark on it, just to see what good it is – does this interface offer any extra speed when compared to SQL+ActiveRecord? Does it have it’s place in the software stack? How much work is needed to get this combination off the ground?
When running into micro-benchmarks, I really try my best to figure out if they hide any value. But it’s hard to find any in one that uses different libraries, runs both the benchmark and the server on the same machine and uses no concurrency.
Original title and link: A Quick Test of the New MySQL Memcached Plugin With (J)Ruby (©myNoSQL)
via: http://elevat.eu/blog/2012/08/a-quick-test-of-the-new-mysql-memcached-plugin-with-ruby/
Monday, 23 July 2012
New JRuby Memcached Gem
Richard Huang about a new memcached gem built on top of spymemcached for JRuby that is compatible with Evan Weaver’s memcached gem :
Quickly I replaced xmemcached to spymemcached, and memcached get time decreased to 40+ ms and it only generates 2 threads, awesome. And its hash and distribution algorithms are 100% compatible to libmemcached 0.32.
I just went through a similar experience trying to find a memcached library that works with both Ruby MRI and JRuby. I’ve ended up not finding it.
Original title and link: New JRuby Memcached Gem (©myNoSQL)
via: http://huangzhimin.com/2012/07/24/jruby-memcached-0-1-0-released/
Thursday, 24 May 2012
Pinterest Architecture Numbers
Todd Hoff caught some new numbers about Pinterest architecture and from those the ones interesting from the data point of view:
-
125 EC2 memcached instances, from which 90 for production and 35 for internal usage:
Another 90 EC2 instances are dedicated towards caching, through memcache. “This allows us to keep a lot of data in memory that is accessed very often, so we can keep load off of our database system,” Park said. Another 35 instances are used for internal purposes.
-
70 master MySQL databases on EC2
- sharded at 50% capacity
- backup databases in different regions
Behind the application, Pinterest runs about 70 master databases on EC2, as well as another set of backup databases located in different regions around the world for redundancy.
In order to serve its users in a timely fashion, Pinterest sharded its database tables across multiple servers. When a database server gets more than 50% filled, Pinterest engineers move half its contents to another server, a process called sharding. Last November, the company had eight master-slave database pairs. Now it has 64 pairs of databases. “The sharded architecture has let us grow and get the I/O capacity we need,” Park said.
-
80 million/410TB objects stored in S3
- no details about Redis
Original title and link: Pinterest Architecture Numbers (©myNoSQL)
Tuesday, 8 May 2012
Algorithm for Automatic Cache Invalidation
Jakub Łopuszański describes in much detail and with examples an algorithm for cache invalidation:
Imagine a bipartite graph which on the left hand side has one vertex per each possible subspace of a write query, and on the right side has vertices corresponding to subspaces of read queries. Actually both sets are equal, but we will focus on edges.
Edge goes from left to right, if a query on the left side affects results of a query on the right side. As said before, both sets are infinite, but that’s not the problem. There are infinitely many edges, but it’s also not bad. What’s bad is that there are nodes on the left side with the infinite degree, which means, we need to invalidate infinitely many queries. What the above tricky algorithm does, is adding a third layer to the graph, in the middle between the two, such that the transitive closure of the resulting graph is still the same (in other words: you can still get by using two edges anywhere you could by one edge in the original graph), yet each node on the left, and each node on the right, have finite (actually constant) degree. This middle layer corresponds to the artificial subspaces with “?” marks, and serves as a connecting hub for all the mess. Now, when a query on the left executes, it needs to inform only its (small number of) neighbours about the change, moving the burden of reading this information to the right. That is, a query on the right side needs to check if there is a message in the “inbox” in the middle layer. So you can think about it as a cooperation where the left query makes one step forward, and the right query does a one step back, to meet at the central place, and pass the important information about the invalidation of cache.
I’m still in front of a piece of paper understanding how it works.
Original title and link: Algorithm for Automatic Cache Invalidation (©myNoSQL)
via: https://groups.google.com/d/topic/memcached/OiScvRbGaU8/discussion
Friday, 4 May 2012
Yahoo Patent Letter to Facebook Referring to Memcached and Other Open Source Technologies
Sarah Lacy:
The technologies in question include things like memcached which was created in 2003 by LiveJournal and has been used longer than Facebook has been alive.[…]
Other examples include Open Compute, an open hardware project started by Facebook that focuses on low-cost, energy efficient server and data center hardware; Tornado a Python-based web server used for building real-time Web services; and HPHP, a source code transformer that turns PHP into C++.
I have no other details about this patent letter Yahoo sent Facebook, but I seriously doubt it targets these technologies separately. Most probably it refers to some sort of combinations of these and one that Facebook has mentioned as part of their IP.
Original title and link: Yahoo Patent Letter to Facebook Referring to Memcached and Other Open Source Technologies (©myNoSQL)
Wednesday, 21 March 2012
In-Memory Key-Value Store in C, Go and Python
Graham King:
On paternity leave for my second child, I found myself writing an in-memory hashmap (a poor-man’s memcached), in Go, Python and C. I was wondering how hard it would be to replace memcached, if we wanted to do something unusual with our key-value store. I also wanted to compare the languages, and, well, I get bored easily!
Actually it’s very easy and doesn’t require any coding at all. Plus you’ll get a bit more than what you’d expect.
Original title and link: In-Memory Key-Value Store in C, Go and Python (©myNoSQL)
via: http://www.darkcoding.net/software/in-memory-key-value-store-in-c-go-and-python/
Tuesday, 13 March 2012
The Benefits of the MySQL Memcached Plugin
Mario Beck:
But the memcached plugin to MySQL is a replacement or addition to the SQL interface to MySQL. So instead of using SQL queries in your application to persist or retrieve data from MySQL you can use the memcached interface. And what are the benefits?
- Much higher performance (nb: reduced latency, higher throughput) Easier scalability via sharding Simpler application coding
Plus Baron Schwartz’s cogent addition:
I think a huge benefit you’re discussing, but not naming separately, is consistently. With memcached, you have two copies of the data, and the man with two watches never knows what time it is. With a memcached interface to MySQL, you have only one copy — and it is consistent. This is a huge win.
Original title and link: The Benefits of the MySQL Memcached Plugin (©myNoSQL)
via: http://mablomy.blogspot.com/2012/03/why-should-i-consider-memcached-plugin.html
Most Popular Articles
- Translate SQL to MongoDB MapReduce
- Tutorial: Getting Started With Cassandra
- CouchDB vs MongoDB: An attempt for a More Informed Comparison
- Cassandra @ Twitter: An Interview with Ryan King
- A Couple of Nice GUI Tools for MongoDB
- NoSQL benchmarks and performance evaluations
- Ehcache: Distributed Cache or NoSQL Store?
- Document Databases Compared: CouchDB, MongoDB, RavenDB
- Quick Review of Existing Graph Databases
- NoSQL Data Modeling