memcached as a replication delay buffer for MySQL

I was playing around with memcached the other day and realised that it perfectly fits into those situations where you have a write-only master database server and one or more replicated slaves.

Picture the classic use of master/slave replication where data is written to the master and read from the slave. But what happens when the delay between the write and read is shorter than the replication lag? Basically you get stale data, or none at all, depending on the application.

In steps memcached. This clever little tool can ease the situation by providing a caching mechanism that can be tuned to your expected worst case replication lag.

The flow is:

Write data to master and to memcached with timeout that covers the gap between expected request replay and expected replication delay.

Read data from memcached. On a cache miss, read the data from the database.

The key point here is making sure that memcached has the latest copy of the data. On read from database, you need to ensure that memcache is updated as well, and writes must go to both the database and memcached.

This system has limitations as memcached is a simple hash table lookup mechanism, so if you look up data by primary key in the database, all is fine. If you want to look up data by multiple keys, then you are going to have architectural issues and will either need to add a layer of code between your app and memcached or consider an alternative.

There are lots of cases where this works a charm. Consider updating a user profile. Typically you would enter the data into a form, write it to the database, then re-read the data and re-display the profile data either for confirmation or re-edit. All fine and dandy. But if you want to separate your writes out to the master and reads to the slaves, you have a real possibility of replication delay biting you on the bum. Alright, you can probably code around this, but what about if there is a step between the save and re-display. Maybe a redirect in the middle, due to capturing referrer data. Add to that a server farm where each request may be handled by a different server. Then the possibility is both real and potentially very embarrassing. Data appears not to be updated at best or a database error is shown at worst. So the use of memcached can be a real life saver in this case and make what is a hit-and-miss affair that only works when the planets are correctly aligned, to a solid application that is fault tolerant.

Take a look at memcached, it is trivial to install and code, and can have enormous performance and resilience gains for your application.

No feedback yet


Form is loading...