Layer Caching in App Engine with memcache and cachepy

The Khan Academy app is in need of a solid caching solution so it can grow new features without paying out performance losses.

All the best performing web apps in the world cache out the wazoo.

Whether they’re straight serving up static HTML files, caching everything possible into application memory, or shoving terabyte after terabyte into shared memcached, the top dev teams in the world are not happy when dynamically generating a bunch of content whenever their servers are getting slammed.

Note: I’m not talking about caching static client-side content. I’m talking about caching the results of datastore requests and other server-side work.

So where does that put us?

While we’re not caching a ton yet, KA runs on Google App Engine, which means we get a nice implementation of memcache right out of the box thanks to GOOG. Sweet. With memcache we get shared state across all of our servers with access speeds that are much faster than the datastore since everything is sitting in memory on some memcache server just waiting to be served up.

result = memcache.get(key)
if result is None:
    result = some_long_running_process()
    memcache.set(key, result)
return result 


What’s the problem?

Memcache isn’t that fast. It’s faster than going to the datastore to reload a bunch of entities, sure, but it still involves a bunch of serialization, round-trips, deserialization, muck muck muck muck. Compare it to a cached entity sitting in the application server’s memory at the time of need, and it doesn’t stand a chance.


I do still have an original Fog Creek Performance Stopwatch, handed to
each and every intern at the beginning of their career. This isn’t it.

More importantly, memcache on GAE has quotas. Specifically, even the nicer GAE plans have a 640 GB limit of data transferred from memcache to your application per day.

So…say your non-profit’s founder has spent the last few years recording thousands and thousands of educational videos, you really want to include that full list of videos on your homepage at the moment, and the HTML-ified list of those videos clocks in around half a meg (uncompressed, which is what gets shoved into memcache). I’m rounding up here, but let’s just say. That gives us breathing room of (640 GB * (1000 MB / GB) = 640000 MB * (2 PAGEVIEWS / 1 MB) = ~1.28MM PAGEVIEWS) a little over 1.25 million pageviews before memcache’s quota is hit and we start to hit terrible performance issues. While this would be a sizable traffic spike for us in a single day, it’s not unreasonable.

What else ya’ got?

The global scope of each App Engine python instance is maintained across requests. This means we can cache objects and results directly in memory on each GAE instance.

The cachepy framework has abstracted this nicely so it’s easy to cache objects on whatever GAE instance your request finds itself running on.

result = cachepy.get(key)
if result is None:
    result = some_long_running_process()
    cachepy.set(key, result)
return result 


This is great because it directly attacks memcache’s weaknesses.

  • Cachepy is blazing fast once the cache is primed (it doesn’t get much faster than just serving an object already stored in memory).
  • GAE can’t really put quota restricitions on cachepy, so that bugaboo is out the window.

Cachepy’s problem?

Unfortunately, a few big ones:

  • Cachepy is unreliable because you never know when a new GAE instance will be inserted or an old one removed from your application. When this happens (it will) you have to be ready to deal with an unprimed cache.
  • You can’t systematically flush the value for specific keys from all cachepy instances since they’re spread out over separate machines with no central communication.
  • Cachepy is fast because you’re filling up the GAE instance’s memory. There’s a limit to this, somewhere. We haven’t hit it yet, and we aren’t getting close (I don’t think), but if you start caching your entire datastore in cachepy I have a feeling App Engine is going to start throwing punches back.

And what computer science technique that’s been around for 50+ years is being blogged about as somehow new this time?

Layering these two caches on top of one another (cachepy is layer one, memcache is layer two) turns out to be pretty useful. I couldn’t find a great system that does this, so I added layer_cache to the Khan Academy code (open source, check it out, tell me what I’ve done wrong or where I lack python chops (I do)).

@layer_cache.cache_with_key(VIDEO_TITLE_MEMCACHE_KEY, expiration=CACHE_EXPIRATION_SECONDS)
def video_title_dicts():
   ...
@layer_cache.cache_with_key_fxn(lambda module: "module_html_%s" % module.key()) 
def module_html(module):
   ...


NASA had it so much harder in the 60’s. They really should’ve been blogging it out.

layer_cache gives you a nice little decorator or two to easily decorate any function you want cached. If you use the decorator that accepts an arbitrary function to generate the cache key, you’ll have access to all of the parameters passed to the original function call so you can customize the key based on the function’s input.

The function’s results will automatically be cached in both cachepy and memcache according to the key and settings you specify, and you don’t have to worry about the repetitive get/if None/set pattern shown above.

layer_cache benefits:

  • More reliable than cachepy. If GOOG decides to bring up a new GAE node, we’ll pull the data out of our second layer, memcache, and stuff it in cachepy for the next request.
  • No quota issues. The majority of cached requests will be handled by cachepy. Memcache just sits in the background and picks up any cache misses caused by new GAE nodes or individual server memory issues. We’ve had a less-generalized version of this layered cache solution running on our KA homepage for a few weeks now and our memcache quota is still extremely low.
  • Much faster than pure-memcache for all the same reasons.
  • Persistance across application versions. Memcache can keep its values even after you deploy a new version of your app, so you no longer need to blow away your entire cache after a deploy (unless desired). 
  • Easy to use. In a new application that’s growing new functionality quickly, it’s important to have basic performance tools at the ready to stop performance bloaty bloat bloat. Handy cache decorators like these will continue to be useful until they inevitably become overused and cause a whole new set of problems.

layer_cache downsides:

  • You still can’t reliably flush specific values across all cachepy instances. We haven’t run into this issue yet — stricter cache key management is working fine for us.
Comments 12/14/10 — 7:47pm Permalink
 
  1. bjk5 posted this