The Khan Academy app is in need of a solid caching solution so it can grow new features without paying out performance losses.
All the best performing web apps in the world cache out the wazoo.
Whether they’re straight serving up static HTML files, caching everything possible into application memory, or shoving terabyte after terabyte into shared memcached, the top dev teams in the world are not happy when dynamically generating a bunch of content whenever their servers are getting slammed.
Note: I’m not talking about caching static client-side content. I’m talking about caching the results of datastore requests and other server-side work.
So where does that put us?
While we’re not caching a ton yet, KA runs on Google App Engine, which means we get a nice implementation of memcache right out of the box thanks to GOOG. Sweet. With memcache we get shared state across all of our servers with access speeds that are much faster than the datastore since everything is sitting in memory on some memcache server just waiting to be served up.
result = memcache.get(key)
if result is None:
result = some_long_running_process()
memcache.set(key, result)
return result
What’s the problem?
Memcache isn’t that fast. It’s faster than going to the datastore to reload a bunch of entities, sure, but it still involves a bunch of serialization, round-trips, deserialization, muck muck muck muck. Compare it to a cached entity sitting in the application server’s memory at the time of need, and it doesn’t stand a chance.

I do still have an original Fog Creek Performance Stopwatch, handed to
each and every intern at the beginning of their career. This isn’t it.
More importantly, memcache on GAE has quotas. Specifically, even the nicer GAE plans have a 640 GB limit of data transferred from memcache to your application per day.
So…say your non-profit’s founder has spent the last few years recording thousands and thousands of educational videos, you really want to include that full list of videos on your homepage at the moment, and the HTML-ified list of those videos clocks in around half a meg (uncompressed, which is what gets shoved into memcache). I’m rounding up here, but let’s just say. That gives us breathing room of (640 GB * (1000 MB / GB) = 640000 MB * (2 PAGEVIEWS / 1 MB) = ~1.28MM PAGEVIEWS) a little over 1.25 million pageviews before memcache’s quota is hit and we start to hit terrible performance issues. While this would be a sizable traffic spike for us in a single day, it’s not unreasonable.
What else ya’ got?
The global scope of each App Engine python instance is maintained across requests. This means we can cache objects and results directly in memory on each GAE instance.
The cachepy framework has abstracted this nicely so it’s easy to cache objects on whatever GAE instance your request finds itself running on.
result = cachepy.get(key)
if result is None:
result = some_long_running_process()
cachepy.set(key, result)
return result
This is great because it directly attacks memcache’s weaknesses.
Cachepy’s problem?
Unfortunately, a few big ones:
And what computer science technique that’s been around for 50+ years is being blogged about as somehow new this time?
Layering these two caches on top of one another (cachepy is layer one, memcache is layer two) turns out to be pretty useful. I couldn’t find a great system that does this, so I added layer_cache to the Khan Academy code (open source, check it out, tell me what I’ve done wrong or where I lack python chops (I do)).
@layer_cache.cache_with_key(VIDEO_TITLE_MEMCACHE_KEY, expiration=CACHE_EXPIRATION_SECONDS)
def video_title_dicts():
...
@layer_cache.cache_with_key_fxn(lambda module: "module_html_%s" % module.key())
def module_html(module):
...

NASA had it so much harder in the 60’s. They really should’ve been blogging it out.
layer_cache gives you a nice little decorator or two to easily decorate any function you want cached. If you use the decorator that accepts an arbitrary function to generate the cache key, you’ll have access to all of the parameters passed to the original function call so you can customize the key based on the function’s input.
The function’s results will automatically be cached in both cachepy and memcache according to the key and settings you specify, and you don’t have to worry about the repetitive get/if None/set pattern shown above.
layer_cache benefits:
layer_cache downsides: