From Joel’s recent IAmA on reddit, w/ ellipses thrown in haphazardly:
“What kind of sacrifices do you make on stackoverflow.com in order to obtain speed/efficiency?”
“Heavy caching…lavish spending…a willingness to let developers spend time on optimization.”
The performance benefits of packaging up your bunches of javascript and css files into as few requests as possible are well known. We recently made a couple changes to the packaging system used for www.khanacademy.org that are worth mentioning, especially for anyone else on App Engine.
The whole point of a javascript/stylesheet packaging system is to be able to cleanly split code into a bunch of different files like this:
"exercises": {
"files": [
"ASCIIMathML.js",
"raphael.js",
"ASCIIsvg-wrapper.js",
"fontdetect.js",
"seedrandom.js",
"metautil.js",
"exerciseutil.js",
"graphutil.js",
"g.raphael-min.js",
"g.pie-min.js",
"g.line-min.js",
]
}
…while only issuing one request to grab all of these files, in their proper order, combined and minified into a single payload, like this:
<script src='/javascript/exercises/all.js'></script>
As always, you’re immediately faced with a decision about how to version this file. You always want visitors to use a cached version of this file unless something has changed. If you leave the URL as-is above, you’ll be telling users to “just press Ctrl+R” a lot after you deploy a new version and their cached, stale javascript no longer works. A common, easy solution is to append the current application’s version to the URL via query string, like this:
<script src='/javascript/exercises/all.js?v={{ Your_App_Version }}'></script>
…and now, whenever you deploy a new version of your app, every browser that hits your site will be guaranteed to download the latest and greatest version of your javascript. Up until today, this is the technique we used for Khan Academy.
There are three problems here.
?v=3 to ?hash=AB34 will ruin some perfectly good proxy caching opportunities.App Engine’s "all.js?v={{Version}}" dilemma
The following is complete conjecture that is sure to be mocked by somebody with real knowledge of the situation.
When you deploy a new version on App Engine, the 5, 10, 100 different instances running your app don’t switch over to the new version all at once. They switch piecemeal.
Consider the following scenario:




all.js?v=5.

all.js.
all.js?v=5 will continue to serve v4 contents for quite a long time, and the browser is powerless to avoid it.This bug started to bite us during what seemed like every other deploy after we crossed a certain traffic threshold (and therefore increased our race condition odds). The worst part is that geographically separate CDNs mean everything can seem fine on the east coast while the entire west coast is unable to use Khan Academy thanks to old javascript files.
All three of the above problems are solved by altering your javascript package’s filename with a hash of the package’s actual content. Now, our javascript references look like this:
<script src='/javascript/exercises/hashed-1424dd90e695125adeb0e14d24a113ac.js'></script>
…which means we never ask users to download scripts they already have, we take advantage of proxy caching, and we protect ourselves from bugs caused by partial deploys and zealous CDN caching.