It’s kinda cheating to call this Part II of App Engine Performace Hacks You’ll Probably Never Need, but I recently stumbled across a single tip that anybody in App Engine land will want to know about.
fetch() for datastore access, it’s slower than it used to beIf you’ve been on App Engine for a while, you’re either surprised by that tip or more informed than I was earlier today. Until somewhat recently, it’s been the case that running:
for monkey in models.Monkey.all().fetch(500):
monkey.swing_from_vine()
…would’ve been much faster than:
for monkey in models.Monkey.all():
monkey.swing_from_vine()
…assuming that you have at least a couple hundred Monkeys (who doesn’t?) and are willing to eat the memory required to hold all Monkeys at once (who wouldn’t be?). That’s because all() returns a query iterator which chunks up its roundtrips to the datastore into a bunch of small trips, while fetch() (it is hoped) only goes back’n’forth once.

Most people who once wrote fetch([some big number]) were
trying to avoid RPC performance waterfalls like this. Now it can cause them.
So if you’re anything like us, your code is probably littered with fetch() calls all over the place wherever you felt comfortable grabbing a large number of entities and were willing to make a deal w/ the App Engine memory gods to get it done as quickly as possible.
This no longer has any performance benefit unless you use the new batch_size parameter.
fetch() and run() now accept an optional batch_size parameter, but fetch()’s is undocumented. You want something like: fetch(1000, batch_size=1000) or any reasonable batch_size for your use case. Without this parameter, fetch() simply returns list(run(...)) using run()’s default parameters, which happen to set a small batch size.
However, since most people were using fetch() to get around the batch size issue and don’t actually need all entities in memory at once, I recommend following the recently updated docs and always using run() with a reasonable limit and batch_size:
# A grossly simplified couple of lines from our student reporting code:
for problem in models.Problem.all().run(limit=10000, batch_size=1000):
problem.add_to_student_report()
…which protects you from unnecessary round trips and still gives you control over the amount of memory you’re willing to allocate at once. It feels like fetch() is going the way of the dodo in App Engine land. You can see similar hints dropped into the App Engine codebase right when their team made the ‘ole switcheroo:

Critical docstring change right when the implementation switched.
If you’re working in App Engine land, you’re going to find countless examples all over the web of people advocating or just using fetch([some big number])-ish code that directly contradicts the above advice. I’m gonna go correct this Stack Overflow question as soon as I hit Publish. As of this change, all those old pieces of code suffer from the worst of both worlds: high memory requirements and small batch sizes causing lots of roundtrips.
I’ve verified the perf impact on our production server, and just as expected, keeping your roundtrips low can really pay off. While it’s possible we missed some big announcement somewhere, the fact that we just discovered this means there might be somebody else out there who can benefit.
If you have any other requests for App Engine performance-related topics to cover, I’d love to hear them.
Three of our twelve(‽) summer interns have arrived. It’s high time I share how we mentor interns before I’m spending all my time swatting sharks and helicopters out of the air. These tips also apply to new full-time hires, of course, but for now I’ve got interns on the brain.
At some point in a new employee’s first day they will hear some variation of the words, “I’m your official mentor. As we work through your first few projects together, you can interrupt me any time for any technical question, non-technical question, question about the rules of Bang!, or just because you want to order a specific keyboard. Don’t think twice.”
We don’t throw our interns into a room with 5 people and say “You’re surrounded by mentors! LEARN.” If you’re hiring right, they already know how to do that better than you. What they need is awareness of a specific mentor that will unflinchingly act as their outlet for any need, without frustration.
Good hires will find the right balance between diving into their new codebase to find their own answers and asking you for help. Some may need a little bit of guidance. We’d rather err on the side of interns asking too many questions (easily correctable) than them being scared or unsure of who to interrupt (unfortunate culture problem).

I wish this shark was remote control.
Gonna keep this short because I’ve been over this before. Code reviews form a huge part of our mentor/mentee relationship. Warning though! I’ve been bitten in the past when a class of interns wasn’t code reviewed consistently. If a few interns are getting great code reviews on every checkin and others…not so much…, then they’re going to notice and draw weird conclusions. I’ve been part of teams that have suffered from this in the past, and I publicly apologize. I’m less worried about this now that we have a simple review all the code policy at Khan.
The most successful intern projects I’ve been part of began with months (no exaggeration) of preparation. At least for Fog Creek and Khan, intern summers are the most suddenly dramatic infusion of horsepower we ever have to absorb. We don’t want to waste a drop not knowing where we’re going. You have 9 months every year to stumble around not knowing what you’re doing…just get it together for the summer.

I’ve been scientifically tracking every intern class and found this to be the most significant variable.
What do you prep? I’ve seen a couple things work well:
Don’t worry about overdefining things just yet. This is about making sure people start with a tangible goal to aim for while understanding how that goal fits into your team’s deeper principles. Smart new hires will have absolutely no trouble stepping outside the bounds of your preparations, taking ownership, and persuading you away from all your stupid assumptions. That’s exactly what your culture should be encouraging.
Our mentors have weekly 1-on-1s with their interns to make sure they’re learning what they hoped to learn and are working on the types of projects they’re most passionate about. These days the top interns have tons of opportunity. It’s a battle for the best, so we make sure Khan interns get the most out of their time. If you think it should be the other way around, you’re probably already missing out on the best and it’s highly likely you never started reading this blog in the first place.
1-on-1’s can be just a few minutes and may fall back to every other week when appropriate. Just enough to make sure our interns have a chance to self-direct.
“Keep learning” is on the short list of Khan Academy company values. Jamie Wong started his internship this summer and dropped this tech talk about meteorjs in the first week. We’ve had tech talks about fonts (typefaces? styles? ugh I can never remember the politically correct word, I’m so sorry Marcos), game design, a/b testing, performance, and how Khan Academy is being used in towns without internet connectivity. We’re inviting cool people like Steve Yegge to teach us about API-centric development. We want to learn more.
In fact, if you like Khan Academy and want to give us a treat, we’d love for you to come teach our team something. Email me - kamens@gmail.com — I promise a low-key atmosphere and maybe board games.
“Step three? FIXIT. Now repeat till it is FIXED.”
Wise words from Kenan, patron saint of the first Khan Academy fixit.
Fixit day is just one more in the long list of solid dev lessons I’ve been learning from the Googlers around here. Since I couldn’t find a reputable source explaining the fixit culture (who reads the NY Times?), I figured it’s my duty to share.

Who’s gonna fix it? They are. Because they broke it.

“The goal of a fixit is to address niggling concerns that bother you time and again, but never enough to actually fix them.”
That’s straight from somebody who has more experience with fixits than either of us. I prefer to view fixit day as that moment when you finally wake up in the middle of the night with enough consciousness to properly rearrange the blanket (or swipe away the breadcrumb, if you’re gross like me) that’s been just barely interrupting your sleep for the past five hours, but never enough to break you out of your dream. Man that’s a good feeling. I hate breadcrumbs in bed…but I also really like cinnamon toast.
Anyway, our first crack at the whole thing tackled two levels of fixits:
The “fixedit!” Trello column got quite full, fast. And while it’s easy to take potshots at fixit day by claiming that everybody should be fixing this stuff all the time, I was even happier with the team culture inspired by fixit day than with any individual change. Every single person on our team was having fun working on the same thing, and no fix was too small annoying UggggghhhhWhatIsThisBrowserDoing to be undeserving of absolutely anybody’s attention*. Our first fixit day felt very healthy. It was a little chaotic at the end due to our refactoring ambitions, but they were good ambitions that paid off. The next fixit will be even better. I look forward to it.
*Making shit work is everyone’s job, unless you’re the shit umbrella for ever-…wait…nevermind. Too much cursing. This one’s not gonna get past the censors on the official KA blog.
Today I was doing whatever it is I do when I ran across this link from Joel:
NYC teacher “effectiveness” ratings are bogus, and the data prove it garyrubinstein.teachforus.org/2012/02/28/ana…
— Joel Spolsky (@spolsky) March 1, 2012
…and my brain started pattern matching. Replace “NYC teacher” with “programmer” in this tweet, and you’d be in classic Joel on Software land. After all, one of Joel’s self-stated missions is to make programmers’ lives better (you can see his efforts played out in both the principles of Fog Creek and Stack Overflow), and he’s spent plenty of time trying to convince us of the stupidity of using automated metrics to assess programmers.
Before we go any further, here’s the gist of the post about NY’s teacher effectiveness ratings that were recently released to the public: they have major flaws. Read the post — but this chart says a lot:

Every point is a teacher that taught the same subject to two different grades in the same year. Think of a middle school teacher handling both 6th and 7th grade math. The x-axis is their effectiveness rating for one of the grades, the y-axis is the other grade.
You don’t have to stare at the graph long to see a surprising lack of correlation. If you’re effective at teaching 7th grade math, shouldn’t you be effective at 6th? Wait. Before you throw effectiveness ratings out the window, read the comment further down the page that points out the increasing usefulness of the published data as you look across multiple years of teachers’ past. Ok, that makes sense. But still, judging teachers on test score summaries alone is madness. Perhaps they’re useful feedback when given to teachers appropriately, with caveats, and all? Maybe the ratings need some tweaking?
None of this matters if the data is published to the public and uses a single, automated metric to reward or punish teachers.
There’s a good reason Bill Gates tore apart the decision to publish this data. He knows that a single metric is bound to be not only flawed, but, if used as an incentive system, also destructive to both teachers and any attempts to improve the metric. His nuanced argument for a system that combines data and highly trained teachers evaluating their peers sounds pretty similar to the belief that highly technical programmers should be the only ones managing other programmers:
But student test scores alone aren’t a sensitive enough measure to gauge effective teaching, nor are they diagnostic enough to identify areas of improvement. Teaching is multifaceted, complex work. A reliable evaluation system must incorporate other measures of effectiveness, like students’ feedback about their teachers and classroom observations by highly trained peer evaluators and principals.
— For Teachers, Shame is Not the Solution
Again, replace “teachers” with “software developers” up there and you’ll see Bill Gates is making the same crusade he made for developers — protection from the type of overly simplified management incentives that destroy your ability to focus on the tasks at hand when working in a complex, creative profession.
I was lucky enough to step into professional programming at a time when an exploding number of forward-thinking companies were starting to treat and recruit programmers effectively. I was never promoted or demoted based on the number of bugs I created or lines of code I wrote. But it’s clear that wasn’t always the way things worked, and when I was in college I remember reading Joel and Paul Graham, who both stood out as Defenders of The Programmer against destructive management.
This is why we’d never, ever use Khan Academy data to single-handedly “rank” teachers or anything else so ridiculous. Khan Academy data (and there’s a lot of it, we just passed 400 million practice problems done) is to be put in the hands of teachers, for teachers, as a powerful tool that lets them dive deep into their students’ individual levels of mastery. We aim to empower teachers with the best tools available and believe that the only people assessing them should be highly trained teachers who understand the nuances of their craft and work with them to improve. Sound familiar to you, devs?
I’ve never been a teacher, but I do know that I’ll never even consider working for a company that assesses my performance based on a single automated metric. I think I have Microsoft and Google and Joel and Paul Graham and co. to thank for the software world’s culture of respect for both data and individuals. And now it’s really cool to see Bill Gates and Joel taking a similar stand in defense of teachers.
Ten bucks says none of the teachers in The Academy for Software Engineering suffer from a single metric incentive system.
This is the story of the growing Khan Academy team converting me into a passionate fan of requiring a code review for every single changeset.
Those who have worked with me know that it’s a surprising position for me to take. On the spectrum of “Follows good development practices even if it slows down the product” to “Just ship the thing, code doesn’t matter, only users matter,” I tend to fall…right about…[furiously scribbling]…here:

Even though I’ve long been a fan of code reviews at both Fog Creek and Khan, I never would’ve suggested requiring them for every single changeset, no matter how small. At first glance it appears all P̝̂R̫͙̼̽ͪ̽̋Ö̝̹̿ͬ́̐̆̈CÈ̝̱S̮̜͙̩̠S̹͍̳̖͍̆̐Y̩̟̥̟̘̺̠ͭͫ̔, this is web development, not rocket science, and Hey what if there’s an emergency and Wait are you serious, you want me to review my single-line change to a trivial #comment???
Luckily for everyone, we’ve been hiring smarter and smarter people at Khan who can save me from myself. (What’s that theory about smart people and some lake? Lake Okeechobee, I think. With the crocodiles.) We told our team that we’re requiring code reviews for all pushes a couple weeks ago. Spoiler alert: it’s not that processy, and even a Just Ship It clown like me is already seeing immense value from the experiment.

Croc! No, sorry, wait…log!…Or, no! Wait! Sorry…Croc!

All changesets reviewed.
It’s not for every group. I’m not convinced it would’ve been right when Jason and I were hacking together alone. But requiring code reviews has already made for a better product and a healthier team, two things that I personally care about far more than a healthy codebase. That’s just a nice side-effect.
P.S. When working on Kiln, I was adamantly opposed to building code review requirements directly into the source control product. I stand by that belief, and I’m almost certain the Kiln team still agrees. Reviews should be required by your team’s dynamics and strongly encouraged by your tooling, not the other way around.
We have these emails hanging up all over our office, sent in from Khan Academy users with incredible, personal stories to tell. Every time I read a new one I’m emotionally affected, which means my robot emotion chip is faulty.
So when some curious soul (like a reporter) wanders in and asks me, “How will you know if Khan Academy is really successful?” I always answer their (totally valid) question with an explanation of our data, analytics, and fancy metrics — but what I’m really thinking is, “You haven’t read these letters.”
Let’s change that. These students’ and parents’ and teachers’ stories are now available for anyone to be inspired by. It is impossible to read them…go ahead, I challenge you…and not come away with the conclusion that a free educational resource like Khan Academy simply must exist.
Call me a softie. It’s not like I don’t believe in data as the final arbiter of any learning tool’s effectiveness. I do. But if our key data metric happened to be “# of page-long, authentic stories sent in from users who have turned their lives around in the face of drug addiction, unleashed their 2nd-grade son on the advanced math he’s fully qualified to handle, or earned acceptance into a university despite being stuck in a country that does not value education,” I don’t think I’d second guess seeing that number skyrocket.
Hopefully these stories inspire others as much as everybody on our team. We spent time designing the page to celebrate the authors, their letters, and the fact that these are real lives, not product testimonials.
Does the page accomplish this? Feedback welcome, good and bad.
I already can’t wait to drop some major challenges in the laps of our two incoming Fall interns to see what they can build.
- Khan Academy Internship, Summer ‘11
Check!
David Hu and Julian Pulgarin stepped up to the plate this Fall during their coopsinternships from University of Waterloo. We call ‘em internships because we’re from Amurrrrica, you crazy Canadians.
When hanging with friends and family recently, I found myself shocked by how willing my brain is to completely forget stories that were surely once described as unforgettable. I don’t have any desire to forget what we’re doing at Khan Academy, and that’s kinda sorta why being as open as possible is one of our core dev principles.
The Summer ‘11 story is already a nice piece of shared history that helps me answer every intern candidate who asks, “What kind of project do interns work on at Khan Academy?”. Here’s my version of the Fall story.

If tales about battling a bison for the right to cross a road can get up and walk out of my head,
then…well…better keep blogging.
It doesn’t get much more open than David’s post about how we use machine learning to assess student mastery. If I tried to summarize it for you, you’d see my managerial hair start spiking up and up and up toward the ceiling, Pinocchio-style. I won’t insult the work by talking about the statistics. Instead, I’ll just say that we now have a much better understanding of how competent each Khan student is in each math subject. Thanks, David.
That’s not all he did. From a dashboard to emphasize the importance of our exercises to a much smoother way of asking students to review work they’ve done in the past, he covers it all in this Vi Hart-inspired internship post-mortem.

Keeping an eye on our students’ activity via David’s dashboard

Forgetful Ben from the future is grateful for this video. And he’s also super-jacked and smart.
If anybody reading this has ever beaten Julian Pulgarin in chess, please rub his nose in it. Julian wiped the floor with me so many times that he would beg me to play him while he was blindfolded. My ego can’t handle that sort of hit, so I usually just deleted some data from production and pretended to be disappointed at each newfound emergency’s particularly poor timing.
When he wasn’t humoring me, Julian made major contributions across the board.
He started by building a number of new exercises for students to learn from, including an experimental crack at a new way of teaching fraction intuition. While working on this, he realized that it was painful to test our open source contributor’s GitHub pull requests, so he disappeared for a few days and came back with Sandcastle. Sandcastle automatically tags every pull request with a link that lets our developers test out the requests’ new exercise content in one click.

It’s a shining example of the reason we hire smart people and set them free to get things done. The first time Julian told me about his Sandcastle idea, I didn’t even really understand the direction. Now it’s indispensable.
Julian also gave our first KA Friday Tech Talk about how to do gradual feature rollouts for various segments of a large userbase. He ended up bringing this full-circle at the end of his internship by building Gandalf, our tool for doing the following:

Gandalf lets us selectively roll out features to all sorts of different subsections of our userbase.

“YOU SHA — ok, you guys, Hey!, you guys over there, you can pass — BUT OTHER THAN THEM YOU SHALL NOT PASS.”
Julian even “accidentally” left his chess set in Mountain View for us to mail back to him, which I’m pretty sure was his way of dropping the mic and walking off stage. “You clean up.”
In conclusion, University of Waterloo is legit. I skipped over plenty of work done by both David and Julian, and it still makes for an impressive Fall Winter there are seasons in Mountain View? As argued in the previous internship’s summary, any team that’s not dedicating tons of resources to both recruiting and mentoring interns is plain old missing out. We’re loving every minute of working with our interns.
Much like the phoenix or a tyrannosaurus flying a fighter jet, Summer ‘11 intern Joel Burget has risen from out of nowhere and dropped entertaining stories about his summer’s work. If you read them, are impressed, and want to hire Joel…you can forget it. He’s now full-time Joel.
If you wanna tag in next, the Summer ‘12 internship class still has openings.
This is my new favorite jQuery trick. I just learned it this year and have mentioned it in enough code reviews to decide it’s worth sharing.
When manipulating the DOM with jQuery, you often see code that looks something like:
$("#container").show();
$("#container .error").hide();
$("#container .zoo").css("background-color", "white");
$("#container .zoo .monkeys").empty();
$("#container .zoo .title").text("The zoo is empty");
$("#container .zoo input").val("");
$("#container .zoo").animate({height: 250});
…or, if somebody gets concerned about performance, they might try to reduce DOM lookups:
var container = $("#container");
container.show();
container.find(".error").hide();
container.find(".zoo").css("background-color", "white");
...
…and so on. Odds are, when you’re manipulating a single element like container, you’ll probably be doing something to nearby elements in the next few lines of code.

Readers of this blog, meet Emma. Emma, meet the three readers of this blog.
Enter .end(), newlines, indentation, and chaining. When you have one jQuery chain going and modify it with, say, .find(), you’re actually pushing the new chained set of elements onto a stack. .end() pops the current jQuery chain off the stack, which lets you do stuff like this:
$("#container")
.show()
.find(".error")
.hide()
.end()
.find(".zoo")
.css("background-color", "white")
.find(".monkeys")
.empty()
.end()
.find(".title")
.text("The zoo is empty")
.end()
.find("input")
.val("")
.end()
.animate({height: 250});
It’s easy to read, because the indentation is significant and matches up with DOM nesting levels. It gets rid of unnecessary DOM lookups. But most importantly for me, it feels natural to indent in and out as I write the code, using small additional selectors to step deeper into the DOM and .end() to find my way back out.
I’ve been destroying my old, ugly var’s left and right with this trick. I hope it helps somebody else.
Those who have worked with me will know that I’m an expert on this topic because my code gets laughed at all the time.
I’ve seen Good laughing and Bad laughing. Good is what I imagine happens when Robert De Niro sits next to Al Pacino as they’re watching Al’s cameo in some trainwreck of an Adam Sandler movie, and Bob turns to Al smiling and says, “This is awful.” It’s when you stare at some code and think, “Good grief. I can just imagine whatever took priority over making this code more reasonable.” There’s a knowing wink and a friendly jab between the laughed-at ghost-coder and the laugher:
Hah! I almost feel bad that you wound up solving it this way. I’ve been there before. You must be exhausted. Why don’t you sit down, I’ll take it from here.
Good laughter contains respect for the fact that this code exists at all and is being worked on by more than one programmer, which is more than you can say for what I’d bet is a whole boatload of perfectly refactored, unused files littering hard drives around the world right now.
That’s the type of laugher that’s erupting at Khan Academy these days, and it’s no coinkydink that now is right about when a team of top-notch coders are getting their first gazes into some of my previous creationsabysses. I can say with certainty that in these cases the knowing winks and playful jabs are well-deserved.

The rare (and best) third type of laughter.
A quick story before we get to the Bad laughter. Did you know that when Sal was a one-man show, the entire Khan Academy application was one big main.py file? At least the server-side stuff. All the request handlers, URL mappings, datastore models, data migrations, and even some HTML generation in one big 1000+ line file.
How funny is that! And it stayed that way for months! I mean, please. Who the heck wants to work on a codebase that’s one big file? If you haven’t detected the dripping sarcasm yet, recalibrate your sarcasm detector and start this paragraph again.
It’s easy to see how judging Sal by that one big file while he was busy making 2400+ free educational videos would be like judging a geek for wearing a t-shirt while she was just trying to look presentable enough to get to her computer to start writing code. Guess what? Sal’s code is still around, and it’s responsible for helping teach literally millions of students. Yet, as time goes on, it gets more and more likely that one day somebody will laugh at an out-of-place line with the type of judgment that can only come from being out-of-touch with what really mattered back then…and the fact that that line helped change many students’ lives.

You’ve gotta bend this brilliant tweet a bit to apply it to non-profit education,
but you’ll notice “good code” and “bad code” aren’t on the list.
Bad laughter doesn’t need much explanation, it just lacks respect. It happens too much in our industry, and I’m not sure why. I’m proud to say that it’s not a problem at Khan at all, but we haven’t always been 100% immune. We weren’t always 100% immune at Fog Creek. I’ve been guilty of this laughter myself, and I’d bet money it exists elsewhere. It’s most common among coders who feel like they need to prove themselves, and it can be combatted by a team that emphasizes shipping and the healthy laughter that comes from reminiscing about the last crappy hack you were responsible for when you decided to Just Ship It.
Here’s (just one of) ours:
PLAYLIST_STRUCTURE = [
{
"name": "Math",
"items":
[
{
"name": "Arithmetic",
"playlist": "Arithmetic"
},
{
"name": "Developmental Math",
"items": [
...
That’s our current Khan playlist structure, defined in code. When Sal adds or renames a playlist, we have to change the code. Laugh it up.
This will actually go away soon in favor of something much nicer, and I’m not arguing for crappy code. I’m putting this example here because it got us this far, without many problems, and who knows where we’d be if Sal had spent the first iteration of khanacademy.org trying to decide on the perfect playlist data structure.
…and a more rambling attempt to get a similar point across to a group of UIUC students by playing this powerful Thank You, Khan Academy video before revealing main.py.
We had some heated debates a while ago about what would happen if we opened up all of Khan Academy’s content for logged out users. Sal’s videos have always been open in this way, of course, but all the interactive exercises and statistical tracking and badges and stuff required an account.
It felt like the right move when we reconfirmed our belief that educational content should be as open and available as possible. We were also persuaded by Fred Wilson’s belief that giving logged out users more power is an effective way to empower your community. But we worried that registrations would drop, because by handing our 250+ exercises to logged out users, we’d be drastically shrinking the carrot on the other side of the “please login!” boundary.
We decided to go for it this summer. Figured I’d share some results.

A dog
4+ months after the change, we know a little more. Jace looked at the data and found:
So more visitors are trying exercises, but they’re not converting into more registered users…yet. We could have fun making up all sorts of explanations. Maybe we need to show off the badges and points unregistered users are accumulating a little bit more, maybe we’re not asking visitors to login forcefully enough, maybe doing math exercises makes people tired, maybe we don’t have enough pictures of golden retrievers on the login page.

Percentage of all visitors who use our exercises
Far more important than any random explanation is the fact that we’re now getting X% more data about how users learn (or don’t learn) thanks to these new exercise visitors. That data is powering some of the best work coming out of Khan Academy so far, so feeding more logged out users into our exercises is, as Jace puts it, a big win on data collection.
Plus, we’re confident that we could iterate and A/B test our way to higher registration rates for our new exercise users. It’s not our priority (for this week, at least). We’re busy cooking up important changes to the core learning experience.
*They don’t have access to everything. The ability to coach and communicate with other users is still restricted to users who have logged in.