Three minute quiz: App Engine datastore performance

As somebody now spending all his time in NoSQL land, my brain perked up when working through The Three Minute SQL Performance Quiz. It was a blast from the past for me, a chance to remember all the little ins’n’outs of SQL performance from a quaint old time when I used to be able to write JOIN statements.

So I thought it’d be fun to develop a similar performance challenge for this new NoSQL life of mine. Since we use App Engine at Khan Academy, we’ll focus on the App Engine datastore. Proceed.

Question 1


QueryModel
Monkey.all().filter(
    "genus IN",
    ["Ateles", "Cebus", "Aotus"]
  ).fetch(10)
class Monkey(db.Model):
  genus = db.StringProperty(
    indexed=True)

Hold on there — a major improvement is possible or All looks good to me — don’t go changin’ a thang

You’re right! Queries that use the IN operator may look like a single query, but they actually run multiple queries behind the scenes, one for each item in the list. That means if you ran the above query, opened Appstats, and looked at this request’s profile, you’d see this:

…and that’s not ideal if you really care about this request’s performance. The requests are running asynchronously and overlap as much as possible — which is great — but you’re still running three requests and increasing the likelihood that one of ‘em will slow you down.

If you want this to be blazing fast, you almost certainly want to denormalize this set membership into a property that can be queried without an IN operator. Perhaps you’d add something like is_in_favorite_genus = db.BooleanProperty(indexed=True) to Monkey, set that property to True if the genus is one you’re interested in, and then change your query to Monkey.all().filter("is_in_favorite_genus =", True). That’d be a significant improvement — especially if your IN list contained many items. That’s the App Engine way.

Question 2


QueryModel
Monkey.all().filter(
    "unique_name =",
    "bob_the_monkey"
  ).get()
class Monkey(db.Model):
  unique_name = db.StringProperty(
    indexed=True)

Hold on there — a major improvement is possible or All looks good to me — don’t go changin’ a thang

You’re right! If you’re loading a single entity using a unique identifier, queries aren’t as fast as loading by key or key name. There are multiple ways to load an entity by key that we won’t get into here — just know that if you have unique identifiers for your models, you should strongly consider constructing your entities using the unique identifier as part of your model’s key name — Monkey(key_name="bob_the_monkey", **kwds) — so you can quickly retrieve it later: db.get_by_key_name("bob_the_monkey")

If you’re curious why this is faster than a query, it makes sense if you’re willing to swallow the gross oversimplification that a query is just quickly looking up the key in an index and then fetching the entity using the key. If you’ve already got the key, why query?

Question 3


One of the coolest dogs ever or Meh, she’s so-so

You’re right! Shouldn’t need explainin’.

Question 4


Putting an entityModel
m = Monkey(
    name="Bob the Monkey",
    favorite_color="blue",
    favorite_food="pizza",
    worst_enemy="honey_badger"
  )

m.put()
class Monkey(db.Model):
  name = db.StringProperty()
  favorite_color = db.StringProperty()
  favorite_food = db.StringProperty()
  worst_enemy = db.StringProperty()

Hold on there — a major improvement is possible or All looks good to me — don’t go changin’ a thang

You’re right! Kinda. It depends how you’re querying this entity. All of the datastore properties on Monkey are indexed by default, and that means every time you put() a Monkey, you’re spending time writing new values to all of those indexes.

Now, if you need to be able to run queries to find Monkeys based on their favorite_color, favorite_food, worst_enemy, and so on — then you’re doing the right thing. You need the indexes to be able to query them. If your code base is anything like Khan Academy’s used to be, however, you may have tons of properties w/ automatic indexes that never, ever need to be queried. We fixed this by using db.StringProperty(indexed=False) in these cases (and writing a linter that requires us to explicitly specify indexed=True|False for every datastore property).

Write speed may not matter for your app, but if it does you can speed up writes by not wasting time writing to indexes you don’t need.

Question 5


QueryModel
query = Monkey.all()
query.filter("name =", "bob")
query.filter("zoo =", "Manhattan")
query.filter("hats >", 5)
query.get()
class Monkey(db.Model):
  name = db.StringProperty(indexed=True)
  zoo = db.StringProperty(indexed=True)
  hats = db.IntegerProperty(indexed=True)

Hold on there — a major improvement is possible or All looks good to me — don’t go changin’ a thang

You’re right! This query will work. And when you first start out and only have a couple hundred Monkeys, it’ll be blazing fast. But as your family of Monkeys grows, you may find yourself with a Very Serious™ performance issue. Why? You have all of the properties indexed, you’re only asking for a single entity…what’s the problem?

You don’t have a perfect index defined. Without an index that perfectly covers all of the properties involved in the query’s filters, here’s what App Engine has to do when you ask it to fetch a single entity: start querying the datastore for, say, Monkeys with name == “Bob”, retrieving them from the datastore, and filtering through them to find any with zoo == “Manhattan” and at least 5 hats. That’s not what you want, and it looks like this in Appstats:

That’s a disaster. Depending on the shape of your data, you could spend hundreds of RPCs just returning a single entity from the datastore. You want App Engine to query the datastore for “Monkeys named Bob in the Manhattan zoo who own at least 5 hats” in one swift, single blow — it should only need one RPC. And it’ll do exactly this if you have a perfect index. You’ll need an entry in index.yaml, something like:

- kind: Monkey
  properties:
  - name: name
  - name: zoo
  - name: hats

If you aren’t careful and don’t keep an eye to make sure that your important queries have indices that perfectly cover ‘em, you could have a piece of code that runs snappy one day and, months later when your datastore fills up with Monkeys, it’ll suddenly take 5+ seconds to return a single entity. We’ve experienced this at Khan Academy more than once. It’s public enemy number one due to how easy it is to not notice the problem at first and then encounter brutal performance problems later.

Question 6


QueryModel
query = Monkey.all()
query.filter("name =", "bob")
query.filter("zoo =", "Manhattan")
query.filter("hats >", 5)
query.get()
class Animal(db.PolyModel):
  name = db.StringProperty(indexed=True)
  zoo = db.StringProperty(indexed=True)

class Monkey(Animal):
  hats = db.IntegerProperty(indexed=True)
 index.yaml
 
- kind: Animal
  properties:
  - name: name
  - name: zoo
  - name: hats

Hold on there — a major improvement is possible or All looks good to me — don’t go changin’ a thang

You’re right! What now?! You’ve got your perfect index on all the queried properties! Well, not really.

This’ll suffer from the exact same Very Serious™ performance issue from Question 5. By using PolyModel and querying specifically for Monkeys — not just any old Animal — you’re implicitly adding another filter to your query. This filter will use PolyModel’s built-in special property, class, to only return Monkey results. If you don’t have an index that covers all filtered properties including class, you ain’t go no perfect index.

- kind: Animal
  properties:
  - name: class
  - name: name
  - name: zoo
  - name: hats



That’s it, you’re done! I could keep going, but by now you’ve spotted a trend in the answers and I’m pretty sure we’re past the 3-minute mark.

If you want more or have other interesting quiz questions I’d love to know. And if you geek out on perf work like me, you know what to do.

3/5/14 — 12:47am Permalink
Email transparency at Khan Academy

Whenever we mention that almost all Khan Academy email is visible to everybody on the team, people always wanna know more.

The idea is unapologetically copied from Stripe. Whether they originally came up with it or not I dunno (edit: they did). We certainly didn’t. But by now we’ve added enough of our own little tweaks to warrant contributing back. Here’s the how and why of “radical email transparency” at Khan.


How we got started


  • Step 1) Read Stripe’s post.
  • Step 2) Get forwarded Stripe’s post by at least 3 other devs within a couple days.
  • Step 3) Implement a super hacky version of Stripe’s post

That all happened ~10 months ago. I’d absolutely do it again. I’m not gonna list all the ways this is different than Stripe’s — our system is just a bit simpler/smaller/stupider.


How our hacky version works


Every team has two email addresses: one for team members and one for the team’s “blackhole.” analytics-team@khanacademy.org and analytics-blackhole@khanacademy.org.

The -team@ address is for emailing all members of the team.
When you send email to analytics-team@, you expect everyone on the analytics team to read it.
Subscribing to analytics-team@ means analytics-related email will land in your priority inbox as soon as it’s sent, and you’re expected to read it.

The -blackhole@ address is for anything else that has anything to do with analytics.
When you CC:analytics-blackhole@, you don’t expect subscribers to immediately read it.
Subscribing to analytics-blackhole@ means you’ll receive analytics-related email, but it’ll get filtered out of your inbox and you’re not expected to read it unless you feel like it.

There’re two additional catch-all lists for the entire dev team: dev-team@ and dev-blackhole@. So we currently have ~24 lists. dev-team@, dev-blackhole@, analytics-team@, analytics-blackhole@, mobile-team@, mobile-blackhole@, i18n-team@, i18n-blackhole@, …

Anybody in the org can join any of these email lists. analytics-team@ is usually just team members, but analytics-blackhole@ has all sorts of lookie-loo subscribers who’re interested in analytics happenings.

All email to analytics-team@ gets forwarded to analytics-blackhole@, so lookie-loo -blackhole@ subscribers will get updates sent to -team@ automatically.

When you receive an email via a -blackhole@ list, it is automatically tagged as blackhole and filtered out of your inbox.

If an email is directly addressed TO:you@ or TO:your-team@, it’ll stay in your inbox regardless of any CC:-blackhole@.

Best practice when sending email: unless you have good reason not to, CC dev-blackhole@ or any of the other -blackhole@ lists.

Best practice when reading email: read the emails in your blackhole on your own schedule, if and when you want.

Between these buckets and best practices we have the tools we need to make the contents of any email open to anyone interested without burying ourselves in a deadly deluge.


Example emails


  • Jace emails Ms. Monkey (an analytics teammate of his) about a new experiment being run. TO:msmonkey@, CC:analytics-blackhole@
  • Tom emails the internationalization team about a change that may affect the way usage statistics are calculated for non-English users. TO:i18n-team@, CC:analytics-blackhole@
  • I email Mr. Gorilla asking when he’d like to demo his latest work for the company. TO:mrgorilla@, CC:dev-blackhole@
  • Craig emails the infrastructure team with a summary of upcoming priorities. TO:infrastructure-team@
  • I email Marcia about a personal career matter that shouldn’t involve others. TO:marcia@.

How it’s technically implemented


Google Groups ‘n’ Gmail filters. You’re about to see how hacky the rabbit hole goes.

Google Groups

  • Each team gets their own -team@ and -blackhole@ groups.
  • Each -blackhole@ group is a member of its respective -team@ group.
  • Group settings:
    • Permissions | “Who can join this group?” ==> “Anyone in the organization”
    • Description includes “(email-transparency)”
    • Information | Directory | Check “List this group in the directory”
  • We setup a short URL to a google groups search for “(email-transparency)” — now anybody can go to khanacademy.org/r/email-transparency and subscribe to whatever lists they want.

Gmail Filters

  • Everybody has the following filters. Our setup doc has ‘em exported to xml, which everybody imports via Gmail | Settings | Filters | Import filters.
    • Matches: to:(*-blackhole.khanacademy.org)
      Do this: Apply label “blackhole”
    • Matches: from:(-me) to:(*-blackhole.khanacademy.org -me -*-team.khanacademy.org)
      Do this: Skip Inbox
  • Now emails received via blackhole lists are labeled and filtered out of your inbox unless they were addressed to you or your team.

Why I’m happy


This experiment pushed my personal comfort zone at first — “wait, is it really ok for everyone to see this?” — but I feel great about an email culture that’s open by default and wouldn’t wanna go back.

Trusting devs to subscribe to whatever content they find helpful makes me happy. Trusting everyone to not waste time reading email that’s unimportant for them has worked out. We’ve seen a lot of value from -blackhole@ subscribers getting news they wouldn’t otherwise have seen.

You know those moments right before you send an email when you sit there and type names into CC, then delete ‘em, then re-type ‘em, all because you’re trying to figure out who cares? That’s gone. It’s up to the subscribers. Just send it to the critical people and CC a blackhole list. Fewer email decisions.

You know those once-in-a-while emails that’re full of insight? Bullet-point priority breakdowns and strong opinions and team schedule updates? Sometimes they’re only read by one or two people, maybe because the author doesn’t feel comfortable blasting all@company.com. That’s just silly. I like ‘em in an open, searchable spot. Blackholes are perfect.

You know how your inbox is already overloaded and the last thing you want is more email? No problem, don’t subscribe. Or only subscribe to -blackhole@ lists and only skim ‘em once in a blue moon. I do the latter.

You know how you keep saying you want to keep your team as flat as possible for as long as possible? Avoiding titles is one thing, sure. But flattening communication — trusted access for everybody — is huge.

Openness and trust has been a big part of Khan Academy’s dev team since the start. This experiment kinda fit right in.


Why I’m not so happy


Our biggest problem, by far, comes from imperfections in the gmail filter setup. Filters have no way of knowing which lists you’re subscribed to, so if somebody sends an email to analytics-team@ and dev-blackhole@ and you receive the email, we don’t know if it should stay in your inbox or not. You may’ve received it via the blackhole list, but if you’re subscribed to analytics-team it shouldn’t disappear to the blackhole.

Right now we err on the side of safety by not removing anything from your inbox if it was sent to a -team@ address, but this means some messages that should be in your blackhole are not. Stripe has tried to work around this via a library to automatically generate complicated gmail filter combinations, but that doesn’t feel quite right for us. It’s a bit complicated for newcomers to use, and it requires reconfiguring your filters every time you join or unjoin a new list.

I’m sure there are tools out there that could solve this — probably by abandoning google groups altogether. Ideas welcome.

I’m also not in love with the google groups UX that lets people find/join/unjoin lists: khanacademy.org/r/email-transparency. No question there are tools that’d handle this use case better. Supposedly Stripe built a custom interface on the Groups API.つ ◕_◕ ༽つ giff open source plz!


Anybody else on this train?


Have others experimented w/ email transparency? We won’t go back. But as you now know our setup is far from perfect. Would be interested in learning from others.

1/1/14 — 2:15pm Permalink
The most common feedback we give dev interns

I’ve been lucky to see ~70 interns pass through the dev teams at Khan Academy and Fog Creek. If you know me you know how much I enjoy internships. Infinitely better than an interview, enough time to get meaningful work done, a chance to sit side-by-side for months, and interns see real personal and career growth (not to mention compensation).

And we’ve been lucky, talent-wise. Age be-damned, we always wind up learning from our young bloods and wishing some would never leave. But we also invest a lot of energy mentoring them, and I’d like to share the two pieces of feedback that’re most often in our mentors’ mouths. Figured this’d be useful to future interns anywhere.


1. Write code with your code reviewer in mind


We love code reviews. I’m kinda tempted to setup a picture frame on each intern’s desk, with a photo of their code reviewer kindly smiling at them and “Be nice to your reviewer!” written along the bottom of the frame in fancy cursive font.

Avoid dumping massive blobs of code on folks. You’re probably hyper-productive as an intern with nothing else to do but write code. Don’t quietly assemble a big bomb of code over 5 days and then drop it on your poor, unsuspecting team lead. Granular checkin points — one conceptual change per commit — give your code reviewers a chance to engage easily and offer feedback as you go along.

Use TODOs liberally. TODO(intern bob): transition to new API once backfill is done gives your reviewer a sneak peek inside your head. You won’t be able to do everything you want in every changeset, especially if you’re focused on shipping and following our advice for granular commits. So throw these suckers around with pride — heck, one of our past interns wrote a script that generates a TODO leaderboard from our codebase.


2. Talk about what you’re doing and how you’re gonna do it


After all those interns, I can count on half a hand the number who talked too much about what they were building.

Your work does not need to be complete to be worth sharing. Bums like me put a lot of effort into building communication channels for the team. Use them. At Khan we’ve got hipchat, transparent emails, weekly demo opportunities, dogfood days…all just begging for interns to post sketches, screenshots, design docs, and prototypes. The best communicators I know consistently face their fears by posting works-in-progress before being asked to do so (and clearly stating that more is on the way).


An excellent communicator

By the way, you’re not bragging about what you’ve done to “take credit” in some ugly political fashion. It’s part of your job to brag, to make sure your team knows your progress and what direction you’re heading.


The myth of the genius programmer


That’s a short list, I know. I wanted to title this post “You’ll Never Believe These Two Simple Tips That’ll Make You a More Valuable Intern,” but then I would’ve had to throw myself out a window and I’m pretty excited to see how 2014 turns out.

The truth is both of these pieces of feedback are about dispelling the myth of the genius programmer. Brian Fitzpatrick stopped by Khan Academy a while back to shed some light on this:

The ultimate geek fantasy is to go off into your cave and work and type and code and then shock the world with your brilliant new invention. You know, it’s a desire to be seen as a genius by your peers.

The fantasy — the myth that feverishly creating while all alone is genius — hits all of us, especially interns coming right out of school. It’s certainly lodged somewhere deep inside me. But Fitz and Ben know that only emerging from your cave when you’ve discovered perfection is insecurity, not genius. It’s what makes us accumulate huge unreviewed swaths of code and clam up when we should be sharing our progress.

The best programmers I know brave their imperfections. They limit themselves to granular changes that others can understand and constantly talk about whatever they’re doing. And that’s why we give interns the above advice. If you can somehow show up and apply your talents in this same way right out of school, you’re probably gonna be seen as a genius.



We’re still accepting applications for our next class of Khan Academy interns.

Due props to Tom Yedwab and all other mentors who directly or indirectly contributed.

12/29/13 — 1:59pm Permalink
"Shipping beats perfection" explained

When we sat down to write Khan Academy’s company values in 2010, “shipping beats perfection” flew out of Sal’s mouth before our butts hit the chairs.

It’s almost 3 years later. I have one-on-ones with teammates new and old who wanna talk more about what “shipping beats perfection” means. When outsiders happen across our development principles, “shipping beats perfection” becomes a lightning rod for anything from compliments and respect to accusations and fury. Our internal chat bot, the culture cow, drops “shipping beats perfection” as occasional chat room reminders (between MOOs).

At some point along the way when the phrase passed between my ears for the NNth time, my brain started believing it was invented by Google or Facebook (nope, that’s “move fast, break things”) or some other big boy with so much success that small fish like us just naturally start repeating whatever they say.

But now that I sit down to explain it, I’m googling for "shipping beats perfection" -khan and not finding much. And while I love its simplicity, “shipping beats perfection” contains subtlety in need of explanation.

We’re willing to be embarrassed about what we haven’t done…


We’re willing to be embarrassed about the things we haven’t done yet. Did you know our mobile app only offers video viewing, not the rest of our interactive platform? Did you know we don’t yet completely cover loops in our programming tutorials and challenges? Did you know we don’t have fully immersive simulations for teaching students physics?

Well, that’s all true, and quite embarrassing…for now. Would we go back in time to delay our mobile app’s launch until we figured out how to support the entire Khan Academy platform in an app? Absolutely not. Would we go back and undo the launch of Khan Academy programming because it doesn’t yet contain all of the content it really needs to? Absolutely not.

Is this the right philosophy for all products? Absolutely not. But educational content is so badly needed right now, and students are so hungry, that it’d be vain of us to think satisfying our own hunger for perfection is worth more than students’ needs. We’ll get to the complete mobile app. We’ll get to better coverage of computer programming content. Maybe we’ll even get to a fully immersive physics simulation. One day.

…but not willing to be embarrassed about what we have done.


"Shipping beats perfection" doesn’t mean we should ship something we’re embarrassed of. Far from it. Would we ever put out physics content that felt crappy or didn’t help students learn? Absolutely not. Would we ever push out a mobile app with a frustrating video viewing experience? Absolutely not.

That would be embarrassment over what we have built. We don’t ship that.

I’m not taking advice from anybody who says ‘shipping beats perfection’…makes no sense anyway you slice it…stability, cache issues, etc.

Concerned Redditor

Concerned Redditor, you need not worry your pretty little karma-filled head. We work hard for performance and stability, and, well, I don’t even know why I’m defending us against this statement because “shipping beats perfection” doesn’t in any way mean “ship crappy code.”

Wait, what? “Shipping beats perfection” can play nice w/ high code quality?


We code review literally every change and demand clear, understandable code. We docstring almost everything. Unit tests are popping up everywhere. If a code reviewer is confused by anything, she can simply say, “I’m having trouble understanding this part,” and it’s on the coder’s head to fix their code or documentation so things are clearer for future readers.

Perfection? Far from it. We can only afford this level of quality and still ship like crazy because we’re willing to be embarrassed about plenty of other things we haven’t gotten to yet.

Bring out the strawman


Try this on for size.

You’re assigned the task of adding a brand new data report for teachers who need to know more about what their students have learned. No doubt in your mind, the report is going to be huge for them. Plus, you just did a design sprint, so you think you know exactly what needs building and how long it should take.

You crack your knuckles and wryly smile as you’re about to fire up your editor and do what you do best.

You crack open javascript/coach-reports/reports.js and can’t believe what you see. What is all this old cruft? Your new report would be soooooo much easier if the data was just bundled up a bit differently. Plus the logic for these other reports is real messy. If you take a couple weeks right now, you could clean things up, and then knocking out your new report will be a breeze a coupl’a days later.

What’s the right move here? Refactor everything and fix it, even though you’ll lose a couple weeks at first? Curse crappy code like a sailor and just hack the new report right on top?

Spoiler alert: we don’t have enough information to answer. How badly do teachers need this? When? Is the “cruft” a bunch of edge cases that really do matter and shouldn’t be thrown away willy-nilly? Is it really old code genuinely in need of replacement?

Every coder faces this demon. The good ones take a step back, ask the above questions, and choose appropriately for each situation. The bad ones dogmatically believe “any code I’m around must be perfect” or, equally as bad, “just ship it.”

Leave it better


You won’t always know the answer, so here’s something for your toolbelt: “leave it better.” Can you liberally add TODOs around the old code, explaining what you will do to fix the situation soon, write your new code such that it demonstrates the new pattern you proudly suggest, and at the same time solve the pressing problem for teachers?.

If so, you left it better. You didn’t delay teachers’ needs for two weeks due to a refactor. You didn’t write lots of new code that you’re embarrassed of. Sure, you may need one messy hack to link your new pattern to the old code. That’s ok, you did so for a reason — for learners — and you added a helpful TODO and a Trello card just to be sure you’ll get back to it. Sure, you’re embarrassed that you didn’t do the full refactor yet. That’s ok, it’s the type of embarrassment — we just haven’t done the work yet — that we’re ok with.

If you’re the type who can’t “just” leave it better but must make code perfect, then you’re satisfying your own needs instead of learners’. You’re violating “shipping beats perfection.”

A story to end on


We’ve seen videos of Spanish-speaking students in South America using Khan Academy to learn math. If a UX guru walked into one of these classrooms in Peru and sat down next to a student, here’s what they’d report back:

  1. Spanish-speaking student goes to www.khanacademy.org.
  2. Student sees a bunch of text in English but clicks around enough to find the math problem she’s trying to practice.
  3. Student selects all of the text in the math problem, then opens another tab, goes to Google translate, pastes the text in, and reads the Spanish translation of the math problem.
  4. Student returns to the tab w/ Khan Academy open, writes her answer, gets the problem right.
  5. Student, seemingly unaware of the usability disaster they’ve just been tortured by, turns to UX guru and smiles blissfully, thrilled by her success.

And then at this point the UX guru’s head would a-splode.

We’ve seen videos of this happening (minus head a-splosion). Many of us have felt deep embarrassment in the past over our lack of translated versions of Khan Academy. But shipping beats perfection. For a long time we weren’t ready to tackle translations. We had to swallow our embarrassment and move forward with the English platform.

So should we be satisfied? Absolutely not. In about a week, a fully internationalized Spanish version of Khan Academy will be out of alpha. Will it be perfect? Far from it. Will 100% of our content translated? Absolutely not. Is our internationalization code free of TODOs and the occasional messy hack? Absolutely not.

Will our internationalization work leave our students, our product, our code quality, and hopefully our world in a better place than before? Absolutely.

You only have to watch one Spanish-speaking student joyfully use an English-only math resource to realize that high code quality and perfect UIs don’t matter for their own sake. They matter when they make a difference for learners. So we leave things better every day, are willing to be embarrassed about what we haven’t done, take pride in what we have, and ship great educational content to everybody as fast as we can.

Shipping beats perfection.

9/9/13 — 9:25am Permalink
"Our team needs more people"

You know those puzzles with the matchsticks? Where you have some shape made out of ‘em and then you remove one and somehow, magically, you’re supposed to reform the original shape with one less matchstick?


"Remove one matchstick and rearrange the rest to reform the original shape."
(It’s a joke, it won’t work, don’t get nerd sniped.)

That’s the image that fills my brain when I try to make sure the right people are working on the right teams.

As a dev team scales it crosses the specialization barrier and forms internal teams. And every once in a rare while, for one reason or another, those teams need to be shifted around without disturbing the bigger picture.

Maybe a bunch of interns just left and certain teams are short-handed. Maybe somebody needs a team switch for their own personal sanity. Maybe a project that’d only been in the backs of your minds for the last two years is suddenly injected with a fever pitch of support, and you need to get things moving quickly on this difficult, previously unstaffed challenge or be remembered as the chump who was unwilling to disturb your oh-so-perfect team breakdown to make room for something new and far more important.

Doesn’t really matter how — it happens. It’s happening at Khan Academy right…wait for it…NOW due to our talented interns so rudely leaving us to finish school or something.

As we consider our priorities, I’ve heard the following from no less than 6 different teams:

We’re pretty stretched at the moment. Given how critical we are for the mission, our team needs more people.

For the first one or two, I was pretty optimistic. We could make a small change, help their team, and maintain the big picture. I was always kinda good at those matchstick puzzles.

Around request four or five I realized that the puzzle’s impossible. Need moar matchsticks. There’s no way we can give all the support we want to all the teams.

One terrible response to this would be to start growing so quickly that we abandon our hiring principles. That won’t be happening. We could also decide to have fewer teams slash do less stuff — a principle I almost always argue for — but Khan Academy has done a very good job recently defining our core mission and focusing on what matters. We don’t want to do any less.

Hence my conundrum. I was pacing around inside our office, then outside our office, then outside other people’s offices. At some point my pacing was interrupted by a lucky 1-on-1 with Craig. I bored him with my matchstick saga and he started laughing. He shared:

Eric Schmidt always said that if one of your teams isn’t asking for more people, something’s wrong. Teams should be able to do ~90% of what they want — not enough people to do everything, but not so few that they’re working unhealthily.

Relief washed over me in an awesome wave. It’s hard to listen to a team asking for help and not immediately help. Hearing that those asks are expected in a culture as respected as Google’s made me feel much better about Khan Academy’s current team breakdown.



Adam Wiggins’s guide to scaling a dev team has been a go-to article for me at many points, but none more so than when first splitting into specialized teams and weathering the associated storm.
8/25/13 — 11:43am Permalink
Team culture for free

The other day I was talking to David, learning whatever I could about how he works his VPEngineering magic over at Stack Exchange. Technical interviews came up. He was interested in new types of interview questions, not just the standard difficult programming challenges. He put it something like this:

We kinda get the whole “devs who can solve really hard technical problems” thing for free. We just hire the top Stack Overflow users.

Tough luck, I know.

Money for nothing and your devs for free


Stack’s entire company is built around surfacing experts who offer great answers to hard questions. It’s only natural their own dev team will have first pick from — and deep insight into — a pool of hard problem solvers they’ve been obsessively building since day one. I don’t know if I’d call that “free” (probably actually costs millions of dollars), but it’s enviable.

It’s team culture — extremely valuable culture — that just sorta falls out of their mission.


"No, wait wait. I’m being serious for a second. I really love you guys."

That had me thinking when our discussion shifted to encouraging a culture of employee growth.

Growth mindset for Khan Academy developers


At Khan, our entire company is obsessively focused on helping students grow. Compassion for students who don’t know how to divide decimals yet. Belief that a kid struggling near the bottom of the class can soon be at the top if given the right mentoring. Embodiment of a growth mindset, the idea that talent is not simply fixed but can be developed through hard work.

We try to build this growth mindset into our product — and hopefully our users — all day e’ry day. It’s only natural we’d bring the same attitude into our team’s inner workings.

I think we do a uniquely good job growing our own developers. I’ve noticed this growth in certain individuals since starting at KA (personally seen the impostor syndrome banished from a talented developer’s head at least three times). But it was really drilled into me during the 1-on-1 talks I just had with each of this summer’s interns. They raved about how much they’re learning, and not just technically. I attribute this to the growth-focused attitudes of their mentors.

This enviable culture — extremely valuable culture — has just sorta fallen out of our mission.

I feel proud that we’re becoming a place that turns good devs into great devs and great devs into free electrons. But I should probably just feel lucky, because this attitude has been sewn into Khan Academy from the get-go.



You should probably apply for a Khan Academy internship.

8/10/13 — 12:07pm Permalink
How we ran the second Khan Academy Healthy Hackathon

I heard from a number of people who got value out of last year’s “How We Hackathon" post (that number, for those curious, is ‘3’). Here goes round two.

Just like before, our hackathons’ defining characteristics are: act like a healthy hacker, "hacking" means creating (not just coding), and everyone demos. Those looking for more details and a strong sedative can read my email below for more.

diff last_year this_year


  • We stretched the hackathon over the whole weekend this time. Friday 4pm to Sunday 4pm instead of ending on Saturday. Teams just needed more time to hack.

  • We had more Roaming Hackers™ helping out anyone in need of advice. Was even more important this time ‘round since we were lucky to have visitors from the KA Lite team.

  • Instead of judges choosing winners, Marcia had the wise idea of giving everybody who demos one ticket to a raffle along with 3 extra tickets they must award to teams or persons behind especially impressive projects. I think this scheme went really well. It highly rewarded deserving hackers while giving everyone a chance to win. It left me feeling much better than last year’s judging.

What was built?


From video lessons of hand-assemblingsoldering a working computer to multiplayer math games to drastic improvements to Khan Academy’s internationalization pipeline to proofs of concept that we had, weeks ago, decided were problems so difficult we couldn’t tackle ‘em for another 6 months…I can’t simply summarize what I just saw demo’d a few hours ago. I’ll be blogging about my project, so here’s where I ever so subtly encourage others to do the same.

The amount of value — value that’ll make its way to real students — created since 4pm Friday has humbled and energized me. Thank you to everyone involved.

Wanna run a hackathon of your own?


Feel free to borrow liberally from our planning communiqué:

To: all@khanacademy.org
Subject: Oh, the Features You’ll Hack!

You’ve heard all the whispers,
The rumors are true.
Two weekends from now,
Healthy Hackathon Part Deux.

You’ll think up some hacks. Consider the code.
About some you will say, “App Engine can’t handle that load.”
With your head full of brains and your scripts full of TODOs,
you’ve come to build what you want, to ship what you choose.

[Cue epic music.]

♤✌☭☂♘ (The Second) Khan Academy Healthy Hackathon. 7/19 - 7/21. ♘☂☭✌♤

I wasn’t here last year. WTF is a hackathon?

A chance to get together with teammates and build absolutely anything you want that’s connected to KA. You don’t have to be a coder! This is open to the entire company. We’re all about making, not just coding — and it doesn’t matter how wild or crazy your idea is.

Time to start letting your brain wander.

But WTF are we hacking on?

Anything related to KA. Below is a grabbag of examples, but the best ideas will come from you. Some of the following actually shipped last year.

…or whatever. Here’s last year’s board full of ideas, things that were demo’d, and stuff that was shipped.

Just WTF is so different about our hackathon?

It’s not just for coders. Draw a mural on our walls. Fill in a spreadsheet. Moderate our discussion boards. Write a KA song. As long as you’re creating something that you can show off to the rest of the group, you’re hacking and you’re in.

It’s healthy. We reject the whole red bull, 3am, alcohol-powered images of hackathons. We invest in employees for the long-term. We’ll eat well and get sleep.

K. WTF am I supposed to do before Friday?

  • Add your ideas to the Trello board
  • Commit to an idea if you’re sold.
  • Start lobbying others to join your team. We humbly suggest using this as a chance to work with people you don’t work with every day.
  • Do not start hacking yet. Emily. Cheater.

WTF happened last year?

Well, for one, a shocking amount of cool stuff was built. For two, we wished we had a bit more time. So this one’s extended. For three, we needed another roaming hacker* or two to help out teams in need. If you’re interested in helping, please let me know.

Who TF is invited?

KA employees, past KA interns (lookin’ at you, David), and the whole KA Lite team.

WTF. Tell me more.

Rules

  • You have to create something connected to Khan Academy.
  • You have to demo or show off what you’ve created at the end of the hackathon.
  • You have to act like a healthy hacker (sleep, eat good food).

Schedule

  • 4:00pm Friday the 19th: Start. Our office. We’ll kick off by letting idea owners pitch others in an attempt to get help.
  • 11:45pm Friday: Everyone kicked out, doors locked. Get a healthy night’s sleep or we’ll steal your computers.
  • 9:30am Saturday: Doors reopen.
  • 11:45pm Saturday: Kicked out, doors locked.
  • 9:30am Sunday: Doors reopen.
  • 4:00pm Sunday: Pencils down. Keyboards disconnected. Every team will demo — expect for the whole thing to end around 6.

Eats

Dinner Friday night, lunch and dinner Saturday, and lunch Sunday. All included. Plus snackies the whole time.

Teams

We highly encourage hackers to work in teams, but try to keep team size limited to 3 or 4.

Prizes

Judging rules will be shared when the hackathon starts. Prizes include at least one [redacted] and one [redacted]. Maybe two or there [redacted]s.

Do I have to attend? This is my weekend. FOR THE LOVE OF THE GODS LEAVE ME ALONE.

No problem. We believe the weekend is your time, and you should use it without hesitation. This is a completely optional healthy hackathon that just happens to instantly make all attendees ridiculously cool and good-looking. The choice is yours.

Ben, WTF, do you really only send emails all day?

Yes. But not during the hackathon. Not. during. the. hackathon.

-Ben

* Last year Alpert was our “roaming hacker.” He went from team to team helping anybody who needed helping. Without him things probably would’ve devolved into a lord of the flies type situation, and quick. We need more roaming hackers this year. Email me.

7/21/13 — 10:56pm Permalink
The App Engine Way

The phrase “the App Engine way” is muttered around our office and wielded in our code reviews. As in, “Yeah, I know it’s counterintuitive, but that’s the App Engine way.”

It boils down to two principles about structuring data:

  • Denormalize
  • Do lots of work when writing to make reads fast

These aren’t obvious. It won’t be obvious to folks from SQL land that App Engine’s performance is tied so tightly to these ideas. It won’t be obvious that, since JOINs don’t exist, it’s really easy to write a bunch of code that 1) loads a Monkey entity, 2) looks at Monkey.zoo_id, then 3) queries for the monkey's Zoo…and that that’s bad.

So we use “the App Engine way” to remind ourselves of these fuzzy guidelines and spread ‘em to newcomers.

Denormalize your data — as in your Notification models should have all the data necessary to render themselves. They shouldn’t just point to the thing that triggered a notification. As in whenever you’re sitting around wishing you could just JOIN Zoo ON Zoo.id = Monkey.zoo_id you should actually be storing all of Zoo's properties on each and every Monkey. Got billions of Monkeys? Who cares, copy the Zoo data a billion times. “Storage is cheap” and all that.

Do lots of work on write to make reads fast — as in your Post models should have their reply_count and vote_count and flag_count and monkey_count stored as properties on the Post itself. As in they should be updated every time a new reply, vote, flag, or monkey is brought into this wonderful world. As in any time you’re sitting around googling for App Engine’s version of SUM(), COUNT(), and AVG(), realize you should just be updating these aggregates every time they’re modified. As in whenever you write to a Zoo entity, kick off a task to update all denormalized copies on your Monkeys. Who cares if writes are a bit slower? This makes reads blazing fast.


Use the mini profiler or appstats to see a waterfall graph for your data-loading RPCs.

A simple rule of thumb for knowing when you’re done:

If you really care about performance, you should be able to create all of the data queries for an entire web request just from incoming GET/POST data and the currently logged-in user’s properties — no query-to-get-the-data-for-another-query allowed.

This is the price of entry for access to one big, enormous, whopper of a benefit: you get a datastore with consistent read/write performance no matter how big your data set becomes. Have a query that returns 10 items from a set of 10,000? Relax knowing that after your company blows up in 6 months, you’ll get the exact same speed when pulling 10 items from a set of 10,000,000,000.

6/29/13 — 1:16pm Permalink
Getting your team to adopt new technology

Recently two Khan Academy devs dropped into our team chat and said they were gonna use React to write a new feature. They even hinted that we may want to adopt it product-wide.

"The library is only a week old. It’s a brand new way of thinking about things. We’re the first to use it outside of Facebook. Heck, even the React devs were surprised to hear we’re using this in production!!!"

Great. And so my stodgy old brain entered Phase 1.

Here’s a sneak peek for you, brave proposer of new tech, inside the heads of your teammates.

The 8 phases your teammates’ brains go through when you propose a scary tech switch:


  1. Someone in chat is talking about some new thing called “React.” It’s probably used for mapreduces or something. I’m hungry, gonna grab dinner.

  2. N/M LOL, the library they’re interested in is client-side magic that’s only a week old and would represent an entirely new way of writing our app. Plus it’s the weekend. Nothing wrong w/ a little imagination and wishful thinking.

  3. OK I think someone just actually pushed this crazy javascript library to master. Surely you’re joking, Mr. Alpert.

  4. WTF, these people are serious. Don’t we have bigger problems to solve? Do we really need to inherit all the issues that come w/ a whole brand new set of client-side coding patterns right now? Time to demand some well-thought-out answers.

  5. BUT but but. I get that you love the library and it’s cool and it actually does make a difference for us and our users. But what about our i18n code? But what about our server-side templates? But what about our entire existing codebase? But what about our onboarding process? But have you thought about other alternatives? But why now?

  6. Hmm. I see value here. Lots of devs are affected by this, though — do they know about it? Who’s owning this transition? What’s the plan for old code? Is there a rule of thumb for new code?

  7. I see the big picture and how the transition might work. Good luck, you’re now directly responsible.

  8. I sure am glad I thought of this whole React thing. I am so, so brilliant.

Now you know what you’re up against.


Your job is to create a wormhole that transports a developer’s brain from Phase 1 to Phase 8 and then shove everyone on your team through it. Here’s how.

  • Know that your proposal is scary, but not because it’s new tech. It’s scary because of uncertainty about all your old code and habits. So you can’t bring somebody to Phase 8 just by selling them on benefits of the new tech. You have to sell how well you’ve prepared a transition for your team.

  • Schedule a plan to help others understand, share your proposal, listen to concerns, and then communicate a decision.

  • For every word you spend talking about the tech, spend two talking about your transition plans. Trying out the tech on one live feature before you want to suggest it for everybody? Say that. Tell the team exactly when you’ll either remove the experimental tech or talk more about team-wide adoption.

  • When doubters pile on, tell them exactly when you’ve scheduled time to discuss more. You’re trying to be brilliant and experiment — not the time to be burdened by early doubters. Sidestep them. “You’re absolutely right, there are a million problems. I’m working on a blog post right now to help others understand React, and I already scheduled a tech talk next Friday for us to go over all concerns before any big decision is made.”

  • Stomp down every conceivable objection before others even get a chance to raise their hand. By the time you want to propose a team-wide change you’ll have been thinking about this longer than anyone. Show your team that. Mention every objection upfront. Mention the alternatives you’ve considered. Honestly acknowledge pain points. Don’t wait for others to do it for you.

You’ll be seen as the clear owner of this decision. Nobody will want to waste their time agonizing over objections because you’ve already done that for them and provided answers.

Exactly how it should be.

Imagine you drop into your team’s chat one day…


…and you write this:

There’s a new JS library out called React. It’s promising but a bit scary. I’m playing with it for my new feature to try to learn more, and I think it’s possible we’ll want to use it everywhere in the future. Before making a team-wide decision like that, I scheduled a tech talk on Friday for us to discuss. I’m writing up a blog post before then that’ll teach others about the library and why I chose it. Read it, then bring all your concerns on Friday. I’ll have a plan for transitioning old code if we want to move forward, and if we decide this isn’t right for us I’ll undo my experiment.

…well. That’s a wormhole capable of transporting even your stodgiest, oldest teammates’ brains straight to “Phase 8: I thought of this and I am so brilliant.”

That reminds me: look’it our smooth math content editor



All the math formatting in this very serious example makes it hard to preview the content as it is being edited.

Our renderer, post-React, is on the left. A typical math editor’s preview is on the right. Ben Alpert and Joel Burget can tell you more about how they’re using React to make this so buttery. It’s absolutely brilliant. I’m just glad I thought of it.



Thanks go to Ben Alpert for draft-reading and to Fleetwood for bandana-wearing.

6/24/13 — 12:11am Permalink
Public speaking is a chance to make myself uncomfortable

Giving public talks is one of those few activities (see: flying) that causes an enormous surge of nervous adrenaline to course through my body.*

But at the same time I love sharing Khan Academy’s work. So I’ve been trying to get better. Thankfully, we now have this great “grillmasters” event in our office where anybody on the team can practice any speech and get constructive feedback from everyone, including one of the best public speakers around.

Here’re the highlights of Sal’s speaking advice for me so far:

  • "Smile so much people think you’re drunk." (later revised to "Smile w/ your eyes," presumably after I creeped him out trying to drunkenly smile)

  • "Know that you’re not wasting anybody’s time. The audience wants to be there."

Heh. Both are big challenges for me, but I put these tips to the test recently while giving a talk about how Khan Academy uses data — with a focus on mistakes we’ve made along the way — and ended up having a blast. If you watch the video you’ll see that I fail miserably at the smiling thing. But I’m getting a bit better at not speeding through my speech at a million words a minute for fear of wasting somebody’s precious time.

Still a long way to go. But I relish the fact that public speaking gives me a chance to be uncomfortable. Feels healthy.


The video of my dog was the talk’s biggest hit.

In case you’re preparing for a talk and in the same boat, I’m super grateful for these resources: What happens to our brains when we have stage fright, Slide Design for Developers, and Scott Hanselman’s Speaking Hacks.

Better yet, if you have tips, I’m all ears.



* Scott Hanselman is an inspiration in many ways. In this case I’m thanking him for his openness about being a type I diabetic — like me. The pre-speech adrenaline I refer to above causes my blood sugar to spike wildly. I only drank water the morning before my last speech, yet my blood sugar shot from ~100 mg/dl to 380+ in the minutes before I went on stage. This can have frustrating effects on your brain and body. Scott’s example as a terrific speaker and openness about being diabetic (he’s checked his blood sugar on stage before) helps and inspires me to be more open.

6/2/13 — 3:34pm Permalink