Premature Optimization

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. – Donald Knuth

I think the above quote is one of the most abused quotes in real-life software development. Mostly because the only part that is quoted is “premature optimization is the root of all evil.”

What the quote is talking about is the fact that we should first profile our programs and maybe run benchmarks against them before we start optimising. Once we have all the information we should only optimise our program in the places it’s spending the most time and only to the extent that it doesn’t unnecessarily harm our ability to maintain the program.

Performance vs Product

I think a lot of people take this quote to mean, “don’t worry about optimizations now, let’s just develop the product, if it’s too slow we can optimise it later”. This sounds super “agile” and speaks to people who work on the product side of things. This way of thinking makes performance optimisations out to be something in the way of the real product that customers are after.

OK, so it’s a misinterpretation of the quote, but that doesn’t make it wrong, right? It’s true that at the end of the day, customers want fast software that doesn’t eat up all the memory on their computers. But a slow program that’s out there earning money is better than a fast one that’s still in development.

It could be that you tell the project manager (or whatever other decision maker) that there is a performance issue with a new feature and they decide that the value of the new feature is greater than the cost of improving performance. This would be a good thing. But I don’t think that this is how things go in real life.

In real life, we want something done before the end of the sprint. We finish a task, if it’s too slow we might say that it’s too slow and that we need some time to improve performance. The project manager, and maybe other teammates, see a finished task and push for it to be deployed. We decide to fix the performance issue later and then never do.

This could be because of a particularly bullish project manager. But I think more often than not the fault here lies with the developer. We need to get better at explaining the performance costs of a change, when we say we need time to improve “performance” in the eyes of the decision maker “performance” is standing in the way of the feature going to market.

We should

  1. Measure performance. what part of the feature is hurting performance? Maybe we can drop that part of the change for now. More often than not just running a profiler on the code will reveal a performance improvement so trivial that we don’t even have to talk about the problem.
  2. Tell others to measure performance. Often, we release code without thinking about the impact on performance, so no one knows how bad it is until the server crashes or the bill comes back from AWS.
  3. Make it clear what is affected by the change. How much slower will this endpoint get? How much longer will it take for this page to load if we add this feature? Will other existing features be affected by the change? Are we reducing the value of an existing feature by trying to add this new feature? Could this cause memory leaks or timeouts?

If we look at all of that then we have the information to decide whether or not optimisation would be “premature”. But before then, both the person concerned about performance and the person talking about “premature” optimisation, are both just wasting time talking about nothing.

Performance vs Maintainability

The other way I think the quote gets misinterpreted is by people saying “I optimise for readability” or “maintainability” rather than performance. This sounds like it’s a good thing, but once again it misses the point. It uses “maintainability” as an excuse for not even measuring performance. If your “readable” code means that the user is going to get a noticeable downgrade in performance, then you should at least be aware that you’re paying that price for maintainability.

Often, the worst performance problems come from things like having to include a lot of dependencies in your JavaScript bundle, making extra calls to the database or making inefficient joins to get to a specific column. In cases like this, the more performant code is almost always more readable.

Other times, you might run the profiler and find a solution that is equally readable but more performant. Using a native function rather than unnecessarily importing a library might be an example of this. Another might be to use a for loop instead of map, filter or reduce in languages where these end up being compiled down to one loop for each array function (PHP, JavaScript, etc.) as opposed to languages with features specifically designed to avoid this problem (streams in Haskell, iterators in Rust, etc.).

Maintainability is worth sacrificing performance for. But we should still find out what the price of that maintainability is and we should always measure performance with profilers and benchmarking tools, even if we decide to stick with a more maintainable solution.