Using Back-of-the-Envelope Math to Pressure-Test Ideas

This post is a bit of an experimental thought exercise, but I hope you will find it interesting. You might have run into the term “back-of-the-envelope calculations” in the context of system design at work, quickly checking if an idea makes sense, you might have heard about this term as ‘back of a napkin’. In this post I would like to explore a few applications of using this. Let’s look at some made-up examples, starting with a broken tool and ending with a mathematical formula for success.

Case 1: Broken Tool

Problem: Let’s say some kind of a validation tool at your big enterprise company is running into timeouts. Annoying, but after retry it goes away, so engineers don’t report the annoyance that much. You’ve had enough of it and complained to the owners by creating some bug reports but there is no traction whatsoever. Things continue like that for a while.

Solution: Spend just mere 10-20 min to do back-of-the-envelope calculation to estimate the implication of this. No need for precise data – just something that makes sense to get the point across. E.g. this tool slows down X engineers on daily basis and they spend Yh of time waiting for the job to complete, then assume some reasonable coefficient k (0<k<1) of waiting async time to sync time and you get something of E = (k*Y*X) of ENG time per day. k, Y, X could come from data if you have it right away, or reasonable assumptions (org size, anecdotal data). If E number is 8h then the cost of things continuing is one engineer full time. Probably this is worth fixing and gives you a good argument for quickly justifying ROI.

Case 2: Validating compatibility of statements

Let’s look at a more complicated and borrowed example from the book “The Art of Doing Science and Engineering” (great read, btw). There are two statements in the book and the author validates their compatibility, based on the premise that amount of knowledge is somewhat proportional (k) to number of scientists:

Statement 1: knowledge doubles every 17 years;
Statement 2: 90% of the scientists who ever lived are now alive.

Now, the author starts with the formulas: – number of scientists at any point in time t, and since there is a proportion k to arrive at the amount of knowledge, we can use this to accumulate knowledge up until now (T), we get: and up until 17 years ago: , so if it doubles every 17 years, we’ve got , and this can be solved to calculate b (~0.0408..). Then author assumes that typical lifetime of a scientist is 55 years and then calculates for second statement: , so the 90% suggested in the statement 2 above. There we go – two statements are fully compatible! Wow.

But hold on! By now this all might sound very convincing, but the above seemed to make sense in the post-WWII world, but current data no longer strongly supports the two statements, UNESCO suggests 8.8M researchers in 2018, but the projections are pointing to plateauing of this because most of the growth in 00’ and 10’ was coming from China’s growth, which no longer continues at that rate and at the same time EU, US has already reached plateauing (other findings on this). Also the original doubling of the knowledge statement is questionable. Potentially doubling of information is still happening fairly frequently if we consider material that is being generated by AI.

My mini-conclusion for this example is that you can arrive at reasonable confirmation if what someone is saying makes sense generally, but it may not hold true for a long period of time.

Case 3: System Design

Let’s say you are sketching an idea for your startup for which latency of processing data is critical, without going into too many details you can get a sense of how much time it takes to process your data at different stages and whether your solution is worth exploring further or not or if you need to find a different solution.

Some typical numbers:

Transatlantic network roundtrip: ~80–120 milliseconds
Reading 1MB from network (LAN): ~0.5–3 milliseconds
Reading 1MB from disk:
- HDD: ~5–15 milliseconds
- NVMe SSD: ~0.1–1 milliseconds
Reading 1MB from memory (DRAM): ~150–300 microseconds
Processor thread synchronization (mutex): ~20–300 nanoseconds
L1 cache access: ~0.5 nanoseconds

Case 4: Formula for success

Switching gears. Let me run a silly thought experiment: you want to model what it takes to succeed as a mathematical formula. Maybe, you want to increase chances of your kids succeeding and being happy. You can come up with some parameters of what you believe have influence on success and then think about possible relationships between them that would determine success at time t based on past success at time t’:

B(t0): Birth conditions (location, socioeconomic status at birth).
E(t′): Quality and quantity of education at age t′, where t′<t.
R(t′): Availability and utilization of resources (money, time, social capital) at age t′.
H(t′): Individual’s health (physical & mental) at age t′.
N(t′): Personal and professional social network strength at age t′,
D(t′): Individual’s level of determination and perseverance at age t′.
L(t′): Luck or random positive/negative events at age t′.

Then, you might arrive at a generalized mathematical formula for success at age t:

where:

α,β,γ,δ,ϵ,ζ,η are weighting factors, representing the relative impact of each component (these could be determined empirically or statistically).
is a discounting or decay function indicating that recent events often influence current success more than distant past events (λ is a decay parameter).

You might conclude that an individual’s level of determination has the greatest impact, and decide to focus on developing that trait in your children.

Conclusion

You don’t always need precise data to move forward even for something that may seem overly complex and even abstract. Quick estimates let you sanity-check assumptions, challenge vague claims, pressure-check ideas, and decide if something is even worth exploring. Quick estimates help you move fast and avoid getting lost in BS and even rough mathematical formulas might help you structure your thinking about complex matters before you dive all the way into it. Anyway, just some random thoughts from my side — curious what you think.

Markdown	Result
text	text
text	text
*text*	text
`code`	`code`
~~~ more code ~~~~	more code
[Link](https://www.example.com)	Link
* Listitem	Listitem
> Quote	Quote