A couple of constants to get started.
Napkin math runs on a tiny set of constants. The one that matters most is 86,400 seconds in a day, and you should round it to 100,000 when doing this in your head. Dividing by 100k means dropping five zeros, which you can do mid-sentence in an interview. The rest of these tables are worth a single flashcard session. They let you sanity check any answer in seconds.
Time
| seconds in a day | 86,400 ≈ 10⁵ |
| seconds in a month | ≈ 2.6M |
| seconds in a year | ≈ 31.5M |
| 1M req/day | ≈ 12 req/s |
| 1B req/day | ≈ 12k req/s |
That last pair is the workhorse. Any "per day" number becomes "per second" by dropping five zeros.
Sizes of things
| a UUID | 16 B |
| a tweet of text | ~300 B |
| a typical DB row | ~1 KB |
| a JSON API response | 1 to 10 KB |
| a compressed photo | ~200 KB |
| 1 min of 1080p video | ~50 MB |
Powers of ten, not 1024. KB, MB, GB, TB, PB. Each step is ×1,000 and nobody at a whiteboard cares about the 2.4%.
Latency
| main memory read | 100 ns |
| SSD random read | 100 µs |
| same-DC round trip | 0.5 ms |
| read 1 MB from SSD | 1 ms |
| HDD seek | 10 ms |
| cross-ocean round trip | 150 ms |
Each row is roughly 10× the one above it. Memory is fast, disks are slow, oceans are slower.
Throughput, per node
| app server, simple JSON | ~1k req/s |
| Postgres, indexed reads | ~5k req/s |
| Postgres, writes | ~2k req/s |
| Redis, gets | ~100k op/s |
| Kafka, messages in | ~1M msg/s |
These swing 10× either way with hardware and query shape. They are starting points you say out loud, then adjust.
DAU to requests per second.
DAU by itself tells you nothing about load. A user who opens the app once and a user who scrolls for an hour are both one DAU. So the first real question is how many requests one user fires in a day. Count taps, page loads, and the API calls behind them. A light utility might be 5. A feed app is 20 to 50. A chat app can be hundreds. Multiply, divide by 86,400, and you have the average. Then remember that nobody's traffic sits at the average. Daily cycles peak at 2 to 3× the mean, and launch spikes or TV moments can hit 10×. Capacity is planned for the peak, so that multiplier is the number that costs money.
Your product
One more habit worth stealing: say the daily number and the per-second number together. "20 million requests a day, so about 230 a second, call it 700 at peak." Interviewers relax the moment they hear a candidate move between those units without a calculator.
Requests to app servers.
There is no universal answer for what one server handles, and anyone who quotes one without context is bluffing. A box returning cached JSON can do 5,000 req/s without breathing hard. The same box resizing images falls over at 50. The honest move is to pick a number, state it as an assumption, and load test later. Around 1,000 req/s per server is a fair default for simple CRUD work on modern hardware. Two corrections keep the estimate from being a lie. You run servers at about 70% so a traffic bump doesn't tip them over, and you add a spare so one box can die or take a deploy while the rest carry the load.
Your hardware
1 carrying traffic, 1 spare for deploys and failures.
Notice how forgiving this math is. Being wrong by 2× on per-server capacity changes the fleet from 10 boxes to 20. Both answers lead to the same architecture: a load balancer and a stateless pool you can grow. The estimate exists to catch order-of-magnitude surprises, like discovering you need 4,000 servers when you budgeted for 12.
Split reads from writes.
Total traffic hides the structure that actually decides your architecture. Most products read far more than they write. A feed might be 95% reads, a URL shortener 99%, a chat app closer to 50/50. The split matters because reads and writes scale differently, and because reads can be absorbed by a cache before they ever touch the database. A cache hit costs a memory lookup. A miss costs a database query. So the database only sees the misses plus all the writes. For cache memory, the classic napkin rule is the 80/20: 20% of your objects serve 80% of reads, so size the cache for about 20% of a day's read volume.
Your workload
Every request at peak, by where it lands. Drag the hit rate to 0 and watch the database inherit everything.
The cache memory number surprises people in both directions. Text objects are nearly free: caching a day of hot reads for a million-user app fits in a few GB, which is one Redis instance. Media flips it. At 200 KB per object the same math wants hundreds of GB, which is why images live on a CDN instead of in your cache.
Keep reading.
Drop your email to unlock the database math, the storage bill, and the full napkin.
How many database nodes.
Here is where most estimates go soft, because people treat "the database" as one mysterious box. Give it numbers instead. A tuned Postgres on solid hardware does around 5,000 indexed reads per second and around 2,000 writes before latency starts climbing. Compare those against the traffic that survived the cache. Reads over budget get replicas, since every replica adds another full copy that can serve reads. Writes over budget get shards, since splitting the data is the only way to split the write load. Shard last. Replicas are an afternoon of work. Sharding follows you for the life of the product, because cross-shard queries and transactions stop being free.
The load
Per-node budgets
One primary handles everything. Add a standby anyway, hardware dies.
Run the math both ways before quoting it. With a 95% cache hit rate a single Postgres carries a surprisingly large product, which is why "one big database plus Redis" runs most of the internet you use daily. Kill the cache in the section above and the same traffic suddenly demands a replica fleet. That swing is the entire argument for caching, expressed in nodes.
Keep reading.
Drop your email to unlock the rest.
The storage bill, and the pipe.
Storage compounds, which makes it the sneakiest line on the napkin. Traffic resets every second. Bytes written today are still on disk in five years. The math is just the write path again: writes per day, times object size, times a replication factor of 3, because production data lives on three machines so losing one is an incident instead of a tragedy. Bandwidth is the same multiplication applied to the read path, and it tells you when you have outgrown a single region's network or need a CDN doing the heavy lifting.
Your durability
Cumulative storage by year, with replication. The bars only ever go up.
Two practical notes. Text-heavy products discover their storage bill is a rounding error, so stop optimizing it and go back to traffic. Media products discover the opposite, and the answer is tiering: hot data on SSDs, everything older in object storage like S3 at a tenth of the price, and a CDN in front so egress comes off your origin entirely.
Keep reading.
Drop your email to unlock the rest.
The whole napkin, end to end.
Everything above is one chain, and this is the chain written out the way you would say it across a table. Load a preset to see how different products bend the same math, or scroll back up and drag any slider. The napkin rewrites itself.
Keep reading.
Drop your email to unlock the rest.
The cheat sheet.
Every formula from this page in one table. In an interview, narrate the chain in this order, round aggressively, and state your assumptions before each step. The goal is never precision. The goal is showing you can move from a product description to a resource plan without hand waving.
| Question | Formula | Rule of thumb |
|---|---|---|
| Average traffic | DAU × req/user ÷ 86,400 |
1M/day ≈ 12/s. Drop five zeros. |
| Peak traffic | avg × peak factor |
2 to 3× daily cycle, 10× for spiky launches. |
| App servers | peak ÷ (capacity × 0.7) + 1 |
~1k req/s per box for simple CRUD. State it, then load test. |
| Database reads | reads × (1 − hit rate) |
The DB only sees cache misses plus writes. |
| Cache memory | daily reads × 20% × object size |
The 80/20 rule. Text is cheap, media is not. |
| Replicas | db reads ÷ 5k per node |
Replicas scale reads. An afternoon of work. |
| Shards | writes ÷ 2k per node |
Shards scale writes. A lifetime commitment. Do it last. |
| Storage | writes/day × size × 3 × days |
Replication is 3×. Storage compounds, traffic resets. |
| Bandwidth | peak reads × response size |
Past ~1 Gbps of egress, a CDN stops being optional. |
And the meta-rule that makes all of it work: round to the nearest power of ten and keep moving. 86,400 is 100k. 694 req/s is 700. Nobody has ever lost an offer for saying "about 12 servers" instead of 11.6. People lose offers by going quiet when the big number lands. Now it never has to land on you. If you want the architecture that this math grows into, the System Evolution walkthrough picks up exactly where this page stops.