When you are designing complex systems, it would be great to be able to build such system in many ways and simply compare how it performs during live load just to choose the best one. Of course, it's not possible.

The proposed solution which is widely used is called back-of-the-envelope estimation. The idea is to estimate various metrics of the system using thought experiments and estimates of common operations (like disk write, disk read etc.). Then we design system based on that numbers.

Let's do an example: estimate QPS and storage requirements for a tweeter-like service assuming:

  • 300 million active users monthly
  • 50% of users are using tweeter every day
  • every active user sends 2 tweets a day
  • 10% of tweets contain media
  • each tweet have:
    • tweet_id — 64 bytes
    • text — 140 bytes
    • media — 1 MB
  • we want to store data for 5 years
  1. DAU (Daily Active Users): 300 million * 50% = 150 million
  2. QPS (Query Per Second): 150 million * tweets / 24 h / 3600 s ~= 3500 tweets/s
  3. Peak QPS: 2 * QPS = 7000 tweets/s
  4. Storage (for media only): 150 million _ 2 tweets _ 10% * 1 MB = 30 TB/day
  5. For 5 years: 5 years _ 365 days in year _ 30 TB/day ~= 55 PB

This gives us rough volume of the data which is needed. There are few important hints while using this technique on System Design Interviews:

  • Write those numbers down, you'll get back to them multiple times
  • Write down your assumptions
  • Write units of those calculations to not confuse yourself
  • Don't calculate exact numbers, approximate calculations are enough
  • Commonly asked estimations: QPS, Peak QPS, storage, cache, number of servers.

There is a very helpful resource which helps to understand various trade-offs we are making when designing a system. This page shows what is the average time of various operations our system will perform. It also shows how this changed over time.

For example:

  • accessing L1 cache takes ~ 1ns
  • mutex lock/unlock takes ~ 17ns
  • accessing main memory takes ~ 100ns
  • compressing 1 KB takes ~ 2000ns
  • sequential read of 1 MB from memory takes ~ 3000ns
  • SSD random read takes ~ 16000ns
  • SSD sequential read of 1 MB takes ~ 49000ns
  • packet round trip in the same data center ~ 500000ns
  • sequential read of 1 MB from disk takes ~ 825000ns
  • packet round trip CA to the Netherlands ~ 150000000ns