Back of Envelope

Back of envelope calculations is a technique used within software engineering to determine how a system should be designed. The method is most famous from big tech companies and is often expected in system design interviews. The thought is that you should first calculate some rough numbers so that it can drive decisions in designing possible solutions.

The following article lists various numbers at the time I did the research that I found useful for systems design, but is no longer being updated.

Note: All the numbers in this post are heavily rounded as their purpose is to give a rough guide for design decisions in the moment. You should always do more precise calculations before starting on a project/feature.

Useful Calculations

x Million users * y KB = xy GB example: 1M users * a documents of 100KB per day = 100GB per day.

x Million users * y MB = xy TB example: 200M users * a short video of 2MB per day = 400TB per day.

Byte Number Sizes

The number of zeros after thousands increments by 3.

Thousands = KB (3 zeros) Millions = MB(6 zeros) Billions = GB (9 zeros) Trillions = TB (12 zeros) Quadrilions = PB (15 zeros)

Byte Conversions

1B = 8bits 1KB = 1000B 1MB = 1000KB 1GB = 1000MB

Object Sizes

Data

The numbers vary depending on the language and implementation.

  • char: 1B (8 bits)

  • char (Unicode): 2B (16 bits)

  • Short: 2B (16 bits)

  • Int: 4B (32 bits)

  • Long: 8B (64 bits)

  • UUID/GUID: 16B

Objects

  • File: 100 KB

  • Web Page: 100 KB (not including images)

  • Picture: 200 KB

  • Short Posted Video: 2MB

  • Steaming Video: 50MB per minute

  • Long/Lat: 8B

Lengths

  • Maximum URL Size: ~2000 (depends on browser)

  • ASCII charset: 128

  • Unicode charset: 143, 859

Per Period Numbers

The following numbers are heavily rounded and help determine how often something needs to happen over a period of time. For example, if a server has a million requests per day, it will need to handle 12 requests per second.

Heavily rounded per time period numbers.

More complex example:

100M photos (200KB) are uploaded daily to a server.

  • 100 (number of millions) * 12 (the number per second for 1M) = 1200 uploads a second.

  • 1200 (uploads) * 200KB (size of photo) = 240MB per second.

The web servers will need to handle a network bandwidth of 240MB per second. You will therefore need a machine with high network performance to handle this bandwidth. In AWS this would translate to at least a m4.4xlarge, but it would be better to have multiple smaller servers to handle fault tolerance.

Usage

Users

  • Facebook: 2.27B | YouTube: 2B | Instagram: 1B

  • Pinterest: 332M | Twitter: 330M | Onedrive: 250M

  • TikTok: 3.7M

Visits

You can get a rough number of the visits a site gets at the similarweb website.

  • Facebook: 26.12B | Twitter: 6.34B | Pinterest: 1.32B

  • Spotify: 293M | Ikea: 233M | Nike: 110M

  • Argos: 54M | John Lewis: 37M |Superdry: 3.5M

  • Virgin Money: 1.8M | Aviva: 1.61M

Cost of Operations

  • Read sequentially from HDD: 30 MB/s

  • Read sequentially from SSD: 1 GB/s

  • Read sequentially from memory: 4 GB/s

  • Read sequentially from 1Gbps Ethernet: 100MB/s

  • Cross continental network: 6–7 world-wide round trips per second.

Systems

These are not exact numbers, which very much depend on the implementation and what is hosting the service. The purpose of the numbers is to have a general idea of the performance across different types of services.

SQL Databases

  • Storage: 60TB

  • Connections: 30K

  • Requests: 25K per second

Cache

[Redis — Requests][Redis — connections]

  • Storage: 300 GB

  • Connections: 10k

  • Requests: 100k per second

Web Servers

  • Requests: 5–10k requests per second

Queues/Streams

[Pub/Sub — limits][Kinesis — limits][SQS — limits]

  • Requests: 1000–3000 requests/s

  • Throughput: 1MB-50MB/s (Write) / 2MB-100MB/s (Read)

Scrapers

[Colly — go scraper]

  • Requests: 1000 per second

Last updated