Design Chat App (WhatsApp)

Topics:

  1. Pros and cons of HTTP vs Websocket

  2. How to scale Redis Pub/Sub?

  3. How to guarantee message delivery?

  4. What DB to use and why?

Functional Requirement

  1. User can send message, receive message.

  2. User can join group chat.

  3. Message delivery acknowledgement: sent, delivered and read.

  4. Push notification

  5. User status: whether use are online or offline.

Optional:

  1. Do we support sending images/files?

  2. Do we support recall a message?

  3. Do we support group chat?

  4. How to add friends?

Non-functional requirement

  1. Low latency

  2. Highly available

  3. Consistency: messages should be delivered in the order they were sent. Users must see the same chat history on all devices.

High Level Design

Drawing

E2E

  1. User A and user B create communication between clients.

  2. User A send a message to chat server.

  3. Chat server acknowledge back to user A

  4. Chat server sends the message to user B and stores message in the DB.

  5. User B sends an acknowledgement to chat server.

  6. Chat server notifies user A message has been successfully delivered.

  7. When user B reads the message, application notifies user A that B has read the message.

API

HTTP

Remove a conversation

View conversation history

Get conversation Detail

Get Friends List

WebSocket

Create a conversation

Send message

Acknowledgement Handler on client

Data Schema

Message Table

User Table

Conversation Table

Scale

2B users, 100B messages per day.

QPS: 100*10^9 / 10^5 = 100*10^4 = 1M QPS.

Storage:

100 bytes for a message

100B*100B = 10^13B = 10^10KB = 10^7MB = 10TB per day

We keep messages for 30 days = 300 TB per month.

Bandwidth:

10TB / 10^5 = 10*1000*1000MB / 10^5 = 1000MB / second

Number of servers:

WhatsApp handles 10M connections on a single server

2B / 10M = 200 servers

Drawing

How to scale Redis Pub/Sub?

Modern Redis server capability:

100GB memory, gigabit network handle about 100,000 subscribers push.

max 10k connections.

1M QPS / 10^5 = 10 Redis.

2B Users -> 20B channels * 20 bytes = 400*10^9 bytes / GB = 400 GB

We need 4 Redis servers with each Redis server has 100GB.

How to maintain message ordering?

the message might be sent

To adjust incorrect device clocks, one approach is to log three timestamps:

  1. The time at which the event occurred, according to the device lock.

  2. The time at which the event was sent to the server, according to device clock.

  3. The time at which the event was received by the server, according to server clock.

offset = 3-2

real time = 1+offset

Decentralized approach regarding pubsub message delivery (at-most-once), Presence(join room, leave room) and caching..

Last updated