Design Chat App (WhatsApp)
Topics:
Pros and cons of HTTP vs Websocket
How to scale Redis Pub/Sub?
How to guarantee message delivery?
What DB to use and why?
Functional Requirement
User can send message, receive message.
User can join group chat.
Message delivery acknowledgement: sent, delivered and read.
Push notification
User status: whether use are online or offline.
Optional:
Do we support sending images/files?
Do we support recall a message?
Do we support group chat?
How to add friends?
Non-functional requirement
Low latency
Highly available
Consistency: messages should be delivered in the order they were sent. Users must see the same chat history on all devices.
High Level Design
E2E
User A and user B create communication between clients.
User A send a message to chat server.
Chat server acknowledge back to user A
Chat server sends the message to user B and stores message in the DB.
User B sends an acknowledgement to chat server.
Chat server notifies user A message has been successfully delivered.
When user B reads the message, application notifies user A that B has read the message.
API
HTTP
Remove a conversation
v1/conversation?id=xxx DELETE
response: 201 status
View conversation history
v1/conversation?uid=xxx GET
response {
"conversations": [
"ConversationId1": {
RecipientName string,
RecipientProfilePic string,
LatestMessage string,
},
"ConversationId2": {},
...
],
"page_size": 10,
"page_number": 1,
}
Get conversation Detail
v1/conversation/conversation_id/details?uid=xxx
response {
"messages": [
"UserA": "text1",
"UserB": "text2",
"UserA": "text1",
]
}
Get Friends List
v1/friends?uid=xxx GET
WebSocket
Create a conversation
/connect
join(room, channel="xxx")
request: uid, recipient_id
response: emit("conversation created", channel="xxx", to=uid)
Send message
/send
request: uid, room_id, message, type?, media_object?, document?
response: emit("message sent", channel="xxx", to=uid)
/join_group
/leave_group
get_all_group(uid)
Acknowledgement Handler on client
socketio.on("ack") {
}
Data Schema
Message Table
Message {
channel_id bigint,
bucket int, // partition based on time, channel_id + bucket to be primary key
message_id, // snowflake id
author_id,
content text,
status: SENT, READ, RECEIVED, RECALLED
created_at
is_deleted
PRIMARY_KEY((channel_id, bucket), message_id)
}
User Table
User {
uid
handle
profile_pic
bio
}
Conversation Table
Conversation {
id: uuid
members: []string
message_ids: []string
owner
}
UserConversation {
uid // partition key
conversation_id // sort key
}
ConversationUser {
conversation_id //partition_key
uid // sort_key
}
Scale
2B users, 100B messages per day.
QPS: 100*10^9 / 10^5 = 100*10^4 = 1M QPS.
Storage:
100 bytes for a message
100B*100B = 10^13B = 10^10KB = 10^7MB = 10TB per day
We keep messages for 30 days = 300 TB per month.
Bandwidth:
10TB / 10^5 = 10*1000*1000MB / 10^5 = 1000MB / second
Number of servers:
WhatsApp handles 10M connections on a single server
2B / 10M = 200 servers
How to scale Redis Pub/Sub?
Modern Redis server capability:
100GB memory, gigabit network handle about 100,000 subscribers push.
max 10k connections.
1M QPS / 10^5 = 10 Redis.
2B Users -> 20B channels * 20 bytes = 400*10^9 bytes / GB = 400 GB
We need 4 Redis servers with each Redis server has 100GB.
How to maintain message ordering?
the message might be sent
To adjust incorrect device clocks, one approach is to log three timestamps:
The time at which the event occurred, according to the device lock.
The time at which the event was sent to the server, according to device clock.
The time at which the event was received by the server, according to server clock.
offset = 3-2
real time = 1+offset
Decentralized approach regarding pubsub message delivery (at-most-once), Presence(join room, leave room) and caching..
Last updated