Design Google Doc
Topics:
How to allow concurrent editing from multiple users?
How to detect and merge editing conflicts?
How to store the data?
How to guarantee the delivery of the editing changes from other users?
Functional Requirement
User can create a google doc and share it
Multiple users can edit the document.
User can revisit a document.
User can comment on document? (optional)
Non-functional Requirement
Editing changes from other user are rendered in real-time.
Low latency
Highly available
Fault tolerant.
High Level Architecture
API
CRUD
Data Schema
Scale
DAU: 1B users. 10 write, 10 read
QPS: 1B*10 / 10^5 = 100k
Storage: 2*365*1B*100kb = 800*10^11kb = 8*10^13kb = 20PB / year.
Final Diagram
What DB to use?
For editing table, we need db that optimize for write requests: Cassandra
Edit table: doc_id + timestamp bucket as primary key, message_id as the partition key.
How to merge conflicts?
How we get the conflict?
Operational Transformation (OT)
A technique that's widely used for conflict resolution in collaborative editing. It's a lock-free and non-blocking approach for conflict resolution. If operations between collaborators conflict, OT resolves conflicts and pushes the correct converged state to end users. As a result, OT provides consistency for users.
Features
Causality preservation (order): If operation a happened before operation b, then operation a is executed before operation b.
Convergence: All document replicas at different clients will eventually be identical.
Cons:
Each operation to characters may require changes to the positional index. Operations are order dependent on each other.
It's challenging to develop and implement from scratch.
Conflict-free Replicated Data Type (CRDT)
Operation-based
State-based
The conflict-free replicated data type (CRDT) was developed in an effort to improve OT.
a family of data structures for sets, maps, ordered lists, counters that can be concurrently edited by multiple users, which automatically resolve conflicts in sensible ways. CRDTs have been implemented in Riak 2.0.
It assigns a global unique identity to each character.
It globally orders each character.
DocumentID
UUID
DocumentCounter
The sequence of operations in a doc
87
Value
Value of character
"A"
PositionalIndex
unique position, can be float
4.5
The example below depicts that a user from site ID 123e4567-e89b-12d3
is inserting a character with a value of A
at a PositionalIndex
of 1.5
. Although a new character is added, the positional indexes of existing characters are preserved using fractional indices. Therefore, the order dependency between operations is avoided. As shown below, an insert()
between O
and T
didn’t affect the position of T
.
Cons: https://core.ac.uk/download/pdf/189163265.pdf
There are interleaving anomalies.
Lock
Locks require us to segment documents into small sections where user could lock a portion and edit it. This will help developers come up with easy solution and avoid complexities like OT and CRDTs.
Pros:
Easy to implement.
Cons:
Poor user experience: two users may want to add characters to the same sectin of the document, but their operations may not necessaily conflict.
Do we need processing queue in client or server side?
Client
Pros:
Easy to implement
Cons:
No centralized QPS control, imagine if a doc have millions of people editing it altogether, it would cause problem.
Server side
Pros:
Centralized QPS throttle and control
Cons:
Additional component to the system add complexity.
significant delay if changes are piling up in the queue?
Last updated