👽
Software Engineer Interview Handbook
  • README
  • Behavioral
    • Useful Links
    • Dongze Li
  • Algorithm
    • Segment Tree
    • Array
      • Product Of Array Except Self
      • Merge Strings Alternately
      • Increasing Triplet Subsequence
      • String Compression
      • Greatest Common Divisor Strings
      • Max Product Of Three
      • Find Duplicate Num
      • Valid Palindrome Ii
      • Next Permutation
      • Rearrange Array By Sign
      • Removing Min Max Elements
      • Find Original Array From Doubled
      • Reverse Words Ii
    • Backtracking
      • Letter Combination Phone Number
      • Combination Sum Iii
      • N Queens
      • Permutations
      • Combination Sum
    • Binary Search
      • Koko Eating Bananas
      • Find Peak Element
      • Successful Pairs Of Spells Potions
    • Binary Search Tree
      • Delete Node In BST
      • Validate Bst
      • Range Sum Bst
    • Binary Tree
      • Maximum Depth
      • Leaf Similar Trees
      • Maximum Level Sum
      • Binary Tree Right Side
      • Lowest Common Ancestor
      • Longest Zigzag Path
      • Count Good Nodes
      • Path Sum III
      • Maximum Path Sum
      • Move Zero
      • Diameter Binary Tree
      • Sum Root Leaf Number
      • Traversal
      • Binary Tree Vertical Order
      • Height Tree Removal Queries
      • Count Nodes Avg Subtree
      • Distribute Coins
      • Binary Tree Max Path Sum
    • Bit
      • Min Flips
      • Single Number
      • Pow
      • Find Unique Binary Str
    • BFS
      • Rotten Oranges
      • Nearest Exist From Entrance
      • Minimum Knight Moves
      • Network Delay Time
      • Minimum Height Tree
      • Knight Probability In Board
    • Design
      • LRU Cache
      • Get Random
      • LFU Cache
      • Moving Average
      • Rle Iterator
      • Design Hashmap
    • DFS
      • Reorder Routes Lead City
      • Evaluate Division
      • Keys And Rooms
      • Number Of Provinces
      • Disconnected Path With One Flip
      • Course Schedule Ii
      • Robot Room Cleaner
      • Word Break Ii
      • Number Coins In Tree Nodes
      • Maximum Increasing Cells
      • Number Coins In Tree Nodes
      • Detonate Maximum Bombs
      • Find All Possible Recipes
      • Min Fuel Report Capital
      • Similar String Groups
    • DP
      • Domino And Tromino Tiling
      • House Robber
      • Longest Common Subsequence
      • Trade Stock With Transaction Fee
      • Buy And Sell Stock
      • Longest Non Decreasing Subarray
      • Number Of Good Binary Strings
      • Delete And Earn
      • Minimum Costs Using Train Line
      • Decode Ways
      • Trapping Rain Water
      • Count Fertile Pyramids
      • Minimum Time Finish Race
      • Knapsack
      • Count Unique Char Substrs
      • Count All Valid Pickup
    • Greedy
      • Dota2 Senate
      • Smallest Range Ii
      • Can Place Flowers
      • Meeting Rooms II
      • Guess the word
      • Minimum Replacement
      • Longest Palindrome Two Letter Words
      • Parentheses String Valid
      • Largest Palindromic Num
      • Find Missing Observations
      • Most Profit Assigning Work
    • Hashmap
      • Equal Row Column Pairs
      • Two Strings Close
      • Group Anagrams
      • Detect Squares
    • Heap
      • Maximum Subsequence Score
      • Smallest Number Infinite Set
      • Total Cost Hire Workers
      • Kth Largest Element
      • Meeting Rooms III
      • K Closest Points Origin
      • Merge K Sorted List
      • Top K Frequent Elements
      • Meeting Room III
      • Num Flowers Bloom
      • Find Median From Stream
    • Intervals
      • Non Overlapping Intervals
      • Min Arrows Burst Ballons
    • Linkedlist
      • Reverse Linked List
      • Delete Middle Node
      • Odd Even Linkedlist
      • Palindrome Linkedlist
    • Monotonic Stack
      • Daily Temperatures
      • Online Stock Span
    • Random
      • Random Pick With Weight
      • Random Pick Index
      • Shuffle An Array
    • Recursion
      • Difference Between Two Objs
    • Segment Fenwick
      • Longest Increasing Subsequence II
    • Stack
      • Removing Stars From String
      • Asteroid Collision
      • Evaluate Reverse Polish Notation
      • Building With Ocean View
      • Min Remove Parentheses
      • Basic Calculator Ii
      • Simplify Path
      • Min Add Parentheses
    • Prefix Sum
      • Find The Highest Altitude
      • Find Pivot Index
      • Subarray Sum K
      • Range Addition
    • Sliding Window
      • Max Vowels Substring
      • Max Consecutive Ones III
      • Longest Subarray Deleting Element
      • Minimum Window Substring
      • K Radius Subarray Averages
    • String
      • Valid Word Abbreviations
    • Two Pointers
      • Container With Most Water
      • Max Number K Sum Pairs
      • Is Subsequence
      • Num Substrings Contains Three Char
    • Trie
      • Prefix Tree
      • Search Suggestions System
      • Design File System
    • Union Find
      • Accounts Merge
    • Multithreading
      • Basics
      • Web Crawler
  • System Design
    • Operating System
    • Mocks
      • Design ChatGPT
      • Design Web Crawler
      • Distributed Search
      • News Feed Search
      • Top K / Ad Click Aggregation
      • Design Job Scheduler
      • Distributed Message Queue
      • Google Maps
      • Nearby Friends
      • Proximity Service
      • Metrics monitoring and alert system
      • Design Email
      • Design Gaming Leaderboard
      • Facebook New Feed Live Comments
      • Dog Sitting App
      • Design Chat App (WhatsApp)
      • Design Youtube/Netflix
      • Design Google Doc
      • Design Webhook
      • Validate Instacart Shopper Checkout
      • Design Inventory
      • Design donation app
      • Design Twitter
    • Deep-Dive
      • Back of Envelope
      • Message Queue
      • Redis Sorted Set
      • FAQ
      • Geohash
      • Quadtree
      • Redis Pub/Sub
      • Cassandra DB
      • Collaborative Concurrency Control
      • Websocket / Long Polling / SSE
    • DDIA
      • Chapter 2: Data Models and Query Languages
      • Chapter 5: Replication
      • Chapter 9: Consistency and Consensus
  • OOD
    • Overview
    • Design Parking
  • Company Tags
    • Meta
    • Citadel
      • C++ Fundamentals
      • 面经1
      • Fibonacci
      • Pi
      • Probability
    • DoorDash
      • Similar String Groups
      • Door And Gates
      • Max Job Profit
      • Design File System
      • Count All Valid Pickup
      • Most Profit Assigning Work
      • Swap
      • Binary Tree Max Path Sum
      • Nearest Cities
      • Exployee Free Time
      • Tree Add Removal
    • Lyft
      • Autocomplete
      • Job Scheduler
      • Read4
      • Kvstore
    • Amazon
      • Min Binary Str Val
    • AppLovin
      • TODO
      • Java Basic Questions
    • Google
      • Huffman Tree
      • Unique Elements
    • Instacart
      • Meeting Rooms II
      • Pw
      • Pw2
      • Pw3
      • Expression1
      • Expression2
      • Expression3
      • PW All
      • Expression All
      • Wildcard
      • Free forum tech discussion
    • OpenAI
      • Spreadsheet
      • Iterator
      • Kv Store
    • Rabbit
      • Scheduler
      • SchedulerC++
    • [Microsoft]
      • Min Moves Spread Stones
      • Inorder Successor
      • Largest Palindromic Num
      • Count Unique Char Substrs
      • Reverse Words Ii
      • Find Missing Observations
      • Min Fuel Report Capital
      • Design Hashmap
      • Find Original Array From Doubled
      • Num Flowers Bloom
      • Distribute Coins
      • Find Median From Stream
Powered by GitBook
On this page
  • Figma Multiplayer Doc
  • Topics:
  • Functional Requirement
  • Non-functional Requirement
  • High Level Architecture
  • API
  • Data Schema
  • Scale
  • Final Diagram
  • What DB to use?
  • How to merge conflicts?
  • How we get the conflict?
  • Do we need processing queue in client or server side?
  • Client
  • Server side
  1. System Design
  2. Mocks

Design Google Doc

PreviousDesign Youtube/NetflixNextDesign Webhook

Last updated 1 year ago

Topics:

  1. How to allow concurrent editing from multiple users?

  2. How to detect and merge editing conflicts?

  3. How to store the data?

  4. How to guarantee the delivery of the editing changes from other users?

Functional Requirement

  1. User can create a google doc and share it

  2. Multiple users can edit the document.

  3. User can revisit a document.

  4. User can comment on document? (optional)

Non-functional Requirement

  1. Editing changes from other user are rendered in real-time.

  2. Low latency

  3. Highly available

  4. Fault tolerant.

High Level Architecture

API

CRUD

1. Create Document
v1/document POST
json payload {
  uid,
  id,
  folder_id,
  access,
}

Response: 201 OK

2. View Document
v1/document?id=xxx GET
Response {
  id,
  content,
  users,
  users_cursor_pos: {
    'user1': {
      'col': xx,
      'row': xx,
    }
  },
  user_access,
}

3. Share Document
v1/document/#id/share GET
Response {
  "url": ""
}

4. Edit Document
v1/document?id=xxx PUT
Request {
  uid,
  starting_pos_row,
  starting_pos_col,
  text,
}

Response: 200 OK

5. Delete Document
v1/document DELETE

Websocket API
/connect
Request data: uid, document_id

/edit
Request data: uid, document_id, pos_row, pos_col, text

Websocket client handlers:
socket.on("/edit") {
  changes = data.changes;
  
}

Data Schema

User Table
{
  uid,
  password,
  email,
  profile_pic,
}

Document Table
{
  id,
  owner,
  user_access: string[],
  created_at,
  is_deleted,
}

Edit Table
{
  id,
  uid,
  pos_row,
  pos_col,
  text,
  created_at,
  screenshot,
}

Scale

DAU: 1B users. 10 write, 10 read

QPS: 1B*10 / 10^5 = 100k

Storage: 2*365*1B*100kb = 800*10^11kb = 8*10^13kb = 20PB / year.

Final Diagram

What DB to use?

For editing table, we need db that optimize for write requests: Cassandra

Edit table: doc_id + timestamp bucket as primary key, message_id as the partition key.

How to merge conflicts?

How we get the conflict?

  1. Operational Transformation (OT)

A technique that's widely used for conflict resolution in collaborative editing. It's a lock-free and non-blocking approach for conflict resolution. If operations between collaborators conflict, OT resolves conflicts and pushes the correct converged state to end users. As a result, OT provides consistency for users.

Features

  1. Causality preservation (order): If operation a happened before operation b, then operation a is executed before operation b.

  2. Convergence: All document replicas at different clients will eventually be identical.

Cons:

  • Each operation to characters may require changes to the positional index. Operations are order dependent on each other.

  • It's challenging to develop and implement from scratch.

  1. Conflict-free Replicated Data Type (CRDT)

  • Operation-based

  • State-based

The conflict-free replicated data type (CRDT) was developed in an effort to improve OT.

a family of data structures for sets, maps, ordered lists, counters that can be concurrently edited by multiple users, which automatically resolve conflicts in sensible ways. CRDTs have been implemented in Riak 2.0.

  1. It assigns a global unique identity to each character.

  2. It globally orders each character.

Data
Explanation
Example

DocumentID

UUID

DocumentCounter

The sequence of operations in a doc

87

Value

Value of character

"A"

PositionalIndex

unique position, can be float

4.5

The example below depicts that a user from site ID 123e4567-e89b-12d3 is inserting a character with a value of A at a PositionalIndex of 1.5. Although a new character is added, the positional indexes of existing characters are preserved using fractional indices. Therefore, the order dependency between operations is avoided. As shown below, an insert() between O and T didn’t affect the position of T.

  • There are interleaving anomalies.

  1. Lock

Locks require us to segment documents into small sections where user could lock a portion and edit it. This will help developers come up with easy solution and avoid complexities like OT and CRDTs.

Pros:

  • Easy to implement.

Cons:

  • Poor user experience: two users may want to add characters to the same sectin of the document, but their operations may not necessaily conflict.

Do we need processing queue in client or server side?

Client

Pros:

  • Easy to implement

Cons:

  • No centralized QPS control, imagine if a doc have millions of people editing it altogether, it would cause problem.

Server side

Pros:

  • Centralized QPS throttle and control

Cons:

  • Additional component to the system add complexity.

  • significant delay if changes are piling up in the queue?

Cons:

Figma Multiplayer Doc
https://core.ac.uk/download/pdf/189163265.pdf
Drawing
Drawing