👽
Software Engineer Interview Handbook
  • README
  • Behavioral
    • Useful Links
    • Dongze Li
  • Algorithm
    • Segment Tree
    • Array
      • Product Of Array Except Self
      • Merge Strings Alternately
      • Increasing Triplet Subsequence
      • String Compression
      • Greatest Common Divisor Strings
      • Max Product Of Three
      • Find Duplicate Num
      • Valid Palindrome Ii
      • Next Permutation
      • Rearrange Array By Sign
      • Removing Min Max Elements
      • Find Original Array From Doubled
      • Reverse Words Ii
    • Backtracking
      • Letter Combination Phone Number
      • Combination Sum Iii
      • N Queens
      • Permutations
      • Combination Sum
    • Binary Search
      • Koko Eating Bananas
      • Find Peak Element
      • Successful Pairs Of Spells Potions
    • Binary Search Tree
      • Delete Node In BST
      • Validate Bst
      • Range Sum Bst
    • Binary Tree
      • Maximum Depth
      • Leaf Similar Trees
      • Maximum Level Sum
      • Binary Tree Right Side
      • Lowest Common Ancestor
      • Longest Zigzag Path
      • Count Good Nodes
      • Path Sum III
      • Maximum Path Sum
      • Move Zero
      • Diameter Binary Tree
      • Sum Root Leaf Number
      • Traversal
      • Binary Tree Vertical Order
      • Height Tree Removal Queries
      • Count Nodes Avg Subtree
      • Distribute Coins
      • Binary Tree Max Path Sum
    • Bit
      • Min Flips
      • Single Number
      • Pow
      • Find Unique Binary Str
    • BFS
      • Rotten Oranges
      • Nearest Exist From Entrance
      • Minimum Knight Moves
      • Network Delay Time
      • Minimum Height Tree
      • Knight Probability In Board
    • Design
      • LRU Cache
      • Get Random
      • LFU Cache
      • Moving Average
      • Rle Iterator
      • Design Hashmap
    • DFS
      • Reorder Routes Lead City
      • Evaluate Division
      • Keys And Rooms
      • Number Of Provinces
      • Disconnected Path With One Flip
      • Course Schedule Ii
      • Robot Room Cleaner
      • Word Break Ii
      • Number Coins In Tree Nodes
      • Maximum Increasing Cells
      • Number Coins In Tree Nodes
      • Detonate Maximum Bombs
      • Find All Possible Recipes
      • Min Fuel Report Capital
      • Similar String Groups
    • DP
      • Domino And Tromino Tiling
      • House Robber
      • Longest Common Subsequence
      • Trade Stock With Transaction Fee
      • Buy And Sell Stock
      • Longest Non Decreasing Subarray
      • Number Of Good Binary Strings
      • Delete And Earn
      • Minimum Costs Using Train Line
      • Decode Ways
      • Trapping Rain Water
      • Count Fertile Pyramids
      • Minimum Time Finish Race
      • Knapsack
      • Count Unique Char Substrs
      • Count All Valid Pickup
    • Greedy
      • Dota2 Senate
      • Smallest Range Ii
      • Can Place Flowers
      • Meeting Rooms II
      • Guess the word
      • Minimum Replacement
      • Longest Palindrome Two Letter Words
      • Parentheses String Valid
      • Largest Palindromic Num
      • Find Missing Observations
      • Most Profit Assigning Work
    • Hashmap
      • Equal Row Column Pairs
      • Two Strings Close
      • Group Anagrams
      • Detect Squares
    • Heap
      • Maximum Subsequence Score
      • Smallest Number Infinite Set
      • Total Cost Hire Workers
      • Kth Largest Element
      • Meeting Rooms III
      • K Closest Points Origin
      • Merge K Sorted List
      • Top K Frequent Elements
      • Meeting Room III
      • Num Flowers Bloom
      • Find Median From Stream
    • Intervals
      • Non Overlapping Intervals
      • Min Arrows Burst Ballons
    • Linkedlist
      • Reverse Linked List
      • Delete Middle Node
      • Odd Even Linkedlist
      • Palindrome Linkedlist
    • Monotonic Stack
      • Daily Temperatures
      • Online Stock Span
    • Random
      • Random Pick With Weight
      • Random Pick Index
      • Shuffle An Array
    • Recursion
      • Difference Between Two Objs
    • Segment Fenwick
      • Longest Increasing Subsequence II
    • Stack
      • Removing Stars From String
      • Asteroid Collision
      • Evaluate Reverse Polish Notation
      • Building With Ocean View
      • Min Remove Parentheses
      • Basic Calculator Ii
      • Simplify Path
      • Min Add Parentheses
    • Prefix Sum
      • Find The Highest Altitude
      • Find Pivot Index
      • Subarray Sum K
      • Range Addition
    • Sliding Window
      • Max Vowels Substring
      • Max Consecutive Ones III
      • Longest Subarray Deleting Element
      • Minimum Window Substring
      • K Radius Subarray Averages
    • String
      • Valid Word Abbreviations
    • Two Pointers
      • Container With Most Water
      • Max Number K Sum Pairs
      • Is Subsequence
      • Num Substrings Contains Three Char
    • Trie
      • Prefix Tree
      • Search Suggestions System
      • Design File System
    • Union Find
      • Accounts Merge
    • Multithreading
      • Basics
      • Web Crawler
  • System Design
    • Operating System
    • Mocks
      • Design ChatGPT
      • Design Web Crawler
      • Distributed Search
      • News Feed Search
      • Top K / Ad Click Aggregation
      • Design Job Scheduler
      • Distributed Message Queue
      • Google Maps
      • Nearby Friends
      • Proximity Service
      • Metrics monitoring and alert system
      • Design Email
      • Design Gaming Leaderboard
      • Facebook New Feed Live Comments
      • Dog Sitting App
      • Design Chat App (WhatsApp)
      • Design Youtube/Netflix
      • Design Google Doc
      • Design Webhook
      • Validate Instacart Shopper Checkout
      • Design Inventory
      • Design donation app
      • Design Twitter
    • Deep-Dive
      • Back of Envelope
      • Message Queue
      • Redis Sorted Set
      • FAQ
      • Geohash
      • Quadtree
      • Redis Pub/Sub
      • Cassandra DB
      • Collaborative Concurrency Control
      • Websocket / Long Polling / SSE
    • DDIA
      • Chapter 2: Data Models and Query Languages
      • Chapter 5: Replication
      • Chapter 9: Consistency and Consensus
  • OOD
    • Overview
    • Design Parking
  • Company Tags
    • Meta
    • Citadel
      • C++ Fundamentals
      • 面经1
      • Fibonacci
      • Pi
      • Probability
    • DoorDash
      • Similar String Groups
      • Door And Gates
      • Max Job Profit
      • Design File System
      • Count All Valid Pickup
      • Most Profit Assigning Work
      • Swap
      • Binary Tree Max Path Sum
      • Nearest Cities
      • Exployee Free Time
      • Tree Add Removal
    • Lyft
      • Autocomplete
      • Job Scheduler
      • Read4
      • Kvstore
    • Amazon
      • Min Binary Str Val
    • AppLovin
      • TODO
      • Java Basic Questions
    • Google
      • Huffman Tree
      • Unique Elements
    • Instacart
      • Meeting Rooms II
      • Pw
      • Pw2
      • Pw3
      • Expression1
      • Expression2
      • Expression3
      • PW All
      • Expression All
      • Wildcard
      • Free forum tech discussion
    • OpenAI
      • Spreadsheet
      • Iterator
      • Kv Store
    • Rabbit
      • Scheduler
      • SchedulerC++
    • [Microsoft]
      • Min Moves Spread Stones
      • Inorder Successor
      • Largest Palindromic Num
      • Count Unique Char Substrs
      • Reverse Words Ii
      • Find Missing Observations
      • Min Fuel Report Capital
      • Design Hashmap
      • Find Original Array From Doubled
      • Num Flowers Bloom
      • Distribute Coins
      • Find Median From Stream
Powered by GitBook
On this page
  • Questions:
  • Functional Requirement
  • Non-functional Requirement
  • Scale
  • Data
  • API
  • Data Schema
  • High Level Diagram
  • With Offline processing:
  • Item Availability ML Pipeline
  • Deep Dive
  • How user make an order?
  • How to prevent the inconsistent state when updating inventory?
  • Different shoppers
  • How to use cache?
  • How to do DB sharding?
  • What if data size is too large for a single database?
  • How to show accurate item availability?
  • Initial model for item availability
  • New Model
  1. System Design
  2. Mocks

Design Inventory

PreviousValidate Instacart Shopper CheckoutNextDesign donation app

Last updated 1 year ago

System Design:不是最常见的visa payment system, 而是inventory management, 设计当有客户,卖菜人,和买菜人同时需要更新菜的数量和买卖的情况。全程面试官毫无交流,感觉对墙说话。我每五分钟都停下来确认是不是再聊他想讨论的内容,面试官都是一句敷衍带过good good。最后草草收场去开别的会了。。

问如果做一个能显示各种category (food,clothes..)的页面, 然后点开每个category 的话还能显示sub- category,应该怎么设计database。感觉这里也答得一般

System Design: 是没见过的心题 , 就算是老面镜题payment都不知道能不能过关。给我来了一个全新的题。 题目: 有用户, 有快递小哥, 有厂商, 围绕着从库里的库存 的数量,更新数据。 api要求返回在1s之内。 问了, 选什么database, 怎么monitor, single point failure有哪些 ‍‍‍‍‍‍‍‌‍‌‌‌‌‍‍‌‌‌之前没怎么看system design, 属于半裸面。 真不敢相信自己真的就扯了1个小时。

shopping experience for customer

  • retailer app check 库存 {商店, 商品, 薯量}

  • customer app 架入 商品 {商店, 商品, 薯量,盯单}

  • 跑腿小哥 checkout {商店, 商品, 薯量,盯单} 没什么特别好的想法,主要问了问库存怎么扣出‍‍‍‍‍‍‍‌‍‌‌‌‌‍‍‌‌‌,怎么确认不超存量

Questions:

  1. How frequent does retailer update the inventory?

Functional Requirement

  1. Retailers are able to check inventory, add/update inventory, delete inventory...

  2. Customers are able to check inventory, add item to cart and place an order.

  3. Shopper can update, checkout order.

Non-functional Requirement

  1. Highly available - user will be able to do grocery shopping at any time.

  2. Highly scalable - the system can handle large amount of read/write

  3. Low latency

  4. Consistency - If multiple users place orders on the same item, we need to accurately reflect on the item availability?

Scale

How much scale we are looking at? and read:write ratio?

75k stores, 500M products on the shelves.

2M active users place multiple orders of tens of items in their cart every month.

600k Shoppers.

One month -> 10M order -> 10M / 30 = 0.3M order / day = 0.3*10^6 / 10^5 = 15 QPS

15*3 = 45 QPS for checkout validation requests.

Data

75k store, 500M items, 200 locations

On average a user will make 20 searches per basket before they checkout.

500*10^6 / (75*10^3) = 500/75*10^3 = 10^4 items per store.

1 store -> items

Retailer partners update inventory once a day for availability of all items on instacart.

Despite estimation about availability, purchased items are not reserved until our shoppers pick them up in the store.

API

API

Retailer Update inventory
POST v1/inventory
Request:
json {
  inventory_name
  inventory_id,
  encoded_blob(base64): []
}

Response:201 Updated OK
500 internal server error

Customer
CRUD
create order
POST v1/order
Request
{
  user_id,
  retailer_id,
  created_at
}
Response

view order
GET v1/order?id=xxx
Request: 
Response: {
  order_id,
  items: [
    "item_name1": {
       "quantity": 10,
    }
   "item_name1": {
       "quantity": 10,
    }
   "item_name1": {
       "quantity": 10,
    }
  ],
  status: CREATED, SHOPPER_ASSIGNED, PICKED_UP, CHECKED_OUT, DELIVERED
  amount: xxx,
}

update order
PUT v1/order
Reqeust:
{
  user_id,
  order_id,
  item_id,
  item_name,
  item_quantity,
  price, [optional]
  created_at
}

Response:
201 OK

remove order
DELETE /v1/order?id=xxx

Shopper
POST v1/orderitem
Reqeust:
{
  order_id,
  orderitem_id,
  shopper_id,
  item_id,
  item_name,
  item_quantity
}

Data Schema

Shopper Table
id (primary key)
phone
email
zipcode
first_name
last_name

Order Table
id, (primary key)
shopper_id
retailer_id
user_id,
created_at,
amount,
checkout_amount
status

Order Item Table
id (primary key)
order_id
retailer_id
item_id
item_name
item_quantity,
price,
status: PENDING/FOUND/REPLACED/NOT_FOUND
is_deleted,
updated_at

Retailer Table
id (primary key)
store_name
store_address
geo_hash
zipcode

Item Availability Table
retailer_id + item_id + date (composite primary key)
retailer_id
item_id
item_quantity,
version, (optimistic lock)
created_at,
updated_at

Item Table
id
retailer_id
item_name
item_price
item_pic: url in S3

User Table
id
phone
email
location
first_name
last_name

Question:
how to quickly update quantity for each product item?
is it necessary to keep a separate table for stock quantity?

High Level Diagram

With Offline processing:

Item Availability ML Pipeline

Deep Dive

How user make an order?

  1. Create cart order

  2. Fetch item availability from Cache.

  3. Create order and order items.

  4. After shopper update order item status:

    1. Fetch item availability and version number.

    2. Update Item Availability table with version number for optimistic lock.

    3. Update order item status.

How to prevent the inconsistent state when updating inventory?

Different shoppers

shopper1 and shopper2 both mark the same item as found in the same retail store, we might ended up with inaccurate quantity for item availability table.

The isolation property in ACID means database transactions must complet their tasks independently from other transactions. So data changes made by transaction 1 are not visible to transaction 2 until transaction 1 is completed.

Option 1: Pessimistic Locking

Pessimistic concurrency control, prevents simultaneous updates by placing a lock on a record as soon as one user starts to update it. Other users who attempt to update the record have to wait until the first user has released the lock.

For MySQL:

SELECT ... FOR UPDATE

works by locking the rows returned by a selection query.

Pros:

  • Prevents applications from updating data that is being changed.

  • Easy to implement and it avoids conflicts by serializing updates.

  • Useful for write heavy application when data-contention is heavy.

Cons:

  • Deadlocks may occur when multiple resources are locked. Writing deadlock-free application code could be challenging.

  • If a transaction is locked for too long, other transactions cannot access the resource. This has a significant impact on database performance, especially when transactions are long lived.

Option 2: Optimistic Locking

Optimistic concurrency control, allows multiple concurrent users to attempt to update the same resource.

  1. A new column called version is added to the database table.

  2. Before a user modifies the database row, the application read the version number of the row.

  3. When the user updates the row, the application increases the version number by 1 and write back to db.

  4. A database validation check is put in place: the next version number should exceed the current version number by 1. Otherwise, validation fails and user tries again from step 2.

Pros:

  • Prevents applications from updating stale data.

  • We don't need to lock db resources. version number control is on application level.

  • Optimistic lock is good when data-contention and concurrency is low.

Cons:

  • Poor performance when data contention is heavy.

  • Why? When there are a lot of shoppers try to mark the same grocery item, only one of them will succeed and the rest of client requests will have to retry.

Option 3: Database constraints

Similar to optimistic locking.

Instead of using is_available, we can use availability initialized to 1, and we add the following DB constraint:

CONSTRAINT `check_availability` CHECK ( total inventory - total sold >= 0)

Pros:

  • Easy to implement, works well when data contention is low.

Cons:

  • The db constraint cannot be version controlled easily like application code.

  • Not all db support constraints, if we do data migration in the future, it might cause problems.

How to use cache?

For customer browsing item catalogs within a retailer store and placing an order, we can actually put all the read requests onto the cache, since the item is not sold until shopper picked it up from store.

Only when shopper pick up item from store, we start update item availability and solve the conflict by using optimistic lock.

After DB updated, we update the cache.

How to do DB sharding?

There are normally two ways we can scale the DB. Either we add replica or we shard the data.

  • Add Read Replicas:

Pros:

  1. Can help ease up read load.

Cons:

  1. Higher chance of change conflicts if user read stale data from version column within the table.

  • Partition data into multiple databases

What if data size is too large for a single database?

  1. Move data to cold storage, for example item availability table from past dates.

  2. Shard by retailer_id

hash(retailer_id) % number of servers

Each retailer is unique and location-based, so we are effectively splitting request traffic from different retailers.

How to show accurate item availability?

Ingesting scores generated by ML models into DB storage for fast and bulk retrieval. In both method, we store score in DB and update them to address low latency use cases.

  1. Full sync

The ML availability scoring service updates a table multiple times a day in Snowflake with the refreshed availability score of items. The DB ingestion workers read the snowflake table periodically and upsert the availability score for refreshed items to ensure no scores are stale.

  1. Lazy Score Refresh

Scores are updated on demand based on an item appearing in the search results. The refresh activity happens in background jobs and used kinesis to aggregate the updates to DB.

Initial model for item availability

New Model

The new model combines three components for general, trending and real-time availability scores.

  • General Score: describes typical item availability patterns and is recalculated weekly.

  • Trending Score: Quantifies short-term deciations from typical patterns and is recomputed daily or hourly.

  • Real-time score: based on latest observations, use a event driven streaming achitecture that employs Kafka and Flink to deliver signals sourced from customer applications and retailer systems.

大萝卜 fullstack vo|instacart面经|一亩三分地海外面经版一亩三分地 1Point3Acres
Logo
Drawing
Drawing
Drawing
Drawing