Cassandra DB
https://discord.com/blog/how-discord-stores-billions-of-messages
2017 Discord Use Case
50/50 read/write ratio.
Voice chat channel:
< 1000 messages a year.
returning small amount of data involves random seek in disk causing disk cache evictions.
Private text chat heavy channel:
100k to 1M messages a year.
read request is low and unlikely in disk cache.
random reads
About Cassandra
It's a KKV store.
The primary key is for partition.
The secondary key is for identify one row within that parition.
Schema
Issues:
Began to see warinings that partitions were found over 100MB in size.
Large partition put a lot of GC pressure on Cassandra during compaction.
Solution:
decide to bucket messages by time. we store about 10 days of messages in one bucket.
New Schema
Concerns
Eventual Consistency
Last write wins.
Read before write anti pattern: read are more expensive than write in Cassandra.
Every write is an upsert, meaning if exist update, not exist we insert.
Concurrency Issues
If user A removes the same message record just before user B edit it, we would end up with a row missing all data except primary key and updated column.
Two solutions:
Write the whole message back when editing the message. This had the possibility of resurrecting messages, adding more chance for concurrent conflicts.
Figuring out message is corrupt and delete it from DB.
Tombstone issues
Avoiding writing null values to Cassandra, causing unnecessary tombstone writing.
A popular channel only have 1 message in it, the owner deleted millions of messages using tombstone. It takes 20 second to load up this channel.
Cause:
Cassandra had to effectively scan millions of messages tombstones (generating garbage faster than JVM could collect it.)
Solution
Lower lifespan of tombstone from 10 days to 2 days
Changed application query code to track empty buckets and avoid them in the future. If a user caused this query then at worst Cassandra would scan only the most recent bucket.
Last updated