Facebook News Feed
Last updated
Last updated
Facebook Timeline: Fanout read
Fanout write = Megafeed
Pro:
just one thing to read
we can do pre-rank.
Con:
If you have N friends, your post will be repeatedly wrote N times. It's time and space comsuming.
Fanout read = Multifeed
Facebook chose Fanout read (Multifeed).
Write amplification makes the storage needs expensive in Megafeed
Write amplification: suppose you have 300 followers, you sending a new post will result in 300 writes to the system. This changes the problem from a tractable dataset that can be saved in memory to a big disk storage problem.
Developing with read-time aggregation is flexible
The developer can simply aggregate the data, apply their algorithms, and see the output right away. It would take longer with fanout write.
Memory and network easier to engineer around
News feed launches in 2005 with fanout write. At that time memory and network weren't that good. The fanout read approach started 2007. Now the hardware is more powerful to support fanout read.
Never have huge fan-out write to do, only bounded (<10k) fan-out read
Justin Bieber problem: with fanout write, when JB sends a new post, we need to write to a million different places. And because facebook's friend relationship is bidirectional, JB's news feed pool will be write a million times if each of his friends sends a new post. But nobody would read one million posts.
In-memory (mostly) databases
Do ~40 requests per feed query
10 billion querys per day to the aggregator. So ~400 billion requests per day, about 4.6 million QPS.
About 50% of the total LOC.
Must store a number of users on each leaf
Once we find a user, we want to scan his/her activity in time order
Want to have an easy way of adding new activity without locking
Most natural data structure is a hashtable to a linked list (map from userId to posts)
If FB is using fanout read for news feed generation, how does it handle new posts?
Facebook News Feed: Social Data at Scale (NOV 26, 2012)
Talked about how feed data is stored.
Scale at Facebook
Scalling Memcache at Facebook