如何 design 一个 Twitter 类型的系统,这是很多朋友都会被问到的一个问题,包子君今天与大家快速分享一下。
首先,Twitter 绝对不是我们能在45分钟之内 design 出来的,除非你是 Twitter 的人出去面试。。。所以主要就是聊聊难点在哪里。当然,大家都能想到一堆 table,互相 join 来 join 去,然后查询就是一堆 select+index,但是你马上会发现当有大V发一条消息的时候,事情就变得比较微妙了,那么解决方案是什么呢?
简单来说就是每个 user maintain 一个 queue,这个 queue 放在 redis 里面。下面的英文内容帮助大家理解一下,建议看一下大神 Raffi 的视频,你就会全部懂了。
User Timeline -> display the tweets you posted -> select + index seems to be working
Home Timeline -> display the tweets by the people that you follow
A few points in mind that you could discuss with your interviewer
- Twitter is consumption heavy, not production heavy
- Celebrity tweet (fanout) is challenging
- you will quickly realize by simply locking a table or row is too slow
- 5 secs is acceptable
- Think about each user maintains a queue in redis or like a mini table, for a user has N followers, it will fanout to insert into N users, but don’t have central point of locking. There might be a problem some tweets might be out of order when people reply/retweet, could sort by tweet id as a mitigation
- For search, basic idea is reverse index + sharding
References:
转自:包子铺里聊IT