FB Technical Phone Round 1

cf36987 · 2018 年11 月 11 日 23:31

10/10 initial technical phone interview. There was a HR screening before this. Good luck to everyone who’'s recruiting!

SQL:-- Q1: how many posts were reported yesterday for each report Reason?-- Table: user_actions-- ds(date, String) | user_id | post_id | action (’‘view’’,’‘like’’,’‘reaction’’,’‘comment’’,’‘report’’,’‘reshare’’) | extra (extra reason for the action, e.g. ‘‘love’’,’‘spam’’,’‘nudity’’)
– Q2: introduce a new table: reviewer_removals, please calculate what percent of daily content that users view on FB is actually spam?–no need to consider if the removal happen at the same post date or not.-- ds(date, String) | reviewer_id |post_id

Product:
How would you test if this filter works?

List some metrics
If we experiment, how would you conduct it?

A/B testing
How to select a sample group

Random to avoid bias
How many people would you select for your sample group

Use formula for n (minimum sample size)
After getting results from A/B testing, what to do next?

T-test on metrics to see if there’‘s a difference
What’‘s a t-test? What’‘s t-score? What’‘s P-value? Explain p-value to someone who doesn’‘t know stats.
Let’'s say the filter worked but revenue went down, what would be your hypothesis?

If it’‘s rev/user dropped, perhaps the number of users increased because of the effective spam filter, but they don’t spend money.
But aside from rev per user, total revenue also dropped. Perhaps user distribution changed, there’‘re more users who don’'t spend money now.
Perhaps the filter downgrades the spam posts and now all the spams are clustered at the same spot (e.g. the 20th post and forward). So users stop looking scrolling after the 20th. Check activated time to validate.

Given revenue decrease, how would you make recommendations? (doesn’'t have to be yes or no answer)

Short term vs. long term: how much does revenue drops? User experience vs. revenue. Short term revenue drop vs. long term brand perception and long term revenue gain.
If user distribution changed: find cause and tackle that unimaged users

0572C · 2018 年11 月 11 日 23:32

楼主方便说这是啥职位么？偏产品？

cf36987 · 2018 年11 月 11 日 23:33

General Data Science/Analytics Intern

0518 · 2018 年11 月 11 日 23:34

这是我第一题的解法
select
action,
extra
count(distinct post_id)
from
user_actions
where
user_id is not null and
action = ‘‘report’’ and
date(date) = date_sub(current_date, interval 1 day)

但是我没太明白第二个sql 题目的意思，追加的table 里面所有的post 都是被removed 了么？reviewer_id 是单独的一种id 还是可以match 到user_id? 一个post被removed 了是不是就等于confirm了这个post 是真正的有问题（i.e. if this post is reported as spam and is removed, then it’'s an actual spam?）? 我应该会先把这些问题和面试官clarify 好了才开始做题吧。

补充内容 (2018-10-14 06:36):
sorry, SQL 少了group by 部分：
group by
action,
extra

cq999999 · 2018 年11 月 11 日 23:35

Thanks for sharing

cf36987 · 2018 年11 月 11 日 23:36

Everything in reviewer_removals is removed. Reviewers are FB employees, so reviewer_id doesn’'t matter.

stn5755 · 2018 年11 月 11 日 23:38

谢谢楼主分享！想问下SQL题目两个table各个attribute是什么意思？

– Table: user_actions
– ds(date, String) | user_id | post_id | action (’‘view’’,’‘like’’,’‘reaction’’,’‘comment’’,’‘report’’,’‘reshare’’) | extra (extra reason for the action, e.g. ‘‘love’’,’‘spam’’,’‘nudity’’)
post_id 是user_id News Feed 里面的post么？换句话说不是这个user post的，是别人post的。action = view是怎么算的呢，需不需要用户点开才算view？
每个action都会有extra reason么，可不可以用户report某个post，但是不给reason？

– Q2: introduce a new table: reviewer_removals, please calculate what percent of daily content that users view on FB is actually spam?
– no need to consider if the removal happen at the same post date or not.
– ds(date, String) | reviewer_id |post_id
不明白这个table和上个table的关系，比如user1 report post1 due to spam，然后post1被FB employee review，post1就会在第二个table里面么？还是post1被确认是spam以后才会在第二个table里？

多谢啦！

stn5755 · 2018 年11 月 11 日 23:38

还想问一下楼主 AB test sample size n 是什么formula呀？我一般都用网上的calculator，只要输入 alpha， beta，以及change就行了。谢谢啦！

mqm · 2018 年11 月 11 日 23:39

product sense 题目这么多么…

cf36987 · 2018 年11 月 11 日 23:40

Yeah Haha. Cuz the SQL part lasted only 5-10 minutes. So the rest of the 40 minutes was all on product and basic stats questions.