Expedia 2019 Summer Intern 过经

GHC Database 发了OA
分别面SDE 和 DS

SDE OA

  1. OOP的题: Dog/Cat/Duck class extends Animal

2.Unique Array by Increment

[2,2,4,5]–à[2,3,4,5],input是一个数组,你可以每次任选元素加1,直到数组中每个元素都unique为止,求出unique array所有数字的和的最小值

3.Ticket sellers

DS OA

10题 Digital OA

  1. good on training data but bad on testingdata, what to do? (overfitting)

1).Increase number of training samples(因为训练集小容易overfitting)

2).Increase Regulation Coefficient

3).Reduce features;

4).Reduce polynomial features

2.解释EnsembleAlgorithm

Bagging: Random Forest, Parallel,independent

Boosting: Adaboost, Serial

Majority Voting, Weighted Voting

Simple Average, Weghted Average

Diversify the training samples of eachsingle model to the most. And we also diversify the input features of eachsingle model to the most. And sometimes there are also different weights oneach single model.

  1. 一道SQL题

4.你熟悉的NLP和Deep Learning的包和tools

5.你熟悉的Data Science的包

6.对Pyhton的熟悉程度选择

7.对R的熟悉程度选择

8.下列哪个不是NLP用的技术(LGA,N-gram)

Onsite 两个candidate一组,一起coding/回答问题
SDE Onsite:
问了五道情景题,问应该用什么数据结构解决,比较简单,涉及到了: 2d-array, prefix-tree等等
一道题 白板 利口二十

DS Onsite:
go through two projects, solve problems step by step: 1. Spam Detection; 2. Priority Ranking
面试官是做NLP的,一起面的是一个博三的phd, 楼主之前没做过NLP压力很大
中间涉及到的问题有:

Sequence labelin

How to fix overfitting

How to process/store data

MapReduce

How to extract features

All kinds of techniques you would use to obtain and store dat

十月中旬收到了offer, 两周ddl, 已decline