Dropbox Onsite

Scarlet · 2018 年11 月 12 日 06:38

補發 onsite 15 Mar 2018
感覺是備胎了二星期，後來也跪了。。。。

1 Sharpness Value
2 Max island size
Follow up: what if we can flip some cell, return the current max number of island size after the flip…
Union find ba… check leetcode, number of islands series
3. Web Crawler

面經有提這，但一直没人詳細講這題，面完終於清楚問題及他expect multi threading 怎樣，打算之後寫個blog 關於這給出Solution。現在還未有空寫
如果想要solution 請，及聯系我

补充内容 (2018-9-14 22:41):
https://paper.dropbox.com/doc/In … 7npLjRBjrAKA5AOBsaM

其他準備或答案在此，不用找我了
也歡迎在文件中留言或更正。

Scarlet · 2018 年11 月 12 日 06:44

當時掛了太傷心，好像忘了補貼這個，那個是Detailed Prep &Solutions：

Q1: FindSharpness Val

The same question but change the way it asks… into water level, land elevation.
But, my interview seems not very familiar with the solution, I need to explain him the dp. and the logics all these things…
the i, j , all these indexing make it hard to understand as well…
I make one logic bug and find out during testing… it is I switch the two nested loop
I made wrong…
for(int i=0; i<m…)
for(int j=1; j<n; …)
This is wrong, the outer loop should be the col loop as we move from left to right…

Talk a bit about my proposal on optimised space usage… just store one prev column and use one var called prev…
There is no ask on what if the input is very large etc…

Problem in details:
Let say, I have a water level x;

in the matrix, left to right, find a path that is not get flooded by the water…(means the height/elevation that cell represents should be > x)///
well, in the path, we pick the smallest val, and that smallest val has to be >x…
Then we find the path which give that best value… so we can assue there is always a path if x is not greater or equal to that value…
(same problem as sharpness val then)

No follow up… the space optimised is suggested by me…

Lunch
Demo

Q2: Find the largest island…

Just leetcode question,
I got one small bug regarding visited marking,… I didn’'t mark the starting point as visited… find it out during testing and corrected.

Follow up:

What if we start with board with all 0… and there is fun can flip the cell (i, j) value;
each time after the function call, return the current largest island…
like leetcode addLand… num of island II

//no need to write exact working code…
//explained Union find … and
// assumed api for find set… and set set … etc…
just write a piece of logic…

Ask about the running time…

I gave some explanation over the worst case of UF, and path compression and weighting etc… saying hard to prove the running time … it is nlog*n after optimised etc

Q3: Web Crawler

First, single thread version…
Talked a bit about DFS, BFS, stack overflow… endless depth etc problem…
Implemented simple BFS, using result set to do visisted checking as well

Multithreading:
Which part is bottle neck: the getURL part… network latency etc… parser etc
I mentioned Master / slave model, and how it can be achieved using ThreadPool, Callable, and check futures…
and metioned it is even more efficient if we don’'t do the syn of these queue, set checking logic just let master manage all these… he seems like this…

Then, he asked, in some language/machine, there is no support of threadPool, how would we do it …
He gave an API… to let you avoid writing some setup code///
setThread(THREAD_COUNT, method)…
implement the method… actually it is the BFS method… but with right locking

There is some back and forth about where to put lock, and
also asked the terminating conditions… (queue is empty and no working thread)

public void crawl() {
        while(true){ //this simplify my code for locking / unlock
                lock.lock();
                if(q.isEmpty() &amp;&amp; workingThread<1) {
                        lock.unlock();
                        break;
                }

                //need to use while to guard it
                while(q.isEmpty()) {//queue is empty but cannot terminate yet
                        wait(); //wait will release the lock...
                }
                String currURL = q.poll();
                lock.unlock();

                //you don''t want to lock the part that takes most time and that part is not accessing common data
                String newURLGathered = getURL(currURL);

                lock.lock();
                for(each){
                        add to q &amp; add to result set
                }
                notifyAll(); //in case there is someone waiting due to empty queue.
                lock.unlock();

        }  

}

Scarlet · 2018 年11 月 12 日 06:45

他假設有這樣的一個API，你只需要implement 那method，
setThread(THREAD_COUNT, method)…

ad1586 · 2018 年11 月 12 日 06:46

求solution wang2579@umn.edu

Scarlet · 2018 年11 月 12 日 06:47

sent you

bm18369 · 2018 年11 月 12 日 06:47

多谢楼主分享求crawler的solution 下周也要面这家 crawler没写过有点不稳 yuzhoul dot eecs @ gmail.com

Scarlet · 2018 年11 月 12 日 06:48

發你了，

stn5755 · 2018 年11 月 12 日 06:49

楼主求一发crawler的solution谢谢！~

Scarlet · 2018 年11 月 12 日 06:51

sent you

Scarlet · 2018 年11 月 12 日 06:51

sent you

ppstudy · 2018 年11 月 12 日 06:54

vet65hao@126.com 处于兴趣求一份答案感谢

song · 2018 年11 月 12 日 07:24

求solution. 多谢 raloxifene@outlook.com

lichenga2404 · 2018 年11 月 12 日 07:56

求链接access approval

mymail.rai · 2020 年8 月 10 日 07:41

Could you send/share web crawler multithreaded solution to mymail.rai@gmail.com?
-Thanks

mymail.rai · 2020 年8 月 11 日 04:49

Hi, Could you share with me the crawler (multithreaded) code at mymail.rai@gmail.com

-Thanks

nyd · 2020 年12 月 1 日 07:14

求solution 谢谢 nydkim@gmail.com