Dropbox 三番上门

职位是 Sr. Software Engineer

店面 :

  • https://leetcode.com/problems/find-duplicate-file-in-system/
    Code was not supposed to run/compile but it was expected to write whole code and making sure that it would run.
    I had to ask for different api like listFiles(), openFile(), isDirectory(), SHA256(file).
    I wrote the solution where you dont store the content of the file in HashMap, you store hashCode[SHA256] of the file.Followup 1:
    1 What would take more time in this algorith?
    Ans: Reading the whole file and calcluating hash of that file.
    2 How will you modify your solution to improve this?
    Ans: You can build a map on size of the files Map<Integer, Set> (<size,filepath>) those are candidates that could be the same.
    Size that has only one file can be removed without calculating SHA256() of the file.
    If there are more than 2 files, compute SHA256() of every file.
    I had to write this code too. I just modified the solution I gave before.Followup 2:
    Files with the same size have the same hash, how do you make sure those are duplicates?
    There is no way you have to compare those files byte by byte.

Onsite :

Round 1: 1hr

  • System Design : Logging system where all dropbox teams would call this api and log the event/something.
    The log will be list of key value pairs from differnet teams.
    Later on all these teams can query on logs saying give all the entries where key=value. You have to return team specific results.
    Every log can have different keys and values.
    For eg:
    team1 entry 1 : { id : “user1”, key1 : 123, key2:12.67 }
    team2 entry 1 : { userguid : “HAK8916”, key22 : 123.34}

Round 2: 1hr

  • You have token bucket initialized with TokenBucket(int tokens, int refreshrate, int capaciy);
    You have initial tokens then for every second refreshrate tokens are added and the capacity is max tokens a bucket can have.
    Write a void get(int amount) multithreaded function for TokenBucket class.
    I said we cant have anotehre method that system thread can call and fill up bucket but interviewer said, as it is expensive to have system thread we dont have that.
    So the thread which calls get() also suppose to fill tokens in the bucket.
    Another way to look at it : The problem is of producer and consumer but same thread consumes and produces via one get() method.
    Following is sample code that sort of does it but it can definitely be improved.
Class TokenBucket{
  float tokens;
  float capacity;
  float refreshrate;
  int timestamp;
  Object o = new new Object():
  TokenBucket(int tokens, int refreshrate, int capaciy){
    this.tokens = tokens;
    this.refreshrate = refreshrate/1000; (convert refresh rate from sec to miliseconds)
    this.capaciy = capaciy;
    timestamp = System.getCurrentMilisec();
  void get(int amount){
        int diff = amount - tokens;
        // calculate how long it should wait in miliseconds to get additional diff tokens 
        // OR calculate tokens to be refrehed here and then wait 
        // when thread wakes up fill up the tokens 
        int curTime =  System.getCureentMilisec();
        int timePassed = curTime-timestamp;
        float addTokens = timepassed*refreshrate;
        tokens = tokens+addTokens > capaciy ? capacity : tokens+addTokens ;
        timestamp = curTime;
  • I am sure there is better solution. Please consider helping community by commenting your approach and ans for this one.

Round 3 : 1hr

  • It was an hr lunch with one of the members. No technical question but work related talk.

Round 4 : 1hr

  • Id allocator. Capacity 0-N.
    int Allocate() : returns avilable ID within 0-N range.
    int Release(int ID) : releases given ID and that ID becomes available for allocation.
    Write O(1) solution. After doing that the result was space (N).
  • I had solved this one so gave the solution of O(1) where you don’t inititializa with N numbers in the queue.
    Total bytes = N * 4(int ID) + overhead of set + overhead of queue.
  • Now interviewer asked:
    We want to save space we dont care about runtime time complexisty how would you save space?
    Here I allocated N bits in byte array and every bit represents the ID from 0-N. (didn’t have to code)
    bit value 0 represents avilable and 1 represent taken. The index/place of the bit in byte array represents int ID.
    The time complexity became O(N) because we have to traverse all byte array bit by bit
    and figure out which bit is availale and return the index/place of that bit in byte array.
  • Interviewer then asked :
    Can we add some more space and improve rumtime instead of O(N).
    It was back and forth with some approaches then I came up with complete binary tree aaproach.
    node will represnet the range of bits .
    – ROOT node (int val, range 0-N)
    – ROOT.left (int val, range 0 - mid)
    – ROOT.right (int val, range mid+1 - N)
    If the value of the node is 1 then that whole range is taken (IDs are allocated)
    If the value of the node is 0 then there is a zero in the Node’s range.
    Then leaf node can contain the actual byte array.
    and then you do recursive traversal to left/right to find zero bit.
    The time complexity becomes O(logN)
    Space - the height is logN of the tree so :
    Node memory : 2^logN * ( 4 (bytes left pointer) + 4 (bytes right) +
    4 (bytes val) + 8 ( bytes for range) )
    Later I realized that node could represent one BYTE instead of every bit.
    so byte array can be presented in complete binary tree form.
    It was already 1 hr so we didnt discuss this approach further.

Round 5 : 1hr (Bar Raiser)

  • Deep dive into the projects that I did.
  • BTW, You have to give paragraph to recruiter before onsite about what project you gonna talk about in deep dive interview.
  • We talked about why the was project needed, DB design, UI, backend and timeline of the projects.

Round 6 : 1hr

  • It was behavioral question for an hr.

Round 7 : 30 min (Manager who sponsored you for onsite)

  • Question and Answer with manager.
  • This is not considered as an interview as such.
  • General discussion about the projects that manager is handling.