2021/08/16~2021/09/15【Back-End Software Engineer】未錄取
由獵頭提供,詳見 JD。
朋友的朋友在該公司本薪 1.8m/annu,因待遇佳且情報可靠而決定投遞。
此關為純技術面試,題目只有一題 Anagram 題型,由面試官一邊在線上編輯器打出題目,一邊口述講解。由於無法在網路上找到完全相同的題目原型,以下的題目說明僅憑筆者個人理解進行記錄。
當字串 與 存在相同的字母集合,且每個字母出現次數相同,稱 與 互為 Anagram。此題目標是從輸入字串 中尋找合法字謎,並除去無法匹配到任何字謎的字串,不必在意答案順序。
def find_anagrams(strings: List[str]) -> List[str]: pass
assert set(find_anagrams(['aabc', 'abca', 'xyz'])) == set(['aabc', 'abca'])assert set(find_anagrams(['aabc', '123', 'abca', 'xyz', 'zyx'])) == set(['aabc', 'abca', 'xyz', 'zyx'])
相似題目可參考 LeetCode 49. Group Anagramsopen_in_new。
此關為純技術面試,題目是一系列與日常工作相關的題組,由面試官一邊在線上編輯器打出題目,一邊口述講解。
"""Input: Infinite stream of float latenciesOutput: p25, p50, and p75 latencies
Latency is within [0, 10k]"""
def get_latencies(): """ Yields: int """ for r in [10, 20, 30, 40, 50]: yield r
def print_percentiles(stream: iter(int)): """ Args: stream: iter(int) Yields: tuple(int,int,int) """ # 10 yield (10, 10, 10) # 10, 20 yield (10, 10, 20) # or (10, 20, 20) # 10, 20, 30 yield (10, 10, 20) # or (10, 20, 20) # 10, 20, 30, 40 yield (20, 20, 30) # or (20, 30, 30) # 10, 20, 30, 40, 50 yield (20, 30, 40)
stream = get_latencies()results = print_percentiles(stream)for i, r in enumerate(results): print(i,r)
此關為最後一關技術面試,前半段口頭詢問 Technical Questions,後半段進行 Live Coding 測驗。
Process vs. Thread
假設有一個 Worker,你會使用 Process 還是 Thread 來實作?
題目由面試官事前準備好直接在線上編輯器貼上題目,一邊口述講解。功能實作其實不算太難,但是此題測驗對於極值與容錯的處理有十分嚴格的要求,原始題目詳見下列。
Given a JSON blob, implement the function that retrieves a single value for the path given as a string:
config = { a: 1, b: { c: { d: 2 }, e: { f: { h: 3 } } }, h: ['a', 'b', 'c']};getByPath(config, '$.b.e'); // { f: { h: 3 } }getByPath(config, '$.h.2'); // 'c'
def string_to_pattern(s): count_map = {} for c in s: if c not in count_map: count_map[c] = 0 count_map[c] += 1 sorted_letters = sorted(count_map.keys()) pattern = ''.join([f'{c}{count_map[c]}' for c in sorted_letters]) return pattern
def find_anagrams(strings): pattern_map = {} for s in strings: pattern = string_to_pattern(s) if pattern not in pattern_map: pattern_map[pattern] = [] pattern_map[pattern].append(s) return [v for values in pattern_map.values() if len(values) > 1 for v in values]
一開始題目沒有限制 latency 的範圍,且型態為 float,因此先嘗試最 Naive 的寫法,使用 priority queue 來維護所有看過的數字。後來面試官加入 的限制,並且型態改成 integer,就能用 counting map 在有限空間及常數時間內完成計算了。
def print_percentiles(stream): total_count = 0 count_map = [0 for _ in range(10000)]
def get_percentile_latency(p: float): k = int(total_count * p) # k-th acc_count = 0
for i in range(len(count_map)): acc_count += count_map[i] if acc_count > k: break return i + 1
for latency in stream: total_count += 1 count_map[latency - 1] += 1 yield ( get_percentile_latency(0.25), get_percentile_latency(0.5), get_percentile_latency(0.75), )
接著 followup 是詢問當 latency 的範圍不受控的情況,latency 型態仍為 integer,此時只需要將使用的資料結構更改為 sorted map 即可。
def print_percentiles(stream): total_count = 0 count_map = {} # sorted map
def get_percentile_latency(p: float): k = int(total_count * p) # k-th acc_count = 0
for latency in count_map.keys(): acc_count += count_map[latency] if acc_count > k: break return latency
for latency in stream: total_count += 1 if latency not in count_map: count_map[latency] = 0 count_map[latency] += 1 yield ( get_percentile_latency(0.25), get_percentile_latency(0.5), get_percentile_latency(0.75), )
過程中也有被詢問時間複雜度,但都不會太難回答。
功能性本身在數分鐘內即可完成實作,但是後續花費非常多的時間與面試官來回討論各種極端情況與錯誤處理,最終在時間限制內得到以下版本:
def getByPath(config, raw_path): paths = raw_path.split('.')
if len(paths) == 0: print('no path') return None elif paths[0] != '$': print('no root') return None
value = config paths = paths[1:] while len(paths) > 0: path = paths[0] paths.pop(0)
if isinstance(value, bool) or isinstance(value, int) or isinstance(value, float) or isinstance(value, str): # print('not found') return None elif isinstance(value, list): index = int(path) if index < len(value): value = value[index] else: print('index out of bound') return None elif isinstance(value, object): if path in value: value = value[path] else: print('key not found') return None else: # print('not found') return None
return value
print(getByPath({ 'a': { 'b': 'c' }}, '$.a.b')) # c
print(getByPath({ 'a': { 'b': 'c' }}, '$.a.b.c')) # None
print(getByPath({ 'd': [True, False]}, '$.d.1')) # False
print(getByPath({ 'd': [True, False]}, '$.d.1.2')) # None
print(getByPath({ 'd': 'abc'}, '$.d.2')) # None
此題目有被追問 Follow-up:路徑裡有該怎麼辦?具體例子如下:
getByPath({ 'a': { 'b': 'c', 'd': 1 }}, '$.a.*')) # ['c', 1]
由於時間已不太充足,當下僅口述並指出哪部分程式碼需要進行修改,沒有完整實作。