folder_open

面試雜記

arrow_right
article

面試心得–Buyandship 香港商易購全球有限公司台灣分公司

面試心得–Buyandship 香港商易購全球有限公司台灣分公司

2023/02/07~2023/02/24【Senior Backend Software Engineer】未錄取

職缺介紹

#

詳見 Youratoropen_in_new

根據第一輪面試了解,Buyandship 先前收購了一間專門做 Path Comparison System 的公司,目前預計招募新的團隊來重寫這套系統。薪資保障 13 個月,預計招募 1~2 位 BE。

公司概況

#

台灣辦公室開發團隊目前編制為 FE * 2、App * 2、BE * 2、QA * 2,香港則有 BE * 3、Data Engineer * 1。專案管理分為 Product Management 及 Project Management,一週一個 Sprint。公司財務上已經達成損益平衡。

面試前測驗

#

詳見原始題目

第一輪線上面試(HR:Una Chen,CTO:Tsz Ming Wong)

#

開頭針對面試前討論的預期薪資討論,提出可接受 offer 5%~10% 為 Stock Option。

接著 CTO 開始隨機問各種問題,技術和非技術交錯:

  1. 銜接考卷題目,如果 MQ 和 API Server 之間的網路也中斷了,你會怎麼處理?

    當時聽不太懂題意,只好根據自己的理解先吐一點東西,回答的方向從 HA MQ、DB Replica,一直到使用 Local File 實作 WAL,最後還是沒有理解題目XD。

  2. 過去做過最有挑戰的技術成就?

  3. 你的 Career Path?

  4. 你覺得你的大學和碩士差異是什麼?

  5. 你從什麼時候開始寫程式的?

  6. 你會怎麼選擇 Python 和 Node.js?

  7. 你過去使用 Monorepo 的經驗?

最後 CTO 有提到,我是第一位 Candidate,因此大約需要 3 週左右的時間才能給我答覆。

應徵結果及時程

#

  • 2023/02/07 Yourator 線上應徵
  • 2023/02/09 邀請面試前測驗
  • 2023/02/10 邀請面試
  • 2023/02/13 第一輪線上面試
  • 2023/02/23 感謝信

附錄:面試前測驗參考解答

#

  1. Background

    a. List some of your favourite software/tools/SaaS to be used during development (e.g. Mac/Window/Ubuntu, VS Code/Vim/IntelliJ IDEA, GitHub/Bitbucket, Vercel, etc.).

    • OS: Mac
    • Editor: VS Code
    • VCS: Git
    • UML: Draw.io
    • Code Review: Github, Gitlab, Bitbucket
    • CI/CD: Circle CI, Gitlab CI
    • Language/Library/Framework: Python, FastAPI, Flask, SQLAlchemy, Alembic, Node.js, React.js, Next.js, Sequelize
    • Database: PostgreSQL, DBeaver
    • MQ: RabbitMQ
    • Monitoring/Logging: Prometheus, Grafana
    • Container Orchestration: Docker-compose, Kubernetes
    • IaaS: GCP, AWS, OCI(Oracle Cloud)
    • SaaS: Vercel, Heroku, Netlify, CloudAMQP, Supabase, ElephantSQL, Airtable, Sentry, SendGrid, Twilio, Stripe, ECPay, Retool, Grafana Cloud
    • Project Management/Issue Tracking: Jira, Clickup, Notion
    • Documentation: Notion, Confluence
    • Design Spec: Figma
    • Communication: Slack, Gmail, Google Meet

    b. Describe a workflow for deploying code from development into production system.

    1. [Local] Create a new feature/bug-fix branch.
    2. [Local] Edit and commit the code.
    3. [Local] Push local branch to remote.
    4. [CI] Lint.
    5. [CI] Build image for current feature branch.
    6. [CI] Test the built image.
    7. [PR/MR] Create a pull/merge request to ask peers to review.
    8. [PR/MR] Once the PR/MR is accepted, start following steps automatically or manually.
    9. [CI] Tag the built image with a unique version number to release the accepted code change.
    10. [CI] Trigger the deployment of current released image to environments like dev, staging, sit and uat, depending on the context. (QA, Alpha release, Beta release, etc.) During the deployment, the database schemas may be upgraded, and the services may be rolling updated.
    11. [CI] Once the testing is passed, trigger the deployment of current released image to production.
    12. [CI] Cleanup the caches or staled stuff.

    c. What would be the top 3 things you would check when doing a code review for a junior developer?

    1. Functionality

      First of all, I will make sure the implementation matches the requirement. Especially for those logical errors or runtime errors, they can't be found statically during build time. For some example. Is the operation transactional? Are there edge cases to crash the service?

    2. Performance

      Then I will start to analyze both time complexity and space complexity. I will see if there exists an improved solution or a better tradeoff between time and space. Some classic issues would like N+1 query, full-table scan, etc.

    3. Scalability

      The third thing I would like to check is whether the current implementation is easy to scale. Is the coding style fit into the style guide of the project? Is the code readable and easy to understand? Is the interface open for extension but close for modification? Will this implementation cause side effects in a distributed system?

  2. Design

    Suppose you have an API service. The single source of truth for this system is a relational database (MySQL). For certain queries, you want to optimise query
    performance using a full-text search database, i.e. Elasticsearch.
    The API server and MySQL are located in the same datacentre, but Elasticsearch we uses a hosted service (elastic.co), so network failures across datacenters need to be considered.
    The current implementation is as follows: For some fields update in the MySQL (e.g. title of a posts table), the API would double the write to the Elasticsearch synchronously, so when there is a network issue the data between the MySQL and Elasticsearch will become inconsistent.

    try:
    # Update MySQL
    cursor.execute("UPDATE posts SET title = 'test' WHERE id = 1")
    db.commit()
    # Update ES, when network issue happen the data will become inconsistent
    res = es.index(index="posts-index", id=1, document={'title': 'test'})
    except Exception as err:
    print("Error when updating title")

    If we allow the data update to Elasticsearch to become async with max delay = 1 to 2 mins, but we want the recovery process (re-sync) to happen fully automatically when network connectivity between data centres is restored. Can you propose a design support this requirement and illustrate with a diagram (no code required)?

    To support such requirement, I propose introducing "Message Queue" and "Worker Process" into the system. They will add up to comprise four components in the system, i.e.

    1. API Server & MySQL

      Once any update has been committed into MySQL, the API Server should publish an event to Message Queue instead of updating Elasticsearch directly.

    2. Message Queue

      There is no special behavior in this component. The importance of this Message Queue is to buffer all the updates during network failures.

    3. Worker Process

      The worker process is responsible for consuming events from Message Queue. By reducing each event or batch of events, the worker process is able to prepare a snapshot of a post. Then these post snapshots can be synced to Elasticsearch.

    4. Elasticsearch
      Nothing has to change in this component.

    With the durability of Message Queue, the updating information could be restored even when there is a networking issue. To be clear, here is a diagram to illustrate this design:

    +------------+ +---------+
    | | 1. Commit update | |
    | API Server | -----------------> | MySQL |
    | | | |
    +------------+ +---------+
    |
    | 2. Publish update event
    |
    V
    +---------------+
    | |
    | Message Queue |
    | |
    +---------------+
    |
    | 3. Process messages
    |
    V
    +----------------+
    | |
    | Worker Process |
    | |
    +----------------+
    |
    | 4. Sync snapshots
    |
    V
    +---------------+
    | |
    | Elasticsearch |
    | |
    +---------------+