System design Trending:

Failure oriented,

Tolerate errors,

Anytime kill itself and will not impact others, micro architecture

How

The process of defining the architecture, component, modules, interfaces, and data for a system to satisfy specified requirements

General Process

Conceptual Desing, Transparentcy black, Marco level

Logical desing, Transparentcy grey,

Physical design, Transaprentcy White, Micro Level

5 Steps for cracking a desin(SNAKE)

based on example of Netflix

  • Scenario: ask features / interfaces / DAU or QPS
    • Enumerate functions: Register/login, Play movie, Movie recommendation
    • Sort top functions: Play movie(get channels, movies in channel, play a movie in a channel)
    • providing API: method name
  • Necessary: constrain / hypothesis
    • Ask for daily active users
    • Predict Users
      • average concurrent users = daily active users/ daily seconds * average online time
      • Predict peak users = 6 * average concurrent users
      • Predict peak users in 3 months = Predict Peak user * 2
    • Predict traffic(band use)
      • traffic per user = 3mbps
      • Max peak traffic = predict peak users * traffic per user
    • Predict Memory
      • Memory per user: 10KB
      • Max daily memory = daily active users * memory per user
    • Predict Stroge
      • Total movie: 10000
      • Average movie size: 90min -> 120min * size per minute
      • Movie strage = Total moview * average movie size
  • Application: split application / service / module / algorithm
    • Replay the case, add a service for each request
    • Merge the services
  • Kilobit: data
    • Append dataset for each request below a service
    • Choose storage types: MySQL, MongoDB, Files?
  • Evolve (improve or scale): sharding, optimize, special case
    • Analyze (consider some)
      • with Better: constrains
      • with Broader: new cases
      • with Deeper: details
      • from Permance
      • from Scalability
      • from robustness (reliability)
    • Go back by evolving accordingly

Compare Pull/Push Model

Use Push Model

  • less resource
  • less coding
  • low requirement for realtime
  • less user post
  • Bi-direction follower/following relationship, without super star (similar to broadcast)

Use Pull Model

  • more resource
  • realtime requirement
  • a lot of user post
  • single direction follower/following, has super star(similar to broadcast)

Pull Model: (need sync DB read)

Every time request news feed, server query and merge all related data then return

Get News Feed => n DB read, sync have to wait

Post Feed => 1 DB write

Process:

  • Request me news feed
  • Get followings
  • Get news feed from every followings (DB sync read, user must be waited)
  • Merge and return

Optimize:

  • cache users news feeds, 1000 feeds, n DB sync read -> n cache read
    • trade off: not cache all
  • 100 = Memcached QPS / MySQL QPS

Push Model:

server side add a table for every user, including all related data, once request news feed, just fetch and return existed data

struct NewsFeedTable {
    var id: Int
    var userId: Foreign Key
    var feedId: Foreign Key
}

Get News Feed => 1 DB read

Post News Feed => n DB write, n = followings, async write

Process:

  • Post a feed
  • Server insert feed into DB
  • Server async send feed to its followers
    • Get its followings
    • Fanout: insert feed to its followers

Optimize

  • Disk is cheap, storing NewsFeedTable
  • Inactive User waste storage,
    • sort by last login time,
    • ignoring some inactive user when insert feeds
  • Super star(ton of followers) is slow to push,
    • let fans of super star use pull model
    • others use push model
    • Trade off: How to determine Super star?
      • <1m followers: push
      • 1m - 10m: pull + push (user merge when pull)
      • >10m: pull

Design New Feed Summary

  • Requirement Analysis: Scenario
  • Data/Scale Prediction: Needs
  • Application Graph
  • Schema Design
  • Push Model vs Pull Model
    • Super star problem
    • Inactive users
  • following & unfollowing
  • Normalize & De-normalize
    • count Likes: de-normalize, count in feed model, every day do a calibration
    • comments: Normalize, count in DB
  • Hot Spot
    • How about cache dropping hot spot item?, leasing get

results matching ""

    No results matching ""