type
status
date
slug
summary
tags
category
icon
password
SLA - 99.9% 99.99%

Distributed Computation

Processes

Abstract clusters, physical/virtual machines, processors, cores…
Assume:
  • Finite set of processes
    • the set of processes in the system is denoted by Π, where N = |Π|
    • each process is modeled as a state machine
  • Unique identities
  • Message passing (p2p links)

Local steps

进程计算按步骤进行
  • Send event: put a message in outbufi[j], for every neighbor pj
  • Receive event: get messages m from inbufi[*]
  • Computation event: Perform local computation → Change state
State S'=f(S,msg)

Message passing

Inbuf Outbuf:输入输出缓冲区

Execution and Configuration

一次execution是一系列步骤都执行
一个configuration是一个vector of individual process states(S1,S2,…)
每一步产生一个新的conf:C0, s1, C1, s2, C2….
一个conf是一个全局状态,任何节点都不可见

JobHandler Specification

notion image
jobhandler → jh
  • Request: < jh, Submit | job >: Requests a job to be processed.
  • Indication: < jh, Confirm | job >: Confirms that the given job has been (or will be) processed ⇒ callback function
  • Properties:
    • JH1. Guaranteed response: Every submitted job is eventually confirmed 分配的job一定要返还一个确认

Algorithm: Synchronous JH

(event-driven style事件驱动算法)
>>>
upon event < jh, Submit | job > do
process(job);
trigger < jh, Confirm | job >;
 
Synchronous 同步
Asynchronous 异步

Algorithm: Asynchronous JH

>>>
upon event < jh, Init > do
buffer := ∅;
upon event < jh, Submit | job > do
buffer := buffer ∪ {job};
trigger < jh, Confirm | job >;
upon buffer ≠ ∅ do
job := selectjob(buffer);
process(job);
buffer := buffer \ {job};
 
一个线程在处理submit,一个线程在处理process

Abstracting Failures

Process failures

A failure occurs when a process does not behave according to the algorithm
Failure processes:
notion image
  • Crash-stop
    • Process stops execution of the algorithm: crashes
  • Crash-recovery
    • Process might temporarily crash, but then recover its state and proceed taking steps
  • Omission
    • The process omits(省略) to send/receive messages it is supposed to send/receive
  • Byzantime
    • Unrestricted, arbitrary
    • Malicious behavior allowed
本课中只考虑crash-stop模型

Abstracting Communication

Assume:It is possible for messages to be lost,But, the probability for a message to reach its destination is nonzero

Fair-loss links(链路)

fair-loss链路是最弱的链路(利用丢失概率不为100%)
  • Request: < fll, Send | q, m >: Requests to send message m to process q
  • Indication: < fll, Deliver | p, m >: Delivers message m sent by process p.
  • Properties:
    • Fair-loss: If a correct process p infinitely often sends a message m to a correct process q, then q delivers m an infinite number of times.正确发送无限次,也会传递消息无限次
    • Finite duplication: If a correct process p sends a message m a finite number of times to process q, then m cannot be delivered an infinite number of times by q.优先次发送不会无限次传递
    • No creation: If some process q delivers a message m with sender p, then m was previously sent to q by process p.如果q传递了p发送的m,那么m一定是先由p发送的

Stubborn Links

  • Properties
    • Stubborn delivery:如果正确的进程p向正确的进程q发送消息m,那么q将传递m无限次。该属性确保了每条消息被接收方传递无数次(这是由发送方发送无数次而导致的)。
    • No creation
fair-loss links可以作为底层构建stubborn links,每过delta时间就重发,无限次发会无限次传递

Perfect Links

  • Request: < pl, Send | q, m >: Requests to send message m to process q.
  • Indication: < pl, Deliver | p, m >: Delivers message m sent by process p
  • Properties
    • PL1. Reliable delivery: If a correct process p sends a message m to a correct process q,then q eventually delivers m.(双边正确一定传递)
    • PL2. No duplication: No message is delivered by a process more than once.(只会传递一次)
    • PL3. No creation: If some process q delivers a message m with sender p, then m was previously sent to q by process p.
算法:Eliminate Duplicates消除重复
正确进程通过记录过去传递的消息集合,当收到一条消息时,只有当它不重复的时候,该消息才会被传递

Timing Assumption

是否有时间上界:
  • communication delay
  • process speeds
  • clock drift

Sync message passing

规定每轮的传递时延+处理时延+时钟差异,将算法切分轮次

Async

不做时间假设

Eventually sync model

Formally, propagation delay(delta delay) and process speed( delta proc ) are bounded after some unknown time(时不时会抖动,但抖动会消失)
 

Abstracting time

Failure detector

故障检测器:心跳机制
Async情况下经过2delta后由于抖动可能无法收到ACK
  • Indication: < P, Crash | p >: Detects that process p has crashed(自动发起)
  • Properties
    • Completeness
    • Acc
分类:
  • Perfect:
    • Strong Completeness:出错一定会被怀疑
    • Strong ACC:没有出错不会被怀疑
  • Eventually P(菱形P)
    • Strong Completeness
    • Eventual Strong ACC:最终没有正确进程被怀疑
算法:Exclude on Timeout
notion image
notion image

Eventually perfect failure detector

  • Indication: < ◇P, Suspect | p >: Notifies that process p is suspected to have crashed.
  • Indication: < ◇P, Restore | p >: Notifies that process p is not suspected anymore
notion image
notion image

Leader election

在一组进程中选择一个进程作为当前进程组的唯一代表,即领导者。如果当前的领导者崩溃,则应该选择一个新的领导者。
  • Indication: < le, Leader | p >: indicates that process p is elected as leader
属性:
  • Eventual detection:除非不存在正确的进程,否则一些正确的进程最终被选举为领导者。
  • ACC:只有领导者崩溃的情况下才能够选举新的领导者,不会有两个leader
算法:Monarchical(君主制) Leader Election
notion image
利用了进程间的先验排序(a-priori ranking)。一个进程被选择成为领导者的前提是所有具有更高秩的进程都已经崩溃。
notion image

Eventual leader detector

  • Indication: < Ω, Trust | p >: Indicates that process p is trusted to be leader.
  • 属性
    • ELD1. Eventual accuracy: 在某个时间点后,每个正确进程都会信任一些正确进程。
    • ELD2. Eventual agreement: 在某个时间点后,不存在两个正确进程信任不同的正确进程。
  • 算法:每次都选择未被怀疑的、rank 最大的进程作为当前的领导者。
    • notion image
 
notion image
由于错误检测器最终完美,因此选举最终正确
Consensus Variants Consensus