5 The Raft consensus algorithm
Raft is an algorithm for managing a replicated log of the form described in Section 2. Figure 2 summarizes the algorithm in condensed form for reference, and Figure 3 lists key properties of the algorithm; the elements of these figures are discussed piecewise over the rest of this section.
Raft implements consensus by first electing a distinguished leader, then giving the leader complete responsibility for managing the replicated log. The leader accepts log entries from clients, replicates them on other servers, and tells servers when it is safe to apply log entries to their state machines. Having a leader simplifies the management of the replicated log. For example, the leader can decide where to place new entries in the log without consulting other servers, and data flows in a simple fashion from the leader to other servers. A leader can fail or become disconnected from the other servers, in which case a new leader is elected.
Given the leader approach, Raft decomposes the consensus problem into three relatively independent subproblems, which are discussed in the subsections that follow:
- Leader election: a new leader must be chosen when an existing leader fails (Section 5.2).
- Log replication: the leader must accept log entries from clients and replicate them across the cluster, forcing the other logs to agree with its own (Section 5.3).
- Safety: the key safety property for Raft is the State Machine Safety Property in Figure 3: if any server has applied a particular log entry to its state machine, then no other server may apply a different command for the same log index. Section 5.4 describes how Raft ensures this property; the solution involves an additional restriction on the election mechanism described in Section 5.2.
After presenting the consensus algorithm, this section discusses the issue of availability and the role of timing in the system.
5 一致性算法Raft
Raft是一种用于管理第2章描述的日志复制的算法。图2进行了简单概括以供参考,图3列出了算法的关键属性;在本文的其他章节讨论了这些图中的元素。
Raft首先选出一个杰出的leader,然后给予其管理日志复制的全部职责来实现一致性。该leader从客户端接收日志条目,将他们复制到其他服务器,并告诉他们它从状态机获取日志条目是安全的。只有一个leader简化了日志复制的管理。例如,该leader不需要和其他服务器沟通就决定将日志条目防于日志的位置,并且数据流以一个简单的方式从leader流向其他服务器。一个leader可能挂了或者链接断了,这时候就选举一个新的leader。
确定了leader策略,Raft将一致性问题分解成了三个独立的部分,请看下面的讨论:
- leader选举:当前leader挂了的时候必须选举出一个新的leader(5.2章)。
- 日志复制:leader必须从客户端接收日志条目并在集群中复制,强制替换其他不同的日志(5.3章)。
- 安全性:Raft的关键安全属性就是图3中的the State Machine Safety Property:假如随便一台服务器传递了一个特定的日志条目给它的状态机,然后其他服务器再也不能传递同一个日志索引的不同命令。第5.4章描述了Raft确保这个属性,该解决方案包含了第5.2章中选举机制的一个额外限制。
呈现了一致性算法后,本章节将讨论可用性问题和定时在系统中的角色。