Paxos: A majority to lead us all

Wait, that’s the wrong Paxos! credit

A new system constraint

Safety, Liveness, and Fault-Tolerance

  1. safety: The guarantee that there will never be a “bad result”, or more specifically, that the result will be identical from any node (which has not failed)
  2. liveness: The guarantee that a request directed to a node (which has not failed) will yield a result in a finite amount of time
  3. fault-tolerance: The guarantee that the system can survive the failure of a node at any point

What is Paxos?

Preparing a Vote

Making a Promise

Starting a Vote

Voting on a Proposal

Committing a Vote

Goodbye to unanimous votes, hello majority rule

Multi-Round Paxos and Side Tasks

Multi-Round Paxos

Leader Election

Catchup

Key Takeaways

Fault Tolerance

Accepted Actions are always Committed

  1. Consider a system of 2n + 1 processes, where a round of Paxos is initiated, with no proposals having been accepted so far.
  2. The first proposal accepted, named A, had to have been both promised by a majority of processes, and accepted by a majority of processes, which is a minimum of n + 1, meaning the majority of processes will now have A as their last accepted vote.
  3. At this point, assume the action fails to be committed.
  4. For any proposal going forward, the proposing node will have to get a promise back from a majority of processes before initiating a vote. Since there can be no 2 disjoint majorities in a set of processes, the majority set of processes which respond back to the initiator with a promise, and the majority set of processes which accepted A are guaranteed to overlap in at least 1 process. This means that the initiating process will get back A in the promise from one of the processes, and will change the action for the proposal to A.

Algorithm Analysis

Safety

Liveness

Close! But not perfect

Fault Tolerance

  1. Consider a proposal which is accepted by all followers, with the leader broadcasting to subsequently commit.
  2. Now, consider a follower which has attempted to commit, and responds back to the leader with the result.
  3. At this point, if both the leader and that follower crash, no processes in the system know the result of the operation on the crashed follower, and so even if a new leader comes up, it will not know whether to commit or abort the transaction, since it’s unsure what the crashed follower did, and the operation may have irreversible effects

Looking Forward

--

--

--

Working on storage management and infrastructure at Azure SQL

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Programming and Interviews, and Algorithms, oh my!

8 Ways in which Robotic Process Automation (RPA) Will Change IT Operations

Automatic Dynamic Secrets Retrieval in Microsoft Azure VMs with HashiCorp Vault

Hunting for Malware with Falco

Lifecycle Management of DBpedia Neural QA Models

Our experience with the Classic Retrospective format

You’ll see someone you truly love staring back at you every morning if you can do that.

Users Don’t Like Web Apps

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aashray Anand

Aashray Anand

Working on storage management and infrastructure at Azure SQL

More from Medium

How to Make Your Account More Secure

Where’s my train?

Developing a Path Shortcuts Manager

A GOOD SOFTWARE IS NOT JUST ABOUT SLOGANS