Think Distributed Systems you own this product

Dominik Tornow

August 2025
ISBN 9781633436176
192 pages

Included with a Manning Online subscription

printed in black & white

available in Russian, Simplified Chinese

catalog / Software Development / Software Engineering / Distributed Systems

table of content

1 Thinking in distributed systems: Models, mindsets, and mechanics

1.1 Software engineering and mental models

1.1.1 Mental models: The foundation of reasoning

1.1.2 Correct mental models

1.1.3 Complete mental models

1.2 Mental model of software systems

1.3 Different types of models

1.3.1 Different models describing the same aspects

1.3.2 Different models describing different aspects of a system

1.4 Thinking about distributed systems

1.4.1 Correctness

1.4.2 Scalability and reliability

1.4.3 Responsiveness

1.5 Two big ideas

1.5.1 Systems of systems

1.5.2 Global view vs. local view

1.6 Distributed Systems Incorporated

1.7 Navigating complexity

1.7.1 Simple yet complex

1.7.2 Emergent behavior

1.7.3 Changing perspective

1.7.4 Think globally; act locally

1.8 Thinking above the code

2 System models, order, and time

2.1 System models

2.1.1 Theory and practice

2.1.2 Synchronous distributed systems

2.1.3 Asynchronous distributed systems

2.1.4 Partially synchronous systems

2.1.5 Component and network behavior

2.1.6 Realistic system models

2.2 Order and time

2.2.1 The happened-before relationship

2.2.2 Time and clocks

2.2.3 Physical time and physical clocks

2.2.4 Logical time and logical clocks

2.2.5 Physical clocks vs. logical clocks

3 Failure tolerance

3.1 In theory

3.2 Types of failure tolerance

3.2.1 Masking failure tolerance

3.2.2 Nonmasking failure tolerance

3.2.3 Fail-safe failure tolerance

3.2.4 None of the above

3.3 In practice

3.3.1 System model

3.3.2 Failure handling

3.3.3 Failure classification

3.3.4 Failure detection

3.3.5 Failure mitigation

3.3.6 Putting everything together

4 Message delivery and processing

4.1 Exchanging messages

4.2 The uncertainty principle of message delivery and processing

4.2.1 Before sending the request

4.2.2 After sending the request and before receiving a response

4.2.3 After receiving a response

4.3 Silence and chatter

4.4 Exactly-once processing semantics

4.5 Idempotence

4.6 Case study: Charging a credit card

5 Transactions

5.1 Abstractions

5.2 The magic of transactions

5.2.1 Concurrency

5.2.2 Failure

5.3 The model of transactions

5.3.1 Correctness

5.3.2 Serializability

5.3.3 Completeness

5.3.4 Application-level abort

5.3.5 Platform-level abort

6 Distributed transactions

6.1 Atomic commitment: From a single RM to multiple RMs

6.1.1 Transaction on a single RM

6.1.2 Transaction on multiple RMs

6.1.3 Blocking and nonblocking

6.2 The essence of distributed transactions

6.3 Two-Phase Commit protocol

6.3.1 In the absence of failure

6.3.2 In the presence of failure

6.3.3 Improvement

7 Partitioning

7.1 Encyclopedias and volumes

7.2 Thinking in partitions

7.3 The mechanics of partitioning and balancing

7.4 (Re)partitioning

7.4.1 Types of partitioning

7.4.2 Data item to partition assignment strategies

7.5 Common item-based assignment strategies

7.5.1 Range partitioning

7.5.2 Hash partitioning

7.6 Repartitioning

7.6.1 Range partitioning

7.6.2 Hash partitioning

7.7 Consistent hashing

7.8 (Re)balancing and overpartitioning

8 Replication

8.1 Redundancy

8.2 Thinking about replication and consistency

8.3 Replication

8.4 The mechanics of replication

8.4.1 System model

8.4.2 Replication lag

8.4.3 Synchronous vs. asynchronous replication

8.4.4 State-based vs. log-based replication

8.4.5 Single-leader, multileader, and leaderless systems

9 Consistency

9.1 Consistency models

9.1.1 Common consistency models

9.1.2 Virtues and limitations

9.2 Linearizability

9.2.1 Queue and stack

9.2.2 Formal definition of linearizability

9.3 Eventual consistency

9.3.1 The shopping cart

9.3.2 Variants of eventual consistency

9.3.3 Implementation

9.4 Consistency, availability, and partition tolerance

9.4.1 History

9.4.2 Conjecture vs. theorem

9.4.3 CAP theorem

10 Distributed consensus

10.1 The challenge of reaching agreement

10.2 System model

10.3 State machine replication

10.4 The origin—and irony—of consensus

10.5 Implementing consensus

10.5.1 Leader-based consensus

10.5.2 Quorum-based consensus

10.5.3 Combining leader and quorum

10.6 Raft

10.6.1 The log

10.6.2 Terms

10.6.3 Leader Election protocol

10.6.4 Log Replication protocol

10.6.5 State machine safety

10.7 Raft puzzles

10.7.1 Puzzle 1

10.7.2 Puzzle 2

10.7.3 Puzzle 3

11 Durable executions

11.1 The pitfalls of partial executions

11.2 System model

11.2.1 Process definition

11.2.2 Process execution

11.3 The concept of failure-transparent recovery

11.4 Strategies of failure-transparent recovery

11.4.1 Restart

11.4.2 Resume

11.5 Implementation of failure-transparent recovery

11.5.1 Application-level implementation: Sagas

11.5.2 Platform-level implementation: Durable execution

12 Cloud and services

12.1 From proactive to reactive

12.2 Cloud computing

12.3 Cloud-native computing

12.4 Serverless computing

12.4.1 Traditional

12.4.2 Serverless

12.4.3 Cold path vs. hot path

12.5 Service

12.5.1 Global view vs. local view

12.5.2 Example recommendation service

12.6 Final thoughts

Overview

6 Distributed transactions

Distributed transactions coordinate changes across multiple resource managers so that a set of local transactions behaves as a single atomic unit. The chapter frames atomic commit as the core concern: all participants must either commit or abort together. It introduces a clear mental model in which each local transaction advances through working, prepared, and finally committed or aborted states, and defines correctness via safety (no conflicting outcomes among participants) and liveness (everyone eventually decides). Blocking versus non-blocking commit protocols are contrasted, with non-blocking protocols tolerating a single participant failure without preventing a decision.

The chapter then presents Two-Phase Commit (2PC), the most widely used atomic commit protocol. A client drives work at each resource manager but delegates the final decision to a single coordinator, which runs two phases. In the Prepare phase, the coordinator asks all participants to vote; each either votes to commit (after durably logging that intent) or unilaterally aborts. In the Commit phase, the coordinator decides: if all vote to commit, it logs and broadcasts commit; otherwise, it logs and broadcasts abort, and each participant durably records and applies the outcome. This design ensures atomicity across systems and, in failure-free conditions, provides both safety and liveness.

Under failures, 2PC remains safe but can block. Participant failures are recoverable without blocking: before voting, recovery implies abort; after voting to commit, a participant queries the coordinator; after recording the final outcome, it performs REDO or UNDO. Coordinator failures are the crux: once any participant has voted to commit, those participants may be stuck until the coordinator recovers, because they cannot know the global decision. Variants try to reduce blocking—such as letting participants consult each other—but subtle issues (for example, timeouts and imperfect clocks) can reintroduce safety risks. The takeaway is that 2PC is fundamentally a blocking protocol whose careful logging, state transitions, and recovery rules uphold safety while leaving liveness vulnerable to coordinator failure.

From a single RM to multiple RMs

How do we coordinate and guarantee multiple commits?

A distributed transaction consists of two or more non-distributed transactions.

State Machine of non-distributed transactions

Global transaction (outstanding messages not illustrated)

Two Phase Commit protocol

A resource manager fails before persistently recording ⟨Vote-To-Commit⟩ or ⟨Abort⟩.

A resource manager fails after persistently recording ⟨Vote-To-Commit⟩ or ⟨Abort⟩.

A resource manager fails after persistently recording ⟨Commit⟩.

A resource manager fails after persistently recording ⟨Abort⟩.

Failure of transaction coordinator after the first commit

Failure of transaction coordinator before the first commit

Summary

Distributed transactions extend non-distributed transactions to span multiple resource managers.
A distributed transaction, also referred to as a global transaction, consists of two or more non-distributed transactions, also referred to as local transactions.
Atomic Commit Protocols ensure distributed transactions achieve a unanimous commit or abort decision, upholding atomicity across resource managers.
Blocking commit protocols guarantee safety but not liveness in the presence of failure.
Non-blocking commit protocols guarantee safety and liveness in the presence of failure.
The Two-Phase Commit (2PC) protocol is the most well-known and the most well-studied atomic commit protocol.
2PC divides participants into a transaction coordinator and resource manager and operates in two phases: the Prepare Phase and the Commit Phase.
2PC guarantees safety and liveness in the case of a resource manager failure.
2PC guarantees safety in the case of the transaction coordinator failure.

FAQ

What is a distributed transaction and what is a resource manager (RM)?

A distributed transaction spans changes across multiple systems. Each participating system is called a resource manager (RM). RMs include databases and other systems such as message queues. In this chapter, “resource manager” and “database system” are used interchangeably.

Why do we need atomic commit protocols across multiple RMs?

When a logical operation (like a money transfer) touches data on different RMs, we must prevent disagreement where one side commits while the other aborts. Atomic commit protocols ensure all sub-transactions unanimously commit or unanimously abort.

How is atomicity ensured on a single RM versus across multiple RMs?

On a single RM, atomicity is achieved by one atomic write to its local log: write Commit or Abort. If the RM fails before writing either, it recovers as if Abort was written. Across multiple RMs, atomicity is achieved by running an atomic commit protocol that coordinates all participants.

What do safety and liveness mean, and what distinguishes blocking from non-blocking commit protocols?

Safety: no two participants reach conflicting decisions (no one commits while another aborts). Liveness: every participant eventually reaches a final decision (commit or abort). Blocking protocols guarantee safety but not liveness under participant failures. Non-blocking protocols guarantee both safety and liveness in the presence of a single participant failure.

What are the states of a local (non-distributed) transaction during a distributed transaction?

Local transactions move through: Working (executing operations), Prepared (waiting for commit/abort decision), and then a final state: Committed or Aborted. From Working, a transaction can abort (e.g., constraint violation) or prepare; from Prepared, it commits or aborts upon instruction.

How does Two-Phase Commit (2PC) work when there are no failures?

2PC has a coordinator and two or more RMs. Phase 1 (Prepare): the coordinator logs Prepare and asks all RMs to vote. Each RM either logs Vote-to-Commit and replies yes, or logs Abort, replies Abort, and aborts locally. Phase 2 (Commit): if all vote yes, the coordinator logs Commit and instructs all to commit; otherwise (any Abort or timeout), it logs Abort and instructs all to abort. Each RM logs the final decision and applies it.

Why can participants abort unilaterally but not commit during 2PC’s Prepare phase?

Commit must be coordinated so all participants make the same decision; a single RM cannot safely commit alone. However, any RM can always abort safely on its own (e.g., on error or doubt). Hence the asymmetry: Vote-to-Commit versus Abort.

How does 2PC handle resource manager failures?

2PC remains safe and live despite RM failures. On recovery: if an RM failed before logging Vote-to-Commit or Abort, it logs Abort and informs the coordinator. If it failed after logging Vote-to-Commit, it must inquire the coordinator for the outcome. If it failed after logging Commit, it performs REDO; after logging Abort, it performs UNDO.

Why is 2PC considered a blocking protocol, and what happens if the coordinator fails?

2PC can block if the transaction coordinator fails after one or more RMs have voted to commit. Any RM that voted yes and is in Prepared cannot decide by itself; it must wait for the coordinator (or a reliable substitute) to learn whether to commit or abort. Example: the coordinator crashes after telling only one RM to commit—others are stuck.

What improvements reduce 2PC blocking, and why can naive timeouts violate safety?

Variants let RMs consult each other: if any RM has already committed or aborted, others can follow, reducing blocking. However, some executions still block. Simply timing out and aborting when all voted yes but no decision arrived can break safety due to unsynchronized clocks—some RMs might commit while others time out and abort.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more