Think Distributed Systems you own this product

Dominik Tornow

August 2025
ISBN 9781633436176
192 pages

Included with a Manning Online subscription

printed in black & white

available in Russian, Simplified Chinese

catalog / Software Development / Software Engineering / Distributed Systems

table of content

1 Thinking in distributed systems: Models, mindsets, and mechanics

1.1 Software engineering and mental models

1.1.1 Mental models: The foundation of reasoning

1.1.2 Correct mental models

1.1.3 Complete mental models

1.2 Mental model of software systems

1.3 Different types of models

1.3.1 Different models describing the same aspects

1.3.2 Different models describing different aspects of a system

1.4 Thinking about distributed systems

1.4.1 Correctness

1.4.2 Scalability and reliability

1.4.3 Responsiveness

1.5 Two big ideas

1.5.1 Systems of systems

1.5.2 Global view vs. local view

1.6 Distributed Systems Incorporated

1.7 Navigating complexity

1.7.1 Simple yet complex

1.7.2 Emergent behavior

1.7.3 Changing perspective

1.7.4 Think globally; act locally

1.8 Thinking above the code

2 System models, order, and time

2.1 System models

2.1.1 Theory and practice

2.1.2 Synchronous distributed systems

2.1.3 Asynchronous distributed systems

2.1.4 Partially synchronous systems

2.1.5 Component and network behavior

2.1.6 Realistic system models

2.2 Order and time

2.2.1 The happened-before relationship

2.2.2 Time and clocks

2.2.3 Physical time and physical clocks

2.2.4 Logical time and logical clocks

2.2.5 Physical clocks vs. logical clocks

3 Failure tolerance

3.1 In theory

3.2 Types of failure tolerance

3.2.1 Masking failure tolerance

3.2.2 Nonmasking failure tolerance

3.2.3 Fail-safe failure tolerance

3.2.4 None of the above

3.3 In practice

3.3.1 System model

3.3.2 Failure handling

3.3.3 Failure classification

3.3.4 Failure detection

3.3.5 Failure mitigation

3.3.6 Putting everything together

4 Message delivery and processing

4.1 Exchanging messages

4.2 The uncertainty principle of message delivery and processing

4.2.1 Before sending the request

4.2.2 After sending the request and before receiving a response

4.2.3 After receiving a response

4.3 Silence and chatter

4.4 Exactly-once processing semantics

4.5 Idempotence

4.6 Case study: Charging a credit card

5 Transactions

5.1 Abstractions

5.2 The magic of transactions

5.2.1 Concurrency

5.2.2 Failure

5.3 The model of transactions

5.3.1 Correctness

5.3.2 Serializability

5.3.3 Completeness

5.3.4 Application-level abort

5.3.5 Platform-level abort

6 Distributed transactions

6.1 Atomic commitment: From a single RM to multiple RMs

6.1.1 Transaction on a single RM

6.1.2 Transaction on multiple RMs

6.1.3 Blocking and nonblocking

6.2 The essence of distributed transactions

6.3 Two-Phase Commit protocol

6.3.1 In the absence of failure

6.3.2 In the presence of failure

6.3.3 Improvement

7 Partitioning

7.1 Encyclopedias and volumes

7.2 Thinking in partitions

7.3 The mechanics of partitioning and balancing

7.4 (Re)partitioning

7.4.1 Types of partitioning

7.4.2 Data item to partition assignment strategies

7.5 Common item-based assignment strategies

7.5.1 Range partitioning

7.5.2 Hash partitioning

7.6 Repartitioning

7.6.1 Range partitioning

7.6.2 Hash partitioning

7.7 Consistent hashing

7.8 (Re)balancing and overpartitioning

8 Replication

8.1 Redundancy

8.2 Thinking about replication and consistency

8.3 Replication

8.4 The mechanics of replication

8.4.1 System model

8.4.2 Replication lag

8.4.3 Synchronous vs. asynchronous replication

8.4.4 State-based vs. log-based replication

8.4.5 Single-leader, multileader, and leaderless systems

9 Consistency

9.1 Consistency models

9.1.1 Common consistency models

9.1.2 Virtues and limitations

9.2 Linearizability

9.2.1 Queue and stack

9.2.2 Formal definition of linearizability

9.3 Eventual consistency

9.3.1 The shopping cart

9.3.2 Variants of eventual consistency

9.3.3 Implementation

9.4 Consistency, availability, and partition tolerance

9.4.1 History

9.4.2 Conjecture vs. theorem

9.4.3 CAP theorem

10 Distributed consensus

10.1 The challenge of reaching agreement

10.2 System model

10.3 State machine replication

10.4 The origin—and irony—of consensus

10.5 Implementing consensus

10.5.1 Leader-based consensus

10.5.2 Quorum-based consensus

10.5.3 Combining leader and quorum

10.6 Raft

10.6.1 The log

10.6.2 Terms

10.6.3 Leader Election protocol

10.6.4 Log Replication protocol

10.6.5 State machine safety

10.7 Raft puzzles

10.7.1 Puzzle 1

10.7.2 Puzzle 2

10.7.3 Puzzle 3

11 Durable executions

11.1 The pitfalls of partial executions

11.2 System model

11.2.1 Process definition

11.2.2 Process execution

11.3 The concept of failure-transparent recovery

11.4 Strategies of failure-transparent recovery

11.4.1 Restart

11.4.2 Resume

11.5 Implementation of failure-transparent recovery

11.5.1 Application-level implementation: Sagas

11.5.2 Platform-level implementation: Durable execution

12 Cloud and services

12.1 From proactive to reactive

12.2 Cloud computing

12.3 Cloud-native computing

12.4 Serverless computing

12.4.1 Traditional

12.4.2 Serverless

12.4.3 Cold path vs. hot path

12.5 Service

12.5.1 Global view vs. local view

12.5.2 Example recommendation service

12.6 Final thoughts

Overview

8 Replication

This chapter motivates replication through the lens of durability in transactional systems: once a system “makes a promise,” it must not backtrack—even in the face of crashes and network faults. To avoid single points of failure, we add redundancy, which is not just duplication but duplication plus coordination. Redundancy can increase reliability and sometimes scalability, yet the relationship is nuanced. The text distinguishes static (unchanging) from dynamic (evolving) redundancy, and uses classic majority voting to show how coordinated replicas can mask individual failures while preserving the behavior of a single logical component.

Before formal models, the chapter reframes replication as representing “one logical thing” with multiple physical instances, highlighting the ambiguity of identity and equivalence. Using the example of books, editions, and copies, it shows how different perspectives yield different answers to what counts as “the same,” a question at the heart of replication and consistency. Replication transparency—the system’s ability to hide many replicas behind the illusion of one—requires careful balance between concealing and exposing details, trading off among consistency, availability, and latency.

The mechanics center on stateful replication, where complexity arises from change: updates must be propagated, and not all changes are equal (monotonic updates preserve knowledge; non-monotonic ones can invalidate it). The system model embraces partial synchrony, failures, and partitions, introducing inherent and imposed replication lag. The chapter contrasts synchronous and asynchronous replication and common quorum hybrids; state-based versus log-based (with deterministic state machines), noting industry preference for log-based; and topology choices—single-leader, multi-leader, and leader-less—each requiring conflict resolution (for example, last-write-wins or CRDTs) with practical pitfalls. Finally, it cautions that follower reads may lag, so fresh reads often require the leader, underscoring the core trade-offs that shape replication strategies.

Redundancy as duplication and coordination

Library inventory of Structure and Interpretation of Computer Programs

Replication represents a single logical object by multiple, identical physical objects.

A replicated key-value store

The network as point-to-point communication links between components

Replication lag: Instantaneous propagation of changes is impossible, resulting in an inherent lag.

Synchronous replication

Asynchronous replication

Single-leader, multi-leader, and leader-less.

Summary

Redundancy aims to improve the reliability of a system, growing beyond the reliability limits of a single resource.
Redundancy refers to the duplication and coordination of subsystems, so that an increase in the duplication factor results in increased reliability.
Static redundancy refers to redundancy where the set of components and their interactions do not change, while dynamic redundancy refers to redundancy where they do change.
Replication, the employment of multiple instances of “the same thing,” is the most common implementation of duplication.
Replication improves the reliability of distributed systems by distributing data across multiple resources, overcoming the limitations of a single resource.
Replication lag is an inherent aspect of distributed systems and complicates replication transparency and consistency.
Synchronous replication ensures consistency but may impact latency and availability, while asynchronous replication improves latency and availability but may impact consistency.
State-based replication propagates the current state of the system, while log-based replication propagates the sequence of operations leading to the state.

FAQ

What is redundancy in distributed systems, and what are its main types?

Redundancy is the duplication and coordination of subsystems to improve reliability and/or scalability. There are two types: (1) Static redundancy: the set of components and their interactions do not change during the system’s lifetime (common in hardware). (2) Dynamic redundancy: components and their interactions can change over time (common in software).

How does redundancy relate to scalability?

Redundancy often aids scalability by distributing data across multiple nodes and enabling load partitioning (for example, round-robin across replicas). However, the relationship is not straightforward—coordination overheads or consistency requirements can also decrease scalability.

Why is durability important, and how does redundancy help achieve it?

Durability ensures that once a transaction is committed, its effects are permanent—no “backtracking.” In real systems subject to failures, a single component can fail and break promises (for example, shipping goods without actually capturing payment). Redundancy avoids single points of failure and helps uphold durable promises despite crashes or recoveries.

What does “duplication and coordination” mean in practice?

Duplicating components alone does not create a reliable system; the duplicates must be coordinated so they behave like one coherent component. Coordination can be simple (load balancing) or complex (consensus). For example, three replicated logic gates coordinated by a majority vote can tolerate one failure while preserving correct output.

What is replication and what is replication transparency?

Replication represents a single logical object by multiple, identical physical objects (replicas) to improve reliability. Replication transparency is the system’s ability to hide the fact that multiple replicas exist and present the illusion of a single object, balancing consistency, availability, and latency.

What system model assumptions matter for replication?

The system is partially synchronous and components can suffer Crash-Stop, Omission, and Crash-Recovery failures. Communication happens over unreliable, point-to-point links, which makes it natural to reason about network partitions (temporary, intermittent, or permanent link failures leading to message loss).

What is replication lag, and why is it unavoidable?

Replication lag is the delay between applying a change on one replica and that change being visible on others. Because only one component or the network takes a step at a time, updates cannot be applied simultaneously across replicas. There is inherent lag (fundamental to distribution) and imposed lag (from partitions and failures). Lag can break replication transparency, especially for reads on followers.

How do synchronous, asynchronous, and quorum (hybrid) replication differ?

Synchronous: an operation completes only after all replicas acknowledge the change. Pro: immediate consistency. Con: higher latency and reduced availability if any replica is slow or unreachable.
Asynchronous: an operation completes after the initial node processes it; replication happens in the background. Pro: low latency and higher availability. Con: replicas can be stale.
Quorum (hybrid): wait for acknowledgments from a majority (quorum) in the foreground; replicate to others asynchronously. Balances latency, availability, and consistency.

What’s the difference between state-based and log-based replication?

Assuming a deterministic state machine: (1) State-based replicates the current state (or a diff), regardless of the operations that produced it. (2) Log-based replicates the sequence of operations that led to the state. In practice, log-based is often preferred because it preserves ordering and enables deterministic replays.

How do single-leader, multi-leader, and leader-less replication compare, and how are conflicts handled?

Single-leader: one node accepts operations and propagates to followers—simple “chain of command,” but the leader can become a single point of failure.
Multi-leader: multiple leaders accept operations—avoids a single bottleneck but introduces concurrent updates that can conflict.
Leader-less: any node accepts operations—maximizes availability but requires conflict resolution.

Conflicts are resolved via strategies like last-write-wins (simple but can cause unexpected overwrites) or CRDTs (designed to converge without conflicts). Also note: reads from followers can be stale due to replication lag; read from the leader for the freshest data.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

eBook

pdf, ePub, online

$47.99 $35.99

you save $12.00 (25%)

include audio $24.99 $18.74

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more