Overview

16 Designing multithreaded programs

This chapter explains how to design Python programs that run multiple threads safely and predictably. It clarifies concurrency versus parallelism, noting that while modern CPUs enable true parallel execution, CPython’s global interpreter lock (GIL) generally restricts threads to concurrent execution. The key design challenge is that threads share code, data, and external resources, so correct programs must coordinate access to shared state. Using Python’s threading module, the chapter shows how to structure threaded work, when the runtime naturally switches threads (e.g., during I/O), and why explicit synchronization is required to avoid subtle timing bugs.

The chapter first tackles race conditions and critical regions, introducing mutexes (locks) to enforce mutual exclusion. A simple printing example demonstrates how unprotected shared output interleaves unpredictably, and how a Lock (especially via the with statement) serializes access and prevents deadlock-prone mistakes. It then expands to semaphores to model the classic reader–writer problem: multiple readers may access a resource simultaneously, but writers need exclusive access. The solution combines a mutex to serialize writers and block readers during writes, a semaphore to admit multiple readers, and an Event to signal that the first write has occurred, illustrating the interplay of blocking, sleeping, and busy-waiting in realistic coordination.

For tighter coordination, the chapter introduces condition objects, which bundle a lock with signaling so threads can wait for specific state changes. Using a bounded queue, it implements the producer–consumer pattern where producers wait when the queue is full, consumers wait when it’s empty, and threads notify each other precisely when the queue transitions from empty to not empty or from full to not full. Crucially, waiting releases the lock, and awakened threads must recheck conditions to handle races. The chapter closes with cautions: thread scheduling is nondeterministic, debugging race conditions and deadlocks is hard, and synchronization and context switching add overhead—so more threads don’t automatically mean faster programs. It briefly notes alternatives like asyncio for I/O-bound concurrency and multiprocessing for true parallelism.

A multithreaded application with three threads. The threads share the application’s internal resources of code and data in memory and external resources such as printers and files. The runtime system gives each thread its own runtime stack and the appearance of having its own set of machine registers.
The critical regions of Thread A and Thread B modify variable X, the shared resource. A mutex object guards the critical regions to protect the shared resource. A thread that successfully locks the mutex can proceed into its critical region. Attempting to lock a mutex that’s already locked causes the thread to block. The thread that’s exiting its critical region must unlock the mutex to allow another thread to lock it.
How a mutex guards a critical region. It only allows one thread at a time to be in its critical region.
Three threads simultaneously print to the shared print stream, the unprotected shared resource.
With a mutex guarding the critical regions of the threads, only one thread at a time can print to the print stream, which is the shared resource. When a thread wants to go into its critical region, it attempts to lock the mutex. If it succeeds, it can enter its critical region and print. If it doesn’t succeed, it blocks until another thread unlocks the mutex. Then the thread can attempt to lock the mutex again. A thread must unlock the mutex before it exits its critical region. Whenever the mutex unlocks, the runtime system determines which blocked thread can lock it.
Three technicians simultaneously attempt to set the meter’s value. Meanwhile, three loggers attempt to read and log the meter’s current setting. Therefore, the meter is the shared resource. Only one technician at a time should be setting the meter, and no logger should read the meter while a technician is setting it. However, as long as no technician is setting the meter, multiple loggers can be reading it at the same time.
The technician threads and the logger threads each loop and attempt to go into their critical regions to access the meter, which is the shared resource. The setting_mutex prevents more than one technician thread at a time from setting the meter. The same mutex prevents a logger from reading and logging the meter’s value while a technician thread is setting the meter. The logging_semaphore allows multiple reader threads to simultaneously read the meter, and it will block technician threads from setting the meter when any reader thread is reading it.
Simultaneously, multiple producers enter values into the bounded queue and multiple consumers remove values from the queue. The queue, the shared resource, has limited capacity.
The producer and consumer threads each attempts to acquire queue.condition at the top of its loop. The condition object synchronizes the producer and consumer threads. The producer threads loop to enter values into the shared bounded queue, and the consumer threads loop to remove values from the queue. A producer thread cannot enter a value into the queue when the queue is full. A consumer thread cannot remove a value from an empty queue.
In order by time from top to bottom, each line shows the action of one of the producer or consumer threads, depending on which column the line begins in. The circled integer is the value entered into the shared bounded queue by a producer thread, or the value (shown as a negative value) removed from the queue by a consumer thread. The list in square brackets after each circled integer is the contents of the queue after the action. A zero value is the end sentinel.

Summary

  • Knowing how to design multithreaded programs is important for applications that inherently have concurrent operations. Designing, developing, and debugging multithreaded applications are major challenges.
  • Each thread is an execution path through the program.
  • Python’s global interpreter lock (GIL) limits a program’s multiple threads to execute concurrently and not in parallel. The runtime engine switches rapidly among the threads.
  • Multiple threads may attempt to access shared resources of the application simultaneously. We must protect the shared resources.
  • The code in each thread that accesses the shared resource is the thread’s critical region associated with that resource.
  • Use a mutex to guard each critical region with mutual exclusion.
  • Python implements a mutex with a Lock object.
  • A thread attempts to acquire (lock) a mutex. A mutex that’s already locked blocks the thread from proceeding into a critical region. Once the mutex is released (unlocked), the thread can try again to acquire the mutex.
  • A thread that successfully acquires a mutex does not allow any other thread to acquire that mutex. Therefore, only the thread that acquires the mutex can be in the critical region.
  • A Semaphore object allows multiple reader threads to simultaneously acquire the semaphore to read a shared resource without modifying it. But it also blocks writer threads from modifying the resource while reader threads are reading it.
  • To allow another thread to run, a thread must release a mutex or a semaphore before it exits its critical region. Otherwise, a deadlock may occur.
  • Deadlocks, race conditions, and random effects are major debugging challenges.
  • An Event object enables a thread to signal another thread.
  • A Condition object combines the functionalities of a mutex and an Event object to synchronize the simultaneous operation of multiple threads.
  • A thread waits on a Condition object as long as an associated condition remains true. When the condition object is notified by another thread, a thread that’s waiting on the condition variable unblocks and checks if the condition is still true. If the condition is false, the thread can proceed into its critical region. Otherwise, it resumes its wait.

FAQ

What’s the difference between concurrency and parallelism in Python, and how does the GIL affect them?Concurrency means threads make progress by time-slicing (context switching) on one or more CPUs; parallelism means threads literally run at the same time on separate cores. In standard CPython, the Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time, so CPU-bound Python threads are concurrent but not parallel. Threads are still useful for I/O-bound or inherently concurrent tasks, and the chapter assumes the GIL is enabled.
When should I use threads in Python?Use threads when your design inherently requires simultaneous activities (for example, producers and consumers sharing a queue, or readers and a writer coordinating over a shared resource) or when your workload is I/O-bound and benefits from overlapping waits. Avoid threading for pure CPU-bound parallel speed-ups under the GIL; consider multiprocessing or native extensions for that.
What is a critical region, and how does a mutex (Lock) protect it?A critical region is code that accesses a shared resource (memory, file, device). A mutex ensures only one thread enters that region at a time. Threads acquire the lock before the region and release it after; if the lock is already held, a thread blocks until it can acquire it. This prevents race conditions and keeps shared state consistent.
How should I use Lock correctly (acquire/release vs with)?Prefer the with statement: with lock: ... It acquires the lock on entry and reliably releases it even if an exception occurs. Manually calling acquire() and release() works but is easier to misuse and can cause deadlocks if release is skipped.
What is a race condition, and how did the printing example fix it?A race condition occurs when a result depends on the unpredictable timing/order of thread operations, such as multiple threads writing interleaved characters to stdout. The fix is to guard the printing critical region with a single shared Lock so only one thread writes a whole message at a time. Adding time.sleep() can demonstrate scheduling effects without breaking the protection.
What problem does a semaphore solve in the reader–writer scenario?A semaphore lets multiple readers enter concurrently while still excluding writers. Initialize it to the maximum allowed concurrent readers. Each reader acquires (decrements) before reading and releases (increments) after. A writer waits until the semaphore indicates no active readers, then uses a mutex to ensure exclusive write access. This enforces “many readers or one writer, but not both.”
Why use an Event in the reader–writer example?The Event coordinates startup so readers don’t log before the meter has ever been set. Technicians set the Event the first time they establish a value; reader threads wait() on the Event before attempting to read. Once set, subsequent set() calls have no extra effect.
What is a Condition object, and why is it ideal for producer–consumer with a bounded queue?A Condition combines a lock with a signaling mechanism. Producers and consumers acquire the condition, check the queue state, and if they can’t proceed (full or empty), they wait(), which atomically releases the lock and blocks. When state changes (queue becomes not empty or not full), a thread notify()s to wake a waiter. This pattern cleanly coordinates “wait until state changes, then proceed,” avoiding busy waiting.
Why must a thread re-check the condition after being notified?Because notifications don’t guarantee the condition still holds—another thread may have run first and restored the “full” or “empty” state. The correct pattern is: with condition: while condition_is_true: condition.wait(). This guards against timing races and spurious wakeups.
What are common pitfalls of multithreaded design in Python, and how can I mitigate them?- Deadlocks: Always release locks (use with), avoid circular lock ordering, and minimize time spent in critical regions. - Nondeterminism and debugging difficulty: Expect different executions to interleave differently; add logging, reduce shared state, and design for determinism where possible. - Performance overheads: Locks, condition variables, and context switches have costs; more threads don’t always mean faster. Balance thread count with workload, prefer I/O-bound threading, and avoid busy waiting (sleep or proper waits instead).

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Software Design for Python Programmers ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Software Design for Python Programmers ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Software Design for Python Programmers ebook for free