Overview

10 Asynchronous processing

This chapter explains how asynchronous processing hides latency when further reductions aren’t feasible. It contrasts synchronous, blocking workflows with async designs that initiate work without waiting for results, improving perceived responsiveness by overlapping I/O and computation. The chapter frames async as complementary to prior optimization techniques: instead of only shrinking absolute latency, it keeps systems responsive while slow operations proceed, and it introduces the core building blocks you need to make that practical at scale.

The event loop is presented as the heart of async systems: a dispatcher that polls OS multiplexing interfaces (such as epoll, kqueue, io_uring, or IOCP), processes readiness events, and runs scheduled tasks without blocking. Around this core, the chapter surveys async I/O techniques to hide or amortize latency—multiplexing many connections per thread, batching requests, hedging duplicates against tail spikes (with idempotency caveats), buffered I/O, and memory mapping. It then covers deferring non-critical work via scheduling, priority queues (with anti-starvation), and work stealing for load balance. Resource management is treated as essential to low latency: thread pools versus thread-per-core runtimes, careful memory buffer pooling, and connection pooling combined with asynchronous database queries to increase parallelism while controlling overhead.

The chapter also addresses the hard parts: complexity, race conditions, and resource blow-ups if concurrency is left unchecked. It advocates backpressure to regulate producers—TCP window-based throttling, bounded buffering, and last-resort dropping or rate limiting—to keep service latency predictable. Robust error handling is required for partial failures, retries with exponential backoff, and safe idempotent operations, plus timeouts and cancellation with thorough cleanup. Finally, it emphasizes observability for async systems: distributed tracing with propagated context, and metrics that capture concurrency, queue depths, error categories, retry behavior, and resource utilization, along with latency decomposition across wait, queue, and processing stages. The takeaway is that async processing can dramatically improve perceived latency, but only with disciplined flow control, resilient error handling, and strong visibility.

Synchronous vs asynchronous processing. With synchronous processing (at the top of the diagram), a task runs to completion before we start a new task. Therefore, the total time to run tasks A, B, and C is the sum of each task’s time. In contrast, with asynchronous processing (at the bottom of the diagram), all the tasks are started at the same time, resulting in the total time of the slowest task. Asynchronous processing allows you to reduce latency if you can execute the tasks in parallel.
Event loop breaks down work into individual tasks that execute when an event happens. In this example, event loop processes three different tasks, accept connection, process request, and send response, as part of processing a request arriving from the network. Each task runs when an event, such as socket becoming readable, happens.
Request hedging is a latency hiding technique where the client sends two or more copies of the same request. The client uses the response that arrives first and ignores responses that arrive later for whatever reason. For example, perhaps the network path for some of the requests and responses is slower than for others or the messages get queued somewhere along the path.
Backpressure controls the flow of work from producer to consumer to avoid overwhelming the consumer. Clients push work to a server, which buffers them. The service itself pulls work from the buffer. The service also signals to the clients of buffer capacity limits so that the clients know when to slow down to avoid overwhelming the system.

Summary

  • In synchronous processing, tasks run one after another, waiting for a task to complete before starting another one. In contrast, asynchronous processing is primarily about structuring your application in a way where tasks can start independently, addressing the issue of some tasks taking a long time to complete.
  • Event loop is a fundamental concept in asynchronous processing where we have a dispatcher at the core of the system, polling for event such as data arriving from the network, and reacting to them.
  • Although asynchronous processing can improve performance and reduce latency, it has some downsides too with resource management and error handling often being more complex.
  • I/O multiplexing is a fundamental OS primitive enabling the event loop approach. It allows the event loop to efficiently monitor thousands of event sources, allowing the application to react to events as they happen.
  • Asynchronous processing enables various efficient latecy hiding techniques such as request hedging, deferred work, and more.
  • Managing concurrency with backpressure is critical in asynchronous systems to avoid overwhelming the system.
  • Asynchronous processing requires special attention to error handling. For example, handling partial failures and recovering from them can be tricky. Timeouts and cancellations are also essential to dealing with asynchronous task errors.

FAQ

What is asynchronous processing and how does it reduce perceived latency compared to synchronous processing?Asynchronous processing lets tasks progress independently without blocking for results. Instead of completing A, then B, then C in sequence (sum of all latencies), an async system starts them and completes when the slowest finishes (max latency), reducing perceived wait time. This hides unavoidable I/O delays by overlapping work and reacting to events when they’re ready, keeping the system responsive.
When does asynchronous processing help, and when can it make things slower?Async helps when tasks are independent and can overlap, such as multiple network/database calls. It reduces idle time waiting on I/O. It can hurt when tasks are inherently sequential or tightly dependent; the coordination overhead (scheduling, state machines, callbacks/futures) may add latency without any parallelism to exploit.
What is an event loop and how does it enable efficient async I/O?An event loop is a dispatcher that multiplexes many I/O sources (sockets, files, timers) on one thread. It repeatedly: 1) polls for events, 2) processes events, 3) runs scheduled tasks, then repeats. Using OS interfaces (e.g., io_uring, epoll, kqueue, IOCP), it registers interest and wakes only when sources are ready, avoiding blocking and enabling a single thread to manage thousands of connections efficiently.
How does an event-driven server differ from a synchronous server?Key differences: 1) Non-blocking operations: rather than calling blocking recv()/send(), it registers interest in readiness and only acts when readable/writable. 2) Resource efficiency: one event loop thread can handle many connections by polling multiple sources at once. 3) Structure: work is split into small tasks (accept, read/process, write) triggered by readiness events instead of a linear request lifecycle per thread.
What are the main challenges of building asynchronous systems?Common pitfalls include: 1) Complexity: managing dependencies, race conditions, and non-linear control flow. 2) Resource management: many concurrent tasks can exhaust memory/handles; throttling is needed. 3) Debuggability: non-deterministic ordering and limited stack traces complicate diagnosis. 4) Error handling: coordinating partial failures and deciding how one task’s error affects others.
What is I/O multiplexing and why is it central to async servers?I/O multiplexing (epoll, kqueue, io_uring, IOCP) lets one thread monitor many connections and act only when specific events occur (readable, writable, timers). It removes the need for a thread-per-connection, cutting context switches and improving scalability for high-connection workloads like web servers and real-time systems.
How do request batching and buffered I/O hide latency?- Request batching: send multiple operations in a single network round-trip to amortize RTT, useful when responses can be processed asynchronously or are not latency-critical. Tune batch size or send responses individually to avoid response latency spikes. - Buffered I/O: accumulate reads/writes in memory and issue larger system calls to reduce syscall overhead. It can be paired with readahead to fetch data before it’s needed.
What is request hedging and what trade-offs does it involve?Hedging sends duplicate requests and uses the first response, masking latency variance from networks or third-party services. Trade-offs: higher load on the service (which can worsen latency), the need for idempotent operations, and more complex error policies (e.g., accept the first response vs. wait for specific success conditions). Use selectively where variability is high and capacity permits.
How do deferring work, task scheduling, and priority queues improve user-perceived latency?- Deferring work: run only user-visible, time-critical tasks immediately; postpone non-critical tasks (e.g., analytics, reconciliation) to reduce foreground latency. - Scheduling: balance immediate vs deferred execution to preserve resources. - Priority queues: assign importance levels, prevent starvation via aging, and adjust priorities dynamically (e.g., off-peak boosts for batch tasks) to keep critical paths fast.
How do backpressure, buffering, and rate limiting keep async systems stable?- Backpressure: consumers signal producers to slow down (e.g., honoring TCP receive window or delaying reads until capacity exists). - Buffering: use bounded queues to absorb bursts; size carefully (large buffers tolerate spikes but add latency; small buffers reduce latency but may drop work). Distinguish this from buffered I/O, which reduces syscall/I/O overhead. - Dropping/rate limiting: as a last resort, reject or limit requests per client/time window to prevent overload and protect tail latency.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Latency ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Latency ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Latency ebook for free