Apache Kafka in Action you own this product

From basics to production

Anatoly Zelenin, Alexander Kropp
Foreword by Adam Bellemare

April 2025
ISBN 9781633437593
368 pages

Included with a Manning Online subscription

printed in black & white

available in Russian

catalog / Data Science / Big Data / Stream Processing

table of content

Part 1 Getting started

1 Introduction to Apache Kafka

1.1 What is Apache Kafka, and how does it solve our problems?

1.2 Kafka in enterprise ecosystems

1.3 Architectural overview of Kafka

1.4 Running and using Kafka

1.5 Our learning path

Summry

2 First steps with Kafka

2.1 Introducing our use case

2.2 Producing messages

2.3 Consuming messages

2.4 Consuming and producing messages in parallel

2.5 Graphical user interfaces for Kafka

Summry

Part 2 Concepts

3 Exploring Kafka topics and messages

3.1 Topics

3.1.1 Viewing topics

3.1.2 Create, customize, and delete topics

3.2 Messages

3.2.1 Message types

3.2.2 Data formats

3.2.3 Message structure

Summry

4 Kafka as a distributed log

4.1 Logs

4.1.1 What exactly is a log?

4.1.2 Basic properties of a log

4.1.3 Kafka as a log

4.2 Kafka as a distributed system

4.2.1 Partitioning and keys

4.2.2 Consumer groups

4.2.3 Replication

4.3 Components of Kafka

4.3.1 Coordination cluster

4.3.2 Broker

4.3.3 Clients

4.4 Kafka in corporate use

Summry

5 Reliability

5.1 Acknowledgments

5.1.1 ACK strategies in Kafka

5.1.2 ACKs and ISRs

5.1.3 Message delivery guarantees in Kafka

5.2 Transactions

5.2.1 Transactions in databases

5.2.2 Transactions in Kafka

5.2.3 Transactions and consumers

5.3 Replication and the leader-follower principle

Summry

6 Performance

6.1 Configuring topics for performance

6.1.1 Scaling and load balancing

6.1.2 Determining how many partitions are needed

6.1.3 Changing the number of partitions

6.2 Producer performance

6.2.1 Producer configuration

6.2.2 Producer performance test

6.3 Broker configuration and optimization

6.3.1 Optimizing brokers

6.3.2 Determining broker count and sizing

6.4 Consumer performance

6.4.1 Consumer configuration

6.4.2 Consumer performance test

Summry

Part 3 Kafka deep dive

7 Cluster management

7.1 Apache Kafka Raft cluster management

7.2 ZooKeeper Cluster Management

7.3 Migrating from ZooKeeper to KRaft

7.4 Connection to Kafka

Summry

8 Producing and persisting messages

8.1 Producer

8.1.1 Producing messages

8.1.2 Production process for messages

8.1.3 Producer and ACKs

8.2 Broker

8.2.1 Receiving and persisting messages

8.2.2 Brokers and ACKs

8.3 Data and file structures

8.3.1 Metadata, checkpoints, and topics

8.3.2 Partitions directory

8.3.3 Log data and indices

8.3.4 Segments

8.3.5 Deleted topics

8.4 Replication

8.4.1 In-sync replicas

8.4.2 High Watermark

8.4.3 Effects of delays during replication

Summry

9 Consuming messages

9.1 Fetching messages

9.1.1 Fetch requests

9.1.2 Fetch from the closest replica

9.2 Broker handling of consumer fetch requests

9.3 Offsets and Consumer

9.3.1 Offset management

9.3.2 Understanding offsets in Kafka

9.4 Understanding and managing Kafka consumer groups

9.4.1 Consumer group management

9.4.2 Distribution of partitions to consumers

9.4.3 Static memberships

Summry

10 Cleaning up messages

10.1 Why clean up messages?

10.2 Kafka’s cleanup methods

10.3 Log retention

10.3.1 When is a log cleaned up via retention?

10.3.2 Offset retention

10.4 Log compaction

10.4.1 When is a log cleaned up via compaction?

10.4.2 How the log cleaner works

10.4.3 Tombstones

Summry

Part 4 Kafka in enterprise use

11 Integrating external systems with Kafka Connect

11.1 What is Kafka Connect?

11.2 Kafka Connect cluster: Distributed Mode

11.2.1 Configuring a Kafka Connect cluster

11.2.2 Creating a connector

11.2.3 Testing the connector

11.3 Scalability and fault tolerance of Kafka Connect

11.4 Worker configuration

11.5 The Kafka Connect REST API

11.5.1 Status of a Kafka Connect cluster

11.5.2 Creating, modifying, and deleting connectors

11.6 Connector configuration

11.6.1 General connector configuration

11.6.2 Error handling in Kafka Connect

11.7 Single message transformations

11.8 Kafka Connect example: JDBC Source Connector

11.8.1 Preparing the JDBC Source Connector

11.8.2 Configuring the JDBC Source Connector

11.8.3 Testing the JDBC Source Connector

11.9 Kafka Connect example: Change data capture connector

11.9.1 Preparing the Debezium connector for PostgreSQL

11.9.2 Configuring the Debezium connector for PostgreSQL

11.9.3 Testing the Debezium connector for PostgreSQL

Summry

12 Stream processing

12.1 Stream processing overview

12.1.1 Stream-processing libraries

12.1.2 Processing data

12.2 Stream processors

12.2.1 Processor types

12.2.2 Processor topologies

12.3 Stream processing using SQL

12.4 Stream states

12.4.1 Streams and tables

12.4.2 Aggregations

12.4.3 Streaming joins

12.4.4 Use case: Notifications

12.5 Streaming and time

12.5.1 Time is relative

12.5.2 Time windows

12.5.3 Use case: Fraud detection

12.6 Scaling Kafka Streams

Summry

13 Governance

13.1 Schema management

13.1.1 Why do we need schemas?

13.1.2 Compatibility levels

13.1.3 Schema registries

13.1.4 Avro

13.2 Security

13.2.1 Transport encryption

13.2.2 Authentication

13.2.3 Authorization

13.2.4 Encryption at rest

13.2.5 End-to-end encryption

13.2.6 ZooKeeper security

13.2.7 Securing an unsecured Kafka cluster

13.3 Quotas in Kafka: Protecting the cluster from overload

Summry

14 Kafka reference architecture

14.1 Useful components and tools

14.1.1 kcat

14.1.2 Graphical user interfaces

14.1.3 Managing Kafka resources

14.1.4 Cruise Control for Apache Kafka

14.2 Deployment environments

14.2.1 Kafka on a company’s own hardware

14.2.2 Kafka in virtualized environments

14.2.3 Kafka in Kubernetes: Strimzi

14.2.4 Running Kafka in the public cloud

14.3 Hardware requirements

14.3.1 Brokers

14.3.2 Coordination cluster

Summry

15 Kafka monitoring and alerting

15.1 Infrastructure metrics

15.2 Broker metrics

15.2.1 Kafka server metrics

15.2.2 Kafka log metrics

15.2.3 Kafka network metrics

15.2.4 Kafka controller metrics

15.3 Client metrics

15.3.1 General client metrics

15.3.2 Producer metrics

15.3.3 Consumer metrics

15.3.4 Kafka Connect and Kafka Streams metrics

15.4 Alerting

15.4.1 From metrics to alerts

15.4.2 From alerts to problem solving

15.5 Kafka deployment environments and their monitoring challenges

15.5.1 Kafka on a company’s own hardware

15.5.2 Kafka on virtual machines

15.5.3 Kafka in the public cloud

15.5.4 Kafka in Kubernetes

15.5.5 Kafka as a managed services

15.5.6 Security considerations across environments

Summry

16 Disaster management

16.1 What could possibly go wrong?

16.1.1 Network failures

16.1.2 Compute failures

16.1.3 Storage failures

16.1.4 Data center failures

16.2 Backing up Kafka

16.3 Mirroring Kafka clusters with MirrorMaker

16.3.1 Active-passive cluster

16.3.2 Active-active cluster

16.3.3 Hub-and-spoke topology

Summry

17 Comparison with other technologies

17.1 Data on the outside vs. data on the inside

17.2 Classic messaging systems vs. Kafka

17.2.1 Kafka is agnostic

17.2.2 Operational complexity in classic messaging systems

17.2.3 Governance of classic messaging systems

17.3 REST vs. Kafka

17.3.1 Challenges of synchronous communication

17.3.2 Alternative communication strategies

17.4 Relational databases vs. Kafka

17.4.1 Strengths and weaknesses of relational databases

17.4.2 Complementary roles of Kafka and relational databases in modern data architectures

17.5 Kafka is the core of a streaming platform

Summry

18 Kafka’s role in modern enterprise architectures

18.1 Kafka as the core of a data mesh

18.1.1 The challenges of traditional data management

18.1.2 Principles of a data mesh

18.1.3 Data mesh vs. traditional approaches

18.1.4 Kafka’s role and responsibilities in implementing a data mesh

18.2 Liberating data from core systems with Kafka

18.3 Kafka for big data

18.4 Kafka for the Industrial Internet of Things

18.4.1 Use cases for Kafka in the IIoT

18.4.2 Data storage and retention challenges

18.4.3 Data integration and access management

18.4.4 When to use multiple Kafka clusters

18.5 What Kafka is not

18.5.1 Kafka isn’t a relational database

18.5.2 Kafka isn’t a synchronous communication interface

18.5.3 Kafka isn’t a file exchange platform

18.5.4 Kafka for small applications is questionable

18.5.5 Kafka isn’t a substitute for good architecture

Summry

Appendix

Appendix A: Setting up a Kafka test environment

A.1 Operating systems

A.2 Downloading Kafka

A.3 Configuring Kafka

A.4 Preparing the data directories

A.5 Starting Kafka

A.6 Stopping Kafka

Appendix B: Monitoring setup

B.1 Prometheus

B.2 Prometheus Exporter

B.3 Prometheus Alertmanager

B.4 Grafana

Overview

1 Introduction to Apache Kafka

This introduction explains why modern organizations need to handle data as events in real time and how Apache Kafka addresses that need. As customer expectations and data volumes rise, traditional batch architectures fall short. Kafka provides a durable, distributed log that enables both streaming and replay, scales horizontally for massive throughput, and supports fault tolerance, making it a backbone for event-driven systems across industries. The chapter sets expectations for the book: practical guidance for architects, operators, and developers on using Kafka effectively rather than a language-specific developer manual.

The chapter positions Kafka as the “central nervous system” for enterprise data, where every significant event is published to topics and consumed by downstream systems asynchronously. This model decouples producers and consumers, easing integration between legacy systems and modern services, and supports independently evolving microservices through resilient, event-driven communication. Kafka’s design embraces standard hardware and tolerates component failures, helping organizations transition from batch to real time while maintaining reliability and operational simplicity.

An architectural overview introduces core components—producers, topics, partitions, brokers, leaders and followers, consumers, and consumer groups—and explains how coordination is handled by KRaft (replacing ZooKeeper). Through an example flow (such as bank transfers), the chapter illustrates partitioned parallelism, durable storage, and failover behavior that enable scalable, accurate processing. It closes with guidance on running Kafka: right-sizing infrastructure, planning topics and partitions, building producer/consumer applications, monitoring and operations, and considering managed services alongside in-house expertise. The learning path ahead emphasizes hands-on, step-by-step examples that build practical skills for designing, operating, and integrating Kafka in real-world environments.

Kafka as the central nervous system for data in a company. Every event that takes place in the enterprise is stored in Kafka. Other services can react to these events asynchronously and process them further.

The components of Apache Kafka and the data flow.

Summary

Kafka is a powerful distributed streaming platform operating on a publish-subscribe model, allowing seamless data flow between producers and consumers.
Widely adopted across industries, Kafka excels in real-time analytics, event sourcing, log aggregation, and stream processing, supporting organizations in making informed decisions based on up-to-the-minute data.
Kafka's architecture prioritizes fault tolerance, scalability, and durability, ensuring reliable data transmission and storage even in the face of system failures.
From finance to retail and telecommunications, Kafka finds applications in real-time fraud detection, transaction processing, inventory management, order processing, network monitoring, and large-scale data stream processing.
Beyond its core messaging system, Kafka offers an ecosystem with tools like Kafka Connect and Kafka Streams, providing connectors to external systems and facilitating the development of stream processing applications, enhancing its overall utility.
Kafka can serve as a central hub for diverse system integration.
Producers send messages to Kafka for distribution.
Consumers receive and process messages from Kafka.
Topics organize messages into channels or categories.
Partitions divide topics to parallelize and scale processes.
Brokers are Kafka servers managing storage, distribution, and retrieval.
ZooKeeper/KRaft coordinates and manages tasks in a Kafka cluster.
Kafka ensures data resilience through replication.
Kafka scales horizontally by adding more brokers to the cluster.
Kafka can run on general purpose hardware.
It is implemented in Java and Scala but there also exist clients for other programming languages as for example Python.

FAQ

What is Apache Kafka and why is it relevant today?

Apache Kafka is an open-source distributed streaming platform that serves as a persistent, distributed log for data events. It enables organizations to handle high-volume, real-time data flows, supporting both immediate processing and reliable replay. This helps companies meet modern expectations for instant responses and continuous data-driven operations.

How does Kafka act as the “central nervous system” for enterprise data?

Kafka stores business events from across the enterprise in topics, allowing services to publish (produce) and subscribe (consume) asynchronously. By centralizing events, it decouples systems, lets teams evolve services independently, and enables real-time reactions without brittle point-to-point integrations.

What problems does Kafka solve compared to batch processing and point-to-point integrations?

Kafka addresses real-time processing needs, reducing delays inherent in batch systems. It avoids the complexity and fragility of point-to-point connections by providing a scalable, durable event backbone, enabling independent team deployments and reliable data flow across many services.

How does Kafka’s publish-subscribe (producer-consumer) model work, and what sets it apart?

Producers write messages to topics; consumers read them as needed. Unlike many traditional messaging systems, Kafka persists data for configurable periods, so multiple consumers can read the same data at different times and systems can replay history after failures or for new use cases.

What are the core components of Kafka’s architecture?

Key elements include messages (records), producers, topics, partitions, consumers, consumer groups, brokers, leaders, followers (replicas), and a coordination layer (KRaft). Together they provide horizontal scalability, fault tolerance, and efficient parallel processing of data streams.

What are topics and partitions, and why are partitions important?

Topics group related messages, similar to tables in a database. They are split into partitions to enable parallelism and scale. Partitions are replicated across brokers for resiliency, and a leader handles reads/writes while followers replicate data for failover.

How do consumer groups provide scalable and fault-tolerant processing?

A consumer group shares the workload of a topic’s partitions so that each partition is processed by exactly one consumer in the group. This enables parallel processing and, if a consumer fails, the remaining consumers in the group take over, ensuring continuity.

What is KRaft and how does it differ from ZooKeeper?

KRaft is Kafka’s built-in coordination layer based on the Raft protocol, replacing the need for an external ZooKeeper ensemble. It simplifies operations and improves scalability by keeping coordination inside Kafka itself.

What do I need to run Kafka effectively?

You need a set of servers (brokers) with low-latency, high-bandwidth networking, appropriate topic and partition design, well-built producer/consumer applications, and comprehensive monitoring and management. Kafka runs on the JVM (JRE required) and offers clients for languages like Java, Scala, and Python.

Should I self-manage Kafka or use a managed service?

Self-managing is possible but operationally demanding. Managed options include AWS MSK, Azure HDInsight, and specialized providers like Confluent and Aiven; Kafka-compatible platforms such as Redpanda and Warpstream also exist. Regardless of choice, in-house expertise is still essential to succeed with streaming architectures.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$47.99 $23.99

you save $24.00 (50%)

include audio $24.99 $12.49

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$47.99 $23.99

you save $24.00 (50%)

include audio $24.99 $12.49

eBook

pdf, ePub, online

$47.99 $23.99

you save $24.00 (50%)

include audio $24.99 $12.49

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more