Kafka for Architects you own this product

Event-driven architecture, logs, microservices, real-time event processing

Katya Gorshkova

MEAP began November 2024
Last updated September 2025
Publication in December 2025 (estimated)

ISBN 9781633436411
375 pages (estimated)

Included with a Manning Online subscription

printed in black & white

available in Russian

catalog / Data Science / Big Data / Stream Processing

table of content

1 Getting to know Kafka as an Architect

1.1 How an architect sees Kafka

1.1.1 Event-driven architecture

1.1.2 Handling myriads of data

1.2 Field notes: Journey of an event-driven project

1.3 Key players in the Kafka ecosystem

1.3.1 Brokers and clients

1.3.2 Managing cluster metadata

1.4 Architectural principles

1.4.1 The Publish-Subscribe pattern

1.4.2 Reliable delivery

1.4.3 The commit log

1.5 Designing and managing data flows

1.5.1 Schema Registry: handling data contracts

1.5.2 Kafka Connect: data replication without code

1.5.3 Data transformation: streaming frameworks

1.6 Impacting operations and infrastructure

1.6.1 Kafka tuning and maintenance

1.6.2 On-premise and cloud options

1.6.3 Solutions from other cloud providers

1.7 Applying Kafka in enterprise

1.7.1 Using Kafka for sending messages

1.7.2 Using Kafka for storing data

1.7.3 How Kafka is different

1.8 Field notes: Getting started with a Kafka project

1.9 Online Resources

1.10 Summary

2 Kafka Cluster Data Architecture

2.1 Field notes: From sketch to project—topics, partitions, and keys

2.2 Inside the Kafka cluster

2.3 Core concepts of data processing

2.3.1 Partitioning the topic

2.3.2 Processing data concurrently

2.3.3 Ordering within a topic

2.4 Field notes: Capturing the architecture of topics, partitions, and beyond

2.4.1 Introducing AsyncAPI

2.5 Replicating partitions

2.5.1 Replica leaders and followers

2.5.2 Choosing replication factor and minimal number of in-sync replicas

2.5.3 Field notes: Extending topic configuration with replication information

2.5.4 Architecture Notes: Configuring Topics

2.6 Inside the topic

2.6.1 Messages: keys, values and headers

2.6.2 Field notes: First draft for documenting messages in AsyncAPI

2.6.3 Message batches and offsets

2.6.4 Physical representation of a topic

2.6.5 Data retention

2.6.6 Selecting the number of partitions

2.6.7 Field notes: Configuring topic metadata

2.6.8 Architecture Points: Advanced Topic Configuration

2.7 Compacted topics

2.7.1 The idea of compaction

2.7.2 How compaction works

2.7.3 Making decisions about compaction policy

2.7.4 Architecture Points: Compaction

2.8 Online resources

2.9 Summary

3 Kafka Clients and Message Production

3.1 Field notes: Patterns and pitfalls for Kafka clients

3.2 Communicating with Kafka

3.2.1 Configuring clients

3.2.2 Connecting to Kafka

3.2.3 Serializing and deserializing data

3.2.4 Setting quotas

3.2.5 Field notes: Setting up Customer360 project

3.2.6 Architecture Points: Initial Producer Configuration

3.3 Sending a message

3.3.1 Partitioning strategy

3.3.2 Field notes: Partitioning strategy for Customer360

3.3.3 Acknowledgement strategies

3.3.4 Field notes: Acknowledgment strategy for Customer360

3.3.5 Batches and timeouts

3.3.6 Field notes: Configuring producer for Customer360

3.3.7 Common producer issues

3.3.8 Architecture Points: advanced producer configuration

3.4 Online resources

3.5 Summary

4 Creating Consumer Applications

4.1 Field notes: Consumer patterns and trade-offs

4.2 Organizing consumer applications

4.3 Receiving a message

4.3.1 Reading data in parallel

4.3.2 Field notes: Setting initial consumer configuration for Customer 360 project

4.3.3 Group Leader and Group Coordinator

4.3.4 Committing the offsets

4.3.5 Field notes: Specifying the strategy for committing offsets for Customer360

4.3.6 Creating batches

4.3.7 Timeouts and partition rebalance

4.3.8 Static Group Membership

4.3.9 Partition assignment strategies

4.3.10 The Next-Gen Consumer Rebalance Protocol

4.3.11 Subscriptions and assignment

4.3.12 Reading data from compacted topics

4.3.13 Field notes: Consumer considerations for the Customer360 project

4.4 Common consumer issues

4.5 Data compression

4.6 Accessing Kafka through REST Proxy

4.7 Online resources

4.8 Summary

5 Kafka in Real World Use Cases

5.1 Field notes: When to choose Kafka—and when not to

5.2 Navigating real-world implementation

5.2.1 Event-driven microservices

5.2.2 Data integration

5.2.3 Collecting logs

5.2.4 Real-time data processing

5.3 Differences with other messaging platforms

5.3.1 Publish-subscribe model

5.3.2 Partitioned data

5.3.3 Lack of broker-side logic

5.3.4 Sequential data access

5.3.5 Message persistence

5.3.6 Limitations in handling large messages

5.3.7 Scalability and high throughput

5.3.8 Fault tolerance

5.3.9 Batch processing

5.3.10 Lack of global ordering

5.4 Kafka Alternatives

5.4.1 RabbitMQ

5.4.2 Apache Pulsar

5.4.3 Solutions from cloud providers

5.5 Online Resources

5.6 Summary

6 Defining Data Contracts

6.1 Field notes: Turning business facts into schemas

6.2 How Kafka handles event structure

6.3 Designing events

6.3.1 Challenges in event design

6.3.2 Fact and delta events: representing state changes

6.3.3 Composite, atomic, and aggregate events: representing event structure

6.3.4 Pulling state on notification

6.3.5 Evolution of types

6.3.6 Mapping events to Kafka messages

6.3.7 Field notes: Data strategies for Customer 360 project

6.4 Event Governance

6.4.1 Data Formats

6.4.2 Field notes: selecting data format for Customer 360 project

6.4.3 Data ownership

6.4.4 Organizing data and communicating changes

6.4.5 Field notes: Designing events for Customer 360 project

6.5 Schema Registry

6.5.1 Schema Registry in Kafka ecosystem

6.5.2 Registering schemas

6.5.3 A Concept of Subject

6.5.4 Compatibility Rules

6.5.5 Alternatives to Schema Registry

6.5.6 Handling data contracts without the centralized server

6.5.7 Commercial extensions for data contracts

6.6 Common problems in handling data contracts

6.6.1 Absence of server-side validation

6.6.2 Handling incompatible changes for non-compacted topics

6.6.3 Migrating state

6.6.4 Automatic registration of schemas

6.7 Online Resources

6.8 Summary

7 Kafka Interaction Patterns

7.1 Field notes: When Kafka helps—and when it hurts

7.1 Using Kafka in Microservices

7.1.1 Smart endpoints and dumb pipes

7.1.2 Request-response pattern

7.1.3 CQRS pattern

7.1.4 Event sourcing with snapshotting

7.1.5 Having “hot” and “cold” data

7.2 Field notes: Implementing data mesh with Kafka

7.2.1 Motivations for data mesh

7.2.2 Domain ownership

7.2.3 Data as a product

7.2.4 Federated governance

7.2.5 Self-Serve platform

7.3 Using Kafka Connect

7.3.1 Kafka Connect at a glance

7.3.2 Internal Kafka Connect architecture

7.3.3 Converters

7.3.4 Single message transformations

7.3.5 Source connectors

7.3.6 Sink connectors

7.3.7 Changes in incoming data structure

7.3.8 Integrating Kafka and databases

7.3.9 Field notes: Creating connector for Customer 360

7.3.10 Common Kafka Connect problems

7.4 Ensuring delivery guarantee

7.4.1 Producer idempotence

7.4.2 Understanding Kafka transactions

7.4.3 Transactional outbox pattern

7.5 Online resources

7.6 Summary

8 Designing Streaming Applications

8.1 Field notes: Transforming data in motion

8.2 Introducing Kafka Streams

8.2.1 ETL, ELT and Stream Processing

8.2.2 Introduction to the Kafka Streams framework

8.2.3 Benefits of using Kafka Streams

8.3 Field notes: Sketching out Customer360 with Kafka Streams

8.4 Processing data

8.4.1 Stateless Operations

8.4.2 Stateful operations

8.4.3 Processing API

8.4.4 Kafka Streams internal architecture

8.4.5 Windowing operations

8.4.6 Joining streams

8.4.7 Field notes: Implementing CustomerJoinService

8.4.8 Interactive Queries

8.5 Alternative Solutions

8.5.1 Confluent ksqlDB

8.5.2 Apache Flink

8.5.3 Solutions from Cloud Providers

8.6 Common streaming application issues

8.6.1 Memory and disk capacity planning

8.6.2 Incorrect topic partitioning

8.6.3 Out-of-order data

8.6.4 Late arriving data

8.6.5 State Store initialization

8.6.6 Monitoring and debugging challenges

8.7 Online resources

8.8 Summary

9 Managing Kafka within the Enterprise

9.1 Field notes: From prototype to deployment

9.2 Managing metadata

9.2.1 Introducing KRaft controllers

9.2.2 Example of cluster configuration

9.2.3 Failover scenarios

9.2.4 Using Zookeeper

9.3 Choosing a deployment solution

9.3.1 Choosing between on-premises and cloud Kafka deployment

9.3.2 Hybrid Approach

9.3.3 Choosing the Right Deployment for a Customer 360 Project

9.4 Protecting Kafka

9.4.1 Kafka security overview

9.4.2 Encrypting using TLS

9.4.3 Authentication

9.4.4 Authorization

9.4.5 Protecting data at rest

9.4.6 Enabling security in the Customer360 project

9.5 Online resources

9.6 Summary

10 Organizing a Kafka Project

10.1 Defining Kafka Project Requirements

10.1.1 Field notes: Use-Case Intake and Requirements

10.1.2 Identifying event-driven workflows

10.1.3 Turning business workflows into events

10.1.4 Gathering functional requirements for Kafka topics

10.1.5 Identifying non-functional requirements

10.2 Maintaining Cluster Structure

10.2.1 Using tools

10.2.2 Using GitOps for Kafka configurations

10.2.3 Using the Kafka Admin API

10.2.4 Setting up environments

10.2.5 Field notes: Choosing a solution for Customer360 project

10.3 Testing Kafka applications

10.3.1 Unit testing

10.3.2 Integration testing

10.3.3 Performance tests

10.4 Online Resources

10.5 Summary

11 Operating Kafka

11.1 Cluster evolution and upgrades

11.1.1 Adding brokers and distributing the load

11.1.2 Removing a broker from the cluster

11.1.3 Upgrading clients

11.1.4 Data mobility

11.2 Monitoring Kafka cluster

11.2.1 Types of metrics in monitoring

11.2.2 Kafka monitoring objects

11.2.3 Ownership of Monitoring Responsibilities

11.2.4 Monitoring Stacks and Tools

11.3 Performance Tuning Clinic

11.3.1 Balancing throughput and latency

11.3.2 Balancing data safety and up-time

11.4 Disaster Recovery & failover

11.4.1 RTO/RPO Engineering

11.5 Online Resources

11.6 Summary

12 What’s next for Kafka

12.1 Kafka’s origins: A path to event backbone

12.2 Kafka as an orchestration platform

12.3 Integration with new runtimes

12.3.1 Kafka with WebAssembly

12.3.2 Serverless Kafka

12.3.3 Kafka at the edge

12.4 Diskless Kafka: decoupling storage from brokers

12.5 Kafka in AI/ML world

12.5.1 Incremental learning

12.5.2 Feature engineering in motion

12.5.3 Kafka and AI agents

12.6 Summary

Overview

4 Creating Consumer Applications

This chapter focuses on the consumer side of Kafka: how applications read at scale, coordinate safely, and stay correct. It clarifies subscribing to topics versus explicitly positioning within a stream, shows how batching and timeouts shape throughput and latency, makes consumer groups and rebalances concrete, and surfaces typical pitfalls such as lag, duplicates, and ordering. It also touches on using a REST proxy as a lightweight integration option so teams can access Kafka over HTTP. The outcome is a set of practices for building consumer logic that’s reliable, efficient, and easy to operate.

Receiving messages from Kafka: Explains how consumers read records, subscribe to topics or explicitly control their position in the stream (offsets), and how batching and timeouts influence throughput and latency.
Principles of parallel message reception: Describes partition-based parallelism via consumer groups, assignment of partitions to consumers, and how rebalances occur as members join or leave.
Common challenges in Kafka consumer handling: Highlights the most frequent issues—consumer lag, handling duplicates, and maintaining ordering—and frames how design choices affect each.
Accessing Kafka via HTTP: Introduces Kafka REST Proxy as a lightweight way for applications to consume data without native clients.
Utilizing data compression in Kafka: Summarizes why and when to use compression to reduce data size and improve efficiency, noting the impact on performance characteristics.

4.1 Field notes: Consumer patterns and trade-offs

The team shifts focus to building the consumer application that aggregates data from two Kafka topics—customer profiles (keyed by customer ID) and transactions (keyed by transaction ID)—with aggregation centered on customer ID. What initially seemed simple reveals multiple operational and architectural trade-offs.

Consumer design challenges: determining the right number of service instances, controlling how much data is fetched per request to avoid memory pressure, and ensuring timely failure detection in the cluster.
Configuration burden: like producers, consumers require careful tuning to work reliably and efficiently.
Fallback integration path: REST Proxy offers a familiar HTTP interface for producing and consuming, useful as a pragmatic backup or for diverse environments.
Trade-offs: REST is approachable but adds HTTP overhead; native Kafka libraries provide full features with better throughput and lower latency.
Aggregation approach: feasible directly in the application using Java stream programming, slated for the next phase.

Outcome: the team commits to exploring consumer configurations and scaling strategies, keeps REST Proxy as a contingency, and plans to implement in-app Java-based aggregation next.

4.2 Organizing consumer applications

This section explains how Kafka consumers read data using a pull model and how to scale consumption with consumer groups. Consumers connect to the cluster, subscribe to topics, and repeatedly poll for new records. Processing is decoupled from offset updates: committing offsets marks records as seen, not necessarily fully processed. Parallelism is achieved via consumer groups, where partitions are exclusively assigned to one consumer instance within a group, while multiple groups can consume the same topic independently.

Subscribe to one or more topics and poll regularly for new records.
Receive batches when data is available or a timeout elapses.
Deserialize records.
Process records according to application logic.
Update offsets in Kafka to acknowledge receipt (not guaranteed processing).

Consumer message reception involves periodic polling, receiving available batches, deserializing, processing, and acknowledging via offsets.

Consumers coordinate in consumer groups to split work across partitions for parallel processing. Within a group, each partition is handled by only one consumer instance at a time, ensuring ordered processing per partition. Different services can run as separate consumer groups and consume the same topic independently without affecting each other.

Each service can consume the same topic independently as its own consumer group.

Overall, the section frames the consumer loop, clarifies offset semantics, and introduces consumer groups as the foundation for scalable, partition-aligned parallel consumption.

4.3 Receiving a message

This section explains how Kafka consumers fetch, process, and track progress, and how to scale, coordinate, and tune them for reliability and throughput.

Consumer read cycle: A consumer joins a group, receives partition assignments, polls for new records, deserializes and processes them, then advances its position by committing offsets. A commit records where to resume; it doesn’t acknowledge records individually.
Parallelism with consumer groups: Partitions enable parallel reads. Within a consumer group, each partition is owned by a single consumer instance, ensuring each record is processed once by the group. Multiple services needing the same data use distinct groups. Consumers can subscribe to multiple topics; assignment strategy is configurable.
Customer360 initial setup: Consumers connect via bootstrap servers, use string deserializers for JSON, share group.id=cust360, and apply topic-specific client.id prefixes. The team documents operations and IDs with AsyncAPI.
Coordination roles: A broker-side group coordinator manages group membership and distributes assignments. A client-side group leader computes assignments using the configured partition.assignment.strategy and reports to the coordinator. Consumers only talk to the coordinator and fetch from partition leaders.
Offset management: Offsets are tracked per group in __consumer_offsets. enable.auto.commit controls periodic auto-commits; many systems prefer manual commits for stronger delivery guarantees. If no offset exists, auto.offset.reset governs where to start (e.g., earliest or latest).
Customer360 offset policy: Manual commits (enable.auto.commit=false) to avoid loss; start from earliest if no offsets, requiring idempotent processing.
Batch sizing and flow control: Tune throughput vs latency and memory via:
- fetch.min.bytes and fetch.max.wait.ms (wait to build bigger batches),
- fetch.max.bytes and max.partition.fetch.bytes (caps per fetch/partition),
- max.poll.records (limit by record count).
Align settings to avoid conflicts and test in context.
Rebalances and timeouts: Group membership changes (join/leave/fail) trigger partition redistribution. Two styles:
- Eager: revoke all, then reassign (pause all consumers).
- Cooperative: move only what’s needed (minimize pauses).
Health is tracked with heartbeat.interval.ms, session.timeout.ms (broker-driven removal if missed), and max.poll.interval.ms (consumer self-removal if processing takes too long).
Static group membership: Set a stable group.instance.id to preserve assignments across brief restarts (within session timeout) and avoid unnecessary rebalances.
Assignment strategies:
- RangeAssignor (default): sequential per consumer; aligns same-index partitions across topics (useful for co-partitioned joins).
- RoundRobinAssignor: evenly distributes partitions; can cause larger movement on rebalance.
- StickyAssignor: balances while preserving existing placements where possible.
- CooperativeStickyAssignor: incremental changes to minimize disruption during rebalance.
Next-gen rebalance protocol: Moves assignment logic to brokers for consistency and simpler operations. A combined ConsumerHeartBeat API handles heartbeats and assignment with server-side assignors (even distribution, stickiness, rack awareness). Broker configs include group.consumer.assignors and heartbeat/session limits. No group leader; available only in KRaft clusters.
Explicit positioning: Beyond subscriptions from last commit, consumers can programmatically set positions by exact offsets or by timestamp-derived offsets. With direct assignment, group.id is unused and positioning is fully controlled in code.
Compacted topics: Client API is unchanged. Expect at least the latest record per key; older records may appear until compaction catches up. Choose keys carefully to match compaction needs.
Customer360 implementation notes: Two consumer sets read profiles and transactions with distinct client.id prefixes within the same group. Manual commits with earliest reset. To avoid lost updates when multiple consumers update the same aggregate, use optimistic locking (version/timestamp checks and retries). An alternative is a single stream-processing service to centralize updates, trading simplicity in concurrency for potential scaling complexity.

4.4 Common consumer issues

This section summarizes frequent challenges in building Kafka consumer applications and the configuration and design choices that mitigate them.

Consumer scalability challenges

Consumer-side scalability is bounded by the number of partitions; each partition in a group can be processed by only one consumer instance. Choose partition counts carefully (tuned via experimentation).
Scale by running multiple instances in the same consumer group; monitor whether they keep up using consumer lag.
Consumer lag reflects how many produced records haven’t been processed by the group yet; it can be read from JMX or approximated as latest produced offset minus latest committed offset (gaps can exist due to compaction and other factors).

Figure Calculating consumer lag.

Another scaling model is dispatching polled records to a bounded worker pool sharded by key (preserves per-key ordering). Offset commits must advance only to the highest contiguous completed offset (a watermark) because workers finish out of order.

Optimizing batch size configuration

Tune batch size for your workload and memory limits; max.poll.records caps the number of records per fetch but actual bytes vary by message size.
Trade-off: larger batches increase throughput but can raise latency; monitor and adapt over time.

Timeout management strategies

For long-running processing, tune max.poll.interval.ms and max.poll.records to avoid unnecessary rebalances.
Health-related timeouts—heartbeat.interval.ms and session.timeout.ms—must balance fast failure detection with stability (avoiding churn).

Error-proof deserialization processes

Brokers don’t validate payloads; consumers must handle deserialization errors robustly to prevent a single bad message from halting progress.
Use client-side error handlers; a Schema Registry (see Chapter 6) enforces contracts and compatibility across producers and consumers.

Offset initialization strategies for new consumers

Decide whether to read historical data or only new messages. The default auto.offset.reset is latest; many use cases need earliest.

Figure Different configurations of auto.offset.reset property.

Offsets have retention (offset.retention.minutes, default 7 days). If they expire while a group is inactive, auto.offset.reset dictates the restart position. Set retention to cover expected downtime to avoid loss or replay.

Accurate offset commitment practices

With enable.auto.commit=true, offsets may be committed before processing completes, risking loss on failure.
For critical paths, manage commits manually and only after successful processing to ensure at-least-once semantics.

Coordinating transactions across external systems

Kafka doesn’t provide distributed transactions. If you write to a database and commit Kafka offsets, ordering matters:
- Commit offsets first: risk of data loss if DB write fails.
- Write DB first: risk of duplicates if offset commit fails.
Chapter 6 discusses strategies to manage these trade-offs.

4.5 Data compression

Kafka supports message compression to improve throughput and reduce storage by compressing batches of records. The recommended approach is producer-side compression: the producer compresses each batch, the broker stores and transmits it as-is, and consumers automatically decompress without extra configuration.

Producer level:
- Configure via compression.type in the producer: gzip, snappy, lz4, zstd, or none.
- Compression happens per batch, yielding better ratios due to data similarity.
- Data remains compressed end-to-end until the consumer reads it.
Broker level:
- Configure via compression.type in the broker (can be per topic): uncompressed, producer (default), or a specific codec (zstd, lz4, snappy, gzip).
- producer retains the producer’s codec (preferred). If the broker enforces a different codec, it must decompress and recompress, adding overhead.
Compression is more effective on text formats (e.g., JSON) and less on compact binary formats (e.g., Avro). Encrypted data does not compress meaningfully.
There is a small CPU cost; it’s typically a good trade-off for aggregated workloads (like logs) where throughput matters more than latency.

Figure Compressing the data on producer side.

4.6 Accessing Kafka through REST Proxy

The REST Proxy enables applications to interact with Kafka over HTTP/HTTPS instead of Kafka’s native protocol, acting as an intermediary that translates REST calls into Kafka operations and returns results via HTTP responses. This approach is useful when native Kafka client libraries are unavailable for a chosen language, for rapid prototyping, and for administrative tasks where ultra-low latency is not essential.

How it works: clients call REST endpoints; the proxy forwards requests to Kafka and returns responses over HTTP.
Implementations:
- Confluent REST Proxy: HTTP interface for producing/consuming messages and administering topics, brokers, and configurations.
- Kafka Bridge (Strimzi): similar capabilities, optimized for cloud-native and Kubernetes environments.
Cloud integration: can be fronted by cloud API gateways (e.g., Amazon API Gateway) to simplify authentication/authorization via cloud identity services and to publish APIs for broader access within the ecosystem.
Best fit: heterogeneous language environments, quick experiments, and operational tooling; not ideal for latency-critical workloads.

Communicating to Kafka through REST.

4.7 Online resources

This section curates authoritative references to deepen understanding and improve practice when building Kafka consumer applications.

Kafka Consumer Configuration Reference: Comprehensive catalog of consumer properties, defaults, and trade-offs affecting reliability and performance, including polling, heartbeats, commit modes, fetch tuning, isolation levels, deserialization, and security.
Understanding Kafka Partition Assignment Strategies: Explains how assignments are computed and maintained; contrasts range, round-robin, sticky, and cooperative-sticky strategies; and offers guidance for designing custom assignors.
Consumer Group Protocol: Walkthrough of group membership and rebalance flows (join, sync, heartbeat), failure handling, and the mechanics behind cooperative rebalancing.
Kafka Consumer Groups and Offsets: Practical guidance on managing groups and offsets, understanding delivery guarantees, choosing synchronous vs. asynchronous commits, and inspecting lag and offsets in operations.
Apache Kafka Message Compression: Details codec choices and their end-to-end impact on throughput, latency, and CPU; when to enable compression and where (producer vs. broker) for optimal results.
Confluent REST Proxy for Apache Kafka: Describes producing and consuming over HTTP, schema handling, and operational considerations for non-JVM integrations and polyglot environments.
Strimzi Kafka Bridge Documentation: Presents a Kubernetes-friendly bridge for producing and consuming over REST/AMQP, including capabilities and limitations in cloud-native setups.
What Are Apache Kafka Consumer Group IDs?: Clarifies dynamic vs. static membership, stable group identity, and techniques to reduce disruptive rebalances during deployments and scaling.
KIP-848: The Next Generation of the Consumer Rebalance Protocol: Outlines protocol improvements for faster, safer, and more incremental rebalances to enhance scalability and availability.

Together, these resources support configuring consumers for correctness and efficiency, selecting assignment and commit strategies, integrating via REST bridges, and preparing for upcoming changes in the consumer protocol.

4.8 Summary

Consumers continuously request data, deserialize incoming messages, process them according to application-specific logic, and then commit the offsets back to Kafka. To facilitate parallel processing, consumers operate within a consumer group, where partitions are equitably distributed among group members. Consumers can either subscribe to topics to dynamically receive partition assignments or explicitly assign specific partitions for targeted data processing. Each consumer group has a group coordinator, which is a broker, responsible for managing the members of the group and facilitating rebalances. Within each consumer group, a group leader is designated to manage partition assignments and coordinate with the group coordinator. Timeouts are set to efficiently detect and manage consumer inactivity or failures. Messages can be compressed by either the producer or the broker to enhance network throughput and reduce storage space. By default, compression is handled by the producer, which specifies the compression type in its configuration. When utilizing the Confluent Rest Proxy, clients can interact with Kafka via HTTP REST. The Rest Proxy serves as a mediator, converting requests and responses between the HTTP REST format and Kafka's native protocol.

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$55.99 $41.99

you save $14.00 (25%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$55.99 $41.99

you save $14.00 (25%)

eBook

pdf, ePub, online

$55.99 $41.99

you save $14.00 (25%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more