How to Create and Manage Kafka Topics Using Command Line

This expert guide provides a comprehensive, step-by-step tutorial on managing Kafka topics exclusively through the command line interface (CLI). Learn the essential commands using the `kafka-topics.sh` utility for crucial administrative tasks: creating topics with defined partitions and replication factors, verifying configurations via description, safely increasing partition counts, and performing topic deletion. Master these actionable commands to ensure robust and efficient Kafka cluster administration.

35 views

How to Create and Manage Kafka Topics Using Command Line

Apache Kafka is a distributed event streaming platform often utilized for high-throughput data pipelines, real-time analytics, and microservices communication. The fundamental organizational unit within Kafka is the Topic, a category or feed name to which records are published.

While graphical tools exist, the most robust, reliable, and common way to interact with and manage Kafka infrastructure is directly through the command-line interface (CLI). Mastering these essential commands is crucial for administrators and developers responsible for maintaining a healthy and efficient Kafka cluster. This guide provides a step-by-step tutorial on using the kafka-topics.sh script to perform the most common topic management tasks.

Prerequisites and Setup

To execute the commands in this guide, you must have access to a machine where the Kafka binaries are installed. All topic management operations are performed using the kafka-topics.sh utility, typically found in the bin directory of your Kafka installation.

All commands require the address of at least one Kafka broker, specified using the --bootstrap-server flag. If you are using an older Kafka version (pre-2.2), you might still rely on the --zookeeper flag, but --bootstrap-server is the recommended and modern standard.

For the examples below, we will assume the broker is running locally on the default port:

# Standard Broker Address Placeholder
BROKER_ADDRESS="localhost:9092"

1. Creating a New Kafka Topic

Creating a topic requires defining its name, along with two critical parameters that dictate its behavior and fault tolerance: the number of partitions and the replication factor.

Essential Parameters

  • --topic <name>: The name of the topic.
  • --partitions <N>: The number of partitions the topic will be split into. Partitions are the units of parallelism and ordering within a topic.
  • --replication-factor <N>: The number of copies of the data that will be maintained across different brokers. A replication factor of 1 means no redundancy.

Command Example: Creating sales-data

This command creates a topic named sales-data with 3 partitions and a replication factor of 2 (meaning 2 copies of every partition will exist across the cluster).

kafka-topics.sh --create --topic sales-data \
  --bootstrap-server $BROKER_ADDRESS \
  --partitions 3 \
  --replication-factor 2

Tip: In a production environment with N brokers, a replication factor of 3 is often recommended for high availability (allowing the loss of two brokers before data loss occurs), and the number of partitions should be tuned based on anticipated throughput and consumer parallelism needs.

2. Listing All Topics

To view all topics currently available in the Kafka cluster, use the --list flag.

Command Example

kafka-topics.sh --list --bootstrap-server $BROKER_ADDRESS

Output Example:

sales-data
logistics-stream
__consumer_offsets

3. Describing Topic Configuration

Checking the existing configuration, partition count, and broker assignment for a specific topic is essential for troubleshooting and verification. Use the --describe flag.

Command Example: Describing sales-data

kafka-topics.sh --describe --topic sales-data \
  --bootstrap-server $BROKER_ADDRESS

Output Interpretation:

The output shows the configuration at both the topic level and the partition level:

Topic: sales-data  PartitionCount: 3 ReplicationFactor: 2 Configs:
  Topic: sales-data  Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
  Topic: sales-data  Partition: 1 Leader: 2 Replicas: 2,0 Isr: 2,0
  Topic: sales-data  Partition: 2 Leader: 0 Replicas: 0,1 Isr: 0,1
  • Leader: The broker currently responsible for handling reads/writes for that partition.
  • Replicas: The list of brokers holding a copy of that partition.
  • Isr (In-Sync Replicas): The subset of replicas that are fully synchronized with the Leader. High availability requires the Leader to be in the ISR.

4. Altering Existing Topics

Kafka provides limited mechanisms for altering topics after creation. The two most common alteration tasks are increasing the partition count and overriding default broker configuration settings.

A. Increasing Partitions

Partitions can only be increased, never decreased. Increasing partitions helps scale out consumer parallelism.

Warning: Increasing partitions changes how messages are mapped (hashed) to partitions. If your producers rely on key-based ordering guarantees, increasing partitions may break ordered delivery for existing keys.

If sales-data currently has 3 partitions, we can increase it to 5:

kafka-topics.sh --alter --topic sales-data \
  --bootstrap-server $BROKER_ADDRESS \
  --partitions 5

B. Altering Topic-Specific Configuration

You can override global broker settings (like message retention time or cleanup policies) for individual topics using the --config flag.

Example: Setting a message retention time of 24 hours (86400000 milliseconds) for sales-data.

kafka-topics.sh --alter --topic sales-data \
  --bootstrap-server $BROKER_ADDRESS \
  --config retention.ms=86400000

To remove a specific configuration override and revert to the default broker setting, use the --delete-config flag:

kafka-topics.sh --alter --topic sales-data \
  --bootstrap-server $BROKER_ADDRESS \
  --delete-config retention.ms

5. Deleting a Kafka Topic

Topics that are no longer in use should be properly deleted to reclaim disk space and maintain cluster hygiene.

Enabling Topic Deletion

By default, Kafka brokers might disable topic deletion for safety. Before you can delete a topic, ensure that the following setting is enabled in your server.properties file on all brokers:

delete.topic.enable=true

Command Example: Deleting old-stream

Use the --delete flag to initiate the topic removal. Topic deletion is often asynchronous, meaning the command submits the request, and the deletion happens in the background.

kafka-topics.sh --delete --topic old-stream \
  --bootstrap-server $BROKER_ADDRESS

Confirmation Output:

Deletion of topic old-stream initiated successfully.

Summary of Topic Management Commands

Action Flag(s) Purpose Example Parameters
Create --create Initialize a new topic. --partitions 5 --replication-factor 3
List --list Show all topics in the cluster. N/A
Describe --describe View current configuration and layout. --topic my-topic
Alter (Partitions) --alter Increase the number of partitions. --partitions N (N > current count)
Alter (Config) --alter --config Override broker defaults for a specific topic. --config retention.ms=...
Delete --delete Remove a topic permanently. --topic my-topic

Conclusion and Next Steps

The command line remains the most powerful and flexible interface for managing your Kafka cluster. By mastering the kafka-topics.sh utility, you gain granular control over topic creation parameters, configuration overrides, and necessary administrative actions like deletion and description.

Next Steps:

  1. Practice these commands in a development or staging environment.
  2. Explore advanced configuration options using the --describe command to see the full list of configurable properties (e.g., cleanup.policy, max.message.bytes).
  3. Learn the corresponding CLI commands for producer and consumer testing (kafka-console-producer.sh and kafka-console-consumer.sh).