Understanding Kafka Command Line Tools: CLI Reference Guide
Unlock the power of Apache Kafka with this comprehensive command-line interface (CLI) reference guide. Learn essential Kafka commands for managing topics (`kafka-topics.sh`), sending messages (`kafka-console-producer.sh`), consuming data (`kafka-console-consumer.sh`), and inspecting consumer groups (`kafka-consumer-groups.sh`). This guide details practical use cases, arguments, and best practices for effective Kafka administration and troubleshooting.
Understanding Kafka Command Line Tools: CLI Reference Guide
Kafka's command-line tools are the quickest way to answer basic operational questions: does this topic exist, which broker leads this partition, what is inside the topic, why is this consumer group behind, and can this client authenticate with the cluster? You do not need them for every task, and most production changes should still go through automation, but during a broken deploy or a late-night data question, the CLI is often the shortest path to facts.
The examples below assume the scripts are on your PATH. In many installations they live under Kafka's bin/ directory, so the same command may be run as bin/kafka-topics.sh. For secured clusters, most commands also need --command-config client.properties, where that file contains SASL, SSL, and other client settings.
Core Kafka CLI Tools
Kafka distributions typically include a bin/ directory containing various scripts and executables. We will focus on the most frequently used ones for managing Kafka effectively.
1. kafka-topics.sh
This is arguably the most frequently used command-line tool. It allows you to create, list, describe, delete, alter, and manage Kafka topics. Topic management is fundamental to organizing data streams within Kafka.
Common Subcommands and Arguments:
--create: Creates a new topic.--list: Lists all topics in the cluster.--describe: Provides detailed information about specific topics.--delete: Deletes one or more topics.--alter: Modifies the configuration of an existing topic.--topic <topic_name>: Specifies the topic name.--partitions <num_partitions>: Sets the number of partitions for a topic (used with--create).--replication-factor <factor>: Sets the replication factor for a topic (used with--create).--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
Examples:
Create a topic named
my_topicwith 3 partitions and a replication factor of 2:kafka-topics.sh --create --topic my_topic --partitions 3 --replication-factor 2 --bootstrap-server kafka-broker-1:9092,kafka-broker-2:9092List all topics in the cluster:
kafka-topics.sh --list --bootstrap-server kafka-broker-1:9092Describe a topic named
my_topic:kafka-topics.sh --describe --topic my_topic --bootstrap-server kafka-broker-1:9092This will show details like partitions, leader, replicas, and ISRs (In-Sync Replicas).
Delete a topic named
old_topic:kafka-topics.sh --delete --topic old_topic --bootstrap-server kafka-broker-1:9092Note: Topic deletion needs to be enabled in Kafka broker configurations (
delete.topic.enable=true).
2. kafka-console-producer.sh
This tool allows you to send messages to a Kafka topic from standard input. It's invaluable for testing producers, injecting sample data, or manually publishing messages.
Common Arguments:
--topic <topic_name>: Specifies the target topic.--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.--property <key>=<value>: Allows setting producer properties (e.g.,key.serializer,value.serializer).--producer-property <key>=<value>: Similar to--property, but specifically for producer-side configurations.
Examples:
Send messages to
my_topic:kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092After running this, you can type messages line by line. Press
Ctrl+Cto exit.Send messages with keys (JSON format):
kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property parse.key=true --property key.separator=':'Now you can type
key:valuepairs, and Kafka will send them with the specified key.
3. kafka-console-consumer.sh
This tool subscribes to one or more Kafka topics and prints the messages it receives to standard output. It's essential for testing consumers, inspecting data in topics, and debugging producer/consumer applications.
Common Arguments:
--topic <topic_name>: Specifies the topic(s) to consume from.--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.--group-id <group_id>: Specifies the consumer group ID. This is important for managing offsets and allowing multiple consumers to share the consumption load.--from-beginning: Reads messages from the beginning of the topic's log.--offset <offset>: Starts consuming from a specific offset.--partition <partition_id>: Consumes from a specific partition.--property <key>=<value>: Allows setting consumer properties (e.g.,value.deserializer).
Examples:
Consume all messages from
my_topic:kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092Consume messages from the beginning of
my_topicfor consumer groupmy_group:kafka-console-consumer.sh --topic my_topic --group-id my_group --from-beginning --bootstrap-server kafka-broker-1:9092Consume messages with offsets and keys printed:
kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property print.key=true --property key.separator="\t" --property print.offset=true --property print.headers=true
4. kafka-consumer-groups.sh
This tool is used to manage and inspect consumer groups. It's vital for understanding consumer lag, reassigning partitions, and troubleshooting consumption issues.
Common Subcommands and Arguments:
--list: Lists all consumer groups in the cluster.--describe: Provides details about specific consumer groups, including lag.--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.--group <group_id>: Specifies the consumer group ID.--reset-offsets: Resets offsets for a consumer group.--topic <topic_name>: Specifies the topic for offset reset.--to-earliest: Resets offsets to the earliest available message.--to-latest: Resets offsets to the latest available message.--execute: Executes the offset reset operation.
Examples:
List all consumer groups:
kafka-consumer-groups.sh --list --bootstrap-server kafka-broker-1:9092Describe a consumer group
my_groupand show its lag:kafka-consumer-groups.sh --describe --group my_group --bootstrap-server kafka-broker-1:9092The output will show the topic, partition, current offset, log end offset, and the lag.
Reset offsets for
my_grouponmy_topicto the earliest available message:kafka-consumer-groups.sh --group my_group --topic my_topic --reset-offsets --to-earliest --execute --bootstrap-server kafka-broker-1:9092Use this command with caution, as it affects where consumers will start reading from.
5. kafka-log-dirs.sh
This tool helps to inspect the log directories on Kafka brokers. It can be useful for understanding disk usage and locating topic data.
Common Arguments:
--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.--topic <topic_name>: Filters the output to show directories for a specific topic.
Examples:
List log directories on a broker:
kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092Show log directories for a specific topic:
kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092 --topic my_topic
6. kafka-preferred-replica-election.sh
This script initiates preferred replica elections for topics. A preferred replica is the broker that is chosen as the leader for a partition based on its replication factor. If a broker fails and a non-preferred replica becomes the leader, this tool can be used to move leadership back to the preferred replica.
Common Arguments:
--topic <topic_name>: Specifies the topic for which to elect preferred replicas.--broker-list <broker_id1,broker_id2,...>: Specifies a comma-separated list of broker IDs.--bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
Examples:
Elect preferred replicas for
my_topic:kafka-preferred-replica-election.sh --topic my_topic --bootstrap-server kafka-broker-1:9092Elect preferred replicas for multiple topics:
kafka-preferred-replica-election.sh --topic topic1,topic2 --bootstrap-server kafka-broker-1:9092
Important Considerations and Best Practices
--bootstrap-serveris Key: Always ensure you specify the correct--bootstrap-serverargument to connect to your Kafka cluster. This is usually a comma-separated list ofhost:portfor your brokers.- Environment: These commands are typically found in the
bin/directory of your Kafka installation. You'll need to navigate to this directory or ensure Kafka'sbindirectory is in your system's PATH. - Permissions: Ensure the user running these commands has the necessary network access to reach the Kafka brokers.
- Configuration: Many CLI tools can accept Kafka client configurations via
--propertyor--producer-property/--consumer-propertyarguments. This is useful for overriding default serializers/deserializers or setting other specific client configurations. - Security: For secure Kafka clusters (e.g., with SSL/TLS or SASL authentication), you'll need to pass additional security-related arguments (like
--command-configpointing to a client properties file) to these tools. - Topic Deletion: Remember that topic deletion is a sensitive operation and must be explicitly enabled in the Kafka broker's
server.propertiesfile usingdelete.topic.enable=true.
A Safe Way to Use the CLI in Production
Use the CLI as an inspection tool first and a mutation tool second. --list, --describe, and short console reads are low-risk. --delete, --alter, partition increases, and offset resets change cluster behavior and should go through the same review path as application changes whenever possible.
A practical production session usually starts with a client config file:
cat client.properties
# security.protocol=SASL_SSL
# sasl.mechanism=SCRAM-SHA-512
# sasl.jaas.config=...
Then every command includes it:
kafka-topics.sh --bootstrap-server kafka-1:9093 --command-config client.properties --describe --topic orders
For console consumers, avoid accidentally joining a real application group. Use a temporary group id when you are inspecting data, and use --max-messages so the command exits:
kafka-console-consumer.sh \
--bootstrap-server kafka-1:9093 \
--command-config client.properties \
--topic orders \
--group debug-orders-$(date +%s) \
--from-beginning \
--max-messages 5 \
--property print.key=true \
--property print.offset=true
That small habit prevents a debug command from stealing partitions from a live service. It also leaves a cleaner audit trail because the group name makes the intent obvious.
The CLI is best when it is boring: one command to inspect, one command to confirm, and a clear record of any command that changes state.
Everyday Troubleshooting Recipes
If a producer says it is writing successfully but the consumer team sees nothing, start with the topic:
kafka-topics.sh --bootstrap-server kafka-1:9092 --describe --topic orders
Confirm the topic name, partition count, leader availability, and in-sync replicas. A typo in a topic name can look exactly like a broken pipeline when auto topic creation is enabled in a development cluster. In production, a topic with offline partitions or shrinking ISR points to a broker or replication problem before it points to application code.
Next, read a small sample with a temporary group:
kafka-console-consumer.sh \
--bootstrap-server kafka-1:9092 \
--topic orders \
--group debug-orders-$(date +%s) \
--max-messages 10 \
--property print.key=true \
--property print.timestamp=true \
--property print.offset=true
If records appear there, Kafka has the data and the issue is probably the real consumer group, its offsets, its subscriptions, or its processing logic. If no records appear, check the producer topic, serializers, authentication, and whether the producer is writing to a different cluster.
For lag questions, go straight to the group:
kafka-consumer-groups.sh --bootstrap-server kafka-1:9092 --describe --group orders-writer
Do not stop at total lag. Compare partitions. A single partition with large lag means a different problem than every partition with moderate lag. Single-partition lag often means key skew or one bad consumer assignment. Even lag usually means the whole application is slower than the input rate.
For "what changed?" questions, inspect topic configuration:
kafka-configs.sh \
--bootstrap-server kafka-1:9092 \
--entity-type topics \
--entity-name orders \
--describe
This is where you catch retention changes, cleanup policy surprises, compression overrides, and message size settings that differ from the service's assumptions.
CLI output is not a replacement for monitoring, but it is excellent for reducing uncertainty. In a real incident, a few command outputs pasted into the ticket can save everyone from debating whether the topic exists, whether records are present, and whether the group is actually moving.
Commands Worth Treating Carefully
Some Kafka CLI commands look harmless because they are short. They are not harmless.
kafka-topics.sh --alter --partitions only increases partition count; it does not shrink it later if you regret the change. More partitions can help consumer parallelism, but they can also change key distribution for new records and break assumptions in systems that expected all events for a key range to land in a smaller set of partitions.
kafka-consumer-groups.sh --reset-offsets --execute changes where a group will read next. Use --dry-run first, stop the affected consumers, and record the old offsets. Resetting to earliest can replay data into systems that are not idempotent. Resetting to latest can skip data that the business still expects to process.
kafka-topics.sh --delete depends on cluster configuration and policy, but when deletion is allowed, it should be treated like dropping a database table. Check the cluster, check the topic, and check whether another environment uses the same naming convention. A production topic called orders-test is still production if real services depend on it.
For repeatable operations, put the command in a runbook or script with explicit variables for cluster, topic, group, and command config. The CLI is great for investigation, but production mutation should be boring, reviewed, and easy to audit.