Understanding Kafka Command Line Tools: CLI Reference Guide

Unlock the power of Apache Kafka with this comprehensive command-line interface (CLI) reference guide. Learn essential Kafka commands for managing topics (`kafka-topics.sh`), sending messages (`kafka-console-producer.sh`), consuming data (`kafka-console-consumer.sh`), and inspecting consumer groups (`kafka-consumer-groups.sh`). This guide details practical use cases, arguments, and best practices for effective Kafka administration and troubleshooting.

Understanding Kafka Command Line Tools: CLI Reference Guide

Kafka's command-line tools are the quickest way to answer basic operational questions: does this topic exist, which broker leads this partition, what is inside the topic, why is this consumer group behind, and can this client authenticate with the cluster? You do not need them for every task, and most production changes should still go through automation, but during a broken deploy or a late-night data question, the CLI is often the shortest path to facts.

The examples below assume the scripts are on your PATH. In many installations they live under Kafka's bin/ directory, so the same command may be run as bin/kafka-topics.sh. For secured clusters, most commands also need --command-config client.properties, where that file contains SASL, SSL, and other client settings.

Core Kafka CLI Tools

Kafka distributions typically include a bin/ directory containing various scripts and executables. We will focus on the most frequently used ones for managing Kafka effectively.

1. kafka-topics.sh

This is arguably the most frequently used command-line tool. It allows you to create, list, describe, delete, alter, and manage Kafka topics. Topic management is fundamental to organizing data streams within Kafka.

Common Subcommands and Arguments:

  • --create: Creates a new topic.
  • --list: Lists all topics in the cluster.
  • --describe: Provides detailed information about specific topics.
  • --delete: Deletes one or more topics.
  • --alter: Modifies the configuration of an existing topic.
  • --topic <topic_name>: Specifies the topic name.
  • --partitions <num_partitions>: Sets the number of partitions for a topic (used with --create).
  • --replication-factor <factor>: Sets the replication factor for a topic (used with --create).
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.

Examples:

  • Create a topic named my_topic with 3 partitions and a replication factor of 2:

    kafka-topics.sh --create --topic my_topic --partitions 3 --replication-factor 2 --bootstrap-server kafka-broker-1:9092,kafka-broker-2:9092
    
  • List all topics in the cluster:

    kafka-topics.sh --list --bootstrap-server kafka-broker-1:9092
    
  • Describe a topic named my_topic:

    kafka-topics.sh --describe --topic my_topic --bootstrap-server kafka-broker-1:9092
    

    This will show details like partitions, leader, replicas, and ISRs (In-Sync Replicas).

  • Delete a topic named old_topic:

    kafka-topics.sh --delete --topic old_topic --bootstrap-server kafka-broker-1:9092
    

    Note: Topic deletion needs to be enabled in Kafka broker configurations (delete.topic.enable=true).

2. kafka-console-producer.sh

This tool allows you to send messages to a Kafka topic from standard input. It's invaluable for testing producers, injecting sample data, or manually publishing messages.

Common Arguments:

  • --topic <topic_name>: Specifies the target topic.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --property <key>=<value>: Allows setting producer properties (e.g., key.serializer, value.serializer).
  • --producer-property <key>=<value>: Similar to --property, but specifically for producer-side configurations.

Examples:

  • Send messages to my_topic:

    kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092
    

    After running this, you can type messages line by line. Press Ctrl+C to exit.

  • Send messages with keys (JSON format):

    kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property parse.key=true --property key.separator=':'
    

    Now you can type key:value pairs, and Kafka will send them with the specified key.

3. kafka-console-consumer.sh

This tool subscribes to one or more Kafka topics and prints the messages it receives to standard output. It's essential for testing consumers, inspecting data in topics, and debugging producer/consumer applications.

Common Arguments:

  • --topic <topic_name>: Specifies the topic(s) to consume from.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --group-id <group_id>: Specifies the consumer group ID. This is important for managing offsets and allowing multiple consumers to share the consumption load.
  • --from-beginning: Reads messages from the beginning of the topic's log.
  • --offset <offset>: Starts consuming from a specific offset.
  • --partition <partition_id>: Consumes from a specific partition.
  • --property <key>=<value>: Allows setting consumer properties (e.g., value.deserializer).

Examples:

  • Consume all messages from my_topic:

    kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092
    
  • Consume messages from the beginning of my_topic for consumer group my_group:

    kafka-console-consumer.sh --topic my_topic --group-id my_group --from-beginning --bootstrap-server kafka-broker-1:9092
    
  • Consume messages with offsets and keys printed:

    kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property print.key=true --property key.separator="\t" --property print.offset=true --property print.headers=true
    

4. kafka-consumer-groups.sh

This tool is used to manage and inspect consumer groups. It's vital for understanding consumer lag, reassigning partitions, and troubleshooting consumption issues.

Common Subcommands and Arguments:

  • --list: Lists all consumer groups in the cluster.
  • --describe: Provides details about specific consumer groups, including lag.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --group <group_id>: Specifies the consumer group ID.
  • --reset-offsets: Resets offsets for a consumer group.
  • --topic <topic_name>: Specifies the topic for offset reset.
  • --to-earliest: Resets offsets to the earliest available message.
  • --to-latest: Resets offsets to the latest available message.
  • --execute: Executes the offset reset operation.

Examples:

  • List all consumer groups:

    kafka-consumer-groups.sh --list --bootstrap-server kafka-broker-1:9092
    
  • Describe a consumer group my_group and show its lag:

    kafka-consumer-groups.sh --describe --group my_group --bootstrap-server kafka-broker-1:9092
    

    The output will show the topic, partition, current offset, log end offset, and the lag.

  • Reset offsets for my_group on my_topic to the earliest available message:

    kafka-consumer-groups.sh --group my_group --topic my_topic --reset-offsets --to-earliest --execute --bootstrap-server kafka-broker-1:9092
    

    Use this command with caution, as it affects where consumers will start reading from.

5. kafka-log-dirs.sh

This tool helps to inspect the log directories on Kafka brokers. It can be useful for understanding disk usage and locating topic data.

Common Arguments:

  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --topic <topic_name>: Filters the output to show directories for a specific topic.

Examples:

  • List log directories on a broker:

    kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092
    
  • Show log directories for a specific topic:

    kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092 --topic my_topic
    

6. kafka-preferred-replica-election.sh

This script initiates preferred replica elections for topics. A preferred replica is the broker that is chosen as the leader for a partition based on its replication factor. If a broker fails and a non-preferred replica becomes the leader, this tool can be used to move leadership back to the preferred replica.

Common Arguments:

  • --topic <topic_name>: Specifies the topic for which to elect preferred replicas.
  • --broker-list <broker_id1,broker_id2,...>: Specifies a comma-separated list of broker IDs.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.

Examples:

  • Elect preferred replicas for my_topic:

    kafka-preferred-replica-election.sh --topic my_topic --bootstrap-server kafka-broker-1:9092
    
  • Elect preferred replicas for multiple topics:

    kafka-preferred-replica-election.sh --topic topic1,topic2 --bootstrap-server kafka-broker-1:9092
    

Important Considerations and Best Practices

  • --bootstrap-server is Key: Always ensure you specify the correct --bootstrap-server argument to connect to your Kafka cluster. This is usually a comma-separated list of host:port for your brokers.
  • Environment: These commands are typically found in the bin/ directory of your Kafka installation. You'll need to navigate to this directory or ensure Kafka's bin directory is in your system's PATH.
  • Permissions: Ensure the user running these commands has the necessary network access to reach the Kafka brokers.
  • Configuration: Many CLI tools can accept Kafka client configurations via --property or --producer-property/--consumer-property arguments. This is useful for overriding default serializers/deserializers or setting other specific client configurations.
  • Security: For secure Kafka clusters (e.g., with SSL/TLS or SASL authentication), you'll need to pass additional security-related arguments (like --command-config pointing to a client properties file) to these tools.
  • Topic Deletion: Remember that topic deletion is a sensitive operation and must be explicitly enabled in the Kafka broker's server.properties file using delete.topic.enable=true.

A Safe Way to Use the CLI in Production

Use the CLI as an inspection tool first and a mutation tool second. --list, --describe, and short console reads are low-risk. --delete, --alter, partition increases, and offset resets change cluster behavior and should go through the same review path as application changes whenever possible.

A practical production session usually starts with a client config file:

cat client.properties
# security.protocol=SASL_SSL
# sasl.mechanism=SCRAM-SHA-512
# sasl.jaas.config=...

Then every command includes it:

kafka-topics.sh --bootstrap-server kafka-1:9093 --command-config client.properties --describe --topic orders

For console consumers, avoid accidentally joining a real application group. Use a temporary group id when you are inspecting data, and use --max-messages so the command exits:

kafka-console-consumer.sh \
  --bootstrap-server kafka-1:9093 \
  --command-config client.properties \
  --topic orders \
  --group debug-orders-$(date +%s) \
  --from-beginning \
  --max-messages 5 \
  --property print.key=true \
  --property print.offset=true

That small habit prevents a debug command from stealing partitions from a live service. It also leaves a cleaner audit trail because the group name makes the intent obvious.

The CLI is best when it is boring: one command to inspect, one command to confirm, and a clear record of any command that changes state.

Everyday Troubleshooting Recipes

If a producer says it is writing successfully but the consumer team sees nothing, start with the topic:

kafka-topics.sh --bootstrap-server kafka-1:9092 --describe --topic orders

Confirm the topic name, partition count, leader availability, and in-sync replicas. A typo in a topic name can look exactly like a broken pipeline when auto topic creation is enabled in a development cluster. In production, a topic with offline partitions or shrinking ISR points to a broker or replication problem before it points to application code.

Next, read a small sample with a temporary group:

kafka-console-consumer.sh \
  --bootstrap-server kafka-1:9092 \
  --topic orders \
  --group debug-orders-$(date +%s) \
  --max-messages 10 \
  --property print.key=true \
  --property print.timestamp=true \
  --property print.offset=true

If records appear there, Kafka has the data and the issue is probably the real consumer group, its offsets, its subscriptions, or its processing logic. If no records appear, check the producer topic, serializers, authentication, and whether the producer is writing to a different cluster.

For lag questions, go straight to the group:

kafka-consumer-groups.sh --bootstrap-server kafka-1:9092 --describe --group orders-writer

Do not stop at total lag. Compare partitions. A single partition with large lag means a different problem than every partition with moderate lag. Single-partition lag often means key skew or one bad consumer assignment. Even lag usually means the whole application is slower than the input rate.

For "what changed?" questions, inspect topic configuration:

kafka-configs.sh \
  --bootstrap-server kafka-1:9092 \
  --entity-type topics \
  --entity-name orders \
  --describe

This is where you catch retention changes, cleanup policy surprises, compression overrides, and message size settings that differ from the service's assumptions.

CLI output is not a replacement for monitoring, but it is excellent for reducing uncertainty. In a real incident, a few command outputs pasted into the ticket can save everyone from debating whether the topic exists, whether records are present, and whether the group is actually moving.

Commands Worth Treating Carefully

Some Kafka CLI commands look harmless because they are short. They are not harmless.

kafka-topics.sh --alter --partitions only increases partition count; it does not shrink it later if you regret the change. More partitions can help consumer parallelism, but they can also change key distribution for new records and break assumptions in systems that expected all events for a key range to land in a smaller set of partitions.

kafka-consumer-groups.sh --reset-offsets --execute changes where a group will read next. Use --dry-run first, stop the affected consumers, and record the old offsets. Resetting to earliest can replay data into systems that are not idempotent. Resetting to latest can skip data that the business still expects to process.

kafka-topics.sh --delete depends on cluster configuration and policy, but when deletion is allowed, it should be treated like dropping a database table. Check the cluster, check the topic, and check whether another environment uses the same naming convention. A production topic called orders-test is still production if real services depend on it.

For repeatable operations, put the command in a runbook or script with explicit variables for cluster, topic, group, and command config. The CLI is great for investigation, but production mutation should be boring, reviewed, and easy to audit.