Understanding Kafka Command Line Tools: CLI Reference Guide

Unlock the power of Apache Kafka with this comprehensive command-line interface (CLI) reference guide. Learn essential Kafka commands for managing topics (`kafka-topics.sh`), sending messages (`kafka-console-producer.sh`), consuming data (`kafka-console-consumer.sh`), and inspecting consumer groups (`kafka-consumer-groups.sh`). This guide details practical use cases, arguments, and best practices for effective Kafka administration and troubleshooting.

41 views

Understanding Kafka Command Line Tools: CLI Reference Guide

Apache Kafka is a powerful distributed event streaming platform that enables high-throughput, fault-tolerant, and scalable real-time data pipelines. While Kafka can be managed and interacted with programmatically through its APIs, its command-line interface (CLI) tools offer a direct and efficient way to perform essential administrative tasks, manage topics, interact with consumers, and monitor cluster health. This guide provides a comprehensive reference to the most commonly used Kafka CLI tools, detailing their purpose, essential arguments, and practical use cases.

Understanding these tools is crucial for Kafka administrators, developers, and anyone involved in managing or troubleshooting Kafka clusters. They allow for quick inspection, manipulation, and diagnostics without the need to write custom scripts or applications for every simple operation.

Core Kafka CLI Tools

Kafka distributions typically include a bin/ directory containing various scripts and executables. We will focus on the most frequently used ones for managing Kafka effectively.

1. kafka-topics.sh

This is arguably the most frequently used command-line tool. It allows you to create, list, describe, delete, alter, and manage Kafka topics. Topic management is fundamental to organizing data streams within Kafka.

Common Subcommands and Arguments:

  • --create: Creates a new topic.
  • --list: Lists all topics in the cluster.
  • --describe: Provides detailed information about specific topics.
  • --delete: Deletes one or more topics.
  • --alter: Modifies the configuration of an existing topic.
  • --topic <topic_name>: Specifies the topic name.
  • --partitions <num_partitions>: Sets the number of partitions for a topic (used with --create).
  • --replication-factor <factor>: Sets the replication factor for a topic (used with --create).
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.

Examples:

  • Create a topic named my_topic with 3 partitions and a replication factor of 2:
    bash kafka-topics.sh --create --topic my_topic --partitions 3 --replication-factor 2 --bootstrap-server kafka-broker-1:9092,kafka-broker-2:9092

  • List all topics in the cluster:
    bash kafka-topics.sh --list --bootstrap-server kafka-broker-1:9092

  • Describe a topic named my_topic:
    bash kafka-topics.sh --describe --topic my_topic --bootstrap-server kafka-broker-1:9092
    This will show details like partitions, leader, replicas, and ISRs (In-Sync Replicas).

  • Delete a topic named old_topic:
    bash kafka-topics.sh --delete --topic old_topic --bootstrap-server kafka-broker-1:9092
    Note: Topic deletion needs to be enabled in Kafka broker configurations (delete.topic.enable=true).

2. kafka-console-producer.sh

This tool allows you to send messages to a Kafka topic from standard input. It's invaluable for testing producers, injecting sample data, or manually publishing messages.

Common Arguments:

  • --topic <topic_name>: Specifies the target topic.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --property <key>=<value>: Allows setting producer properties (e.g., key.serializer, value.serializer).
  • --producer-property <key>=<value>: Similar to --property, but specifically for producer-side configurations.

Examples:

  • Send messages to my_topic:
    bash kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092
    After running this, you can type messages line by line. Press Ctrl+C to exit.

  • Send messages with keys (JSON format):
    bash kafka-console-producer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property parse.key=true --property key.separator=':'
    Now you can type key:value pairs, and Kafka will send them with the specified key.

3. kafka-console-consumer.sh

This tool subscribes to one or more Kafka topics and prints the messages it receives to standard output. It's essential for testing consumers, inspecting data in topics, and debugging producer/consumer applications.

Common Arguments:

  • --topic <topic_name>: Specifies the topic(s) to consume from.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --group-id <group_id>: Specifies the consumer group ID. This is important for managing offsets and allowing multiple consumers to share the consumption load.
  • --from-beginning: Reads messages from the beginning of the topic's log.
  • --offset <offset>: Starts consuming from a specific offset.
  • --partition <partition_id>: Consumes from a specific partition.
  • --property <key>=<value>: Allows setting consumer properties (e.g., value.deserializer).

Examples:

  • Consume all messages from my_topic:
    bash kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092

  • Consume messages from the beginning of my_topic for consumer group my_group:
    bash kafka-console-consumer.sh --topic my_topic --group-id my_group --from-beginning --bootstrap-server kafka-broker-1:9092

  • Consume messages with offsets and keys printed:
    bash kafka-console-consumer.sh --topic my_topic --bootstrap-server kafka-broker-1:9092 --property print.key=true --property key.separator="\t" --property print.offset=true --property print.headers=true

4. kafka-consumer-groups.sh

This tool is used to manage and inspect consumer groups. It's vital for understanding consumer lag, reassigning partitions, and troubleshooting consumption issues.

Common Subcommands and Arguments:

  • --list: Lists all consumer groups in the cluster.
  • --describe: Provides details about specific consumer groups, including lag.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --group <group_id>: Specifies the consumer group ID.
  • --reset-offsets: Resets offsets for a consumer group.
  • --topic <topic_name>: Specifies the topic for offset reset.
  • --to-earliest: Resets offsets to the earliest available message.
  • --to-latest: Resets offsets to the latest available message.
  • --execute: Executes the offset reset operation.

Examples:

  • List all consumer groups:
    bash kafka-consumer-groups.sh --list --bootstrap-server kafka-broker-1:9092

  • Describe a consumer group my_group and show its lag:
    bash kafka-consumer-groups.sh --describe --group my_group --bootstrap-server kafka-broker-1:9092
    The output will show the topic, partition, current offset, log end offset, and the lag.

  • Reset offsets for my_group on my_topic to the earliest available message:
    bash kafka-consumer-groups.sh --group my_group --topic my_topic --reset-offsets --to-earliest --execute --bootstrap-server kafka-broker-1:9092
    Use this command with caution, as it affects where consumers will start reading from.

5. kafka-log-dirs.sh

This tool helps to inspect the log directories on Kafka brokers. It can be useful for understanding disk usage and locating topic data.

Common Arguments:

  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.
  • --topic <topic_name>: Filters the output to show directories for a specific topic.

Examples:

  • List log directories on a broker:
    bash kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092

  • Show log directories for a specific topic:
    bash kafka-log-dirs.sh --bootstrap-server kafka-broker-1:9092 --topic my_topic

6. kafka-preferred-replica-election.sh

This script initiates preferred replica elections for topics. A preferred replica is the broker that is chosen as the leader for a partition based on its replication factor. If a broker fails and a non-preferred replica becomes the leader, this tool can be used to move leadership back to the preferred replica.

Common Arguments:

  • --topic <topic_name>: Specifies the topic for which to elect preferred replicas.
  • --broker-list <broker_id1,broker_id2,...>: Specifies a comma-separated list of broker IDs.
  • --bootstrap-server <host:port>: Specifies the Kafka broker to connect to.

Examples:

  • Elect preferred replicas for my_topic:
    bash kafka-preferred-replica-election.sh --topic my_topic --bootstrap-server kafka-broker-1:9092

  • Elect preferred replicas for multiple topics:
    bash kafka-preferred-replica-election.sh --topic topic1,topic2 --bootstrap-server kafka-broker-1:9092

Important Considerations and Best Practices

  • --bootstrap-server is Key: Always ensure you specify the correct --bootstrap-server argument to connect to your Kafka cluster. This is usually a comma-separated list of host:port for your brokers.
  • Environment: These commands are typically found in the bin/ directory of your Kafka installation. You'll need to navigate to this directory or ensure Kafka's bin directory is in your system's PATH.
  • Permissions: Ensure the user running these commands has the necessary network access to reach the Kafka brokers.
  • Configuration: Many CLI tools can accept Kafka client configurations via --property or --producer-property/--consumer-property arguments. This is useful for overriding default serializers/deserializers or setting other specific client configurations.
  • Security: For secure Kafka clusters (e.g., with SSL/TLS or SASL authentication), you'll need to pass additional security-related arguments (like --command-config pointing to a client properties file) to these tools.
  • Topic Deletion: Remember that topic deletion is a sensitive operation and must be explicitly enabled in the Kafka broker's server.properties file using delete.topic.enable=true.

Conclusion

Kafka's command-line tools provide a robust and accessible interface for managing and interacting with your Kafka cluster. Mastering tools like kafka-topics.sh, kafka-console-producer.sh, kafka-console-consumer.sh, and kafka-consumer-groups.sh is essential for efficient Kafka operations, troubleshooting, and development. By understanding their arguments and use cases, you can significantly streamline your workflow and gain deeper insights into your event streaming infrastructure.

Regularly referring to these commands will not only help you perform daily administrative tasks but also empower you to diagnose and resolve issues more effectively. As you become more familiar with Kafka, you can explore other utility scripts available in the bin/ directory for more advanced operations.