Apache Kafka a Quick Overview

Thiago Marsal Farias
3 min readApr 13, 2023

--

Apache Kafka is an open-source distributed streaming platform that allows you to build real-time data pipelines and streaming applications. Kafka is designed to handle high volumes of data in real time, making it an ideal choice for applications that require fast and reliable data processing.

It provides a distributed architecture for handling data streams and supports various use cases, including real-time data processing, messaging, and log aggregation. Kafka is based on three capabilities: Publish, Store, and Process, where data is produced by publishers and consumed by subscribers. Producers write data to Kafka topics, which are essentially streams of records, and consumers read data from these topics in real time.

This quick start guide will cover the basics of setting up and using Apache Kafka for your data streaming needs.

Prerequisites

Before getting started with Apache Kafka, you will need the following prerequisites:

Setting up Apache Kafka

  1. Download the Apache Kafka binary distribution and extract it to a directory of your choice.
  2. Navigate to the Kafka directory and start the ZooKeeper “zookeeper-server-start” server by running the following command in a terminal window:
  3. In a new terminal window, start the Kafka server “kafka-server-start” by running the following command:

Windows:

bin/zookeeper-server-start.bat config/zookeeper.properties
bin/kafka-server-start.bat config/server.properties

Linux:

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

This will start the Kafka server, which will listen on port 9092 by default.

Creating a Topic

Before sending data to Kafka, you need to create a topic. A topic is a named stream of records in Kafka, the primary data storage unit.

To create a topic, run the following command. This will create a topic named “test-topic” with one partition and one replication factor.

Windows:

bin/kafka-topics.bat --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test-topic

Linux:

bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test-topic

Sending and Receiving Messages

Once you have created a topic, you can send and receive messages using the Kafka command line tools.

Run the following command to send a message to the “test-topic” topic will send the message “Hello World” to the “test-topic” topic.

Windows:

echo "Hello World" | bin/kafka-console-producer.bat --broker-list localhost:9092 --topic test-topic

Linux:

echo "Hello World" | bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test-topic

Run the following command to receive messages from the “test-topic” topic will start a console-based consumer that will read messages from the “test-topic” topic and print them to the console.

Windows:

bin/kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic test-topic --from-beginning

Linux:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning

Simple Java Producer and Consumer

The code below provides an overview of implementing a simple Java program to produce and consume messages using Apache Kafka.

https://github.com/thiagomarsal/java-kafka

Conclusion

Apache Kafka is a powerful tool for building real-time data pipelines and streaming applications. With its distributed architecture and support for high volumes of data, it is an ideal choice for applications that require fast and reliable data processing.

This quickstart guide covered the basics of setting up and using Apache Kafka, including creating a topic, sending and receiving messages, and more. With this knowledge, you can build your data streaming applications using Apache Kafka.

I will cover some key features and configurations for high-volume data handling in a dedicated article, which requires more leverage.

References

https://kafka.apache.org/documentation/

--

--

Responses (1)