Apache Kafka for Beginners

Youtube Tutorial

Apache Kafka is distributed publish-subscribe messaging system.

Producer -> Broker -> Consumer


  • write msg to partition (in Topic)


  • read msg to partition (in Topic)
  • 2 ways: waiting for new, from beginning

Kafka Broker

  • receive msg from producer
  • store msg
  • give ability for consumers to read msg
  • 1 broker can have many topics


  • have specific name
  • default duration: 7 days
  • queue structure (msg)
  • can be in many brokers at the same time
  • part of a topic (in a broker) called partition


  • 1 partition has 2+ nodes (2 types): leader & follower
  • leader: handle partition read/write operation
  • if leader fails, one of followers be come leader
  • have many nodes to back up msg

Message (structure)

  • timestamp
  • offset number (unique across partition)
  • key (optional)
  • value (sequence of bytes)
Broker <-> Zookeeper


  • maintains list of active brokers
  • manage config of topics and partitions
  • elects controller

Zookeeper Cluster (ensemble)

  • many zookeepers
  • recommended to have odd number of zookeepers in an ensemble


Setup Ubuntu on VirtualBox

  • open VirtualBox, click "New"
  • Setup following:
    • Name: Ubuntu Kafka
    • Type: Linux
    • Version: Ubuntu (64 bit)
    • memory: 2048 MB (2 GB)
    • hardware: create a hardware, VDI, Dynamically allocated

After save there will be a folder created

note. /Users/ZaGA/VirtualBox VMs/Ubuntu Kafka/

  • click "Settings", go to Storage
  • click Empty (Controller IDE)
    • Optical Drive: the ubuntu-xxx-desktop-amd64.iso file, we downloaded from above
  • in the VM (small window), click "Install Ubuntu", finish the settings (it will restart)

Setup Ubuntu on Droplet (Digital Ocean)

On Computer, to connect to the Droplet

$ ssh root@<IP_ADDRESS>
# enter pwd (send to email)

Install Apache Kafka on Sever

# install java
$ sudo apt install openjdk-11-jdk

# check java version
$ java --version
# if Downloads not exist create one
$ mkdir Downloads

# download Apache Kafka
$ curl -o Downloads/kafka.tgz

# extract file
$ mkdir kafka
$ cd kafka
$ tar -xvzf ~/Downloads/kafka.tgz --strip 1

Zookeeper and Broker

Kafka Broker can have multiple clusters

1 Kafka cluster = 1 Zookeeper + 1 Broker

Run Zookeeper

# default -> localhost:2181
$ bin/ config/

Run Server/Broker (keep Zookeeper running)

# default -> localhost:9092
$ bin/ config/


# create, delete, describe, or change a topic
$ bin/ --create --bootstrap-server localhost:9092 --topic <TOPIC_NAME>
# list topics belong to a zookeeper
$ bin/ --list --zookeeper localhost:2181

# get a topic detail
$ bin/ --describe --zookeeper localhost:2181 --topic <TOPIC_NAME>

Producer and Consumer

in 1 Kafka cluster can have multiple Producers and Consumers

# start sending messages to a topic
$ bin/ --broker-list localhost:9092 --topic <TOPIC_NAME>
# start receiving messages from a topic
$ bin/ --bootstrap-server localhost:9092 --topic <TOPIC_NAME>

# start receiving (and also show the earlier messages)
$ bin/ --bootstrap-server localhost:9092 --topic <TOPIC_NAME> --from-beginning


