Kafka

Introduction

Apache Kafka is a leading open-source distributed streaming platform first developed at LinkedIn. It consists of several APIs such as Producer, Consumer, Connect and Streams. Together, those systems act as high-throughput, low-latency platforms for handling real-time data. This is why Kafka is preferred among several of the top-tier tech companies such as Uber, Zalando and AirBnB.

Quite often, we would like to deploy a fully-fledged Kafka cluster in Kubernetes, just because we have a collection of microservices and we need a resilient message broker in the center. We also want to spread the Kafka instances across nodes, to minimize the impact of a failure.

In this tutorial, we are going to see an example Kafka deployment within Platform9 Free Tier Kubernetes platform, backed up by some DigitalOcean droplets. Let’s get started.

Prerequisites

A valid Kubernetes cluster. You can create one quickly for free using Platform9 Managed Kubernetes. Signup for a Free PMK Account Here and create your Kubernetes cluster using PMK. You can also use this guide on any other Kubernetes cluster you may have.
A Kubectl installation with your Kubernetes cluster from the step above configured as the primary cluster.
The helm 3 package manager client installed on your local machine. Follow Helm CLI to install the helm client on your machine.

Step 1 - Create Persistent Volumes

Before we install Helm and the Kafka chart, we need to create some persistent volumes for storing Kafka replication message files.

This step is crucial to be able to enable persistence in our cluster because without that, the topics and messages would disappear after we shutdown any of the servers, as they live in memory.

In our example, we are going to use a local file system, Persistent Volume (PV), and we need one persistent volume for each Kafka instance; so if we plan to deploy three instances, we need three PVs.

Create and apply first the Kafka namespace and the PV specs:

Kafka Namespace

namespace.yml

 YAML 
    
xxxxxxxxxx
 
---apiVersion: v1kind: Namespacemetadata:  name: kafka
Copy

And, then:

 Shell 
    
xxxxxxxxxx
 
$ kubectl apply -f namespace.ymlnamespace/kafka created
Copy

PV Specs

pv.yml

 YAML 
    
xxxxxxxxxx
 
---apiVersion: v1kind: PersistentVolumemetadata:  name: kafka-pv-volume  labels:    type: localspec:  storageClassName: manual  capacity:    storage: 10Gi  accessModes:    - ReadWriteOnce  hostPath:    path: "/mnt/data"---apiVersion: v1kind: PersistentVolumemetadata:  name: kafka-pv-volume-2  labels:    type: localspec:  storageClassName: manual  capacity:    storage: 10Gi  accessModes:    - ReadWriteOnce  hostPath:    path: "/mnt/data"---apiVersion: v1kind: PersistentVolumemetadata:  name: kafka-pv-volume-3  labels:    type: localspec:  storageClassName: manual  capacity:    storage: 10Gi  accessModes:    - ReadWriteOnce  hostPath:    path: "/mnt/data"
Copy

And, then kubectl apply:

 Shell 
    
xxxxxxxxxx
 
$ kubectl apply -f pv.yml
Copy

If you are using the Kubernetes UI, you should be able to see the PV volumes on standby:

Installing Helm

 Bash 
    
xxxxxxxxxx
 
$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3$ chmod 700 get_helm.sh$ ./get_helm.sh
Copy

Deploying the Helm Chart

In the past, trying to deploy Kafka on Kubernetes was a good exercise. You had to deploy a working Zookeeper Cluster, role bindings, persistent volume claims and apply the correct configuration.

Hopefully for us, with the use of the Kafka Incubator Chart, the whole process is mostly automated (with a few quirks here and there).

We add the Helm chart:

 Shell 
    
xxxxxxxxxx
 
$ helm repo add incubator https://charts.helm.sh/incubator
Copy

Export the chart values in a file:


	Note: Starting from Helm v3 the release name is now mandatory as part of the command and hence the flag `--name` is no longer valid.

 Shell 
    
xxxxxxxxxx
 
$ helm install  kafka-demo \        --namespace kafka incubator/kafka \        -f values.yml \        --debug --dry-run > chart_values.yaml
Copy


	Carefully inspect the configuration values, particularly around the parts about persistence and about the number of Kafka stateful sets to deploy.

Then install the chart:

 Shell 
    
xxxxxxxxxx
 
$ helm install  kafka-demo \        --namespace kafka incubator/kafka \        -f values.yml \        --debug
Copy

Check the status of the deployment:

 Shell 
    
​x
 
$ helm status kafka-demoLAST DEPLOYED: Sun Apr 19 14:05:15 2020NAMESPACE: kafkaSTATUS: DEPLOYED​RESOURCES:==> v1/ConfigMapNAME                  DATA  AGEkafka-demo-zookeeper  3     5m29s​==> v1/Pod(related)NAME                    READY  STATUS   RESTARTS  AGEkafka-demo-zookeeper-0  1/1    Running  0         5m28skafka-demo-zookeeper-1  1/1    Running  0         4m50skafka-demo-zookeeper-2  1/1    Running  0         4m12skafka-demo-zookeeper-0  1/1    Running  0         5m28skafka-demo-zookeeper-1  1/1    Running  0         4m50skafka-demo-zookeeper-2  1/1    Running  0         4m12s​==> v1/ServiceNAME                           TYPE       CLUSTER-IP     EXTERNAL-IP  PORT(S)                     AGEkafka-demo                     ClusterIP  10.21.255.214         9092/TCP                    5m29skafka-demo-headless            ClusterIP  None                  9092/TCP                    5m29skafka-demo-zookeeper           ClusterIP  10.21.13.232          2181/TCP                    5m29skafka-demo-zookeeper-headless  ClusterIP  None                  2181/TCP,3888/TCP,2888/TCP  5m29s​==> v1/StatefulSetNAME                  READY  AGEkafka-demo            3/3    5m28skafka-demo-zookeeper  3/3    5m28s​==> v1beta1/PodDisruptionBudgetNAME                  MIN AVAILABLE  MAX UNAVAILABLE  ALLOWED DISRUPTIONS  AGEkafka-demo-zookeeper  N/A            1                1                    5m29s
Copy

During this phase, you may want to navigate to the Kubernetes UI and inspect the dashboard for any issues. Once everything is complete, then the pods and Persistent Volume Claims should be bound and green.

Now we can test the Kafka cluster.

Testing the Kafka Cluster

We are going to deploy a test client that will execute scripts against the Kafka cluster.

Create and apply the following deployment:

testclient.yml

 YAML 
    
xxxxxxxxxx
 
apiVersion: v1kind: Podmetadata:  name: testclient  namespace: kafkaspec:  containers:  - name: kafka    image: solsson/kafka:0.11.0.0    command:    - sh    - -c    - "exec tail -f /dev/null"
Copy

Then, apply:

 Shell 
    
xxxxxxxxxx
 
$ kubectl apply -f testclient
Copy

Then, using the testclient, we create the first topic, which we are going to use to post messages:

 Shell 
    
xxxxxxxxxx
 
$ kubectl -n kafka exec -ti testclient -- ./bin/kafka-console-consumer.sh \        --bootstrap-server kafka-demo:9092 \        --topic messages \        --from-beginning
Copy

Here we need to use the correct hostname for the ZooKeeper cluster and the topic configuration.

Next, verify that the topic exists:

 Shell 
    
xxxxxxxxxx
 
$ helm install --name kafka-demo \        --namespace kafka incubator/kafka \        -f values.yml \        --debug
Copy

Now, we can create one consumer and one producer instance so that we can send and consume messages.

First create one or two listeners, each on its own shell:

 Shell 
    
xxxxxxxxxx
 
$ helm install --name kafka-demo \        --namespace kafka incubator/kafka \        -f values.yml \        --debug
Copy

Then create the producer session and type some messages. You will be able to see them propagate to the consumer sessions:

 Shell 
    
xxxxxxxxxx
 
$ kubectl -n kafka exec -ti testclient -- ./bin/kafka-console-producer.sh \        --broker-list kafka-demo:9092 \        --topic messages>Hi>How are you?>Hope you're well
Copy

Switching on each consumer you will see:

 Shell 
    
xxxxxxxxxx
 
HiHow are you?Hope you're well
Copy

Destroying the Helm Chart

To clean up our resources, we just destroy the Helm Chart and delete the PVs we created earlier:

 Shell 
    
xxxxxxxxxx
 
$ helm delete kafka-demo --purge$ kubectl delete -f pv.yml -n kafka
Copy

Last updated on Jan 24, 2022

Was this page helpful?