Files
infra-kafka/README.md
T
2026-05-15 07:41:27 +00:00

7.1 KiB
Raw Blame History

Kafka on Kubernetes (Strimzi)

Single-broker Kafka cluster for dev/test, deployed via the Strimzi operator on a single-node k3s cluster.


Architecture

kafka namespace
├── strimzi-cluster-operator   Strimzi operator — reconciles Kafka/KafkaUser/KafkaTopic CRs
├── kafka-dual-role-0          Kafka 4.1.0 broker (KRaft, controller + broker in one pod)
└── kafka-entity-operator      Topic Operator + User Operator (manages topics and users)

Key design decisions

Choice Rationale
KRaft mode No ZooKeeper dependency — simpler, fewer pods
Single node pool dual-role One pod acts as both controller and broker, suitable for dev/test
SASL_SSL + SCRAM-SHA-512 Authenticated and encrypted connections; TLS managed by Strimzi CA
Simple authorization ACLs enforced per user; kafka-admin is declared super user
local-path storage (10 Gi) Default k3s storage class; PVC deleted on cluster teardown

Listeners

Name Port Protocol Auth
tls 9093 SASL_SSL SCRAM-SHA-512

Bootstrap address (cluster-internal):

kafka-kafka-bootstrap.kafka.svc.cluster.local:9093

Users

KafkaUser Secret Role
kafka-admin kafka-admin Super user — unrestricted access
kafka-client kafka-client Application user — Read/Write all topics and groups

Strimzi stores credentials in Kubernetes secrets. Each secret contains two keys:

  • password — the SCRAM password
  • sasl.jaas.config — the ready-to-use JAAS config string

TLS

Strimzi generates and manages its own internal CA. The cluster CA certificate is stored in:

secret/kafka-cluster-ca-cert  (key: ca.crt)

This certificate must be trusted by any Kafka client connecting over TLS.


Prerequisites

  • kubectl configured against the target cluster
  • helm v3
  • cert-manager is not required — Strimzi manages its own CA

Installation

./deploy.sh

The script performs the following steps:

  1. Creates the kafka namespace
  2. Installs the Strimzi operator via Helm (scoped to the kafka namespace)
  3. Waits for the Strimzi CRDs to be fully established
  4. Applies kafka.yaml (KafkaNodePool + Kafka CRs) and waits for Ready
  5. Applies kafka-users.yaml (KafkaUser CRs) and waits for Ready

Expected duration: 46 minutes on a single-node k3s cluster.

Verify the installation

kubectl get kafka,kafkanodepool,kafkauser,pods -n kafka

Expected output:

NAME                           READY   KAFKA VERSION   METADATA VERSION
kafka.kafka.strimzi.io/kafka   True    4.1.0           4.1-IV0

NAME                                       DESIRED REPLICAS   ROLES                     NODEIDS
kafkanodepool.kafka.strimzi.io/dual-role   1                  ["controller","broker"]   [0]

NAME                                      CLUSTER   AUTHENTICATION   AUTHORIZATION   READY
kafkauser.kafka.strimzi.io/kafka-admin    kafka     scram-sha-512                    True
kafkauser.kafka.strimzi.io/kafka-client   kafka     scram-sha-512    simple          True

Client configuration

Get the CA certificate

kubectl -n kafka get secret kafka-cluster-ca-cert \
  -o jsonpath='{.data.ca\.crt}' | base64 -d > kafka-ca.crt

Get user credentials

# Admin password
kubectl -n kafka get secret kafka-admin \
  -o jsonpath='{.data.password}' | base64 -d

# Client password
kubectl -n kafka get secret kafka-client \
  -o jsonpath='{.data.password}' | base64 -d

# Ready-to-use JAAS config (includes username and password)
kubectl -n kafka get secret kafka-client \
  -o jsonpath='{.data.sasl\.jaas\.config}' | base64 -d

Sample client.properties

bootstrap.servers=kafka-kafka-bootstrap.kafka.svc.cluster.local:9093
security.protocol=SASL_SSL
ssl.truststore.type=PEM
ssl.truststore.location=/path/to/kafka-ca.crt
sasl.mechanism=SCRAM-SHA-512
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
  username="kafka-client" password="<password>";

Testing

A smoke test job is provided that validates the full produce/consume cycle over SASL_SSL.

Run the smoke test

kubectl apply -f kafka-smoke-test.yaml
kubectl logs -n kafka -l job-name=kafka-smoke-test -f

Re-run

kubectl delete job kafka-smoke-test -n kafka
kubectl apply -f kafka-smoke-test.yaml
kubectl logs -n kafka -l job-name=kafka-smoke-test -f

What the test does

  1. Topic creation — creates topic smoke-test with --if-not-exists (idempotent)
  2. Produce — sends 5 messages (message-1message-5) via kafka-console-producer
  3. Consume — reads 5 messages from the beginning via kafka-console-consumer and asserts count

Expected output:

==> [1/3] Creating topic 'smoke-test' (idempotent)
    Topic ready.
==> [2/3] Producing 5 messages
    5 messages produced.
==> [3/3] Consuming (from-beginning, max 5)
message-1
message-2
message-3
message-4
message-5
==> SMOKE TEST PASSED (5/5 messages)

The job is cleaned up automatically after 10 minutes (ttlSecondsAfterFinished: 600).


Adding a new user

Create a new KafkaUser manifest following the kafka-client pattern in kafka-users.yaml, then apply it:

kubectl apply -f kafka-users.yaml

Strimzi will create the corresponding secret in the kafka namespace within seconds.


Upgrade

Upgrade the Kafka version

Supported versions are dictated by the installed Strimzi operator. Check available versions:

kubectl get kafka kafka -n kafka \
  -o jsonpath='{.status.kafkaVersion}'

helm show chart strimzi/strimzi-kafka-operator | grep appVersion

To upgrade Kafka from e.g. 4.1.0 to 4.1.1:

  1. Update spec.kafka.version in kafka.yaml
  2. Update spec.kafka.metadataVersion if the new version introduces a new metadata version
  3. Apply the change:
kubectl apply -f kafka.yaml
kubectl wait kafka/kafka --for=condition=Ready --timeout=10m -n kafka

Strimzi performs a rolling restart of the broker pod automatically.

Upgrade the Strimzi operator

helm repo update strimzi
helm upgrade strimzi-kafka-operator strimzi/strimzi-kafka-operator \
  --namespace kafka \
  --set watchAnyNamespace=false \
  --wait --timeout 5m

Note: always upgrade the operator before upgrading the Kafka version. Check the Strimzi upgrade guide for supported upgrade paths.


Uninstallation

# Delete Kafka cluster and users (PVC is deleted because deleteClaim: true)
kubectl delete -f kafka-users.yaml
kubectl delete -f kafka.yaml

# Wait for pods to terminate
kubectl wait --for=delete pod -l strimzi.io/cluster=kafka -n kafka --timeout=120s

# Uninstall Strimzi operator
helm uninstall strimzi-kafka-operator -n kafka

# Delete namespace (removes all remaining secrets and CRDs bindings)
kubectl delete namespace kafka

The deleteClaim: true flag in kafka.yaml ensures the 10 Gi PVC is deleted together with the KafkaNodePool, leaving no orphaned volumes.