Files
infra-kafka/README.md
T
2026-05-15 07:41:27 +00:00

258 lines
7.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Kafka on Kubernetes (Strimzi)
Single-broker Kafka cluster for dev/test, deployed via the [Strimzi](https://strimzi.io/) operator on a single-node k3s cluster.
---
## Architecture
```
kafka namespace
├── strimzi-cluster-operator Strimzi operator — reconciles Kafka/KafkaUser/KafkaTopic CRs
├── kafka-dual-role-0 Kafka 4.1.0 broker (KRaft, controller + broker in one pod)
└── kafka-entity-operator Topic Operator + User Operator (manages topics and users)
```
### Key design decisions
| Choice | Rationale |
|---|---|
| **KRaft mode** | No ZooKeeper dependency — simpler, fewer pods |
| **Single node pool** `dual-role` | One pod acts as both controller and broker, suitable for dev/test |
| **SASL_SSL + SCRAM-SHA-512** | Authenticated and encrypted connections; TLS managed by Strimzi CA |
| **Simple authorization** | ACLs enforced per user; `kafka-admin` is declared super user |
| **`local-path` storage (10 Gi)** | Default k3s storage class; PVC deleted on cluster teardown |
### Listeners
| Name | Port | Protocol | Auth |
|---|---|---|---|
| `tls` | 9093 | SASL_SSL | SCRAM-SHA-512 |
Bootstrap address (cluster-internal):
```
kafka-kafka-bootstrap.kafka.svc.cluster.local:9093
```
### Users
| KafkaUser | Secret | Role |
|---|---|---|
| `kafka-admin` | `kafka-admin` | Super user — unrestricted access |
| `kafka-client` | `kafka-client` | Application user — Read/Write all topics and groups |
Strimzi stores credentials in Kubernetes secrets. Each secret contains two keys:
- `password` — the SCRAM password
- `sasl.jaas.config` — the ready-to-use JAAS config string
### TLS
Strimzi generates and manages its own internal CA. The cluster CA certificate is stored in:
```
secret/kafka-cluster-ca-cert (key: ca.crt)
```
This certificate must be trusted by any Kafka client connecting over TLS.
---
## Prerequisites
- `kubectl` configured against the target cluster
- `helm` v3
- `cert-manager` is **not** required — Strimzi manages its own CA
---
## Installation
```bash
./deploy.sh
```
The script performs the following steps:
1. Creates the `kafka` namespace
2. Installs the Strimzi operator via Helm (scoped to the `kafka` namespace)
3. Waits for the Strimzi CRDs to be fully established
4. Applies `kafka.yaml` (KafkaNodePool + Kafka CRs) and waits for `Ready`
5. Applies `kafka-users.yaml` (KafkaUser CRs) and waits for `Ready`
Expected duration: **46 minutes** on a single-node k3s cluster.
### Verify the installation
```bash
kubectl get kafka,kafkanodepool,kafkauser,pods -n kafka
```
Expected output:
```
NAME READY KAFKA VERSION METADATA VERSION
kafka.kafka.strimzi.io/kafka True 4.1.0 4.1-IV0
NAME DESIRED REPLICAS ROLES NODEIDS
kafkanodepool.kafka.strimzi.io/dual-role 1 ["controller","broker"] [0]
NAME CLUSTER AUTHENTICATION AUTHORIZATION READY
kafkauser.kafka.strimzi.io/kafka-admin kafka scram-sha-512 True
kafkauser.kafka.strimzi.io/kafka-client kafka scram-sha-512 simple True
```
---
## Client configuration
### Get the CA certificate
```bash
kubectl -n kafka get secret kafka-cluster-ca-cert \
-o jsonpath='{.data.ca\.crt}' | base64 -d > kafka-ca.crt
```
### Get user credentials
```bash
# Admin password
kubectl -n kafka get secret kafka-admin \
-o jsonpath='{.data.password}' | base64 -d
# Client password
kubectl -n kafka get secret kafka-client \
-o jsonpath='{.data.password}' | base64 -d
# Ready-to-use JAAS config (includes username and password)
kubectl -n kafka get secret kafka-client \
-o jsonpath='{.data.sasl\.jaas\.config}' | base64 -d
```
### Sample `client.properties`
```properties
bootstrap.servers=kafka-kafka-bootstrap.kafka.svc.cluster.local:9093
security.protocol=SASL_SSL
ssl.truststore.type=PEM
ssl.truststore.location=/path/to/kafka-ca.crt
sasl.mechanism=SCRAM-SHA-512
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
username="kafka-client" password="<password>";
```
---
## Testing
A smoke test job is provided that validates the full produce/consume cycle over SASL_SSL.
### Run the smoke test
```bash
kubectl apply -f kafka-smoke-test.yaml
kubectl logs -n kafka -l job-name=kafka-smoke-test -f
```
### Re-run
```bash
kubectl delete job kafka-smoke-test -n kafka
kubectl apply -f kafka-smoke-test.yaml
kubectl logs -n kafka -l job-name=kafka-smoke-test -f
```
### What the test does
1. **Topic creation** — creates topic `smoke-test` with `--if-not-exists` (idempotent)
2. **Produce** — sends 5 messages (`message-1``message-5`) via `kafka-console-producer`
3. **Consume** — reads 5 messages from the beginning via `kafka-console-consumer` and asserts count
Expected output:
```
==> [1/3] Creating topic 'smoke-test' (idempotent)
Topic ready.
==> [2/3] Producing 5 messages
5 messages produced.
==> [3/3] Consuming (from-beginning, max 5)
message-1
message-2
message-3
message-4
message-5
==> SMOKE TEST PASSED (5/5 messages)
```
The job is cleaned up automatically after 10 minutes (`ttlSecondsAfterFinished: 600`).
---
## Adding a new user
Create a new `KafkaUser` manifest following the `kafka-client` pattern in `kafka-users.yaml`, then apply it:
```bash
kubectl apply -f kafka-users.yaml
```
Strimzi will create the corresponding secret in the `kafka` namespace within seconds.
---
## Upgrade
### Upgrade the Kafka version
Supported versions are dictated by the installed Strimzi operator. Check available versions:
```bash
kubectl get kafka kafka -n kafka \
-o jsonpath='{.status.kafkaVersion}'
helm show chart strimzi/strimzi-kafka-operator | grep appVersion
```
To upgrade Kafka from e.g. `4.1.0` to `4.1.1`:
1. Update `spec.kafka.version` in `kafka.yaml`
2. Update `spec.kafka.metadataVersion` if the new version introduces a new metadata version
3. Apply the change:
```bash
kubectl apply -f kafka.yaml
kubectl wait kafka/kafka --for=condition=Ready --timeout=10m -n kafka
```
Strimzi performs a rolling restart of the broker pod automatically.
### Upgrade the Strimzi operator
```bash
helm repo update strimzi
helm upgrade strimzi-kafka-operator strimzi/strimzi-kafka-operator \
--namespace kafka \
--set watchAnyNamespace=false \
--wait --timeout 5m
```
> **Note:** always upgrade the operator before upgrading the Kafka version. Check the [Strimzi upgrade guide](https://strimzi.io/docs/operators/latest/deploying.html#assembly-upgrade-str) for supported upgrade paths.
---
## Uninstallation
```bash
# Delete Kafka cluster and users (PVC is deleted because deleteClaim: true)
kubectl delete -f kafka-users.yaml
kubectl delete -f kafka.yaml
# Wait for pods to terminate
kubectl wait --for=delete pod -l strimzi.io/cluster=kafka -n kafka --timeout=120s
# Uninstall Strimzi operator
helm uninstall strimzi-kafka-operator -n kafka
# Delete namespace (removes all remaining secrets and CRDs bindings)
kubectl delete namespace kafka
```
> The `deleteClaim: true` flag in `kafka.yaml` ensures the 10 Gi PVC is deleted together with the KafkaNodePool, leaving no orphaned volumes.