commit 569dff28a78ae23e34faabe4fa63230c789c90ad Author: sttlab Date: Fri May 15 07:41:27 2026 +0000 first commit diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..4c5d062 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +kafka-ca.crt +*.p12 +*.jks +*.pem diff --git a/README.md b/README.md new file mode 100644 index 0000000..50ceb4a --- /dev/null +++ b/README.md @@ -0,0 +1,257 @@ +# Kafka on Kubernetes (Strimzi) + +Single-broker Kafka cluster for dev/test, deployed via the [Strimzi](https://strimzi.io/) operator on a single-node k3s cluster. + +--- + +## Architecture + +``` +kafka namespace +├── strimzi-cluster-operator Strimzi operator — reconciles Kafka/KafkaUser/KafkaTopic CRs +├── kafka-dual-role-0 Kafka 4.1.0 broker (KRaft, controller + broker in one pod) +└── kafka-entity-operator Topic Operator + User Operator (manages topics and users) +``` + +### Key design decisions + +| Choice | Rationale | +|---|---| +| **KRaft mode** | No ZooKeeper dependency — simpler, fewer pods | +| **Single node pool** `dual-role` | One pod acts as both controller and broker, suitable for dev/test | +| **SASL_SSL + SCRAM-SHA-512** | Authenticated and encrypted connections; TLS managed by Strimzi CA | +| **Simple authorization** | ACLs enforced per user; `kafka-admin` is declared super user | +| **`local-path` storage (10 Gi)** | Default k3s storage class; PVC deleted on cluster teardown | + +### Listeners + +| Name | Port | Protocol | Auth | +|---|---|---|---| +| `tls` | 9093 | SASL_SSL | SCRAM-SHA-512 | + +Bootstrap address (cluster-internal): +``` +kafka-kafka-bootstrap.kafka.svc.cluster.local:9093 +``` + +### Users + +| KafkaUser | Secret | Role | +|---|---|---| +| `kafka-admin` | `kafka-admin` | Super user — unrestricted access | +| `kafka-client` | `kafka-client` | Application user — Read/Write all topics and groups | + +Strimzi stores credentials in Kubernetes secrets. Each secret contains two keys: +- `password` — the SCRAM password +- `sasl.jaas.config` — the ready-to-use JAAS config string + +### TLS + +Strimzi generates and manages its own internal CA. The cluster CA certificate is stored in: +``` +secret/kafka-cluster-ca-cert (key: ca.crt) +``` + +This certificate must be trusted by any Kafka client connecting over TLS. + +--- + +## Prerequisites + +- `kubectl` configured against the target cluster +- `helm` v3 +- `cert-manager` is **not** required — Strimzi manages its own CA + +--- + +## Installation + +```bash +./deploy.sh +``` + +The script performs the following steps: + +1. Creates the `kafka` namespace +2. Installs the Strimzi operator via Helm (scoped to the `kafka` namespace) +3. Waits for the Strimzi CRDs to be fully established +4. Applies `kafka.yaml` (KafkaNodePool + Kafka CRs) and waits for `Ready` +5. Applies `kafka-users.yaml` (KafkaUser CRs) and waits for `Ready` + +Expected duration: **4–6 minutes** on a single-node k3s cluster. + +### Verify the installation + +```bash +kubectl get kafka,kafkanodepool,kafkauser,pods -n kafka +``` + +Expected output: + +``` +NAME READY KAFKA VERSION METADATA VERSION +kafka.kafka.strimzi.io/kafka True 4.1.0 4.1-IV0 + +NAME DESIRED REPLICAS ROLES NODEIDS +kafkanodepool.kafka.strimzi.io/dual-role 1 ["controller","broker"] [0] + +NAME CLUSTER AUTHENTICATION AUTHORIZATION READY +kafkauser.kafka.strimzi.io/kafka-admin kafka scram-sha-512 True +kafkauser.kafka.strimzi.io/kafka-client kafka scram-sha-512 simple True +``` + +--- + +## Client configuration + +### Get the CA certificate + +```bash +kubectl -n kafka get secret kafka-cluster-ca-cert \ + -o jsonpath='{.data.ca\.crt}' | base64 -d > kafka-ca.crt +``` + +### Get user credentials + +```bash +# Admin password +kubectl -n kafka get secret kafka-admin \ + -o jsonpath='{.data.password}' | base64 -d + +# Client password +kubectl -n kafka get secret kafka-client \ + -o jsonpath='{.data.password}' | base64 -d + +# Ready-to-use JAAS config (includes username and password) +kubectl -n kafka get secret kafka-client \ + -o jsonpath='{.data.sasl\.jaas\.config}' | base64 -d +``` + +### Sample `client.properties` + +```properties +bootstrap.servers=kafka-kafka-bootstrap.kafka.svc.cluster.local:9093 +security.protocol=SASL_SSL +ssl.truststore.type=PEM +ssl.truststore.location=/path/to/kafka-ca.crt +sasl.mechanism=SCRAM-SHA-512 +sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \ + username="kafka-client" password=""; +``` + +--- + +## Testing + +A smoke test job is provided that validates the full produce/consume cycle over SASL_SSL. + +### Run the smoke test + +```bash +kubectl apply -f kafka-smoke-test.yaml +kubectl logs -n kafka -l job-name=kafka-smoke-test -f +``` + +### Re-run + +```bash +kubectl delete job kafka-smoke-test -n kafka +kubectl apply -f kafka-smoke-test.yaml +kubectl logs -n kafka -l job-name=kafka-smoke-test -f +``` + +### What the test does + +1. **Topic creation** — creates topic `smoke-test` with `--if-not-exists` (idempotent) +2. **Produce** — sends 5 messages (`message-1` … `message-5`) via `kafka-console-producer` +3. **Consume** — reads 5 messages from the beginning via `kafka-console-consumer` and asserts count + +Expected output: +``` +==> [1/3] Creating topic 'smoke-test' (idempotent) + Topic ready. +==> [2/3] Producing 5 messages + 5 messages produced. +==> [3/3] Consuming (from-beginning, max 5) +message-1 +message-2 +message-3 +message-4 +message-5 +==> SMOKE TEST PASSED (5/5 messages) +``` + +The job is cleaned up automatically after 10 minutes (`ttlSecondsAfterFinished: 600`). + +--- + +## Adding a new user + +Create a new `KafkaUser` manifest following the `kafka-client` pattern in `kafka-users.yaml`, then apply it: + +```bash +kubectl apply -f kafka-users.yaml +``` + +Strimzi will create the corresponding secret in the `kafka` namespace within seconds. + +--- + +## Upgrade + +### Upgrade the Kafka version + +Supported versions are dictated by the installed Strimzi operator. Check available versions: + +```bash +kubectl get kafka kafka -n kafka \ + -o jsonpath='{.status.kafkaVersion}' + +helm show chart strimzi/strimzi-kafka-operator | grep appVersion +``` + +To upgrade Kafka from e.g. `4.1.0` to `4.1.1`: + +1. Update `spec.kafka.version` in `kafka.yaml` +2. Update `spec.kafka.metadataVersion` if the new version introduces a new metadata version +3. Apply the change: + +```bash +kubectl apply -f kafka.yaml +kubectl wait kafka/kafka --for=condition=Ready --timeout=10m -n kafka +``` + +Strimzi performs a rolling restart of the broker pod automatically. + +### Upgrade the Strimzi operator + +```bash +helm repo update strimzi +helm upgrade strimzi-kafka-operator strimzi/strimzi-kafka-operator \ + --namespace kafka \ + --set watchAnyNamespace=false \ + --wait --timeout 5m +``` + +> **Note:** always upgrade the operator before upgrading the Kafka version. Check the [Strimzi upgrade guide](https://strimzi.io/docs/operators/latest/deploying.html#assembly-upgrade-str) for supported upgrade paths. + +--- + +## Uninstallation + +```bash +# Delete Kafka cluster and users (PVC is deleted because deleteClaim: true) +kubectl delete -f kafka-users.yaml +kubectl delete -f kafka.yaml + +# Wait for pods to terminate +kubectl wait --for=delete pod -l strimzi.io/cluster=kafka -n kafka --timeout=120s + +# Uninstall Strimzi operator +helm uninstall strimzi-kafka-operator -n kafka + +# Delete namespace (removes all remaining secrets and CRDs bindings) +kubectl delete namespace kafka +``` + +> The `deleteClaim: true` flag in `kafka.yaml` ensures the 10 Gi PVC is deleted together with the KafkaNodePool, leaving no orphaned volumes. diff --git a/deploy.sh b/deploy.sh new file mode 100755 index 0000000..f159f6d --- /dev/null +++ b/deploy.sh @@ -0,0 +1,85 @@ +#!/usr/bin/env bash +set -euo pipefail + +NAMESPACE="kafka" +KAFKA_NAME="kafka" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +echo "==> Checking prerequisites" +command -v kubectl >/dev/null || { echo "kubectl not found"; exit 1; } +command -v helm >/dev/null || { echo "helm not found"; exit 1; } + +echo "==> Verifying cluster reachable" +kubectl cluster-info --request-timeout=5s >/dev/null + +echo "==> Step 1/5 : Create namespace" +kubectl create namespace "${NAMESPACE}" --dry-run=client -o yaml | kubectl apply -f - + +echo "==> Step 2/5 : Install Strimzi operator" +helm repo add strimzi https://strimzi.io/charts/ 2>/dev/null || true +helm repo update strimzi + +helm upgrade --install strimzi-kafka-operator strimzi/strimzi-kafka-operator \ + --namespace "${NAMESPACE}" \ + --set watchAnyNamespace=false \ + --wait --timeout 5m + +echo "==> Waiting for Strimzi Cluster Operator to be ready" +kubectl rollout status deployment/strimzi-cluster-operator -n "${NAMESPACE}" --timeout=120s + +echo "==> Waiting for Strimzi CRDs to be fully established" +for crd in kafkas.kafka.strimzi.io kafkanodepools.kafka.strimzi.io kafkausers.kafka.strimzi.io; do + until kubectl get crd "${crd}" -o jsonpath='{.status.conditions[?(@.type=="Established")].status}' 2>/dev/null | grep -q "True"; do + sleep 2 + done + echo " - ${crd} established" +done + +echo "==> Step 3/5 : Apply Kafka cluster (KRaft, TLS, SCRAM-SHA-512)" +kubectl apply -f "${SCRIPT_DIR}/kafka.yaml" + +echo "==> Waiting for Kafka cluster to be Ready (3-5 min)" +kubectl wait kafka/"${KAFKA_NAME}" \ + --for=condition=Ready \ + --timeout=10m \ + -n "${NAMESPACE}" + +echo "==> Step 4/5 : Apply KafkaUsers" +kubectl apply -f "${SCRIPT_DIR}/kafka-users.yaml" + +echo "==> Waiting for KafkaUsers to be Ready" +for user in kafka-admin kafka-client; do + echo " - waiting for ${user}" + kubectl wait kafkauser/"${user}" \ + --for=condition=Ready \ + --timeout=120s \ + -n "${NAMESPACE}" +done + +echo "" +echo "==> Step 5/5 : Deployment complete" +echo "" +kubectl get pods -n "${NAMESPACE}" +echo "" +echo "Bootstrap (TLS + SCRAM-SHA-512, cluster-internal):" +echo " kafka-kafka-bootstrap.${NAMESPACE}.svc.cluster.local:9093" +echo "" +echo "Get CA cert (import on client side):" +echo " kubectl -n ${NAMESPACE} get secret kafka-cluster-ca-cert \\" +echo " -o jsonpath='{.data.ca\\.crt}' | base64 -d > kafka-ca.crt" +echo "" +echo "Get SCRAM credentials:" +echo " # Admin" +echo " kubectl -n ${NAMESPACE} get secret kafka-admin -o jsonpath='{.data.password}' | base64 -d" +echo " # Client" +echo " kubectl -n ${NAMESPACE} get secret kafka-client -o jsonpath='{.data.password}' | base64 -d" +echo "" +echo "Sample client config (properties):" +echo " bootstrap.servers=kafka-kafka-bootstrap.${NAMESPACE}.svc.cluster.local:9093" +echo " security.protocol=SASL_SSL" +echo " ssl.truststore.type=PEM" +echo " ssl.truststore.certificates=" +echo " sasl.mechanism=SCRAM-SHA-512" +echo " sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \\" +echo " username=\"kafka-client\" password=\"\";" +echo "" diff --git a/kafka-smoke-test.yaml b/kafka-smoke-test.yaml new file mode 100644 index 0000000..67e69bd --- /dev/null +++ b/kafka-smoke-test.yaml @@ -0,0 +1,91 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: kafka-smoke-test + namespace: kafka +spec: + ttlSecondsAfterFinished: 600 + backoffLimit: 3 + template: + spec: + restartPolicy: Never + volumes: + - name: ca-cert + secret: + secretName: kafka-cluster-ca-cert + items: + - key: ca.crt + path: ca.crt + containers: + - name: smoke-test + image: apache/kafka:4.1.0 + command: ["/bin/bash", "-c"] + args: + - | + set -euo pipefail + BOOTSTRAP="kafka-kafka-bootstrap.kafka.svc.cluster.local:9093" + TOPIC="smoke-test" + + # Build client.properties injecting sasl.jaas.config from the Strimzi-managed secret + cat > /tmp/client.properties < [1/3] Creating topic '${TOPIC}' (idempotent)" + /opt/kafka/bin/kafka-topics.sh \ + --bootstrap-server "${BOOTSTRAP}" \ + --command-config /tmp/client.properties \ + --create \ + --topic "${TOPIC}" \ + --partitions 1 \ + --replication-factor 1 \ + --if-not-exists + echo " Topic ready." + + echo "==> [2/3] Producing 5 messages" + printf 'message-%s\n' 1 2 3 4 5 | \ + /opt/kafka/bin/kafka-console-producer.sh \ + --bootstrap-server "${BOOTSTRAP}" \ + --producer.config /tmp/client.properties \ + --topic "${TOPIC}" + echo " 5 messages produced." + + echo "==> [3/3] Consuming (from-beginning, max 5)" + MESSAGES=$(/opt/kafka/bin/kafka-console-consumer.sh \ + --bootstrap-server "${BOOTSTRAP}" \ + --consumer.config /tmp/client.properties \ + --topic "${TOPIC}" \ + --from-beginning \ + --max-messages 5 \ + --timeout-ms 30000 2>/dev/null || true) + + echo "${MESSAGES}" + COUNT=$(echo "${MESSAGES}" | grep -c '^message-' || true) + + if [ "${COUNT}" -eq 5 ]; then + echo "==> SMOKE TEST PASSED (${COUNT}/5 messages)" + else + echo "==> SMOKE TEST FAILED: expected 5, got ${COUNT}" + exit 1 + fi + env: + - name: SASL_JAAS_CONFIG + valueFrom: + secretKeyRef: + name: kafka-client + key: sasl.jaas.config + volumeMounts: + - name: ca-cert + mountPath: /ca + readOnly: true + resources: + requests: + memory: 256Mi + cpu: "100m" + limits: + memory: 512Mi + cpu: "500m" diff --git a/kafka-users.yaml b/kafka-users.yaml new file mode 100644 index 0000000..1cccee5 --- /dev/null +++ b/kafka-users.yaml @@ -0,0 +1,53 @@ +--- +# Admin account: super user (declared in Kafka CR → authorization.superUsers). +# Strimzi-generated secret: kubectl -n kafka get secret kafka-admin +apiVersion: kafka.strimzi.io/v1 +kind: KafkaUser +metadata: + name: kafka-admin + namespace: kafka + labels: + strimzi.io/cluster: kafka +spec: + authentication: + type: scram-sha-512 +--- +# Application account with explicit ACLs — duplicate per application. +# Strimzi-generated secret: kubectl -n kafka get secret kafka-client +apiVersion: kafka.strimzi.io/v1 +kind: KafkaUser +metadata: + name: kafka-client + namespace: kafka + labels: + strimzi.io/cluster: kafka +spec: + authentication: + type: scram-sha-512 + authorization: + type: simple + acls: + - resource: + type: topic + name: "*" + patternType: literal + operations: + - Read + - Write + - Create + - Describe + - DescribeConfigs + - resource: + type: group + name: "*" + patternType: literal + operations: + - Read + - Describe + - resource: + type: transactionalId + name: "*" + patternType: literal + operations: + - Describe + - Write diff --git a/kafka.yaml b/kafka.yaml new file mode 100644 index 0000000..20ed5e8 --- /dev/null +++ b/kafka.yaml @@ -0,0 +1,60 @@ +--- +# KRaft mode: single pod acting as both controller and broker. +# Storage is defined here (not in the Kafka CR) because node pools are enabled. +apiVersion: kafka.strimzi.io/v1 +kind: KafkaNodePool +metadata: + name: dual-role + namespace: kafka + labels: + strimzi.io/cluster: kafka +spec: + replicas: 1 + roles: + - controller + - broker + storage: + type: persistent-claim + size: 10Gi + deleteClaim: true + class: local-path + resources: + requests: + memory: 1Gi + cpu: "250m" + limits: + memory: 2Gi + cpu: "1000m" +--- +apiVersion: kafka.strimzi.io/v1 +kind: Kafka +metadata: + name: kafka + namespace: kafka + annotations: + strimzi.io/node-pools: enabled + strimzi.io/kraft: enabled +spec: + kafka: + version: 4.1.0 + metadataVersion: "4.1-IV0" + listeners: + - name: tls + port: 9093 + type: internal + tls: true + authentication: + type: scram-sha-512 + authorization: + type: simple + superUsers: + - kafka-admin + config: + offsets.topic.replication.factor: "1" + transaction.state.log.replication.factor: "1" + transaction.state.log.min.isr: "1" + default.replication.factor: "1" + min.insync.replicas: "1" + entityOperator: + topicOperator: {} + userOperator: {}