Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jwmatthews/e747a24a9341cc5537d1a73ea26d6886 to your computer and use it in GitHub Desktop.

Select an option

Save jwmatthews/e747a24a9341cc5537d1a73ea26d6886 to your computer and use it in GitHub Desktop.

Kafka Migration Guide: ArgoCD + MirrorMaker2

Version: 1.0
Last Updated: February 2026
Scenario: Migrating a Kafka deployment from a source Kubernetes cluster to a target cluster using application-level replication, with ArgoCD managing both sides.


Overview

This guide walks through:

  1. Deploying a minimal Kafka cluster to a source cluster managed by ArgoCD
  2. Verifying Kafka is functional by producing and consuming messages
  3. Deploying Kafka to a target cluster via ArgoCD
  4. Configuring MirrorMaker2 to replicate data from source to target
  5. Promoting the target cluster to primary and decommissioning the source

Why This Migration Strategy

Traditional Kubernetes migrations use volume-level tools like Velero to snapshot and restore Persistent Volumes. For Kafka this is risky - Kafka stores data in log segments tightly coupled to broker IDs, offsets, and cluster metadata. A storage-level snapshot can capture an inconsistent state mid-write.

MirrorMaker2 solves this by replicating at the application level:

  • Topics and their data are replicated as Kafka records (the native format)
  • Consumer group offsets are translated and replicated to the target
  • Replication runs continuously, keeping target current until cutover
  • You can verify data integrity on the target before cutting over
  • Cutover is a configuration change, not a storage operation

What You Will Build

┌─────────────────────────────────┐     ┌─────────────────────────────────┐
│         SOURCE CLUSTER          │     │         TARGET CLUSTER          │
│                                 │     │                                 │
│  ┌─────────┐    ┌───────────┐   │     │  ┌─────────┐                   │
│  │ ArgoCD  │───▶│   Kafka   │   │     │  │ ArgoCD  │───▶  Kafka        │
│  │         │    │  Cluster  │   │     │  │         │    (standby)      │
│  └─────────┘    └─────┬─────┘   │     │  └─────────┘       ▲           │
│                        │         │     │                     │           │
│                        │         │     │              ┌──────┴──────┐    │
│                        └─────────┼─────┼────────────▶│ MirrorMaker2│   │
│                      Replication │     │              └─────────────┘   │
└─────────────────────────────────┘     └─────────────────────────────────┘

Prerequisites

  • Two Kubernetes clusters (source and target)
  • kubectl configured to access both clusters
  • ArgoCD installed on both clusters
  • Strimzi Kafka Operator available (installed via ArgoCD)
  • Git repository to store manifests

Part 1: Deploy Kafka to the Source Cluster

Step 1.1 - Repository Structure

Create the following Git repository structure. This is everything ArgoCD will manage.

kafka-gitops/
├── base/
│   └── kafka/
│       ├── kustomization.yaml
│       ├── namespace.yaml
│       ├── strimzi-operatorgroup.yaml
│       ├── strimzi-subscription.yaml
│       ├── kafka-cluster.yaml
│       └── kafka-topic.yaml
└── overlays/
    ├── source/
    │   ├── kustomization.yaml
    │   └── argocd-application.yaml
    └── target/
        ├── kustomization.yaml
        ├── kafka-cluster-patch.yaml
        └── argocd-application.yaml

Step 1.2 - Base Manifests

base/kafka/namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: kafka

base/kafka/strimzi-operatorgroup.yaml

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: strimzi-operatorgroup
  namespace: kafka
spec:
  targetNamespaces:
    - kafka

base/kafka/strimzi-subscription.yaml

Install the Strimzi operator, which manages Kafka clusters as Kubernetes custom resources.

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: strimzi-kafka-operator
  namespace: kafka
spec:
  channel: stable
  name: strimzi-kafka-operator
  source: operatorhubio-catalog
  sourceNamespace: olm
  installPlanApproval: Automatic

Note for OpenShift/ROSA: Replace source: operatorhubio-catalog and sourceNamespace: olm with source: redhat-operators and sourceNamespace: openshift-marketplace. The subscription name stays the same.


base/kafka/kafka-cluster.yaml

This is the minimal Kafka cluster definition - 1 broker, 1 ZooKeeper node (or KRaft mode), ephemeral storage replaced with a small PVC for persistence.

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  namespace: kafka
spec:
  kafka:
    version: 3.6.0
    replicas: 1                         # Minimal: single broker
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: external
        port: 9094
        type: nodeport
        tls: false
    config:
      offsets.topic.replication.factor: 1
      transaction.state.log.replication.factor: 1
      transaction.state.log.min.isr: 1
      default.replication.factor: 1
      min.insync.replicas: 1
      inter.broker.protocol.version: "3.6"
    storage:
      type: persistent-claim
      size: 5Gi                         # Small PVC - sufficient for the demo
      deleteClaim: false
  zookeeper:
    replicas: 1                         # Minimal: single ZooKeeper node
    storage:
      type: persistent-claim
      size: 1Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}                   # Enables KafkaTopic CRD management
    userOperator: {}

Why single replicas? This keeps the example easy to follow. In production you would use at least 3 brokers and 3 ZooKeeper nodes. The migration steps are identical regardless of replica count.


base/kafka/kafka-topic.yaml

Create a test topic we will produce messages to and verify on the target side.

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
  name: test-topic
  namespace: kafka
  labels:
    strimzi.io/cluster: my-cluster
spec:
  partitions: 3
  replicas: 1
  config:
    retention.ms: 604800000             # 7 days
    segment.bytes: 1073741824

base/kafka/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - namespace.yaml
  - strimzi-operatorgroup.yaml
  - strimzi-subscription.yaml
  - kafka-cluster.yaml
  - kafka-topic.yaml

Step 1.3 - Source Overlay

overlays/source/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - ../../base/kafka

# No patches needed for source - base config is the source config

overlays/source/argocd-application.yaml

This is the ArgoCD Application resource. Apply this to the ArgoCD namespace on the source cluster to have ArgoCD take ownership of the Kafka deployment.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kafka-source
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/kafka-gitops.git
    targetRevision: main
    path: overlays/source
  destination:
    server: https://kubernetes.default.svc
    namespace: kafka
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true          # Required for Strimzi CRDs

Step 1.4 - Apply the ArgoCD Application

# Set context to source cluster
kubectl config use-context source-cluster

# Apply the ArgoCD Application
kubectl apply -f overlays/source/argocd-application.yaml

# Watch ArgoCD sync
argocd app get kafka-source
argocd app sync kafka-source

# Watch resources come up
kubectl get pods -n kafka -w

Expected output after a few minutes:

NAME                                          READY   STATUS    RESTARTS
my-cluster-entity-operator-7d9f8c5b4-abc12   3/3     Running   0
my-cluster-kafka-0                            1/1     Running   0
my-cluster-zookeeper-0                        1/1     Running   0
strimzi-cluster-operator-6d8f9c5b4-xyz99     1/1     Running   0

Part 2: Verify Kafka is Functional

Before migrating anything, confirm the source cluster is producing and consuming messages correctly. This also creates data we will verify on the target after migration.

Step 2.1 - Produce Test Messages

Open a terminal and run a producer pod. This sends 10 messages to test-topic.

kubectl -n kafka run kafka-producer -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-producer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic test-topic

At the > prompt, type your test messages and press Enter after each:

> Hello from the source cluster - message 1
> Hello from the source cluster - message 2
> Hello from the source cluster - message 3
> Migration test message - timestamp 2026-02-17
> (Ctrl+C to exit)

Step 2.2 - Consume and Verify Messages

In a second terminal, run a consumer to read back all messages from the beginning:

kubectl -n kafka run kafka-consumer -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-consumer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic test-topic \
     --from-beginning

Expected output:

Hello from the source cluster - message 1
Hello from the source cluster - message 2
Hello from the source cluster - message 3
Migration test message - timestamp 2026-02-17

Step 2.3 - Verify Topic Metadata

# Check topic details
kubectl -n kafka run kafka-topics -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-topics.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --describe \
     --topic test-topic

Expected output:

Topic: test-topic       TopicId: abc123def456  PartitionCount: 3
ReplicationFactor: 1    Configs: retention.ms=604800000,segment.bytes=1073741824

Topic: test-topic       Partition: 0    Leader: 0   Replicas: 0   Isr: 0
Topic: test-topic       Partition: 1    Leader: 0   Replicas: 0   Isr: 0
Topic: test-topic       Partition: 2    Leader: 0   Replicas: 0   Isr: 0

Step 2.4 - Check ArgoCD Status

argocd app get kafka-source

Expected output:

Name:               argocd/kafka-source
Project:            default
Server:             https://kubernetes.default.svc
Namespace:          kafka
URL:                https://argocd.source-cluster/applications/kafka-source
Repo:               https://github.com/your-org/kafka-gitops.git
Target:             main
Path:               overlays/source
SyncStatus:         Synced
HealthStatus:       Healthy

Source cluster Kafka is deployed, managed by ArgoCD, and verified functional. We have test data in test-topic.


Part 3: Deploy Kafka to the Target Cluster via ArgoCD

Before configuring replication, we deploy Kafka to the target cluster. At this point it is a fresh, empty cluster. MirrorMaker2 will fill it with data from the source.

Step 3.1 - Target Overlay

The target Kafka cluster is identical to source but we give it a distinct name. This is important: when MirrorMaker2 replicates topics from source, it prefixes them with the source cluster alias (e.g., source.test-topic). Naming the clusters differently avoids confusion.

overlays/target/kafka-cluster-patch.yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  namespace: kafka
spec:
  kafka:
    # Everything else inherited from base
    # No changes needed - identical cluster config
    replicas: 1
    storage:
      type: persistent-claim
      size: 5Gi
      deleteClaim: false
  zookeeper:
    replicas: 1
    storage:
      type: persistent-claim
      size: 1Gi
      deleteClaim: false

At this stage the target cluster config is the same as source. We are not making it a "replica" at the Kafka level - MirrorMaker2 handles replication externally and independently of cluster config.


overlays/target/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - ../../base/kafka

patches:
  - path: kafka-cluster-patch.yaml
    target:
      kind: Kafka
      name: my-cluster

overlays/target/argocd-application.yaml

Apply this to the ArgoCD instance on the target cluster.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kafka-target
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/kafka-gitops.git
    targetRevision: main
    path: overlays/target
  destination:
    server: https://kubernetes.default.svc
    namespace: kafka
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true
      - ServerSideApply=true

Step 3.2 - Apply to Target Cluster

# Switch context to target cluster
kubectl config use-context target-cluster

# Apply ArgoCD Application
kubectl apply -f overlays/target/argocd-application.yaml

# Watch rollout
kubectl get pods -n kafka -w

Expected output - target cluster running but empty:

NAME                                          READY   STATUS    RESTARTS
my-cluster-entity-operator-7d9f8c5b4-def34   3/3     Running   0
my-cluster-kafka-0                            1/1     Running   0
my-cluster-zookeeper-0                        1/1     Running   0
strimzi-cluster-operator-6d8f9c5b4-uvw88     1/1     Running   0

Step 3.3 - Verify Target is Empty

Confirm the target has no data yet in test-topic (it shouldn't - the topic was created by the KafkaTopic CR but has zero messages):

kubectl config use-context target-cluster

kubectl -n kafka run kafka-consumer -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-consumer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic test-topic \
     --from-beginning \
     --timeout-ms 5000

Expected output: No messages, consumer times out after 5 seconds.

[2026-02-17 10:00:05,123] ERROR Timeout expired while fetching topic metadata (org.apache.kafka.tools.ConsoleConsumer)
Processed a total of 0 messages

Both clusters are running. Source has data. Target is empty and awaiting replication.


Part 4: Configure MirrorMaker2 for Cross-Cluster Replication

MirrorMaker2 is Kafka's built-in cross-cluster replication tool, built on top of Kafka Connect. Strimzi manages it via the KafkaMirrorMaker2 custom resource.

Step 4.1 - Networking: Expose the Source Kafka

MirrorMaker2 runs on the target cluster and reaches back to the source. The source Kafka broker must be accessible from the target cluster's network.

Option A: NodePort (simplest for testing)

# On source cluster - check the NodePort assigned to external listener
kubectl config use-context source-cluster

kubectl get svc -n kafka my-cluster-kafka-external-bootstrap
# NAME                                    TYPE       CLUSTER-IP      PORT(S)
# my-cluster-kafka-external-bootstrap     NodePort   10.96.xxx.xxx   9094:3xxxx/TCP

# Get a node IP
kubectl get nodes -o wide
# NAME       STATUS   ROLES   AGE   VERSION   INTERNAL-IP      EXTERNAL-IP
# node-1     Ready    <none>  5d    v1.27.0   192.168.1.10     <none>

# Source bootstrap address for MirrorMaker2:
# 192.168.1.10:3xxxx  (node IP + nodeport)

Option B: LoadBalancer (if clusters are on same VPC/cloud)

# Update the Kafka listener in base/kafka/kafka-cluster.yaml
listeners:
  - name: external
    port: 9094
    type: loadbalancer     # Change from nodeport to loadbalancer
    tls: false
kubectl config use-context source-cluster
kubectl get svc -n kafka my-cluster-kafka-external-bootstrap
# The EXTERNAL-IP column will show the LoadBalancer address after provisioning

For EKS → EKS or EKS → ROSA migrations on AWS: Use a LoadBalancer listener. Both clusters are on AWS and the NLB address is routable between VPCs (with VPC peering or Transit Gateway). This is the realistic scenario for your use case.


Step 4.2 - Add MirrorMaker2 to the Git Repository

Add these files to your repository:

kafka-gitops/
├── base/
│   └── kafka/
│       └── ...existing files...
└── overlays/
    └── target/
        ├── kustomization.yaml          (update this)
        ├── kafka-cluster-patch.yaml
        ├── mirrormaker2.yaml           (new)
        └── argocd-application.yaml

overlays/target/mirrormaker2.yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaMirrorMaker2
metadata:
  name: mm2
  namespace: kafka
spec:
  version: 3.6.0
  replicas: 1                           # Single MirrorMaker2 worker (minimal)
  connectCluster: "target"              # MM2 internally uses Kafka Connect
                                        # targeting the local (target) cluster
  clusters:
    # ----- SOURCE CLUSTER -----
    - alias: "source"
      bootstrapServers: "SOURCE_BOOTSTRAP_ADDRESS"
      # Replace SOURCE_BOOTSTRAP_ADDRESS with the external address from Step 4.1
      # Example: "192.168.1.10:32094"  (NodePort)
      # Example: "a1b2c3d4.elb.us-east-1.amazonaws.com:9094"  (LoadBalancer)
      config:
        ssl.endpoint.identification.algorithm: ""

    # ----- TARGET CLUSTER -----
    - alias: "target"
      bootstrapServers: "my-cluster-kafka-bootstrap:9092"
      # Internal cluster address - MirrorMaker2 runs in the same namespace
      config:
        config.storage.replication.factor: 1
        offset.storage.replication.factor: 1
        status.storage.replication.factor: 1

  mirrors:
    - sourceCluster: "source"
      targetCluster: "target"
      sourceConnector:
        config:
          replication.factor: 1
          offset-syncs.topic.replication.factor: 1
          sync.topic.acls.enabled: "false"
          # Replicate all topics - adjust regex to limit scope
          topics: ".*"
      heartbeatConnector:
        config:
          heartbeats.topic.replication.factor: 1
      checkpointConnector:
        config:
          checkpoints.topic.replication.factor: 1
          # Replicate consumer group offsets
          # This ensures consumers can resume on the target
          # at the same position they were at on the source
          sync.group.offsets.enabled: "true"
          sync.group.offsets.interval.seconds: "60"

Update overlays/target/kustomization.yaml to include MirrorMaker2:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - ../../base/kafka

resources:
  - mirrormaker2.yaml                   # Add this line

patches:
  - path: kafka-cluster-patch.yaml
    target:
      kind: Kafka
      name: my-cluster

Step 4.3 - Commit and Let ArgoCD Deploy MirrorMaker2

git add overlays/target/mirrormaker2.yaml
git add overlays/target/kustomization.yaml
git commit -m "Add MirrorMaker2 for cross-cluster replication"
git push

# ArgoCD on target cluster will detect the change and sync automatically
# Or trigger manually:
kubectl config use-context target-cluster
argocd app sync kafka-target

# Watch MirrorMaker2 pod come up
kubectl get pods -n kafka -w

Expected output:

NAME                                          READY   STATUS    RESTARTS
mm2-mirrormaker2-7d9f8c5b4-xyz11              1/1     Running   0
my-cluster-entity-operator-7d9f8c5b4-def34   3/3     Running   0
my-cluster-kafka-0                            1/1     Running   0
my-cluster-zookeeper-0                        1/1     Running   0

Step 4.4 - Verify MirrorMaker2 is Replicating

Check MirrorMaker2 status:

kubectl config use-context target-cluster

kubectl describe kafkamirrormaker2 mm2 -n kafka

Look for the Conditions section in the output:

Status:
  Conditions:
    Last Transition Time:  2026-02-17T10:05:00Z
    Status:                True
    Type:                  Ready
  Connector Plugins: ...
  Connectors:
    Name: source->target.MirrorSourceConnector
    Connector Status:
      State: RUNNING       # <-- This must be RUNNING
      Worker ID: mm2-...
    Name: source->target.MirrorCheckpointConnector
    Connector Status:
      State: RUNNING       # <-- This must be RUNNING
    Name: source->target.MirrorHeartbeatConnector
    Connector Status:
      State: RUNNING       # <-- This must be RUNNING

Check replicated topics on target:

kubectl -n kafka run kafka-topics -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-topics.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --list

Expected output:

__consumer_offsets
heartbeats
mm2-configs.source.internal
mm2-offsets.source.internal
mm2-status.source.internal
source.checkpoints.internal
source.test-topic                   # <-- Replicated topic (prefixed with "source.")

The source. prefix is MirrorMaker2's naming convention. When topics are replicated from the cluster aliased as source, they appear on the target as source.<original-topic-name>. This prevents naming collisions and makes the provenance of replicated data explicit.


Step 4.5 - Verify Data Replicated Successfully

Now consume from the replicated topic on the target to confirm the messages we produced in Part 2 have arrived:

kubectl config use-context target-cluster

kubectl -n kafka run kafka-consumer -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-consumer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic source.test-topic \
     --from-beginning

Expected output - your source messages appear on the target:

Hello from the source cluster - message 1
Hello from the source cluster - message 2
Hello from the source cluster - message 3
Migration test message - timestamp 2026-02-17

Data has replicated from source to target at the application level. No PV snapshots were taken. No Velero was needed.


Step 4.6 - Produce Live Messages to Confirm Continuous Replication

MirrorMaker2 replicates continuously, not just once. Verify this by producing new messages to the source and watching them appear on the target.

Terminal 1 - Start consumer on target (watch for new messages):

kubectl config use-context target-cluster

kubectl -n kafka run kafka-consumer-live -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-consumer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic source.test-topic

Terminal 2 - Produce new messages to source:

kubectl config use-context source-cluster

kubectl -n kafka run kafka-producer-live -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-producer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic test-topic

Type a few messages at the > prompt. Within seconds, those same messages should appear in Terminal 1 on the target cluster.

> Live replication test message 1
> Live replication test message 2

Continuous replication confirmed. The target is staying in sync with the source.


Part 5: Cutover - Promote Target to Primary

The target cluster is now fully caught up. The next steps stop writes to the source, let replication drain, and redirect applications to the target.

Step 5.1 - Understand the Cutover Sequence

BEFORE CUTOVER                        AFTER CUTOVER

Producers → Source Kafka              Producers → Target Kafka
                 ↓                                      (no replication)
           MirrorMaker2
                 ↓
           Target Kafka

The cutover has four phases:

  1. Pause producers - Stop applications writing to source
  2. Drain replication - Allow MirrorMaker2 to replicate the final in-flight messages
  3. Rename topics - Remove the source. prefix so topics match original names
  4. Redirect producers - Point applications at target, remove MirrorMaker2

Step 5.2 - Pause Producers on Source

Before stopping production writes, record the current offset on the source. This is your baseline to verify the target has caught up completely.

kubectl config use-context source-cluster

# Check current end offset on source
kubectl -n kafka run kafka-check-offsets -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-run-class.sh kafka.tools.GetOffsetShell \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic test-topic \
     --time -1

Example output:

test-topic:0:12
test-topic:1:9
test-topic:2:11

This means: partition 0 has 12 messages, partition 1 has 9, partition 2 has 11.

Now stop your producer applications (scale to 0, or disable in ArgoCD):

# If your producer is a Deployment
kubectl scale deployment my-producer-app --replicas=0 -n production

# Or in ArgoCD - suspend auto-sync for producer app
argocd app set my-producer-app --sync-policy none

Step 5.3 - Wait for Replication to Drain

Give MirrorMaker2 time to replicate the final messages. Then verify the target offset matches the source.

kubectl config use-context target-cluster

# Check current end offset on target (note: topic name is prefixed)
kubectl -n kafka run kafka-check-offsets-target -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-run-class.sh kafka.tools.GetOffsetShell \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic source.test-topic \
     --time -1

Expected output matching source:

source.test-topic:0:12    # Matches source partition 0
source.test-topic:1:9     # Matches source partition 1
source.test-topic:2:11    # Matches source partition 2

Repeat this check until counts match. With a quiet source it should drain within the MirrorMaker2 poll interval (typically 1-5 seconds).


Step 5.4 - Remove the source. Topic Prefix

Applications expect to read from test-topic, not source.test-topic. You have two options:

Option A: Update Application Config (Recommended)

Update your consumer and producer applications to use the new topic name source.test-topic. This is the cleanest approach - the source. prefix is now permanent, and topic name is just a configuration value.

Update your application ConfigMap or environment variable:

# Before
env:
- name: KAFKA_TOPIC
  value: "test-topic"

# After  
env:
- name: KAFKA_TOPIC
  value: "source.test-topic"

Commit this change to Git and let ArgoCD deploy the updated application to the target cluster.

Option B: Create Topic Aliases via MirrorMaker2 Rename

If you cannot change application code, configure MirrorMaker2 to rename topics during replication by adding a rename regex to mirrormaker2.yaml:

mirrors:
  - sourceCluster: "source"
    targetCluster: "target"
    sourceConnector:
      config:
        replication.factor: 1
        topics: ".*"
        # Rewrite topic names - strip the "source." prefix
        replication.policy.class: "org.apache.kafka.connect.mirror.IdentityReplicationPolicy"

Note: IdentityReplicationPolicy keeps topic names identical to source, meaning test-topic on source becomes test-topic on target (no prefix). This requires restarting MirrorMaker2 and re-syncing. Use this if you set it up before replication starts; changing policy mid-migration requires a full topic re-sync.

The cleanest path for a migration is Option A. You are replacing the source cluster, so updating the topic name in application config is a one-time change with no ongoing complexity.


Step 5.5 - Remove MirrorMaker2 from Git

Now that we are cutting over, MirrorMaker2's job is done. Remove it from the target overlay so ArgoCD stops managing it and it is decommissioned cleanly.

Update overlays/target/kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

bases:
  - ../../base/kafka

patches:
  - path: kafka-cluster-patch.yaml
    target:
      kind: Kafka
      name: my-cluster

# mirrormaker2.yaml REMOVED - replication is no longer needed

Delete the MirrorMaker2 manifest from the overlay directory:

git rm overlays/target/mirrormaker2.yaml
git commit -m "Remove MirrorMaker2 - cutover to target complete"
git push

ArgoCD detects the diff and removes the KafkaMirrorMaker2 resource (and its pod) from the target cluster automatically via pruning.

kubectl config use-context target-cluster

# Sync to apply the removal
argocd app sync kafka-target

# Verify MirrorMaker2 pod is gone
kubectl get pods -n kafka

Expected output:

NAME                                          READY   STATUS    RESTARTS
my-cluster-entity-operator-7d9f8c5b4-def34   3/3     Running   0
my-cluster-kafka-0                            1/1     Running   0
my-cluster-zookeeper-0                        1/1     Running   0

Step 5.6 - Redirect Producers and Consumers to Target

Update your applications to use the target cluster's bootstrap address. If applications are themselves managed by ArgoCD, update their ConfigMaps or Secrets in Git:

# ConfigMap for your producer/consumer applications
apiVersion: v1
kind: ConfigMap
metadata:
  name: kafka-config
  namespace: production
data:
  # Before (source cluster)
  # KAFKA_BOOTSTRAP_SERVERS: "my-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092"

  # After (target cluster - if same namespace name, same address works)
  KAFKA_BOOTSTRAP_SERVERS: "my-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092"
  KAFKA_TOPIC: "source.test-topic"    # Updated topic name if using Option A

If your applications run in the same cluster as Kafka (the common pattern), the internal service address my-cluster-kafka-bootstrap.kafka.svc.cluster.local:9092 is the same on both clusters. The only config change is the topic name.

Scale producers back up on the target:

kubectl config use-context target-cluster
kubectl scale deployment my-producer-app --replicas=3 -n production

Step 5.7 - Final Verification on Target

Confirm the target cluster is now operating as a standalone primary with data intact and new messages flowing.

kubectl config use-context target-cluster

# Confirm all existing messages are present
kubectl -n kafka run kafka-final-verify -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-console-consumer.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --topic source.test-topic \
     --from-beginning \
     --max-messages 100

Expected output - all messages accounted for:

Hello from the source cluster - message 1
Hello from the source cluster - message 2
Hello from the source cluster - message 3
Migration test message - timestamp 2026-02-17
Live replication test message 1
Live replication test message 2
Processed a total of 6 messages
# Confirm ArgoCD on target shows healthy
argocd app get kafka-target
Name:               argocd/kafka-target
SyncStatus:         Synced
HealthStatus:       Healthy

Target cluster is the standalone primary. All data migrated. ArgoCD managing the target Kafka deployment cleanly.


Step 5.8 - Decommission the Source

Once you have verified the target is stable (monitor for 24-48 hours in production), remove the source cluster.

Option A: Remove ArgoCD Application, let it prune

kubectl config use-context source-cluster

# Delete the ArgoCD Application
# The finalizer will clean up all managed resources
argocd app delete kafka-source --cascade

# This deletes:
# - Kafka cluster and pods
# - ZooKeeper cluster and pods
# - Strimzi operator
# - Namespace kafka
# NOTE: PVCs are retained by default (deleteClaim: false in Kafka CR)
# Delete PVCs manually once you are confident the migration is complete
kubectl delete pvc --all -n kafka

Option B: Keep source in ArgoCD, sync to empty overlay

If you want ArgoCD to manage the decommission declaratively:

# Create an empty overlay for source
mkdir -p overlays/source-decommissioned
cat > overlays/source-decommissioned/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Empty - all resources will be pruned by ArgoCD
EOF

# Update ArgoCD Application to point to empty overlay
# (or simply delete the Application as in Option A)

Part 6: Summary of What Just Happened

The Migration Flow Recap

1. SOURCE CLUSTER
   ├── ArgoCD deploys Kafka via KafkaTopic + Kafka CRs
   ├── Test data produced to test-topic
   └── Kafka verified functional

2. TARGET CLUSTER  
   ├── ArgoCD deploys identical Kafka configuration
   ├── MirrorMaker2 added to target overlay in Git
   ├── ArgoCD syncs → MirrorMaker2 pod starts
   ├── MirrorMaker2 connects to source, replicates topics
   └── source.test-topic appears on target with all messages

3. CUTOVER
   ├── Producers on source stopped
   ├── Replication drained (offset verification)
   ├── MirrorMaker2 removed from Git → ArgoCD prunes it
   ├── Applications updated to point to target
   └── Source cluster decommissioned

4. POST-MIGRATION
   └── Target is standalone primary, managed by ArgoCD
       No ongoing dependency on source cluster

Why This Is Better Than PV Migration

Approach Risk Downtime Data Integrity Rollback
PV Snapshot (Velero) High - inconsistent state Hours Uncertain Hard
MirrorMaker2 Low - application-native Minutes (drain only) Verified pre-cutover Easy (keep source)

Troubleshooting

MirrorMaker2 Pod Not Starting

kubectl describe pod mm2-mirrormaker2-xxx -n kafka
kubectl logs mm2-mirrormaker2-xxx -n kafka

Common causes:

  • Cannot reach source bootstrap address - verify NodePort or LoadBalancer IP
  • DNS resolution failure - use IP address instead of hostname for cross-cluster
  • TLS mismatch - ensure ssl.endpoint.identification.algorithm: "" if not using TLS

Topics Not Appearing on Target

# Check connector status
kubectl describe kafkamirrormaker2 mm2 -n kafka | grep -A 10 "Connectors:"

# Look for connector errors
kubectl logs mm2-mirrormaker2-xxx -n kafka | grep -i error

Common causes:

  • Connector state is FAILED not RUNNING - check logs for root cause
  • Topic regex topics: ".*" may need to be more specific
  • Source Kafka not reachable from target cluster

Offsets Don't Match After Drain

# Force MirrorMaker2 to flush outstanding records
kubectl rollout restart deployment mm2-mirrormaker2 -n kafka

# Wait 60 seconds then re-check offsets
# If still lagging, check network latency between clusters

ArgoCD Shows OutOfSync After MirrorMaker2 Removal

# Force sync with pruning enabled
argocd app sync kafka-target --prune

# Verify MirrorMaker2 CRs are gone
kubectl get kafkamirrormaker2 -n kafka
# No resources found

Appendix: Production Considerations

This guide used a minimal single-broker configuration to keep the example clear. Before running this pattern in production, consider the following:

Increase Broker and ZooKeeper Replicas

spec:
  kafka:
    replicas: 3               # Minimum for production
    config:
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      default.replication.factor: 3
      min.insync.replicas: 2
  zookeeper:
    replicas: 3               # Always odd number

Enable TLS Between Clusters

spec:
  clusters:
    - alias: "source"
      bootstrapServers: "source-kafka-bootstrap:9093"  # TLS port
      tls:
        trustedCertificates:
          - secretName: source-cluster-ca-cert
            certificate: ca.crt

Limit Which Topics MirrorMaker2 Replicates

sourceConnector:
  config:
    topics: "orders.*|payments.*|inventory.*"    # Specific topic regex
    topics.exclude: ".*internal.*|.*__.*"         # Exclude internal topics

Monitor Replication Lag

# MirrorMaker2 exposes Prometheus metrics
# Add ServiceMonitor if using Prometheus Operator
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mm2-metrics
  namespace: kafka
spec:
  selector:
    matchLabels:
      strimzi.io/cluster: mm2
  endpoints:
  - port: tcp-prometheus
    interval: 30s

Key metrics to watch:

  • kafka_consumer_fetch_manager_records_lag - replication lag in records
  • kafka_connect_mirror_source_connector_replication_latency - latency in ms
  • Both should trend toward 0 before cutover

Consumer Group Offset Translation

MirrorMaker2 translates consumer group offsets automatically when sync.group.offsets.enabled: true. This means consumers that were reading from test-topic on the source at offset 42 will resume from the equivalent offset on source.test-topic on the target - they won't re-process old messages or miss new ones.

Verify offset translation after cutover:

kubectl -n kafka run kafka-consumer-groups -ti \
  --image=quay.io/strimzi/kafka:0.38.0-kafka-3.6.0 \
  --rm=true \
  --restart=Never \
  -- bin/kafka-consumer-groups.sh \
     --bootstrap-server my-cluster-kafka-bootstrap:9092 \
     --describe \
     --all-groups

Appendix: Relationship to Konveyor/MTA

This migration pattern is a useful test case for Konveyor AI's migration analysis capabilities. An automated tool should be able to:

  1. Detect that the application uses Kafka (scan Deployments for KAFKA_BOOTSTRAP_SERVERS env vars, or scan KafkaTopic CRDs)
  2. Identify this as a stateful workload requiring application-level migration rather than PV-level migration
  3. Generate the MirrorMaker2 KafkaMirrorMaker2 CR as a migration artifact
  4. Suggest the source cluster alias and bootstrap address placeholder
  5. Warn about the topic rename (source. prefix) and flag any hardcoded topic names in application config that will need updating

This is the same pattern that applies to:

  • PostgreSQL (Crunchy PGO): Standby cluster promotion
  • Redis (Redis Operator): Sentinel failover
  • MongoDB (Percona Operator): Replica set member promotion

In each case, the migration tool's job is to recognize the workload type, generate the replication configuration, and surface the cutover steps rather than defaulting to a storage-level backup/restore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment