Global Rate Limiting with External Redis

Lab Objectives

Deploy an external Redis instance separate from the built-in ext-cache
Configure the Enterprise AgentGateway rate limiter to use the external Redis via a deployment overlay
Apply global rate limiting to LLM routes using RateLimitConfig and EnterpriseAgentgatewayPolicy
Verify rate limit counters are stored in the external Redis

Overview

Enterprise AgentGateway ships with a built-in Redis instance (ext-cache-enterprise-agentgateway) that backs the rate limiter. In production environments, you may want to use a managed Redis service (e.g., Amazon ElastiCache, Google Memorystore, Azure Cache for Redis) for better durability, replication, and operational control.

The rate limiter reads its Redis connection from the REDIS_URL environment variable, which the controller hardcodes to the built-in ext-cache service. To point it at an external Redis, use a deployment overlay on the EnterpriseAgentgatewayParameters resource. The overlay uses Kubernetes Strategic Merge Patch, which merges the env list by name — allowing you to replace the default REDIS_URL value.

Architecture

┌──────────┐                   ┌──────────────────────────────────────────────┐
│          │   LLM Request     │  AgentGateway                               │
│  Client  │ ────────────────► │                                              │
│          │                   │  1. Envoy checks rate limit                  │
│          │ ◄──────────────── │  2. Rate Limiter queries Redis counters      │
└──────────┘   200 / 429       │  3. Allow or reject (429 Too Many Requests)  │
                               └──────────┬───────────────────────────────────┘
                                          │
                                          │ gRPC
                                          ▼
                               ┌─────────────────────┐
                               │  Rate Limiter        │
                               │  (REDIS_URL override)│
                               └──────────┬──────────┘
                                          │
                                          │ tcp:6379
                                          ▼
                               ┌─────────────────────┐
                               │  External Redis      │
                               │  (my-redis namespace) │
                               └─────────────────────┘

Prerequisites

Requirement	Details
Kubernetes cluster	With Enterprise AgentGateway 2.1+ installed
`kubectl`	Configured to access the cluster
Existing Gateway	A working `agentgateway-proxy` Gateway with at least one HTTPRoute

Step 1: Deploy an External Redis

Create a dedicated namespace and deploy a Redis instance. In production, this would be a managed Redis service — here we use a simple Redis deployment for demonstration.

kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: redis
  labels:
    workshop: external-redis-ratelimit
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-redis
  namespace: redis
  labels:
    workshop: external-redis-ratelimit
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-redis
  template:
    metadata:
      labels:
        app: my-redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        ports:
        - containerPort: 6379
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
  name: my-redis
  namespace: redis
  labels:
    workshop: external-redis-ratelimit
spec:
  selector:
    app: my-redis
  ports:
  - port: 6379
    targetPort: 6379
EOF

Wait for Redis to be ready:

kubectl wait --for=condition=ready pod -l app=my-redis -n redis --timeout=60s

Step 2: Configure the Rate Limiter to Use External Redis

Update the EnterpriseAgentgatewayParameters to override REDIS_URL on the rate limiter container using a deployment overlay. The overlay uses Strategic Merge Patch — it matches the rate-limiter container by name and replaces the REDIS_URL env var by name.

Important: Use kubectl apply --server-side because the EnterpriseAgentgatewayParameters resource uses overlays that require server-side apply. See the customization docs.

kubectl apply --server-side -f - <<EOF
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
  name: agentgateway-config
  namespace: agentgateway-system
spec:
  sharedExtensions:
    extCache:
      deployment:
        spec:
          replicas: 1
      enabled: true
    extauth:
      deployment:
        spec:
          replicas: 1
      enabled: true
    ratelimiter:
      enabled: true
      deployment:
        spec:
          replicas: 1
          template:
            spec:
              containers:
              - name: rate-limiter
                env:
                - name: REDIS_URL
                  value: "my-redis.redis.svc.cluster.local:6379"
EOF

Note: This example preserves existing sharedExtensions settings. Adjust to match your current EnterpriseAgentgatewayParameters spec — the deployment overlay is additive and only the fields you specify are merged.

Wait for the rate limiter to roll:

kubectl rollout status deploy/rate-limiter-enterprise-agentgateway \
  -n agentgateway-system --timeout=60s

Verify REDIS_URL is pointing to the external Redis:

kubectl get pod -n agentgateway-system -l app=rate-limiter \
  -o jsonpath='{.items[0].spec.containers[0].env}' | \
  python3 -c "import json,sys; [print(f\"{e['name']}={e.get('value','')}\") for e in json.load(sys.stdin) if 'REDIS' in e['name']]"

Expected output:

REDIS_DB=0
REDIS_URL=my-redis.redis.svc.cluster.local:6379
REDIS_SOCKET_TYPE=tcp

Confirm the rate limiter connected successfully:

kubectl logs -n agentgateway-system -l app=rate-limiter --tail=5 | grep redis

Expected:

{"level":"info","msg":"will connect to redis on tcp my-redis.redis.svc.cluster.local:6379 with pool size 10"}

Step 3: Create a Rate Limit Policy

Create a RateLimitConfig that limits requests to 3 per minute (intentionally low for easy testing), and an EnterpriseAgentgatewayPolicy that attaches it to the Gateway.

kubectl apply -f - <<EOF
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: global-rate-limit
  namespace: agentgateway-system
  labels:
    workshop: external-redis-ratelimit
spec:
  raw:
    descriptors:
    - key: generic_key
      value: counter
      rateLimit:
        requestsPerUnit: 3
        unit: MINUTE
    rateLimits:
    - actions:
      - genericKey:
          descriptorValue: counter
      type: REQUEST
---
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
  name: global-rate-limit
  namespace: agentgateway-system
  labels:
    workshop: external-redis-ratelimit
spec:
  targetRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: agentgateway-proxy
  traffic:
    entRateLimit:
      global:
        rateLimitConfigRefs:
        - name: global-rate-limit
EOF

Verify the policy is accepted and attached:

kubectl get enterpriseagentgatewaypolicy global-rate-limit \
  -n agentgateway-system \
  -o jsonpath='{.status.ancestors[0].conditions[*].reason}'

Expected: Valid Attached

Step 4: Test Rate Limiting

Get the gateway address:

export GATEWAY_IP=$(kubectl get svc -n agentgateway-system \
  --selector=gateway.networking.k8s.io/gateway-name=agentgateway-proxy \
  -o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}{.items[*].status.loadBalancer.ingress[0].hostname}')
echo "Gateway IP: $GATEWAY_IP"

Send 5 requests (limit is 3 per minute). Replace /openai with a valid route on your Gateway:

for i in $(seq 1 5); do
  CODE=$(curl -s -o /dev/null -w "%{http_code}" \
    "http://$GATEWAY_IP:8080/openai/v1/models" \
    -H "host: www.example.com" --max-time 5)
  echo "Request $i: HTTP $CODE"
done

Expected output — the first 3 requests pass through, requests 4 and 5 are rate limited:

Request 1: HTTP 200
Request 2: HTTP 200
Request 3: HTTP 200
Request 4: HTTP 429
Request 5: HTTP 429

Note: The first 3 requests may return a non-200 status (e.g., 401, 503) depending on your backend configuration. The key verification is that requests 4 and 5 return 429 Too Many Requests.

Inspect the rate limit response headers:

curl -sv "http://$GATEWAY_IP:8080/openai/v1/models" \
  -H "host: www.example.com" --max-time 5 2>&1 | grep "< "

Expected:

< HTTP/1.1 429 Too Many Requests
< retry-after: 60

Step 5: Verify Counters in External Redis

Confirm that rate limit counters are stored in the external Redis (not the built-in ext-cache):

kubectl exec -n redis deploy/my-redis -- redis-cli keys '*'

Expected output (a rate limit counter key):

tree|solo.io|generic_key^agentgateway-system.global-rate-limit|generic_key^counter|<timestamp>

Additional Redis Environment Variables

The rate limiter supports additional Redis configuration via env vars. These can all be overridden using the same deployment overlay technique:

Environment Variable	Description	Default
`REDIS_URL`	Redis host:port	`ext-cache-enterprise-agentgateway:6379`
`REDIS_SOCKET_TYPE`	Connection type: `tcp` or `tls`	`tcp`
`REDIS_DB`	Redis database number	`0`
`REDIS_AUTH`	Redis password	(none)
`REDIS_TLS`	Enable TLS	(not set)

Example with authentication and TLS:

sharedExtensions:
  ratelimiter:
    enabled: true
    deployment:
      spec:
        template:
          spec:
            containers:
            - name: rate-limiter
              env:
              - name: REDIS_URL
                value: "my-elasticache.abc123.use1.cache.amazonaws.com:6380"
              - name: REDIS_SOCKET_TYPE
                value: "tls"
              - name: REDIS_AUTH
                valueFrom:
                  secretKeyRef:
                    name: redis-credentials
                    key: password

Cleanup

# Remove rate limit resources
kubectl delete ratelimitconfig,enterpriseagentgatewaypolicy \
  -n agentgateway-system -l workshop=external-redis-ratelimit

# Remove external Redis
kubectl delete ns redis

# Revert the EnterpriseAgentgatewayParameters to remove the REDIS_URL overlay
# (update to match your original spec)
kubectl apply --server-side -f - <<EOF
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
  name: agentgateway-config
  namespace: agentgateway-system
spec:
  sharedExtensions:
    extCache:
      deployment:
        spec:
          replicas: 1
      enabled: true
    extauth:
      deployment:
        spec:
          replicas: 1
      enabled: true
    ratelimiter:
      deployment:
        spec:
          replicas: 1
      enabled: true
EOF

rvennam/external-redis-global-rate-limiting.md

Select an option

No results found