- Deploy an external Redis instance separate from the built-in ext-cache
- Configure the Enterprise AgentGateway rate limiter to use the external Redis via a deployment overlay
- Apply global rate limiting to LLM routes using
RateLimitConfigandEnterpriseAgentgatewayPolicy - Verify rate limit counters are stored in the external Redis
Enterprise AgentGateway ships with a built-in Redis instance (ext-cache-enterprise-agentgateway) that backs the rate limiter. In production environments, you may want to use a managed Redis service (e.g., Amazon ElastiCache, Google Memorystore, Azure Cache for Redis) for better durability, replication, and operational control.
The rate limiter reads its Redis connection from the REDIS_URL environment variable, which the controller hardcodes to the built-in ext-cache service. To point it at an external Redis, use a deployment overlay on the EnterpriseAgentgatewayParameters resource. The overlay uses Kubernetes Strategic Merge Patch, which merges the env list by name — allowing you to replace the default REDIS_URL value.
┌──────────┐ ┌──────────────────────────────────────────────┐
│ │ LLM Request │ AgentGateway │
│ Client │ ────────────────► │ │
│ │ │ 1. Envoy checks rate limit │
│ │ ◄──────────────── │ 2. Rate Limiter queries Redis counters │
└──────────┘ 200 / 429 │ 3. Allow or reject (429 Too Many Requests) │
└──────────┬───────────────────────────────────┘
│
│ gRPC
▼
┌─────────────────────┐
│ Rate Limiter │
│ (REDIS_URL override)│
└──────────┬──────────┘
│
│ tcp:6379
▼
┌─────────────────────┐
│ External Redis │
│ (my-redis namespace) │
└─────────────────────┘
| Requirement | Details |
|---|---|
| Kubernetes cluster | With Enterprise AgentGateway 2.1+ installed |
kubectl |
Configured to access the cluster |
| Existing Gateway | A working agentgateway-proxy Gateway with at least one HTTPRoute |
Create a dedicated namespace and deploy a Redis instance. In production, this would be a managed Redis service — here we use a simple Redis deployment for demonstration.
kubectl apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
name: redis
labels:
workshop: external-redis-ratelimit
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-redis
namespace: redis
labels:
workshop: external-redis-ratelimit
spec:
replicas: 1
selector:
matchLabels:
app: my-redis
template:
metadata:
labels:
app: my-redis
spec:
containers:
- name: redis
image: redis:7-alpine
ports:
- containerPort: 6379
resources:
requests:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: my-redis
namespace: redis
labels:
workshop: external-redis-ratelimit
spec:
selector:
app: my-redis
ports:
- port: 6379
targetPort: 6379
EOFWait for Redis to be ready:
kubectl wait --for=condition=ready pod -l app=my-redis -n redis --timeout=60sUpdate the EnterpriseAgentgatewayParameters to override REDIS_URL on the rate limiter container using a deployment overlay. The overlay uses Strategic Merge Patch — it matches the rate-limiter container by name and replaces the REDIS_URL env var by name.
Important: Use
kubectl apply --server-sidebecause theEnterpriseAgentgatewayParametersresource uses overlays that require server-side apply. See the customization docs.
kubectl apply --server-side -f - <<EOF
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
name: agentgateway-config
namespace: agentgateway-system
spec:
sharedExtensions:
extCache:
deployment:
spec:
replicas: 1
enabled: true
extauth:
deployment:
spec:
replicas: 1
enabled: true
ratelimiter:
enabled: true
deployment:
spec:
replicas: 1
template:
spec:
containers:
- name: rate-limiter
env:
- name: REDIS_URL
value: "my-redis.redis.svc.cluster.local:6379"
EOFNote: This example preserves existing
sharedExtensionssettings. Adjust to match your currentEnterpriseAgentgatewayParametersspec — the deployment overlay is additive and only the fields you specify are merged.
Wait for the rate limiter to roll:
kubectl rollout status deploy/rate-limiter-enterprise-agentgateway \
-n agentgateway-system --timeout=60sVerify REDIS_URL is pointing to the external Redis:
kubectl get pod -n agentgateway-system -l app=rate-limiter \
-o jsonpath='{.items[0].spec.containers[0].env}' | \
python3 -c "import json,sys; [print(f\"{e['name']}={e.get('value','')}\") for e in json.load(sys.stdin) if 'REDIS' in e['name']]"Expected output:
REDIS_DB=0
REDIS_URL=my-redis.redis.svc.cluster.local:6379
REDIS_SOCKET_TYPE=tcp
Confirm the rate limiter connected successfully:
kubectl logs -n agentgateway-system -l app=rate-limiter --tail=5 | grep redisExpected:
{"level":"info","msg":"will connect to redis on tcp my-redis.redis.svc.cluster.local:6379 with pool size 10"}
Create a RateLimitConfig that limits requests to 3 per minute (intentionally low for easy testing), and an EnterpriseAgentgatewayPolicy that attaches it to the Gateway.
kubectl apply -f - <<EOF
apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
name: global-rate-limit
namespace: agentgateway-system
labels:
workshop: external-redis-ratelimit
spec:
raw:
descriptors:
- key: generic_key
value: counter
rateLimit:
requestsPerUnit: 3
unit: MINUTE
rateLimits:
- actions:
- genericKey:
descriptorValue: counter
type: REQUEST
---
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayPolicy
metadata:
name: global-rate-limit
namespace: agentgateway-system
labels:
workshop: external-redis-ratelimit
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: agentgateway-proxy
traffic:
entRateLimit:
global:
rateLimitConfigRefs:
- name: global-rate-limit
EOFVerify the policy is accepted and attached:
kubectl get enterpriseagentgatewaypolicy global-rate-limit \
-n agentgateway-system \
-o jsonpath='{.status.ancestors[0].conditions[*].reason}'Expected: Valid Attached
Get the gateway address:
export GATEWAY_IP=$(kubectl get svc -n agentgateway-system \
--selector=gateway.networking.k8s.io/gateway-name=agentgateway-proxy \
-o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}{.items[*].status.loadBalancer.ingress[0].hostname}')
echo "Gateway IP: $GATEWAY_IP"Send 5 requests (limit is 3 per minute). Replace /openai with a valid route on your Gateway:
for i in $(seq 1 5); do
CODE=$(curl -s -o /dev/null -w "%{http_code}" \
"http://$GATEWAY_IP:8080/openai/v1/models" \
-H "host: www.example.com" --max-time 5)
echo "Request $i: HTTP $CODE"
doneExpected output — the first 3 requests pass through, requests 4 and 5 are rate limited:
Request 1: HTTP 200
Request 2: HTTP 200
Request 3: HTTP 200
Request 4: HTTP 429
Request 5: HTTP 429
Note: The first 3 requests may return a non-200 status (e.g., 401, 503) depending on your backend configuration. The key verification is that requests 4 and 5 return 429 Too Many Requests.
Inspect the rate limit response headers:
curl -sv "http://$GATEWAY_IP:8080/openai/v1/models" \
-H "host: www.example.com" --max-time 5 2>&1 | grep "< "Expected:
< HTTP/1.1 429 Too Many Requests
< retry-after: 60
Confirm that rate limit counters are stored in the external Redis (not the built-in ext-cache):
kubectl exec -n redis deploy/my-redis -- redis-cli keys '*'Expected output (a rate limit counter key):
tree|solo.io|generic_key^agentgateway-system.global-rate-limit|generic_key^counter|<timestamp>
The rate limiter supports additional Redis configuration via env vars. These can all be overridden using the same deployment overlay technique:
| Environment Variable | Description | Default |
|---|---|---|
REDIS_URL |
Redis host:port | ext-cache-enterprise-agentgateway:6379 |
REDIS_SOCKET_TYPE |
Connection type: tcp or tls |
tcp |
REDIS_DB |
Redis database number | 0 |
REDIS_AUTH |
Redis password | (none) |
REDIS_TLS |
Enable TLS | (not set) |
Example with authentication and TLS:
sharedExtensions:
ratelimiter:
enabled: true
deployment:
spec:
template:
spec:
containers:
- name: rate-limiter
env:
- name: REDIS_URL
value: "my-elasticache.abc123.use1.cache.amazonaws.com:6380"
- name: REDIS_SOCKET_TYPE
value: "tls"
- name: REDIS_AUTH
valueFrom:
secretKeyRef:
name: redis-credentials
key: password# Remove rate limit resources
kubectl delete ratelimitconfig,enterpriseagentgatewaypolicy \
-n agentgateway-system -l workshop=external-redis-ratelimit
# Remove external Redis
kubectl delete ns redis
# Revert the EnterpriseAgentgatewayParameters to remove the REDIS_URL overlay
# (update to match your original spec)
kubectl apply --server-side -f - <<EOF
apiVersion: enterpriseagentgateway.solo.io/v1alpha1
kind: EnterpriseAgentgatewayParameters
metadata:
name: agentgateway-config
namespace: agentgateway-system
spec:
sharedExtensions:
extCache:
deployment:
spec:
replicas: 1
enabled: true
extauth:
deployment:
spec:
replicas: 1
enabled: true
ratelimiter:
deployment:
spec:
replicas: 1
enabled: true
EOF