Skip to content

Instantly share code, notes, and snippets.

@MattMencel
Last active September 17, 2025 17:29
Show Gist options
  • Select an option

  • Save MattMencel/a7e60368e358ebac9ab3f389e082828f to your computer and use it in GitHub Desktop.

Select an option

Save MattMencel/a7e60368e358ebac9ab3f389e082828f to your computer and use it in GitHub Desktop.
Envoy AI Gateway - Non-Working

Steps

Follows the Getting Started guide at AI Gateway.

Prerequisite

  • Create an Azure OpenAI instance.
  • Add the gpt-4.1 deployment
  • Create a Service Principal
  • Assign the Service Principal the "Cognitive Services OpenAI User" role.

1. Apply the module

This is how I do apply the Gateway prerequesite and AI-Gateway. Can be done with kubectl and manifests instead.

# apply crds first
terraform apply -target='module.aks.module.envoy_ai_gateway.data.helm_template.eg_crds'
terraform apply -target='module.aks.module.envoy_ai_gateway'

This terraform module will...

  • create the namespace envoy-gateway-system
  • deploy the Envoy Gateway CRDs
  • deploy the Envoy Gateway Helm Chart using envoy-gateway-values.yaml.tpl
  • deploy the Envoy AI Gateway CRDS
  • deploy the Envoy AI Gateway Helm Chart
  • deploy the envoy_ai_gateway_client_secret K8S Secret - This is the Service Principal secret for the AI Gateway.
  • deploy the rbac/redis manifests in manifests.tf
  • The config manifest mentioned in the AI Gateway install docs isn't needed, it's added to the Envoy Gateway Helm Chart values.yaml

2. Apply the AI Gateway manifests

To simplify the setup and make sure you're following the AI Gateway install docs, apply the AI Gateway manifests manually.

This is hard-coded to use a public Azure OpenAI endpoint, to eliminate complications with private endpoint. The gpt-4.1 model deployment is created.

kubectl apply -f azure-openai.yaml

This will deploy all the AI Gateway resources. This is sourced from the examples and customized for our use case.

Basic resources

---
kind: GatewayClass
---
kind: Gateway
---
kind: ClientTrafficPolicy
---
kind: EnvoyProxy

AIEG resources

---
kind: AIGatewayRoute
---
kind: AIServiceBackend
---
kind: BackendSecurityPolicy
---
kind: Backend
---
kind: BackendTLSPolicy

3. Testing

Services should be running in envoy-gateway-system namespace.

kubectl get pods -n envoy-gateway-system

Port forward the envoy-envoy-gateway-system-envoy-ai-gateway-basic service.

kubectl get svc -n envoy-gateway-system --selector=gateway.envoyproxy.io/owning-gateway-namespace=envoy-gateway-system,gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic
kubectl port-forward -n envoy-gateway-system svc/envoy-envoy-gateway-system-envoy-ai-gateway-basic-HASH 8080:80

Test the endpoint

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-ai-eg-model: gpt-4.1" \
  -d '{
    "model": "gpt-4.1",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Probably results in a upstream request timeout after 60s.

Troubleshooting

Token Verification

Get the apiToken secret from the envoy-envoy-gateway-system-envoy-ai-gateway-basic secret.

Launch a temporary pod and test.

kubectl run tmp-shell --rm -i --tty --image nicolaka/netshoot
# from the tmp-shell terminal
export TOKEN={apiToken}
curl -X POST "https://DUMMY.openai.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview" \
          -H "Content-Type: application/json" \
          -H "Authorization: Bearer $TOKEN" \
          -d '{
          "messages": [
              {
                  "role": "user",
                  "content": "I am going to Paris, what should I see?"
              }
          ],
          "max_completion_tokens": 13107,
          "temperature": 1,
          "top_p": 1,
          "frequency_penalty": 0,
          "presence_penalty": 0,
          "model": "gpt-4.1"
      }'

This probably works.

Teardown

kubectl delete -f azure-openai.yaml
terraform destroy -target='module.aks.module.envoy_ai_gateway'

Verify the envoy-gateway-system namespace is deleted.

kubectl get namespaces
# Copyright Envoy AI Gateway Authors
# SPDX-License-Identifier: Apache-2.0
# The full text of the Apache license is available in the LICENSE file at
# the root of the repo.
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: envoy-ai-gateway-basic
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: envoy-ai-gateway-basic
namespace: envoy-gateway-system
spec:
gatewayClassName: envoy-ai-gateway-basic
listeners:
- name: http
protocol: HTTP
port: 80
infrastructure:
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: envoy-ai-gateway-basic
---
# By default, Envoy Gateway sets the buffer limit to 32kiB which is not sufficient for AI workloads.
# This ClientTrafficPolicy sets the buffer limit to 150MiB as an example.
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata:
name: client-buffer-limit
namespace: envoy-gateway-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: envoy-ai-gateway-basic
connection:
bufferLimit: 150Mi
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: envoy-ai-gateway-basic
namespace: envoy-gateway-system
spec:
# logging:
# level:
# default: debug
provider:
type: Kubernetes
kubernetes:
envoyDeployment:
container:
# Note: this is to clear the large default memory/cpu requirements for local tests.
# In production, you should set these to values that make sense for your environment.
resources: {}
envoyService:
type: ClusterIP
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-basic-azure
namespace: envoy-gateway-system
spec:
parentRefs:
- name: envoy-ai-gateway-basic
kind: Gateway
group: gateway.networking.k8s.io
rules:
- matches:
- headers:
- type: Exact
name: x-ai-eg-model
value: gpt-4.1
backendRefs:
- name: envoy-ai-gateway-basic-azure
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
name: envoy-ai-gateway-basic-azure
namespace: envoy-gateway-system
spec:
schema:
name: AzureOpenAI
version: 2025-01-01-preview
backendRef:
name: envoy-ai-gateway-basic-azure
kind: Backend
group: gateway.envoyproxy.io
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: BackendSecurityPolicy
metadata:
name: envoy-ai-gateway-basic-azure-credentials
namespace: envoy-gateway-system
spec:
targetRefs:
- group: aigateway.envoyproxy.io
kind: AIServiceBackend
name: envoy-ai-gateway-basic-azure
type: AzureCredentials
azureCredentials:
clientID: CLIENTID # replace with SPN clientID
tenantID: TENANTID # replace with Azure Tenant
clientSecretRef:
name: envoy-ai-gateway-client-secret
namespace: envoy-gateway-system
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
name: envoy-ai-gateway-basic-azure
namespace: envoy-gateway-system
spec:
endpoints:
- fqdn:
hostname: dummy-azure-resource.openai.azure.com # Replace "dummy-azure-resource" with your Azure OpenAI resource e.g. <azure_resource_name>.openai.azure.com
port: 443
---
apiVersion: gateway.networking.k8s.io/v1alpha3
kind: BackendTLSPolicy
metadata:
name: envoy-ai-gateway-basic-azure-tls
namespace: envoy-gateway-system
spec:
targetRefs:
- group: "gateway.envoyproxy.io"
kind: Backend
name: envoy-ai-gateway-basic-azure
validation:
wellKnownCACertificates: "System"
hostname: dummy-azure-resource.openai.azure.com # Replace "dummy-azure-resource" with your Azure OpenAI resource e.g. <azure_resource_name>.openai.azure.com
# Applied via terraform helm_release.eg
config:
envoyGateway:
gateway:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
logging:
level:
default: info
provider:
kubernetes:
rateLimitDeployment:
patch:
type: StrategicMerge
value:
spec:
template:
spec:
containers:
- imagePullPolicy: IfNotPresent
name: envoy-ratelimit
image: docker.io/envoyproxy/ratelimit:60d8e81b
type: Kubernetes
extensionApis:
enableEnvoyPatchPolicy: true
enableBackend: true
extensionManager:
hooks:
xdsTranslator:
translation:
listener:
includeAll: true
route:
includeAll: true
cluster:
includeAll: true
secret:
includeAll: true
post:
- Translation
- Cluster
- Route
service:
fqdn:
hostname: ai-gateway-controller.envoy-gateway-system.svc.cluster.local
port: 1063
rateLimit:
backend:
type: Redis
redis:
url: redis.redis-system.svc.cluster.local:6379
locals {
eg_namespace = "envoy-gateway-system"
}
# Create namespace for AI Gateway
resource "kubernetes_namespace_v1" "eg" {
metadata {
name = local.eg_namespace
}
}
data "helm_template" "eg_crds" {
name = "eg-crds"
chart = "oci://docker.io/envoyproxy/gateway-crds-helm"
version = "v0.0.0-latest"
namespace = kubernetes_namespace_v1.eg.metadata[0].name
set {
name = "crds.gatewayAPI.enabled"
value = "true"
}
set {
name = "crds.gatewayAPI.channel"
value = "standard"
}
set {
name = "crds.envoyGateway.enabled"
value = "true"
}
}
resource "kubernetes_manifest" "eg_crds" {
for_each = {
for k, v in data.helm_template.eg_crds.manifests : k => v
if can(yamldecode(v)) && v != "" && v != "null"
}
manifest = yamldecode(each.value)
field_manager {
force_conflicts = true
}
}
resource "helm_release" "eg" {
depends_on = [
kubernetes_manifest.eg_crds
]
name = "envoy-gateway"
chart = "oci://docker.io/envoyproxy/gateway-helm"
version = "v0.0.0-latest"
namespace = kubernetes_namespace_v1.eg.metadata[0].name
create_namespace = false
skip_crds = true
values = [
templatefile("${path.module}/envoy-gateway-values.yaml.tpl", {
})
]
}
resource "helm_release" "aieg_crds" {
name = "aieg-crd"
chart = "oci://registry-1.docker.io/envoyproxy/ai-gateway-crds-helm"
version = "v0.0.0-latest"
namespace = kubernetes_namespace_v1.eg.metadata[0].name
create_namespace = false
}
# Envoy Gateway Helm Chart
resource "helm_release" "aieg" {
depends_on = [
helm_release.eg,
helm_release.aieg_crds
]
name = "aieg"
chart = "oci://registry-1.docker.io/envoyproxy/ai-gateway-helm"
version = "v0.0.0-latest"
namespace = kubernetes_namespace_v1.eg.metadata[0].name
create_namespace = false
skip_crds = true
set {
name = "controller.mutatingWebhook.certManager.enable"
value = "true"
}
}
resource "kubernetes_secret_v1" "envoy_ai_gateway_client_secret" {
metadata {
name = "envoy-ai-gateway-client-secret"
namespace = kubernetes_namespace_v1.eg.metadata[0].name
}
data = {
client-secret = # SPN clientSecret data resource
}
}
data "http" "aieg_redis" {
url = "https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-config/redis.yaml"
}
locals {
aieg_redis_docs = split("---", data.http.aieg_redis.response_body)
aieg_redis_valid_docs = [
for i, doc in local.aieg_redis_docs : {
index = i
content = doc
}
if trimspace(doc) != "" && !startswith(trimspace(doc), "#")
]
}
resource "kubectl_manifest" "aieg_redis" {
for_each = { for doc in local.aieg_redis_valid_docs : tostring(doc.index) => doc.content }
yaml_body = each.value
}
data "http" "aieg_rbac" {
url = "https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/manifests/envoy-gateway-config/rbac.yaml"
}
locals {
aieg_rbac_docs = split("---", data.http.aieg_rbac.response_body)
aieg_rbac_valid_docs = [
for i, doc in local.aieg_rbac_docs : {
index = i
content = doc
}
if trimspace(doc) != "" && !startswith(trimspace(doc), "#")
]
}
resource "kubectl_manifest" "aieg_rbac" {
for_each = { for doc in local.aieg_rbac_valid_docs : tostring(doc.index) => doc.content }
yaml_body = each.value
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment