Follows the Getting Started guide at AI Gateway.
- Create an Azure OpenAI instance.
- Add the
gpt-4.1deployment - Create a Service Principal
- Assign the Service Principal the "Cognitive Services OpenAI User" role.
This is how I do apply the Gateway prerequesite and AI-Gateway. Can be done with kubectl and manifests instead.
# apply crds first
terraform apply -target='module.aks.module.envoy_ai_gateway.data.helm_template.eg_crds'
terraform apply -target='module.aks.module.envoy_ai_gateway'This terraform module will...
- create the namespace
envoy-gateway-system - deploy the Envoy Gateway CRDs
- deploy the Envoy Gateway Helm Chart using
envoy-gateway-values.yaml.tpl - deploy the Envoy AI Gateway CRDS
- deploy the Envoy AI Gateway Helm Chart
- deploy the
envoy_ai_gateway_client_secretK8S Secret - This is the Service Principal secret for the AI Gateway. - deploy the rbac/redis manifests in
manifests.tf - The
configmanifest mentioned in the AI Gateway install docs isn't needed, it's added to the Envoy Gateway Helm Chart values.yaml
To simplify the setup and make sure you're following the AI Gateway install docs, apply the AI Gateway manifests manually.
This is hard-coded to use a public Azure OpenAI endpoint, to eliminate complications with private endpoint.
The gpt-4.1 model deployment is created.
kubectl apply -f azure-openai.yamlThis will deploy all the AI Gateway resources. This is sourced from the examples and customized for our use case.
---
kind: GatewayClass
---
kind: Gateway
---
kind: ClientTrafficPolicy
---
kind: EnvoyProxy---
kind: AIGatewayRoute
---
kind: AIServiceBackend
---
kind: BackendSecurityPolicy
---
kind: Backend
---
kind: BackendTLSPolicyServices should be running in envoy-gateway-system namespace.
kubectl get pods -n envoy-gateway-systemPort forward the envoy-envoy-gateway-system-envoy-ai-gateway-basic service.
kubectl get svc -n envoy-gateway-system --selector=gateway.envoyproxy.io/owning-gateway-namespace=envoy-gateway-system,gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic
kubectl port-forward -n envoy-gateway-system svc/envoy-envoy-gateway-system-envoy-ai-gateway-basic-HASH 8080:80Test the endpoint
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-ai-eg-model: gpt-4.1" \
-d '{
"model": "gpt-4.1",
"messages": [{"role": "user", "content": "Hello!"}]
}'Probably results in a upstream request timeout after 60s.
Get the apiToken secret from the envoy-envoy-gateway-system-envoy-ai-gateway-basic secret.
Launch a temporary pod and test.
kubectl run tmp-shell --rm -i --tty --image nicolaka/netshoot# from the tmp-shell terminal
export TOKEN={apiToken}
curl -X POST "https://DUMMY.openai.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"messages": [
{
"role": "user",
"content": "I am going to Paris, what should I see?"
}
],
"max_completion_tokens": 13107,
"temperature": 1,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0,
"model": "gpt-4.1"
}'This probably works.
kubectl delete -f azure-openai.yaml
terraform destroy -target='module.aks.module.envoy_ai_gateway'Verify the envoy-gateway-system namespace is deleted.
kubectl get namespaces