Complete configuration for running GPU workloads on AWS EKS with Karpenter v1.8 and G6f fractional GPU instances using multiple NodePools for dynamic instance selection.
- EKS cluster (v1.34+) already created
kubectlconfigured to access the clusterhelminstalled- AWS CLI configured
- Cluster has OIDC provider enabled
This setup uses multiple NodePools to dynamically select G6f instance types based on workload GPU memory requirements:
- Small GPU (3GB): g6f.large, g6f.xlarge → 1/8 fractional GPU
- Medium GPU (6GB): g6f.2xlarge → 1/4 fractional GPU
- Large GPU (12GB): g6f.4xlarge, gr6f.4xlarge → 1/2 fractional GPU
Workloads use node selectors to target specific GPU sizes, and Karpenter provisions the appropriate instance type.
Important: NodeOverlay is required because AWS reports GPU count as 0 for fractional GPU instances in the EC2 API.
export KARPENTER_NAMESPACE="kube-system"
export KARPENTER_VERSION="1.8.6"
export CLUSTER_NAME="<your-cluster-name>"
export AWS_DEFAULT_REGION="<your-region>"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export AWS_PARTITION="aws" # or aws-cn, aws-us-govDownload and deploy the CloudFormation template:
curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v${KARPENTER_VERSION}/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > karpenter-cloudformation.yaml
aws cloudformation deploy \
--stack-name "Karpenter-${CLUSTER_NAME}" \
--template-file karpenter-cloudformation.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides "ClusterName=${CLUSTER_NAME}"aws iam create-service-linked-role --aws-service-name spot.amazonaws.com || true# Logout of helm registry
helm registry logout public.ecr.aws
# Install Karpenter with NodeOverlay feature enabled
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version "${KARPENTER_VERSION}" \
--namespace "${KARPENTER_NAMESPACE}" \
--create-namespace \
--set "settings.clusterName=${CLUSTER_NAME}" \
--set "settings.interruptionQueue=${CLUSTER_NAME}" \
--set "settings.featureGates.nodeOverlay=true" \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi \
--waitNote: nodeOverlay=true is required for fractional GPU support.
kubectl get pods -n kube-system -l app.kubernetes.io/name=karpenter
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -c controller --tail=20helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo updatehelm install nvdp nvdp/nvidia-device-plugin \
--namespace nvidia \
--create-namespace \
--version 0.18.2 \
--set gfd.enabled=true \
--set-json 'nfd.worker.tolerations=[{"key":"nvidia.com/gpu","operator":"Exists","effect":"NoSchedule"},{"key":"node-role.kubernetes.io/master","operator":"Equal","effect":"NoSchedule"}]'This installs the device plugin with GPU Feature Discovery and configures NFD worker to tolerate GPU node taints.
kubectl get pods -n nvidia
kubectl get daemonset -n nvidiaWhy NodeOverlay is Required: AWS EC2 API reports GPU count as 0 for fractional GPU instances. NodeOverlay tells Karpenter these instances actually have GPU capacity during scheduling simulation.
File: g6f-nodeoverlay.yaml
# NodeOverlay for G6f fractional GPU instances
# REQUIRED: AWS reports GPU count as 0 for fractional GPUs
# This tells Karpenter these instances have GPU capacity
apiVersion: karpenter.sh/v1alpha1
kind: NodeOverlay
metadata:
name: g6f-fractional-gpu
spec:
weight: 100
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- "g6f.large"
- "g6f.xlarge"
- "g6f.2xlarge"
- "g6f.4xlarge"
- "gr6f.4xlarge"
capacity:
# Critical: Override AWS's "0" GPU count
nvidia.com/gpu: "1"Apply the NodeOverlay configuration:
kubectl apply -f g6f-nodeoverlay.yaml
kubectl get nodeoverlayVerification:
# Check NodeOverlay status
kubectl describe nodeoverlay g6f-fractional-gpu
# Should show Ready=TrueFile: g6f-ec2nodeclass.yaml
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: g6f-gpu
spec:
# Replace with your Karpenter node role
role: "KarpenterNodeRole-${CLUSTER_NAME}"
# Use GPU-optimized AMI (NVIDIA drivers pre-installed)
amiSelectorTerms:
- alias: "al2023@latest"
amiFamily: AL2023
# Note: The AL2023 GPU AMI includes:
# - NVIDIA drivers pre-installed and configured
# - NVIDIA container toolkit configured
# - Containerd configured for GPU support
# No custom userData needed for basic GPU functionality
# Subnet and security group discovery
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}"
# Block device configuration
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
encrypted: true
deleteOnTermination: true
# Metadata options
metadataOptions:
httpEndpoint: enabled
httpProtocolIPv6: disabled
httpPutResponseHopLimit: 2
httpTokens: requiredApply with environment variable substitution:
envsubst < g6f-ec2nodeclass.yaml | kubectl apply -f -This is the key to dynamic G6f instance selection. Each NodePool targets specific instance types and labels nodes accordingly.
File: g6f-nodepools.yaml
---
# NodePool for Small GPU workloads (1/8 GPU, 3GB)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: g6f-gpu-small
spec:
template:
metadata:
labels:
gpu-memory-size: "3gb"
gpu-fraction: "0.125"
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- "g6f.large"
- "g6f.xlarge"
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: g6f-gpu
taints:
- key: nvidia.com/gpu
effect: NoSchedule
expireAfter: 720h
limits:
cpu: 50
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 5m
---
# NodePool for Medium GPU workloads (1/4 GPU, 6GB)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: g6f-gpu-medium
spec:
template:
metadata:
labels:
gpu-memory-size: "6gb"
gpu-fraction: "0.25"
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- "g6f.2xlarge"
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: g6f-gpu
taints:
- key: nvidia.com/gpu
effect: NoSchedule
expireAfter: 720h
limits:
cpu: 50
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 5m
---
# NodePool for Large GPU workloads (1/2 GPU, 12GB)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: g6f-gpu-large
spec:
template:
metadata:
labels:
gpu-memory-size: "12gb"
gpu-fraction: "0.5"
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values:
- "g6f.4xlarge"
- "gr6f.4xlarge"
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: g6f-gpu
taints:
- key: nvidia.com/gpu
effect: NoSchedule
expireAfter: 720h
limits:
cpu: 50
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 5mApply the NodePools:
kubectl apply -f g6f-nodepools.yaml
kubectl get nodepoolFile: sample-gpu-workloads.yaml
---
# Small GPU workload (3GB) - Will provision g6f.large or g6f.xlarge
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-small-inference
spec:
replicas: 0
selector:
matchLabels:
app: gpu-small-inference
template:
metadata:
labels:
app: gpu-small-inference
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
# Target small GPU NodePool
nodeSelector:
gpu-memory-size: "3gb"
containers:
- name: inference
image: nvidia/cuda:12.3.0-base-ubuntu22.04
command: ["sleep", "infinity"]
resources:
requests:
cpu: 2
memory: 8Gi
nvidia.com/gpu: 1
limits:
cpu: 2
memory: 8Gi
nvidia.com/gpu: 1
---
# Medium GPU workload (6GB) - Will provision g6f.2xlarge
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-medium-inference
spec:
replicas: 0
selector:
matchLabels:
app: gpu-medium-inference
template:
metadata:
labels:
app: gpu-medium-inference
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
# Target medium GPU NodePool
nodeSelector:
gpu-memory-size: "6gb"
containers:
- name: inference
image: nvidia/cuda:12.3.0-base-ubuntu22.04
command: ["sleep", "infinity"]
resources:
requests:
cpu: 4
memory: 16Gi
nvidia.com/gpu: 1
limits:
cpu: 4
memory: 16Gi
nvidia.com/gpu: 1
---
# Large GPU workload (12GB) - Will provision g6f.4xlarge or gr6f.4xlarge
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-large-inference
spec:
replicas: 0
selector:
matchLabels:
app: gpu-large-inference
template:
metadata:
labels:
app: gpu-large-inference
spec:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
# Target large GPU NodePool
nodeSelector:
gpu-memory-size: "12gb"
containers:
- name: inference
image: nvidia/cuda:12.3.0-base-ubuntu22.04
command: ["sleep", "infinity"]
resources:
requests:
cpu: 8
memory: 32Gi
nvidia.com/gpu: 1
limits:
cpu: 8
memory: 32Gi
nvidia.com/gpu: 1Deploy and test:
# Deploy workloads
kubectl apply -f sample-gpu-workloads.yaml
# Test small GPU (3GB)
kubectl scale deployment gpu-small-inference --replicas 1
# Monitor provisioning
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -c controller -f
# Wait for pod to be ready
kubectl wait --for=condition=ready pod -l app=gpu-small-inference --timeout=300s
# Verify GPU
kubectl exec deployment/gpu-small-inference -- nvidia-smiExpected output:
NVIDIA L4-3Q, 3072 MiB # 1/8 fractional GPU
kubectl get pods -n kube-system -l app.kubernetes.io/name=karpenter
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -c controller --tail=50kubectl get nodeoverlay
kubectl describe nodeoverlay g6f-fractional-gpukubectl get nodepool
kubectl describe nodepool g6f-gpu-smallkubectl get pods -n nvidia
kubectl get daemonset -n nvidiakubectl get nodes -L node.kubernetes.io/instance-type,gpu-memory-size,gpu-fractionkubectl get nodes -o json | jq '.items[] | select(.status.capacity["nvidia.com/gpu"] != null) | {name: .metadata.name, instance: .metadata.labels["node.kubernetes.io/instance-type"], gpu: .status.capacity["nvidia.com/gpu"]}'-
Pod requests GPU with node selector:
nodeSelector: gpu-memory-size: "3gb" resources: requests: nvidia.com/gpu: 1
-
Karpenter matches to NodePool:
- Sees
gpu-memory-size: "3gb"requirement - Matches to
g6f-gpu-smallNodePool - NodePool only allows
g6f.largeorg6f.xlarge
- Sees
-
NodeOverlay enables GPU detection:
- AWS API reports GPU count = 0 for fractional GPUs
- NodeOverlay overrides this to GPU count = 1
- Karpenter knows instance can satisfy GPU request
-
Instance provisioned:
- Karpenter provisions most cost-effective option
- Node gets labeled with
gpu-memory-size: "3gb" - Pod schedules to the new node
-
GPU registered:
- NVIDIA device plugin starts on node
- Detects actual GPU hardware
- Registers
nvidia.com/gpu: 1on node - Pod can access fractional GPU
AWS EC2 API reports GPU count as 0 for fractional GPU instances:
aws ec2 describe-instance-types --instance-types g6f.xlarge"GpuInfo": {
"Gpus": [{
"Name": "L4",
"Count": 0, // ← AWS reports 0!
"MemoryInfo": {"SizeInMiB": 2861}
}]
}Pod requests: nvidia.com/gpu: 1
↓
Karpenter checks AWS API
↓
AWS says: GPU Count = 0
↓
Karpenter: "Instance has 0 GPUs, can't satisfy pod"
↓
❌ Won't provision g6f instance
Pod requests: nvidia.com/gpu: 1
↓
Karpenter checks NodeOverlay
↓
NodeOverlay says: nvidia.com/gpu = "1"
↓
Karpenter: "Instance has 1 GPU, can satisfy pod"
↓
✅ Provisions g6f instance
| Instance | GPU Fraction | GPU Memory | On-Demand | Spot (avg) |
|---|---|---|---|---|
| g6f.large | 1/8 | 3 GB | $0.08/hr | $0.03/hr |
| g6f.xlarge | 1/8 | 3 GB | $0.16/hr | $0.05/hr |
| g6f.2xlarge | 1/4 | 6 GB | $0.32/hr | $0.10/hr |
| g6f.4xlarge | 1/2 | 12 GB | $0.64/hr | $0.20/hr |
Without dynamic selection:
- 10 small workloads → 10x g6f.2xlarge = $3.20/hr
With dynamic selection:
- 10 small workloads → 10x g6f.xlarge = $1.60/hr
- Savings: 50%
# Check pod events
kubectl describe pod <pod-name>
# Common issues:
# 1. Missing node selector
# 2. Node selector doesn't match NodePool labels
# 3. NodePool limits reached
# 4. NodeOverlay not applied# Check device plugin on node
kubectl get pods -n nvidia -o wide
# Check GPU capacity
kubectl describe node <node-name> | grep nvidia
# Check device plugin logs
kubectl logs -n nvidia -l app.kubernetes.io/name=nvidia-device-plugin# Check NodeOverlay status
kubectl get nodeoverlay
kubectl describe nodeoverlay g6f-fractional-gpu
# Verify feature gate is enabled
kubectl get deployment -n kube-system karpenter -o yaml | grep nodeOverlay
# Check Karpenter logs
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -c controller | grep overlay# Required for dynamic selection
nodeSelector:
gpu-memory-size: "3gb" # or "6gb", "12gb"- g6f.large/xlarge: 2-4 vCPUs, 8-16GB RAM
- g6f.2xlarge: 4-8 vCPUs, 16-32GB RAM
- g6f.4xlarge: 8-16 vCPUs, 32-64GB RAM
NodePools are configured for both Spot and On-Demand. Karpenter prefers Spot.
Install DCGM exporter for GPU metrics:
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/dcgm-exporter/main/dcgm-exporter.yaml# Delete workloads
kubectl delete deployment gpu-small-inference gpu-medium-inference gpu-large-inference
# Delete NodePools (will drain nodes)
kubectl delete nodepool g6f-gpu-small g6f-gpu-medium g6f-gpu-large
# Delete NodeOverlay
kubectl delete nodeoverlay g6f-fractional-gpu
# Delete EC2NodeClass
kubectl delete ec2nodeclass g6f-gpu
# Uninstall NVIDIA device plugin
helm uninstall nvdp --namespace nvidia
kubectl delete namespace nvidia
# Uninstall Karpenter
helm uninstall karpenter --namespace kube-system
# Delete CloudFormation stack
aws cloudformation delete-stack --stack-name "Karpenter-${CLUSTER_NAME}"This setup provides:
✅ Dynamic G6f instance selection based on workload requirements
✅ Cost optimization through right-sizing (50%+ savings)
✅ NodeOverlay support for fractional GPU detection
✅ Simple node selector approach
✅ Karpenter v1.8 with NodeOverlay feature enabled
✅ AWS-recommended NVIDIA device plugin via Helm
✅ Production-ready configuration
Key Files:
g6f-nodeoverlay.yaml- Required for fractional GPU supportg6f-ec2nodeclass.yaml- EC2NodeClass for GPU nodesg6f-nodepools.yaml- Multiple NodePools for dynamic selectionsample-gpu-workloads.yaml- Example workloads
Critical Insight: NodeOverlay is required because AWS reports GPU count as 0 for fractional GPU instances. Without it, Karpenter won't provision G6f instances for GPU workloads.
Result: Workloads automatically get the right-sized G6f instance based on their GPU memory requirements, optimizing both cost and GPU utilization.