Skip to content

Instantly share code, notes, and snippets.

View lusoal's full-sized avatar
🏠
Working from home

Lucas Duarte lusoal

🏠
Working from home
View GitHub Profile

SOCI vs Standard Time to Inference Performance Test Results

Test Environment

  • Cluster: <cluster-name>
  • Instance Type: g6e.2xlarge
  • Source Image: public.ecr.aws/aws-containers/aiml/ray-2.43.0-py311-vllm0.7.3:latest
  • Test Image: <account-id>.dkr.ecr.<region>.amazonaws.com/ray-vllm-soci:latest (source image with SOCI index added and pushed to private ECR)
  • Image Size: 7,976,980,513 bytes (~7.4 GB)

Ray Configuration Context

# Deployment Guide

## Initial Setup
1. **Deploy 3 Tenants in the Platform**:
   - Deploy one tenant in each tier. Note that the `pool-1` environment is already deployed.
   - Create new versions of Helm Charts and images to demonstrate updates already deployed.

2. **Get SQS Queues URL to Provision the Tenants**:
   ```bash
@lusoal
lusoal / configuration-auth-policy-istio.yaml
Created January 16, 2024 13:13
How to control ingress traffic using Istio Authorization policy and AWS ALB with routing.http.xff_header_processing.mode=append
---
# Istio ingress Helm Chart
# Name allows overriding the release name. Generally this should not be set
name: ""
# revision declares which revision this gateway is a part of
revision: ""
# Controls the spec.replicas setting for the Gateway deployment if set.
# Otherwise defaults to Kubernetes Deployment default (1).
#!/bin/bash
# Creating cluster for tenant-1
echo "Creating Kafka cluster"
kubectl apply -f ./kafka_cluster.yaml
echo "Veryfing if the cluster is created or not"
kubectl wait kafka/<YOUR_CLUSTER_NAME> -nstrimzi-kafka-operator --for=condition=Ready --timeout=100s
# create a conditional to failure based on command output above using if statement
@lusoal
lusoal / README.md
Last active March 4, 2023 00:52
EKS Blueprints Workshop

EKS Blueprints Workshop

This workshop helps you build a multi-team platform on top of EKS.

It will enable multiple development teams at your organization to deploy workloads freely without the platform team being the bottleneck. We walk through the baseline setup of an EKS cluster, and gradually add add-ons to easily enhance its capabilities such as enabling ArgoCD, Rollouts, GitOps and other common open-source add-ons.

Event Engine Access:

Event Hash:

---
AWSTemplateFormatVersion: '2010-09-09'
Description: AWS CloudFormation template for dynamic Cloud 9 setups. Creates a Cloud9
bootstraps the instance. It also creates a t2.micro instance to deploy the terraform scripts to spin-up an Amazon EKS cluster
Parameters:
ExampleC9InstanceType:
Description: Example Cloud9 instance type
Type: String
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 1
selector:
matchLabels:

Runbook to create New Nodes, Cordon and Drain

Creating NodeGroups

First of all we are going to create 3 different NodeGroups in 3 different AZs (Availability Zone). In that way the AutoScaling Group will not try to rebalance the Nodes to place the same amount of Nodes in different AZs.

Follow that documentation in order to create the different NodeGroups.

https://docs.aws.amazon.com/eks/latest/userguide/create-managed-node-group.html

@lusoal
lusoal / CLUSTER-AUTOSCALER-REBALANCE.md
Created October 25, 2022 16:21
Tests for CA rebalance

Scenario 1

Application with anti-affinity defined, to scale one replica only per Node.

ASG multi-az: us-east-2a, us-east-2b, us-east-2c

Capacity rebalance enabled

apiVersion: apps/v1
@lusoal
lusoal / IRSA-DEMONSTRATION.md
Created October 20, 2022 20:35
IRSA Demonstration Script

Setup RBAC

Creating a new user:

export ACCOUNT_ID=0000000
export AWS_DEFAULT_REGION=us-east-2
aws iam create-user --user-name rbac-user
aws iam create-access-key --user-name rbac-user | tee /tmp/create_output.json