Skip to content

Instantly share code, notes, and snippets.

@louis030195
Created December 6, 2025 23:03
Show Gist options
  • Select an option

  • Save louis030195/21c560b1457ec6e1c5db218dbaa1edbe to your computer and use it in GitHub Desktop.

Select an option

Save louis030195/21c560b1457ec6e1c5db218dbaa1edbe to your computer and use it in GitHub Desktop.
HCS-Kube: Dead Simple Windows Sandbox Orchestrator - Run multiple concurrent sandboxes via HCS API with HvSocket communication

HCS-Kube: Dead Simple Windows Sandbox Orchestrator

A minimal Kubernetes-like orchestrator for running multiple concurrent Windows Sandboxes via HCS API.

Why This Exists

Approach Concurrency Boot Time Complexity
Windows Sandbox 1 only ❌ 2s Low
Hyper-V VMs N ✅ 30-60s High
HCS Direct (this) N ✅ ~30s (or <1s warm) Low

Key Discovery: Windows Sandbox's single-instance limit is in WindowsSandbox.exe, not HCS. Calling HCS directly allows multiple concurrent sandboxes.

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                              hcs-kube                                        │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────────────────────┐   │
│   │ CLI         │     │ HTTP API    │     │ Rust lib                    │   │
│   │ hcs vm list │     │ POST /run   │     │ hcs_kube::run_workflow()    │   │
│   └──────┬──────┘     └──────┬──────┘     └─────────────┬───────────────┘   │
│          └───────────────────┴─────────────────────────┬┘                    │
│                                                        ▼                     │
│   ┌─────────────────────────────────────────────────────────────────────┐   │
│   │                        Orchestrator                                  │   │
│   │                                                                      │   │
│   │   Pool:  [sandbox-1: Ready] [sandbox-2: Busy] [sandbox-3: Ready]    │   │
│   │                                                                      │   │
│   │   acquire() → boot sandbox, return HvSocket connection              │   │
│   │   execute() → send command via HvSocket, get result                 │   │
│   │   release() → terminate sandbox, recycle slot                       │   │
│   │                                                                      │   │
│   └───────────────────────────────┬─────────────────────────────────────┘   │
│                                   │                                          │
│   ┌───────────────────────────────┴─────────────────────────────────────┐   │
│   │                         HCS Layer                                    │   │
│   │                                                                      │   │
│   │   HcsCreateComputeSystem + HcsStartComputeSystem                    │   │
│   │   Config: Scsi + HvSocket (minimal, proven working)                 │   │
│   │   VHDX: Pre-converted standalone copies with agent baked in        │   │
│   │                                                                      │   │
│   └───────────────────────────────┬─────────────────────────────────────┘   │
│                                   │ HvSocket (AF_HYPERV)                     │
│   ┌───────────────────────────────┴─────────────────────────────────────┐   │
│   │                         Sandboxes (N concurrent)                     │   │
│   │                                                                      │   │
│   │   ┌─────────────┐   ┌─────────────┐   ┌─────────────┐               │   │
│   │   │ sandbox-1   │   │ sandbox-2   │   │ sandbox-3   │               │   │
│   │   │             │   │             │   │             │               │   │
│   │   │ MCP Agent   │   │ MCP Agent   │   │ MCP Agent   │               │   │
│   │   │ (terminatr) │   │ (terminatr) │   │ (terminatr) │               │   │
│   │   │      ↕      │   │      ↕      │   │      ↕      │               │   │
│   │   │  HvSocket   │   │  HvSocket   │   │  HvSocket   │               │   │
│   │   └─────────────┘   └─────────────┘   └─────────────┘               │   │
│   │                                                                      │   │
│   └──────────────────────────────────────────────────────────────────────┘   │
│                                                                              │
└──────────────────────────────────────────────────────────────────────────────┘

Proven: Multi-Instance Works

PS> cargo run -- list
ID                                       OWNER           STATE
-----------------------------------------------------------------
91ff262a-ca78-41b9-98e5-06defbb1ccee     CmService       SavedAsTemplate
sandbox1                                 hcs-sandbox     Running
sandbox2                                 hcs-sandbox     Running

Two concurrent HCS sandboxes, bypassing Windows Sandbox's singleton limit.

Minimal Working HCS Config

{
  "SchemaVersion": { "Major": 2, "Minor": 1 },
  "Owner": "hcs-kube",
  "ShouldTerminateOnLastHandleClosed": true,
  "VirtualMachine": {
    "StopOnReset": true,
    "Chipset": {
      "Uefi": {
        "BootThis": { "DeviceType": "ScsiDrive", "DevicePath": "Scsi(0,0)" }
      }
    },
    "ComputeTopology": {
      "Memory": { "SizeInMB": 2048 },
      "Processor": { "Count": 2 }
    },
    "Devices": {
      "Scsi": {
        "0": {
          "Attachments": {
            "0": { "Path": "C:\\Sandboxes\\sandbox-1\\disk.vhdx", "Type": "VirtualDisk" }
          }
        }
      },
      "HvSocket": {}   // ← CRITICAL: Required for VM to start + enables host communication
    }
  }
}

Key Discovery: HvSocket: {} is required. Without it, VMs fail to start.

Data Model

struct Sandbox {
    id: String,
    name: String,                     // "sandbox-1"
    vhdx_path: PathBuf,              // C:\Sandboxes\sandbox-1\disk.vhdx
    state: SandboxState,             // Ready | Booting | Running | Stopping
    hcs_id: Option<String>,          // HCS compute system ID when running
}

enum SandboxState {
    Ready,      // VHDX exists, not running, can be started
    Booting,    // HCS created, waiting for agent connection
    Running,    // Agent connected, executing workflow
    Stopping,   // Terminating
}

Core Implementation (~500 lines total)

Orchestrator

pub struct Orchestrator {
    db: Database,
    config: Config,
    agent_connections: HashMap<String, HvSocketConnection>,
}

impl Orchestrator {
    /// Pre-create N sandbox slots (one-time setup)
    pub fn provision(&self, count: usize) -> Result<()> {
        let source_vhdx = find_windows_sandbox_vhdx()?;

        for i in 0..count {
            let name = format!("sandbox-{}", i);
            let vhdx_path = self.config.storage_path.join(&name).join("disk.vhdx");

            // Convert-VHD merges parent chain into standalone copy
            powershell(&format!(
                "Convert-VHD -Path '{}' -DestinationPath '{}' -VHDType Dynamic",
                source_vhdx, vhdx_path.display()
            ))?;

            // Bake agent into VHDX
            self.inject_agent(&vhdx_path)?;

            // Register in DB
            self.db.insert_sandbox(Sandbox {
                id: uuid(),
                name,
                vhdx_path,
                state: SandboxState::Ready,
                hcs_id: None,
            })?;
        }
        Ok(())
    }

    /// Boot a sandbox, wait for agent, return handle
    pub fn acquire(&self) -> Result<SandboxHandle> {
        let sandbox = self.db.find_ready_sandbox()?
            .ok_or(Error::NoSandboxAvailable)?;

        self.db.update_state(&sandbox.id, SandboxState::Booting)?;

        // Create HCS compute system
        let hcs_config = build_hcs_config(&sandbox);
        let cs = hcs::ComputeSystem::create(&sandbox.name, &hcs_config)?;
        cs.start()?;

        self.db.update_hcs_id(&sandbox.id, cs.id())?;

        // Wait for agent to connect via HvSocket
        let vm_guid = get_vm_guid(&cs)?;
        let conn = self.wait_for_agent(vm_guid, Duration::from_secs(60))?;

        self.db.update_state(&sandbox.id, SandboxState::Running)?;
        self.agent_connections.insert(sandbox.id.clone(), conn);

        Ok(SandboxHandle {
            id: sandbox.id,
            name: sandbox.name,
        })
    }

    /// Execute workflow in sandbox
    pub fn execute(&self, handle: &SandboxHandle, workflow: &str, input: Value) -> Result<Value> {
        let conn = self.agent_connections.get(&handle.id)
            .ok_or(Error::NotConnected)?;

        conn.send_json(&json!({
            "type": "execute",
            "workflow": workflow,
            "input": input
        }))?;

        conn.recv_json()
    }

    /// Terminate sandbox, recycle slot
    pub fn release(&self, handle: SandboxHandle) -> Result<()> {
        self.agent_connections.remove(&handle.id);

        let sandbox = self.db.get_sandbox(&handle.id)?;
        if let Some(hcs_id) = &sandbox.hcs_id {
            if let Ok(cs) = hcs::ComputeSystem::open(hcs_id) {
                let _ = cs.terminate();
            }
        }

        self.db.update_state(&handle.id, SandboxState::Ready)?;
        self.db.update_hcs_id(&handle.id, None)?;

        Ok(())
    }
}

HvSocket Communication

// Host side - connect to sandbox
pub fn connect_to_sandbox(vm_guid: Guid, service_guid: Guid) -> Result<HvSocketConnection> {
    let socket = Socket::new(AF_HYPERV, SOCK_STREAM)?;

    let addr = SOCKADDR_HV {
        Family: AF_HYPERV as u16,
        Reserved: 0,
        VmId: vm_guid,
        ServiceId: service_guid,
    };

    socket.connect(&addr)?;
    Ok(HvSocketConnection { socket })
}

// Guest side - listen for host (in agent)
pub fn listen_for_host(service_guid: Guid) -> Result<HvSocketListener> {
    let socket = Socket::new(AF_HYPERV, SOCK_STREAM)?;

    let addr = SOCKADDR_HV {
        Family: AF_HYPERV as u16,
        Reserved: 0,
        VmId: HV_GUID_PARENT,  // Special: means "my host"
        ServiceId: service_guid,
    };

    socket.bind(&addr)?;
    socket.listen(1)?;

    Ok(HvSocketListener { socket })
}

// Protocol
impl HvSocketConnection {
    pub fn send_json(&self, value: &Value) -> Result<()> {
        let bytes = serde_json::to_vec(value)?;
        let len = (bytes.len() as u32).to_le_bytes();
        self.socket.write_all(&len)?;
        self.socket.write_all(&bytes)?;
        Ok(())
    }

    pub fn recv_json(&self) -> Result<Value> {
        let mut len_buf = [0u8; 4];
        self.socket.read_exact(&mut len_buf)?;
        let len = u32::from_le_bytes(len_buf) as usize;

        let mut buf = vec![0u8; len];
        self.socket.read_exact(&mut buf)?;

        Ok(serde_json::from_slice(&buf)?)
    }
}

Agent (add to terminator)

// Add to terminator's startup
const MCP_SERVICE_GUID: Guid = guid!("12345678-1234-5678-1234-567812345678");

fn start_hvsocket_listener() {
    std::thread::spawn(|| {
        let listener = hvsocket::listen_for_host(MCP_SERVICE_GUID)
            .expect("Failed to create HvSocket listener");

        loop {
            let conn = listener.accept().expect("Accept failed");

            std::thread::spawn(move || {
                handle_connection(conn);
            });
        }
    });
}

fn handle_connection(conn: HvSocketConnection) {
    loop {
        let cmd: Value = match conn.recv_json() {
            Ok(v) => v,
            Err(_) => break,
        };

        let result = match cmd["type"].as_str() {
            Some("execute") => {
                let workflow = cmd["workflow"].as_str().unwrap();
                let input = &cmd["input"];
                execute_workflow(workflow, input)
            }
            Some("screenshot") => {
                let png = capture_screen();
                json!({ "type": "screenshot", "data": base64::encode(&png) })
            }
            Some("click") => {
                let x = cmd["x"].as_i64().unwrap() as i32;
                let y = cmd["y"].as_i64().unwrap() as i32;
                mouse_click(x, y);
                json!({ "ok": true })
            }
            _ => json!({ "error": "unknown command" })
        };

        if conn.send_json(&result).is_err() {
            break;
        }
    }
}

Setup Flow

# 1. Run Windows Sandbox once (creates base VHDX with shared kernel)
WindowsSandbox.exe
# Close it after it boots

# 2. Provision sandbox slots
hcs-kube provision --count 5
# Creates 5 standalone VHDX copies with agent baked in

# 3. Start the orchestrator
hcs-kube serve --port 8080

Runtime Flow

POST /workflow/run
{
  "workflow": "book-flight",
  "input": { "destination": "NYC", "date": "2024-01-15" }
}

           ┌─────────────────────────────────────────────────────────┐
           │                     hcs-kube                            │
           └──────────────────────────┬──────────────────────────────┘
                                      │
    1. acquire()                      │
       ┌──────────────────────────────┴──────────────────────────────┐
       │  - Find ready slot (sandbox-1)                              │
       │  - HcsCreateComputeSystem + HcsStartComputeSystem           │
       │  - Wait for HvSocket connection from agent                   │
       └──────────────────────────────┬──────────────────────────────┘
                                      │
                                      ▼
                           ┌─────────────────────┐
                           │     sandbox-1       │
                           │                     │
                           │  [Windows boots]    │
                           │  [Agent starts]     │
                           │  [HvSocket connect] │
                           │                     │
                           └──────────┬──────────┘
                                      │
    2. execute()                      │
       ┌──────────────────────────────┴──────────────────────────────┐
       │  Host: send {"type":"execute","workflow":"book-flight",...} │
       │  Guest: terminator runs UI automation                       │
       │  Guest: send result back                                    │
       └──────────────────────────────┬──────────────────────────────┘
                                      │
    3. release()                      │
       ┌──────────────────────────────┴──────────────────────────────┐
       │  - Terminate HCS compute system                             │
       │  - Mark slot as Ready                                       │
       │  - VHDX unchanged (ephemeral execution)                     │
       └─────────────────────────────────────────────────────────────┘

Response:
{
  "success": true,
  "confirmation_number": "ABC123",
  "screenshots": ["base64...", "base64..."]
}

Warm Pool Optimization

Keep 2-3 sandboxes pre-booted for instant response:

impl Orchestrator {
    /// Background task: maintain warm pool
    async fn maintain_warm_pool(&self) {
        loop {
            let warm_count = self.count_running_idle();

            if warm_count < self.config.warm_pool_size {
                // Boot another sandbox in background
                if let Some(sandbox) = self.db.find_ready_sandbox() {
                    let _ = self.boot_sandbox(&sandbox.id).await;
                }
            }

            tokio::time::sleep(Duration::from_secs(5)).await;
        }
    }

    /// Acquire from warm pool (instant) or cold boot (30s)
    pub async fn acquire(&self) -> Result<SandboxHandle> {
        // Try warm pool first
        if let Some(handle) = self.acquire_warm().await {
            // Trigger background replacement
            self.spawn_warm_replacement();
            return Ok(handle);
        }

        // Fall back to cold boot
        self.acquire_cold().await
    }
}

File Structure

hcs-kube/
├── Cargo.toml
├── src/
│   ├── lib.rs              # Exports
│   ├── orchestrator.rs     # Core logic (~200 lines)
│   ├── hcs.rs              # HCS API wrapper (from hcs-sandbox)
│   ├── hvsocket.rs         # AF_HYPERV socket wrapper (~100 lines)
│   ├── db.rs               # SQLite state (~50 lines)
│   └── bin/
│       └── hcs-kube.rs     # CLI
├── agent/
│   └── hvsocket.rs         # Add to terminator (~50 lines)
└── scripts/
    └── inject-agent.ps1    # Bake agent into VHDX

Key Advantages

  1. Simple - ~500 lines of new code total
  2. Proven - HCS multi-instance already works
  3. Fast communication - HvSocket is direct, no networking
  4. Ephemeral - Each run starts from clean VHDX
  5. Scalable - Limited only by host resources

Comparison to Alternatives

Feature Windows Sandbox Hyper-V + hyperv-kube HCS + hcs-kube
Concurrency 1 N N
Boot time 2s 2-5s (resume) 30s (or <1s warm)
Complexity None High Low
Shared kernel Yes No Yes
Memory per instance ~500MB 2-4GB ~500MB
Disk per instance ~100MB (COW) 20-40GB ~5GB (standalone)

Dependencies

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
rusqlite = "0.29"
uuid = { version = "1", features = ["v4"] }
tracing = "0.1"
windows = { version = "0.48", features = [
    "Win32_Networking_WinSock",
    "Win32_System_Hypervisor",
]}

Registry Setup for HvSocket

Guest must register the service GUID:

# Run inside the VHDX before sealing (inject-agent.ps1)
$guid = "12345678-1234-5678-1234-567812345678"
$path = "HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization\GuestCommunicationServices\{$guid}"
New-Item -Path $path -Force
Set-ItemProperty -Path $path -Name "ElementName" -Value "MCP Agent"

Next Steps

  1. Extract HCS bindings from hcs-sandbox into shared crate
  2. Implement HvSocket wrapper (AF_HYPERV)
  3. Add HvSocket listener to terminator
  4. Build orchestrator with acquire/execute/release
  5. Add warm pool optimization
  6. HTTP API for external integration
  7. Integration with mediar-web-app

This design leverages the proven HCS multi-instance capability while keeping implementation minimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment