| name | description |
|---|---|
poc-research |
Write proof-of-concept scripts to research external systems, test assumptions, and validate integration patterns. Use when exploring new APIs, SDK capabilities, infrastructure providers, or any external dependency. PoCs live in poc/ and their assertions graduate into the real test suite. |
Write small, runnable proof-of-concept scripts that test assumptions about external systems. PoCs are throwaway code that validates integration patterns before building the real thing.
poc/
{topic}/
package.json # standalone Bun project (no workspace deps)
main.ts # primary test script
*.ts # additional focused scripts
README.md # only if user asks
Each PoC is a standalone Bun project with its own package.json and bun.lock. No imports from the main monorepo.
Every PoC script follows this pattern:
// 1. Config from env vars (fail fast if missing)
const API_KEY = process.env.THING_API_KEY?.trim();
if (!API_KEY) { console.error("Set THING_API_KEY"); process.exit(1); }
// 2. Timeout protection on EVERY async call
function withTimeout<T>(promise: Promise<T>, ms: number, label: string): Promise<T> {
return Promise.race([
promise,
new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error(`Timeout: ${label} exceeded ${ms}ms`)), ms),
),
]);
}
// 3. Global hard timeout (prevents runaway scripts)
const GLOBAL_TIMEOUT_MS = 5 * 60 * 1000;
const globalTimer = setTimeout(() => {
console.error("GLOBAL TIMEOUT exceeded");
printResults();
process.exit(2);
}, GLOBAL_TIMEOUT_MS);
// 4. Results tracking
const results: Array<{ test: string; status: "PASS" | "FAIL" | "SKIP"; detail: string }> = [];
function record(test: string, status: "PASS" | "FAIL" | "SKIP", detail: string) {
results.push({ test, status, detail });
}
// 5. Cleanup queue (LIFO)
const cleanupQueue: Array<() => Promise<void>> = [];
// 6. Tests as async functions, each self-contained
// 7. main() runs tests sequentially, prints results table, runs cleanup-
Timeouts on everything. Every async call gets
withTimeout(). Every script gets a global hard timeout withprocess.exit(2). Never let a script hang — the user had to manually kill a 57-minute hang once. Don't repeat that. -
Cleanup queue. Register cleanup functions as you create resources. Run them in reverse order in a
finallyblock. Catch and log cleanup errors — don't let them mask test results. -
Results table. Track pass/fail/skip for every test. Print a summary table at the end. Include timing data.
-
Standalone. Each PoC has its own
package.jsonwith only the dependencies it needs. No imports from the monorepo. Run withbun run ./script.ts. -
Env vars for secrets. Never hardcode API keys. Use
process.envand fail fast if missing. -
Test one thing per function. Each test function should be independent and named descriptively.
-
Log diagnostics. When something fails, log the raw response/error so you can debug without re-running.
-
Destructive tests last. Tests that mutate state (suspend, delete, stop) go at the end so failures don't block other tests.
- Running without timeouts
- Retrying the same failing operation in a loop
- Swallowing errors silently
- Creating resources without cleanup
- Depending on other PoC directories
- Using the monorepo's node_modules
PoC scripts validate assumptions. Once the real implementation is built, the key assertions from the PoC should become integration tests:
- Extract the assertion — what did the PoC prove? (e.g., "Freestyle Git repos mount at the specified path within 5s")
- Write an integration test in the real codebase that verifies the same thing against the real service layer
- Keep the PoC around as reference code — it documents how the external API actually behaves, including edge cases and workarounds
PoCs are never deleted unless the external system they test is no longer used.
See poc/freestyle-vm/ for a comprehensive example:
workspace.ts— 13 tests covering VM create, filesystem, runCode, systemd, fork, identity, git, terminals, snapshots, suspend, secretsworkspace-full.ts— full workspace product model: git repo lifecycle, workspace template, domain mapping, fork for CI, snapshot restore, secret injection, SSH multi-user, service managementmain.ts— minimal raw API client (no SDK)
Key findings documented by the PoC:
- VmBun installs to
/opt/bun/bin/bun(not/root/.bun/bin/bun) vm.fork()SDK bug: returnsvmId: undefined(use raw fetch, readforks[0].id)vm.start()hangs indefinitely on persistent suspended VMs (Freestyle backend issue)- Snapshot restore is sub-second (1.09s)
- Git repos need ~5s for
freestyle-git-syncservice to initialize - Secrets via
runCode({ env })don't leak to disk