Skip to content

Instantly share code, notes, and snippets.

@galligan
Last active March 13, 2026 22:34
Show Gist options
  • Select an option

  • Save galligan/ef6a6678129e729d9e8c4a99a53050d0 to your computer and use it in GitHub Desktop.

Select an option

Save galligan/ef6a6678129e729d9e8c4a99a53050d0 to your computer and use it in GitHub Desktop.
Porting the Proof SDK to Cloudflare Workers

Porting the Proof SDK to Cloudflare Workers

I'm a subscriber to Every, and I listen to every podcast they publish. A few weeks back, Dan Shipper did a Making Of episode for Proof, the collaborative markdown editor they built for working alongside AI agents. I was immediately hooked.

If you haven't seen Proof yet: it's an open-source editor, collaboration server, and agent HTTP bridge, all built around a simple idea. Humans and AI agents should be able to edit the same markdown document together, in real time, with comments, suggestions, and full provenance tracking. No MCP authentication dance. No overwhelming IDE. Just a clean surface for the work.

That simplicity is what got me. There's a real gap right now between the tools we have. IDEs are overwhelming for non-technical collaborators (and plenty of us are living in terminals with Claude Code and Codex these days anyway). Obsidian isn't built for collaboration. Notion is nice but doesn't use markdown natively, and agents need to authenticate via MCP or similar. Google Docs works, but it's hauling around decades of features you don't need when the job is "edit this markdown together." Proof nails the simplicity angle, and that's killer.

The Outage That Got Me Curious

The day after the team open-sourced Proof SDK, they had a small outage. People wanted to use it. That's a good problem to have.

It also made me curious about the infrastructure. The existing deployment is an Express server on Railway: a single Node.js process with one SQLite file and all Yjs documents held in memory. It works great for getting started. But it's inherently single-process, single-region, and when demand spikes, there's nowhere to go without rearchitecting.

Here's the thing: Proof's workload is document-scoped. Each document is its own island. Its own collab session, its own marks, its own set of connected viewers. That's not a limitation. That's an architecture.

Why Durable Objects Fit Like a Glove

Cloudflare's Durable Objects map 1:1 to Proof's document model. Each document gets its own DO instance with:

  • Its own SQLite database for Yjs snapshots, events, idempotency keys, and access tokens
  • Its own WebSocket handler, single-threaded, so no concurrency bugs
  • Automatic hibernation with zero cost when nobody's editing
  • Global placement where the DO spins up near the first connecting client

The failure isolation is meaningful too. If a DO crashes, one document is affected. Not everything. The Express server doesn't have that luxury.

Cold starts tell the story quickly. The Express container takes 2-5 seconds to boot. A Worker starts in ~25ms. A hibernated DO wakes in under 50ms. For a collaborative editor where someone clicks a link and expects to start typing, that difference is felt.

Express on Railway Workers + Durable Objects
Scaling Vertical (bigger container) Per-document, auto-distributed
Cold start ~2-5s container startup ~25ms Worker, <50ms DO wake
Idle cost Always-on container Hibernates when unused
Blast radius Process crash = all docs DO crash = one doc
State One SQLite file, all docs DO SQLite per document (up to 10 GB each)
Global distribution Single region Edge-placed automatically
Horizontal scaling Architecture rewrite Already built in

The cost difference is stark and it scales in the right direction:

Scale Cloudflare Workers Railway
100 docs, 10 editors ~$5/mo ~$25/mo
1,000 docs, 50 editors ~$5-7/mo ~$35-50/mo
10,000 docs, 500 editors ~$10-20/mo ~$100-200/mo (multiple replicas)
100,000 docs, 5,000 editors ~$50-100/mo Architecture rewrite required

The Workers paid plan ($5/mo) includes 10M requests, 400K GB-s of DO compute, 25B SQLite row reads, and 50M row writes. Most moderate Proof deployments won't exceed the included tiers. Railway charges for always-on containers regardless of load, plus $0.05/GB egress. With Cloudflare, static assets (the editor bundle, HTML, images) are served free from their CDN.

The key insight: inactive documents cost nothing. DOs hibernate when all clients disconnect. You only pay for documents people are actually using.

How I Built It (Agents All the Way Down)

I used Claude Code for the entire build. But I didn't just point an agent at the codebase and say "go." The Proof SDK repo is substantial: a Milkdown/ProseMirror editor, Express server, Yjs CRDT collaboration, 25+ editor plugins, a full agent bridge with 27 HTTP endpoints. An agent diving in cold would burn context reading the same files repeatedly and miss conventions.

So I started by having agents do a deep exploration of the entire codebase. From that, I built two things:

  1. A proof-development skill, a structured reference covering the monorepo architecture, editor internals, server routes, agent bridge protocol, and data model. Agents load this once and have the full picture.
  2. A proof-dev subagent, a dedicated agent that understands Proof's conventions, can answer questions about the codebase, and can do development work that conforms to the project's patterns.

This meant that when it came time to actually write the Cloudflare Worker, agents weren't starting from scratch every time. They had a map. The skill and subagent reduced context churn, kept code consistent with upstream conventions, and made the whole process dramatically faster.

I also put together a CLAUDE.md and .claude/ configuration for the project, things the upstream repo doesn't have yet. Those aren't in this PR, but I'm happy to contribute them separately if the team's interested.

Cloudflare's own agent documentation helped too. They publish skills that make it straightforward for agents to understand Workers, Durable Objects, D1, and the rest of their platform. When your agent can read the docs natively, the code it writes is better.

The Result: Full Parity

The Cloudflare Worker implements the complete Proof SDK surface:

  • Real-time collaborative editing via Hocuspocus-compatible Yjs WebSocket sync
  • All 27 agent bridge endpoints including state, snapshot, edits (v1 and v2), marks CRUD, presence, and events
  • Document creation with slug generation and a D1 catalog
  • SPA hosting with static asset serving via Cloudflare Assets
  • Auth, idempotency, and role-based access matching the Express server's behavior

For all intents and purposes, the experience is identical to the Express deployment. Same API surface. Same editor. Same agent contract. Different infrastructure underneath.

Try It

The worker is live right now:

proof-sdk-cloudflare.galligan.workers.dev

Create a document, open it in a couple tabs, watch the real-time sync. Hit the agent endpoints. It's the same Proof experience, running on Cloudflare's edge.

The code is on my fork: galligan/proof-sdk PR #1.

What's Next

I'm doing a bit more testing, and then I plan to submit this upstream to EveryInc/proof-sdk. The goal isn't to replace the Express server. It's to give people a second deployment option that scales differently and costs less. An alternative on-ramp.

More broadly, I think what Dan and the Every team have built here matters. The tools for humans and agents to collaborate on text are still surprisingly primitive. Proof is one of the first projects that takes that workflow seriously, keeps it simple, and opens it up for anyone to build on.

You can just do things. Open source is awesome.

If you have questions or feedback, find me on X (@mg). And if you're building with Proof or thinking about it, I'd love to hear what you're working on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment