Vercel Workflow Blog Post Drafts (v2)

Three variations targeting different reader motivations. All technically grounded in Workflow DevKit documentation and following Vercel blog editorial conventions.

Durable AI streaming with Vercel Workflow

Format: Deep-dive Primary keyword: durable AI streaming

AI streaming breaks when users do ordinary things. A tab reloads. A laptop sleeps. A mobile network switches towers. The model keeps generating, but the user sees a stalled response and assumes the product failed.

We built Vercel Workflow to close this gap. Workflow DevKit makes AI streams durable — the work keeps running when the client disconnects, and the client reconnects without starting over. One team migrated an existing AI chat app from ephemeral to durable streaming and picked up automatic retries, observability, and local debugging in the process.

Why ephemeral streaming breaks under real usage

The default AI SDK pattern ties the response to a single HTTP connection. Open a connection, stream tokens, render them. Three lines of code and you have a working chat experience.

The failure modes surface once the feature has real users:

Page refresh kills the in-progress response. The server may still be generating, but the client has no way back in.
Network interruption on mobile — switching from Wi-Fi to cellular, entering a tunnel — drops the connection permanently.
Serverless function timeout at 30 seconds cuts off long agent responses mid-stream.
Retries require custom code and risk duplicate work on the server.

Each of these forces teams to build custom recovery infrastructure. Before long, the reliability layer is bigger than the feature it protects.

{TODO: image — side-by-side diagram showing ephemeral streaming (connection breaks, response lost) vs durable streaming (connection breaks, run continues, client reconnects)}

Keep the streaming UX, remove the brittleness

Workflow DevKit uses event sourcing to make execution durable. Every step produces events (step_created, step_started, step_completed) persisted to an append-only log. When a workflow is interrupted and replayed, the framework reads the log, skips completed steps, and resumes from the exact point of interruption.

Streams are part of this model. They are backed by persistent storage — Redis on Vercel, filesystem locally — and identified by a run ID. The stream exists independently of the HTTP connection. If the browser disappears, the workflow continues. When the client reconnects, it picks up from the last chunk it received.

The migration is three changes to an existing AI SDK app.

1. Move generation into a workflow function

DurableAgent replaces the AI SDK's standard Agent. It executes each LLM call as a durable step with automatic retries — 3 by default, configurable per function. Completed calls are never repeated on replay.

The "use workflow" directive tells the build system to transform the function for durable execution. getWritable() returns a persistent stream attached to the workflow run, not to the HTTP response.

import { DurableAgent } from "@workflow/ai/agent";
import { getWritable } from "workflow";
import type { ModelMessage, UIMessageChunk } from "ai";

export async function chatWorkflow(messages: ModelMessage[]) {
  "use workflow";

  const agent = new DurableAgent({
    model: "anthropic/claude-haiku-4.5",
    system: "You are a helpful assistant.",
  });

  await agent.stream({
    messages,
    writable: getWritable<UIMessageChunk>(),
  });
}

2. Return the run ID for reconnection

The API route starts the workflow and passes the run ID back in a response header. A second endpoint handles reconnection by looking up the existing run and returning its stream from a specific position using startIndex:

import { convertToModelMessages, createUIMessageStreamResponse } from "ai";
import { start } from "workflow/api";

export async function POST(req: Request) {
  const { messages } = await req.json();
  const run = await start(chatWorkflow, [convertToModelMessages(messages)]);

  return createUIMessageStreamResponse({
    stream: run.readable,
    headers: { "x-workflow-run-id": run.runId },
  });
}

3. Drop in `WorkflowChatTransport` on the client

WorkflowChatTransport is a drop-in replacement for the AI SDK's default transport. It stores the run ID, detects when a "finish" chunk is missing (indicating an interrupted stream), and automatically reconnects using the startIndex of the last chunk the client received.

import { useChat } from "@ai-sdk/react";
import { WorkflowChatTransport } from "@workflow/ai";

const { messages, sendMessage } = useChat({
  resume: Boolean(activeRunId),
  transport: new WorkflowChatTransport({
    api: "/api/chat",
    onChatSendMessage: (response) => {
      const runId = response.headers.get("x-workflow-run-id");
      if (runId) localStorage.setItem("active-run-id", runId);
    },
    onChatEnd: () => localStorage.removeItem("active-run-id"),
    prepareReconnectToStreamRequest: ({ api, ...rest }) => {
      const runId = localStorage.getItem("active-run-id");
      if (!runId) throw new Error("No active run ID");
      return { ...rest, api: `/api/chat/${encodeURIComponent(runId)}/stream` };
    },
  }),
});

After these three changes, users can refresh the page mid-stream and the response picks up where it left off.

Retries, observability, and debugging included

The migration adds more than reconnection. Because every step produces events, we include built-in observability without any additional logging infrastructure.

Retries are automatic. Steps retry up to 3 times by default. For external APIs that need backoff, throw a RetryableError with a retryAfter duration. For errors that should not be retried, FatalError fails the step immediately. getStepMetadata() exposes the current attempt number for custom exponential backoff.

The Web UI shows the full step trace. Run npx workflow inspect runs --web to see every run with its step status, duration, retry count, and stream output. The CLI exposes the same data for scripting.

{TODO: image — screenshot of Workflow Web UI showing a run with step trace, including a retried step}

Human-in-the-loop is built in. For agent workflows that need confirmations or approvals, hooks pause execution and wait for external input. Use defineHook() with a Zod schema for type-safe payloads.

Debug durable streams locally before you deploy

We built Workflow DevKit to run locally with zero configuration. The Local World stores events as JSON files, runs a queue in memory, and serves the full Web UI against local runs. No cloud account required.

This matters because long-running systems are hard to reason about when the only feedback loop is production. The step debugger shows state that console logs miss: steps that completed but returned unexpected data, streams that were written to but never closed, retries that masked a flaky external API.

The same workflow code runs on the Local World during development and the Vercel World in production. The infrastructure differs — Redis vs filesystem, distributed queue vs in-memory — but the workflow code does not change.

Get started

Building Durable AI Agents guide — full walkthrough from AI SDK to DurableAgent
Resumable Streams guide — step-by-step WorkflowChatTransport setup
Flight Booking example — production-ready reference app

Debug every step before you deploy with Vercel Workflow

Format: Thought leadership Primary keyword: local workflow debugging

There is a pattern in workflow tooling that keeps repeating. A team adopts a system for durable execution. The happy path works in a hosted dashboard. Then a failure happens, and the only debugging option is: deploy, trigger the error, read the logs, guess.

We think the debugging experience for long-running AI workflows should match what you already expect from frontend development: inspect state on your machine, reproduce the problem locally, fix it, and verify — all before it reaches production.

The cost of deploy-to-debug

Most orchestration tools separate development from execution. You write workflows locally. You deploy them to a managed service. You inspect runs through a remote dashboard.

The feedback loop is minutes, not seconds. Teams stop writing small, focused workflows because the iteration cost is too high. Failures get debugged by reading logs rather than inspecting state. Edge cases in long-running flows go untested because reproducing them requires a deployed environment.

For AI agent workloads, this is worse. A DurableAgent workflow might make 5-10 LLM calls in sequence, each with tool invocations that hit external APIs. A silent failure in step 7 of 10 is nearly impossible to catch without step-level visibility. And step-level visibility that exists only in production is visibility you will not use during development.

Inspect step traces, retries, and streams on your machine

Workflow DevKit runs the same execution model locally and in production. We use a "world" abstraction to separate workflow logic from infrastructure. The Local World stores events as JSON files in .workflow-data/, runs a queue in memory, and serves the full Web UI on your machine. The Vercel World uses Redis-backed streams, distributed queuing, and OIDC authentication. Your workflow code is identical in both.

What local development gives you:

Step-level debugging. The Web UI shows every run with its full step trace. Each step displays its status, duration, retry count, and the data it returned. If a step completed but returned unexpected data, you see it immediately.

{TODO: image — Web UI showing a step trace with a failed step highlighted, retry count visible}

Stream chunk inspection. For AI streaming workflows, the Web UI shows stream chunks as they are written. You can verify output without adding logging.

Retry simulation. Steps retry up to 3 times by default. Locally, you can watch a step fail, retry with exponential backoff, and either succeed or exhaust its attempts — testing error handling paths without deploying.

Event log visibility. Every workflow produces an append-only log following the event sourcing model. Events use a consistent format (run_created, step_started, step_completed, hook_received) with ULID-based entity IDs (wrun_, step_, hook_) that are lexicographically sortable by creation time.

Why this works: deterministic workflows, full-access steps

The architecture enforces a clean boundary. Workflow functions carry the "use workflow" directive and run in a sandboxed environment that enforces determinism — required for reliable replay. Step functions carry "use step" and run with full Node.js access. Parameters pass by value between the two contexts.

This separation means orchestration logic is predictable and replayable, while step execution has full access to the runtime — npm packages, fetch, databases, external APIs. The framework handles the boundary automatically through a code transform at build time.

When the framework replays a workflow after interruption, it reads the event log, skips completed steps, and resumes from the point of interruption. This is why local and production behavior match: the event log format is the same, the replay logic is the same. Only the storage backend differs.

What this changes for AI teams

AI agent workflows are inherently multi-step. A DurableAgent might make an LLM call, invoke a tool, wait for user approval via a hook, make another LLM call, and stream the result. Each operation is a step with its own retry behavior and observable state.

When you can inspect all of this on your machine — catch the step that silently returns an empty result, see the retry masking a flaky API, verify stream reconnection by refreshing the browser during a local run — you ship with more confidence and fewer surprises.

What's next

We are expanding the observability surface to include distributed tracing across workflows and cost attribution per step for LLM calls. The goal: the same visibility you expect from application monitoring, applied to every step in a durable workflow.

How to make AI chat streams survive page refreshes

Format: Tutorial Primary keyword: resumable AI streams

Your AI chat feature streams responses. Users like it. Then the bug reports arrive: refreshing the page loses the in-progress response, switching networks on mobile kills it mid-sentence, and long agent responses get cut off by serverless function timeouts.

These are not bugs in your app. They are the default behavior of ephemeral streaming — the response dies with the HTTP connection. With Vercel Workflow, you can make those streams resumable in three steps, without rewriting your app.

The problem

Client ──HTTP connection──▶ Server (streaming tokens)
         │
    connection breaks
         │
Client ──new request──▶ Server (starts from scratch)

The response is tied to the HTTP connection. When it breaks, the response is gone. The model may still be generating, but the client has no way back in.

The fix

Client ──HTTP connection──▶ Server (starts durable workflow run)
         │                        │
    connection breaks         run continues (backed by persistent stream)
         │                        │
Client ──reconnect via runId──▶ Server (resumes from last chunk)

The response is tied to a workflow run with a persistent stream. The client reconnects using the run ID and a startIndex that skips chunks it already received. No duplicate data, no restart.

{TODO: image — browser refreshing mid-stream, response continuing after reload}

Step 1: Wrap generation in a durable workflow

Move your AI generation into a workflow function using DurableAgent. It replaces the AI SDK's standard Agent and runs each LLM call as a durable step with automatic retries (3 by default).

getWritable() returns a persistent stream attached to the workflow run. On Vercel, this stream is backed by Redis. Locally, it is stored in the filesystem. Either way, it survives client disconnects.

import { DurableAgent } from "@workflow/ai/agent";
import { getWritable } from "workflow";
import type { ModelMessage, UIMessageChunk } from "ai";

export async function chatWorkflow(messages: ModelMessage[]) {
  "use workflow";

  const writable = getWritable<UIMessageChunk>();

  const agent = new DurableAgent({
    model: "anthropic/claude-haiku-4.5",
    system: "You are a helpful assistant.",
  });

  await agent.stream({ messages, writable });
}

Step 2: Return the run ID and add a reconnection endpoint

Update your API route to start the workflow and return the run ID in a response header. Then add a second endpoint that returns the stream for an existing run, starting from a specific chunk index:

import type { UIMessage } from "ai";
import { convertToModelMessages, createUIMessageStreamResponse } from "ai";
import { start } from "workflow/api";
import { chatWorkflow } from "@/workflows/chat/workflow";

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();
  const modelMessages = convertToModelMessages(messages);
  const run = await start(chatWorkflow, [modelMessages]);

  return createUIMessageStreamResponse({
    stream: run.readable,
    headers: { "x-workflow-run-id": run.runId },
  });
}

import { createUIMessageStreamResponse } from "ai";
import { getRun } from "workflow/api";

export async function GET(
  request: Request,
  { params }: { params: Promise<{ id: string }> }
) {
  const { id } = await params;
  const { searchParams } = new URL(request.url);
  const startIndexParam = searchParams.get("startIndex");
  const startIndex = startIndexParam
    ? parseInt(startIndexParam, 10)
    : undefined;

  const run = getRun(id);
  const stream = run.getReadable({ startIndex });

  return createUIMessageStreamResponse({ stream });
}

The startIndex parameter tells the server to skip chunks the client already received.

Step 3: Use `WorkflowChatTransport` on the client

WorkflowChatTransport is a drop-in replacement for the default AI SDK transport. It stores the run ID from the initial response, detects when a stream is interrupted (no "finish" chunk received), and automatically reconnects through the reconnection endpoint.

"use client";

import { useChat } from "@ai-sdk/react";
import { WorkflowChatTransport } from "@workflow/ai";
import { useMemo } from "react";

export default function ChatPage() {
  const activeRunId = useMemo(() => {
    if (typeof window === "undefined") return;
    return localStorage.getItem("active-run-id") ?? undefined;
  }, []);

  const { messages, sendMessage } = useChat({
    resume: Boolean(activeRunId),
    transport: new WorkflowChatTransport({
      api: "/api/chat",
      onChatSendMessage: (response) => {
        const runId = response.headers.get("x-workflow-run-id");
        if (runId) localStorage.setItem("active-run-id", runId);
      },
      onChatEnd: () => localStorage.removeItem("active-run-id"),
      prepareReconnectToStreamRequest: ({ api, ...rest }) => {
        const runId = localStorage.getItem("active-run-id");
        if (!runId) throw new Error("No active run ID");
        return {
          ...rest,
          api: `/api/chat/${encodeURIComponent(runId)}/stream`,
        };
      },
    }),
  });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>
          <strong>{m.role}:</strong> {m.content}
        </div>
      ))}
    </div>
  );
}

Verify it works

Run the app locally, start a chat, and refresh the page mid-stream. The response continues from where it left off. Open the Workflow Web UI to see the run trace:

npx workflow inspect runs --web

Each step shows its status, duration, retry attempts, and stream output.

{TODO: image — Workflow Web UI showing a completed run with step trace and stream output}

What else you get from this migration

This is not a reconnection-only change. Because every step in a workflow produces durable events, the migration adds:

Automatic retries. Each LLM call inside DurableAgent retries up to 3 times. For rate-limited external APIs, throw RetryableError with a retryAfter duration. For permanent failures, FatalError skips retries.
Observability without extra infrastructure. The Web UI and CLI show full step traces for every run. Use --backend vercel to inspect production runs remotely.
Human-in-the-loop. Use defineHook() with a Zod schema to pause a workflow for user approval or content review, then resume with a typed payload.
Local debugging. The Local World runs with zero configuration — same execution model, same Web UI, same step debugger. No cloud account required.

Get started

Resumable Streams guide — full WorkflowChatTransport walkthrough
Building Durable AI Agents — from AI SDK to DurableAgent
Flight Booking example — production-ready reference app with resumable streams

johnlindquist/workflow-blog-posts-v2.md

Select an option

No results found

Select an option

No results found

Vercel Workflow Blog Post Drafts (v2)

Durable AI streaming with Vercel Workflow

Why ephemeral streaming breaks under real usage

Keep the streaming UX, remove the brittleness

1. Move generation into a workflow function

2. Return the run ID for reconnection

3. Drop in `WorkflowChatTransport` on the client

Retries, observability, and debugging included

Debug durable streams locally before you deploy

Get started

Debug every step before you deploy with Vercel Workflow

The cost of deploy-to-debug

Inspect step traces, retries, and streams on your machine

Why this works: deterministic workflows, full-access steps

What this changes for AI teams

What's next

How to make AI chat streams survive page refreshes

The problem

The fix

Step 1: Wrap generation in a durable workflow

Step 2: Return the run ID and add a reconnection endpoint

Step 3: Use `WorkflowChatTransport` on the client

Verify it works

What else you get from this migration

Get started

johnlindquist/workflow-blog-posts-v2.md

Vercel Workflow Blog Post Drafts (v2)

Durable AI streaming with Vercel Workflow

Why ephemeral streaming breaks under real usage

Keep the streaming UX, remove the brittleness

1. Move generation into a workflow function

2. Return the run ID for reconnection

3. Drop in WorkflowChatTransport on the client

Retries, observability, and debugging included

Debug durable streams locally before you deploy

Get started

Debug every step before you deploy with Vercel Workflow

The cost of deploy-to-debug

Inspect step traces, retries, and streams on your machine

Why this works: deterministic workflows, full-access steps

What this changes for AI teams

What's next

How to make AI chat streams survive page refreshes

The problem

The fix

Step 1: Wrap generation in a durable workflow

Step 2: Return the run ID and add a reconnection endpoint

Step 3: Use WorkflowChatTransport on the client

Verify it works

What else you get from this migration

Get started

3. Drop in `WorkflowChatTransport` on the client

Step 3: Use `WorkflowChatTransport` on the client