Skip to content

Instantly share code, notes, and snippets.

@colindotfun
Created September 23, 2025 00:06
Show Gist options
  • Select an option

  • Save colindotfun/3b58cb6f04c989650ac6533b0c025342 to your computer and use it in GitHub Desktop.

Select an option

Save colindotfun/3b58cb6f04c989650ac6533b0c025342 to your computer and use it in GitHub Desktop.

How Traces Work

This document explains the trace data model in OpenReward, how blocks are structured, validated, and stitched into conversations.


πŸ“¦ Trace Basics

  • A trace is a log of an agent run.
  • Each trace contains a sequence of blocks.
  • Blocks are atomic units: messages, tool calls, results, or reasoning steps.
  • Blocks are immutable once written.

🧱 Block Model

Each block is stored in the trace_blocks table:

Field Type Notes
id string Unique id (tb_...)
trace_id string Foreign key to trace
block_type enum: MESSAGE | ACT | OBSERVE Coarse bucket for fast filters
sub_type enum: MESSAGE | TOOL_CALL | TOOL_RESULT | THINK Semantic meaning
payload JSON Client-owned data, validated by server
parent_block_id string? Links child β†’ parent
metadata JSON Server annotations
raw JSON Raw provider payloads
extra JSON Freeform client space (e.g. reward scores)
created_at timestamp Insertion time
updated_at timestamp Auto-updated

BlockType vs SubType

  • block_type = coarse category for UI lanes:

    • MESSAGE β†’ chat messages
    • ACT β†’ tool calls + think steps
    • OBSERVE β†’ tool results
  • sub_type = finer meaning:

    • MESSAGE: payload.role ∈ {system, user, assistant}
    • TOOL_CALL: invocation of a tool
    • TOOL_RESULT: result from a tool
    • THINK: reasoning text (optional, under a message)

πŸ”— Parent Rules

Server enforces strict invariants:

  • MESSAGE: no parent_block_id
  • TOOL_CALL: parent must be a MESSAGE
  • THINK: parent must be a MESSAGE
  • TOOL_RESULT: parent must be a TOOL_CALL

No orphans; no cross-trace parents.


πŸ“ Payload Validation

Server validates payloads and applies hard byte limits (env tunable):

SubType Requirements Default Limit
MESSAGE payload.role ∈ {system,user,assistant}, non-empty content 64 KB
TOOL_CALL Valid name; arguments parsed to JSON; must fit within limit 256 KB
THINK Non-empty text 32 KB
TOOL_RESULT Exactly one of output | delta; optional seq (non-negative int) 2 MB

🚫 Over-limit Behavior

  • If a field exceeds its byte limit, the API returns:
{
  "error": {
    "code": "PAYLOAD_TOO_LARGE",
    "http_status": 413,
    "message": "tool_result output exceeds 2MB limit",
    "details": {
      "sub_type": "TOOL_RESULT",
      "field": "output",
      "limit_bytes": 2097152,
      "actual_bytes": 3987654,
      "trace_id": "tr_...",
      "parent_block_id": "tb_..."
    }
  }
}
  • No truncation. Either the payload fits, or the request fails.

🌳 Stitching

Endpoint: GET /v1/organizations/:org/traces/:traceId/blocks.stitched

  • Returns a tree view of the trace
  • Structure:
message
 β”œβ”€ think
 β”œβ”€ tool_call
 β”‚    └─ tool_result(s)
 β”œβ”€ tool_call
 β”‚    └─ tool_result(s)
  • Results are sorted by:

    1. payload.seq (if present, nulls last)
    2. createdAt
    3. id
  • orphans.tool_calls and orphans.tool_results are included for debugging (should be empty if invariants hold)


🚨 Errors

Common error codes:

  • 422 VALIDATION β†’ payload missing required fields, wrong parent type, invalid role
  • 413 PAYLOAD_TOO_LARGE β†’ size limit exceeded
  • 409 PARENT_SUBTYPE_MISMATCH β†’ parent not of allowed subType
  • 409 DUPLICATE_CALL_ID β†’ duplicate TOOL_CALL with same (trace, call_id)
  • 409 DUPLICATE_RESULT_SEQ β†’ duplicate TOOL_RESULT with same (trace, call_id, seq)

πŸ›  SDK Usage

Helpers map directly to block types:

// root message
await trace.logMessage({
  role: "user" | "assistant" | "system",
  content: "some text",
  raw: { providerPayload },
});

// tool call (ACT)
await trace.logAct({
  parent_block_id: message.id,
  payload: { call_id, name, arguments },
});

// tool result (OBSERVE)
await trace.logObserve({
  parent_block_id: toolCall.id,
  payload: { call_id, output },
  extra: { reward: 1 },
});

// think (ACT)
await trace.logThink({
  parent_block_id: message.id,
  payload: { text: "reasoning ..." },
});

βš™οΈ Env Defaults

LIMIT_MSG_BYTES=65536
LIMIT_THINK_BYTES=32768
LIMIT_TOOL_ARGS_BYTES=262144
LIMIT_TOOL_RESULT_BYTES=2097152

πŸ“‹ Example Trace

[
  {
    "block_type": "MESSAGE",
    "sub_type": "MESSAGE",
    "payload": { "role": "user", "content": "what's the weather?" }
  },
  {
    "block_type": "ACT",
    "sub_type": "TOOL_CALL",
    "parent_block_id": "tb_msg123",
    "payload": {
      "call_id": "call_1",
      "name": "get_weather",
      "arguments": { "city": "bogotΓ‘" }
    }
  },
  {
    "block_type": "OBSERVE",
    "sub_type": "TOOL_RESULT",
    "parent_block_id": "tb_call123",
    "payload": {
      "call_id": "call_1",
      "output": { "forecast": "22Β°C cloudy" }
    }
  },
  {
    "block_type": "ACT",
    "sub_type": "THINK",
    "parent_block_id": "tb_msg123",
    "payload": { "text": "decide to show forecast in celsius" }
  }
]

βœ… Summary

  • Blocks are containers + links
  • block_type = UI bucket, sub_type = semantic meaning
  • payload holds user data; server validates & enforces hard size limits
  • parent_block_id defines the trace tree
  • Stitched endpoint assembles tree with deterministic ordering
  • SDK provides ergonomic helpers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment