LiteLLM Anthropic Integration - Extended Thinking Fix

Overview

This document explains the dual-provider configuration for LiteLLM when using Anthropic Claude models with extended thinking enabled.

Problem Statement

When using Anthropic Claude models (Sonnet 4.5, Opus 4.5) with extended thinking enabled through LiteLLM proxy, multi-turn conversations with tool use fail with cryptographic signature validation errors.

Error Messages Observed

Invalid `signature` in `thinking` block

Expected `thinking` or `redacted_thinking`, but found `text`. When `thinking` is enabled, a final `assistant` message must start with a thinking block

Root Cause Analysis

How Extended Thinking Works

When extended thinking is enabled, Anthropic's API returns thinking blocks in the assistant response
Each thinking block contains a cryptographic signature generated by Anthropic's servers
In multi-turn conversations, the previous assistant message (with thinking blocks) must be sent back to the API
Anthropic verifies the signature to ensure thinking blocks weren't tampered with

Why LiteLLM's OpenAI-Compatible Endpoint Fails

The original configuration used:

@ai-sdk/openai-compatible SDK
LiteLLM's /v1/chat/completions endpoint (OpenAI format)

This causes problems because:

Format Translation: LiteLLM translates Anthropic's native format to OpenAI format and back
Signature Loss: The translation process loses or corrupts the cryptographic signatures on thinking blocks
Validation Failure: When the next request is sent, Anthropic rejects it because:
- Either the signature is missing/invalid
- Or the thinking block structure doesn't match expectations

Attempted Solutions That Failed

Reasoning Adapter Callback: We tried creating a LiteLLM callback (reasoning_adapter.py) to:
- Cache thinking blocks from responses
- Re-inject them into subsequent requests
- Failed because: You cannot fabricate valid signatures - they are cryptographically verified
Placeholder Thinking Blocks: Attempted to inject placeholder thinking blocks with dummy signatures
- Failed because: Invalid signature in thinking block - Anthropic validates signatures server-side

Solution: Dual Provider Configuration

Architecture

LiteLLM exposes two different endpoints for Anthropic:

Endpoint	Format	SDK	Use Case
`/v1/chat/completions`	OpenAI	`@ai-sdk/openai-compatible`	Non-Anthropic models (GPT, Gemini, etc.)
`/v1/messages`	Anthropic Native	`@ai-sdk/anthropic`	Anthropic models with extended thinking

Both endpoints still provide full LiteLLM features:

Cost tracking
Usage logging
Virtual key management
Rate limiting

Implementation

We now configure two providers when LiteLLM proxy is detected:

// Provider 1: litellm (OpenAI-compatible)
providerConfig.litellm = {
  npm: "@ai-sdk/openai-compatible",
  options: {
    apiKey,
    baseURL: "http://litellm:4000/v1",
  },
};

// Provider 2: litellm-anthropic (Anthropic-native)
providerConfig["litellm-anthropic"] = {
  npm: "@ai-sdk/anthropic",
  options: {
    apiKey,
    baseURL: "http://litellm:4000",  // Uses /v1/messages endpoint
  },
};

Model Routing

Model Pattern	Provider to Use	Endpoint
`anthropic/claude-*`	`litellm-anthropic`	`/v1/messages`
`openai/gpt-*`	`litellm`	`/v1/chat/completions`
`gemini/*`	`litellm`	`/v1/chat/completions`
`xai/grok-*`	`litellm`	`/v1/chat/completions`

Integration TODO

The following changes need to be made to complete the integration:

1. Model Spec Selection (opencode-backend.ts)

Update the modelSpec construction in execute() to use the correct provider:

// Current (needs update):
if (isLiteLLMProxy) {
  return {
    providerID: "litellm",
    modelID: modelStr,
  };
}

// Should become:
if (isLiteLLMProxy && isAnthropicModel) {
  return {
    providerID: "litellm-anthropic",
    modelID: anthropicModelId,  // Without "anthropic/" prefix
  };
}
if (isLiteLLMProxy) {
  return {
    providerID: "litellm",
    modelID: modelStr,
  };
}

2. Server Initialization Model

Update the model string format in initialize() for the server config:

// For Anthropic via LiteLLM:
model: `litellm-anthropic/${anthropicModelId}`

// For other models via LiteLLM:
model: `litellm/${modelStr}`

3. Testing Required

Test multi-turn conversation with Claude Sonnet 4.5 + extended thinking
Test tool use in multi-turn with extended thinking
Verify cost tracking still works via LiteLLM dashboard
Test non-Anthropic models still work via litellm provider

LiteLLM Configuration Reference

The LiteLLM proxy config (infrastructure/litellm-proxy/config.yaml) already has extended thinking enabled:

- model_name: anthropic/claude-sonnet-4-5-20250929
  litellm_params:
    model: anthropic/claude-sonnet-4-5-20250929
    api_key: os.environ/ANTHROPIC_API_KEY
    thinking:
      type: enabled
      budget_tokens: 8000

The reasoning_adapter callback has been disabled since it cannot solve the signature validation problem.

References

Postmortem

Timeline

Initial Issue: Multi-turn conversations with Claude + extended thinking failed via LiteLLM
Investigation: Identified that thinking block signatures were being lost in format translation
Attempt 1: Created reasoning_adapter.py callback to cache/restore thinking blocks - failed due to signature validation
Attempt 2: Tried placeholder thinking blocks with fake signatures - failed, signatures are cryptographically verified
Root Cause Identified: The OpenAI-compatible translation path cannot preserve Anthropic's native thinking block format
Solution: Use LiteLLM's /v1/messages endpoint which speaks native Anthropic format

Key Learnings

Anthropic's thinking block signatures are cryptographic and cannot be forged
Any format translation (Anthropic <-> OpenAI) will break extended thinking
LiteLLM provides both OpenAI-compatible AND Anthropic-native endpoints
The Anthropic-native endpoint preserves all features (cost tracking, logging) while avoiding format translation

Files Modified

apps/sidecar/src/backends/opencode-backend.ts - Added dual provider configuration
infrastructure/litellm-proxy/config.yaml - Disabled reasoning_adapter callback
infrastructure/litellm-proxy/reasoning_adapter.py - Attempted fix (now disabled)

Next Steps

Complete the model routing logic in execute() method
Test the integration end-to-end
Consider adding automatic provider selection based on model name
Update any upstream code that specifies the provider to use litellm-anthropic for Claude models

shekohex/LITELLM_ANTHROPIC_INTEGRATION.md

Select an option

No results found