This document explains the dual-provider configuration for LiteLLM when using Anthropic Claude models with extended thinking enabled.
When using Anthropic Claude models (Sonnet 4.5, Opus 4.5) with extended thinking enabled through LiteLLM proxy, multi-turn conversations with tool use fail with cryptographic signature validation errors.
Invalid `signature` in `thinking` block
Expected `thinking` or `redacted_thinking`, but found `text`. When `thinking` is enabled, a final `assistant` message must start with a thinking block
- When extended thinking is enabled, Anthropic's API returns
thinkingblocks in the assistant response - Each thinking block contains a cryptographic signature generated by Anthropic's servers
- In multi-turn conversations, the previous assistant message (with thinking blocks) must be sent back to the API
- Anthropic verifies the signature to ensure thinking blocks weren't tampered with
The original configuration used:
@ai-sdk/openai-compatibleSDK- LiteLLM's
/v1/chat/completionsendpoint (OpenAI format)
This causes problems because:
- Format Translation: LiteLLM translates Anthropic's native format to OpenAI format and back
- Signature Loss: The translation process loses or corrupts the cryptographic signatures on thinking blocks
- Validation Failure: When the next request is sent, Anthropic rejects it because:
- Either the signature is missing/invalid
- Or the thinking block structure doesn't match expectations
-
Reasoning Adapter Callback: We tried creating a LiteLLM callback (
reasoning_adapter.py) to:- Cache thinking blocks from responses
- Re-inject them into subsequent requests
- Failed because: You cannot fabricate valid signatures - they are cryptographically verified
-
Placeholder Thinking Blocks: Attempted to inject placeholder thinking blocks with dummy signatures
- Failed because:
Invalid signature in thinking block- Anthropic validates signatures server-side
- Failed because:
LiteLLM exposes two different endpoints for Anthropic:
| Endpoint | Format | SDK | Use Case |
|---|---|---|---|
/v1/chat/completions |
OpenAI | @ai-sdk/openai-compatible |
Non-Anthropic models (GPT, Gemini, etc.) |
/v1/messages |
Anthropic Native | @ai-sdk/anthropic |
Anthropic models with extended thinking |
Both endpoints still provide full LiteLLM features:
- Cost tracking
- Usage logging
- Virtual key management
- Rate limiting
We now configure two providers when LiteLLM proxy is detected:
// Provider 1: litellm (OpenAI-compatible)
providerConfig.litellm = {
npm: "@ai-sdk/openai-compatible",
options: {
apiKey,
baseURL: "http://litellm:4000/v1",
},
};
// Provider 2: litellm-anthropic (Anthropic-native)
providerConfig["litellm-anthropic"] = {
npm: "@ai-sdk/anthropic",
options: {
apiKey,
baseURL: "http://litellm:4000", // Uses /v1/messages endpoint
},
};| Model Pattern | Provider to Use | Endpoint |
|---|---|---|
anthropic/claude-* |
litellm-anthropic |
/v1/messages |
openai/gpt-* |
litellm |
/v1/chat/completions |
gemini/* |
litellm |
/v1/chat/completions |
xai/grok-* |
litellm |
/v1/chat/completions |
The following changes need to be made to complete the integration:
Update the modelSpec construction in execute() to use the correct provider:
// Current (needs update):
if (isLiteLLMProxy) {
return {
providerID: "litellm",
modelID: modelStr,
};
}
// Should become:
if (isLiteLLMProxy && isAnthropicModel) {
return {
providerID: "litellm-anthropic",
modelID: anthropicModelId, // Without "anthropic/" prefix
};
}
if (isLiteLLMProxy) {
return {
providerID: "litellm",
modelID: modelStr,
};
}Update the model string format in initialize() for the server config:
// For Anthropic via LiteLLM:
model: `litellm-anthropic/${anthropicModelId}`
// For other models via LiteLLM:
model: `litellm/${modelStr}`- Test multi-turn conversation with Claude Sonnet 4.5 + extended thinking
- Test tool use in multi-turn with extended thinking
- Verify cost tracking still works via LiteLLM dashboard
- Test non-Anthropic models still work via
litellmprovider
The LiteLLM proxy config (infrastructure/litellm-proxy/config.yaml) already has extended thinking enabled:
- model_name: anthropic/claude-sonnet-4-5-20250929
litellm_params:
model: anthropic/claude-sonnet-4-5-20250929
api_key: os.environ/ANTHROPIC_API_KEY
thinking:
type: enabled
budget_tokens: 8000The reasoning_adapter callback has been disabled since it cannot solve the signature validation problem.
- LiteLLM /v1/messages Documentation
- LiteLLM Anthropic Passthrough
- Anthropic Extended Thinking Docs
- AWS Bedrock Thinking Encryption
- Initial Issue: Multi-turn conversations with Claude + extended thinking failed via LiteLLM
- Investigation: Identified that thinking block signatures were being lost in format translation
- Attempt 1: Created
reasoning_adapter.pycallback to cache/restore thinking blocks - failed due to signature validation - Attempt 2: Tried placeholder thinking blocks with fake signatures - failed, signatures are cryptographically verified
- Root Cause Identified: The OpenAI-compatible translation path cannot preserve Anthropic's native thinking block format
- Solution: Use LiteLLM's
/v1/messagesendpoint which speaks native Anthropic format
- Anthropic's thinking block signatures are cryptographic and cannot be forged
- Any format translation (Anthropic <-> OpenAI) will break extended thinking
- LiteLLM provides both OpenAI-compatible AND Anthropic-native endpoints
- The Anthropic-native endpoint preserves all features (cost tracking, logging) while avoiding format translation
apps/sidecar/src/backends/opencode-backend.ts- Added dual provider configurationinfrastructure/litellm-proxy/config.yaml- Disabled reasoning_adapter callbackinfrastructure/litellm-proxy/reasoning_adapter.py- Attempted fix (now disabled)
- Complete the model routing logic in
execute()method - Test the integration end-to-end
- Consider adding automatic provider selection based on model name
- Update any upstream code that specifies the provider to use
litellm-anthropicfor Claude models