Skip to content

Instantly share code, notes, and snippets.

@jfarcand
Created March 3, 2026 22:56
Show Gist options
  • Select an option

  • Save jfarcand/fd967296f7780f7d4afc612f56cacef3 to your computer and use it in GitHub Desktop.

Select an option

Save jfarcand/fd967296f7780f7d4afc612f56cacef3 to your computer and use it in GitHub Desktop.
Atmosphere AI Framework Abstraction Analysis — What can be unified across Spring AI, LangChain4j, Embabel, and Google ADK

Atmosphere AI Framework Abstraction Analysis

Overview

Atmosphere currently integrates with 4 external AI frameworks plus a built-in OpenAI-compatible client:

Framework Module Adapter Request Type
Built-in OpenAI modules/ai BuiltInAiSupport AiRequest
Spring AI modules/spring-ai SpringAiStreamingAdapter ChatRequest
LangChain4j modules/langchain4j LangChain4jStreamingAdapter LangChain4jRequest
Embabel modules/embabel EmbabelStreamingAdapter AgentRequest
Google ADK modules/adk AdkStreamingAdapter AdkRequest

What's Already Abstracted (Well Done ✅)

The modules/ai core already provides strong framework-agnostic abstractions:

Abstraction What it does
AiStreamingAdapter<T> Uniform streaming SPI — all 4 frameworks implement it
StreamingSession Push tokens/progress/errors regardless of transport (WS, SSE, gRPC)
AiSupport ServiceLoader auto-detection — drop a JAR, it just works
@AiEndpoint + @Prompt Zero-boilerplate endpoint declaration
AiRequest Framework-agnostic request (message, systemPrompt, model, history)
AiConversationMemory Sliding-window chat history
FanOutStreamingSession Multi-model fan-out with 3 strategies
AiInterceptor Cross-cutting pre/post processing chain

How Each Framework Streams

All 4 follow the same pattern: bridge their native streaming model to StreamingSession:

  • Spring AI: Flux<ChatResponse>SpringAiStreamingAdaptersession.send(token)
  • LangChain4j: Callbacks (onNext/onComplete/onError) → LangChain4jStreamingAdaptersession.send(token)
  • Embabel: Agent events via AtmosphereOutputChannelEmbabelStreamingAdaptersession.send(token)
  • Google ADK: Flowable<Event>AdkEventAdaptersession.send(token)

Wire Protocol (Common to All)

{"type":"token","data":"Hello","sessionId":"abc-123","seq":1}
{"type":"progress","data":"Thinking...","sessionId":"abc-123","seq":2}
{"type":"metadata","key":"model","value":"gpt-4","sessionId":"abc-123","seq":3}
{"type":"complete","sessionId":"abc-123","seq":4}
{"type":"error","data":"Failed","sessionId":"abc-123","seq":5}

What Could Still Be Abstracted to Benefit Developers

1. 🔧 Unified Tool/Function Calling

Problem: Each framework handles tool calling differently:

  • LangChain4j: @Tool annotation + ToolSpecification
  • ADK: AdkBroadcastTool + Google's tool API
  • Embabel: @Action annotations on agents
  • Spring AI: Function callbacks / @Bean functions

Proposed Abstraction: A common @AiTool annotation or ToolDefinition SPI that lets developers define tools once and have them work across all frameworks.

@AiTool(name = "weather", description = "Get current weather")
public WeatherResult getWeather(@Param("city") String city) {
    return weatherService.lookup(city);
}

Impact: Developers could switch AI backends without rewriting tool integrations. This is the single highest-value abstraction.

2. 💾 Persistent Conversation Memory

Problem: Only InMemoryConversationMemory exists. Server restarts lose all history.

Proposed Abstraction: A pluggable persistence SPI with implementations for:

  • Redis (via existing modules/redis)
  • Kafka (via existing modules/kafka)
  • JDBC
  • File-based
@AiEndpoint(path = "/chat", conversationMemory = true,
            memoryStore = "redis")

Impact: Production-ready conversation persistence without framework-specific code.

3. 📐 Structured Output / Response Schema

Problem: All 4 frameworks stream raw text tokens. No common way to enforce structured responses.

Proposed Abstraction: A response schema definition that works across frameworks:

@Prompt
public void onPrompt(String message, StreamingSession session) {
    session.stream(message, ResponseSchema.of(WeatherResponse.class));
}

Impact: Developers get typed, validated responses without framework-specific deserialization.

4. 🚦 Rate Limiting / Token Budget Enforcement

Problem: TokenBudgetManager exists but isn't wired into the interceptor chain.

Proposed Abstraction: Annotation-driven rate limiting:

@AiEndpoint(path = "/chat")
@RateLimit(tokensPerMinute = 10_000, perUser = true)
public class ChatEndpoint { ... }

Impact: Instant cost control across all AI backends.

5. 🖼️ Multi-Modal Content

Problem: All adapters currently stream text only. Modern LLMs support images, audio, files.

Proposed Abstraction: A Content type in StreamingSession:

session.send(Content.text("Here's the chart:"));
session.send(Content.image(chartBytes, "image/png"));
session.send(Content.file(csvBytes, "results.csv"));

Impact: Future-proofs the API as LLMs go multi-modal — all 4 frameworks already support it at their level.

6. 📊 Observability / Cost Metering

Problem: CostMeteringFilter exists but there's no standard way to emit metrics.

Proposed Abstraction: An AiMetrics SPI that plugs into OpenTelemetry:

public interface AiMetrics {
    void recordTokenUsage(String model, int promptTokens, int completionTokens);
    void recordLatency(String model, Duration ttft, Duration total);
    void recordCost(String model, BigDecimal cost);
}

Impact: Instant observability regardless of AI backend. Natural fit with existing spring-boot-otel-chat sample.

7. 📚 RAG / Context Augmentation SPI

Problem: Spring AI has advisors for RAG, LangChain4j has content retrievers, but there's no Atmosphere-level abstraction.

Proposed Abstraction: A ContextProvider SPI that feeds into AiInterceptor.preProcess():

public interface ContextProvider {
    List<Document> retrieve(String query, int maxResults);
}

Impact: Plug in vector stores once, use them with any framework.


Priority Ranking

Priority Abstraction Effort Value
🥇 Unified Tool/Function Calling High Very High
🥈 Persistent Conversation Memory Medium High
🥉 Observability / Cost Metering Medium High
4 RAG / Context Augmentation Medium Medium-High
5 Rate Limiting / Token Budget Low Medium
6 Structured Output Medium Medium
7 Multi-Modal Content High Medium (future)

Conclusion

Atmosphere's existing abstraction layer (AiStreamingAdapter, StreamingSession, AiSupport) is well-designed and already handles the hardest problem: unifying streaming across fundamentally different APIs (reactive Flux, callbacks, event streams, agent events).

The biggest opportunity is unified tool/function calling — it's what turns a chatbot into a useful agent, and it's where framework differences cause the most developer pain. Combined with persistent conversation memory and observability, these three abstractions would make Atmosphere the most developer-friendly AI gateway available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment