gemini.model.ts is 960 lines, mixes HTTP transport, SSE parsing, cumulative/fragmented dedup, S3 upload orchestration, backpressure handling, and business logic in one class. The raw fetch approach (bypassing the @google/genai SDK) is necessary for memory efficiency — the SDK buffers entire request bodies via JSON.stringify and parses full response objects per chunk — but the implementation can be decomposed into composable layers.
All transforms are functions: (input: AsyncIterable<A>) → AsyncGenerator<B>.