CASE STUDY A1 · AI

DocScorer

Streaming document scoring over WebSocket.

A Go service that scores documents against configurable rubrics with Gemini, streaming per-criterion results over WebSocket as they complete.

streamed

per-criterion results over WS

THE PROBLEM

Scoring a batch of documents against a rubric with an LLM is embarrassingly parallel but painfully slow to watch synchronously — and a spinner that resolves after two minutes is indistinguishable from a hang.

WHAT I BUILT

A compact Go service: submit documents and a rubric, and per-criterion scores stream back over WebSocket the moment each evaluation lands, with a final aggregate. Rubrics are configuration, not code.

ARCHITECTURE

client ──WS──► Go service
                 ├─ worker pool ──► Gemini (per criterion)
                 └─ stream results as they complete

A bounded worker pool fans criterion-evaluations out to Gemini and streams results back in completion order — utilization stays high without rate-limit blowups.

STACK — AND WHY

Goroutines and channels make the fan-out/stream-back pattern almost declarative.

Gemini

Per-criterion evaluation calls.

WebSocket

Results arrive as they complete, not when everything finishes.

THE HARD PARTS

Order-independent streaming

Criteria complete out of order; the protocol tags every result so the client assembles a coherent picture from an arbitrary arrival sequence.

WHAT IT TAUGHT ME

For LLM batch work, streaming partial results is the difference between a tool people trust and one they kill after 30 seconds.

← Back to the descent