CASE STUDY A1 · AI
DocScorer
Streaming document scoring over WebSocket.
A Go service that scores documents against configurable rubrics with Gemini, streaming per-criterion results over WebSocket as they complete.
streamed
per-criterion results over WS
THE PROBLEM
Scoring a batch of documents against a rubric with an LLM is embarrassingly parallel but painfully slow to watch synchronously — and a spinner that resolves after two minutes is indistinguishable from a hang.
WHAT I BUILT
A compact Go service: submit documents and a rubric, and per-criterion scores stream back over WebSocket the moment each evaluation lands, with a final aggregate. Rubrics are configuration, not code.
ARCHITECTURE
client ──WS──► Go service
├─ worker pool ──► Gemini (per criterion)
└─ stream results as they complete- A bounded worker pool fans criterion-evaluations out to Gemini and streams results back in completion order — utilization stays high without rate-limit blowups.
STACK — AND WHY
Go
Goroutines and channels make the fan-out/stream-back pattern almost declarative.
Gemini
Per-criterion evaluation calls.
WebSocket
Results arrive as they complete, not when everything finishes.
THE HARD PARTS
Order-independent streaming
Criteria complete out of order; the protocol tags every result so the client assembles a coherent picture from an arbitrary arrival sequence.
WHAT IT TAUGHT ME
- For LLM batch work, streaming partial results is the difference between a tool people trust and one they kill after 30 seconds.