How to build advanced features for AI chatbots on SSE
The article explores implementing advanced AI chatbot features—resumable streams, cancellations, and multi-device sync—using Server-Sent Events (SSE), highlighting the technical complexity and inefficiencies involved in making SSE support durable, real-time interactions. While possible, these features require extensive database writes, coordination across server replicas, and workarounds for connection drops and client synchronization. The author argues that SSE is ill-suited for such use cases due to its reliance on HTTP and lack of native support for persistent, bidirectional streaming. A dedicated pub/sub-based transport is proposed as a more efficient alternative.
Full article excerpt tap to expand
Advanced features for AI chatbots on SSEApr 23, 2026 · 9 min readAgents used to be a thing you talked to synchronously. Now they’re a thing that runs in the background while you work. When you make that change, the transport breaks.But a lot of folks are saying: “No, you can just use Server-Sent Events (SSE) with Last-Event-ID to get a durable stream, it’s easy”. And yes, all of this is do-able. But I contest that it’s easy. So let’s walk through how to do it, and you can decide for yourself.Catch up on the previous article and discussionhttps://news.ycombinator.com/item?id=47832720The advanced chatbot features I want to walk through are:Resumable streams — refresh the page mid-response and get the in-progress tokens back, instead of waiting for the full response to land in the database.Cancellations — stopping the LLM mid-response when the user changes their mind, even though the connection is now allowed to drop and reconnect.Multi-device — open the same conversation on a second device or browser, and have it pick up the in-flight response and any new prompts in realtime.Each of these is do-able on SSE. Whether they’re easy is what we’re going to find out.Tokens vs. the API responsesTokens are the individual pieces of text that LLMs generate, but the actual responses you get back from LLM providers have a bunch more stuff in the. The responses have slightly different structure and format, but pretty much all follow a similar pattern.Some kind of ‘start’ event, some ‘delta’ events that contain text or tool call requests, and then some kind of ’end’ event.To get the full response text, you either concatenate the text deltas together, or some of the APIs will give you the ‘full’ response as it’s own event type at the end.Vercel AI SDK: 1 2 3 4 5 6 7 8 9 10 {"type":"text-delta","value":"Let me"} {"type":"text-delta","value":" look that up."} {"type":"tool-call-streaming-start","value":{"toolCallId":"call_001","toolName":"search"}} {"type":"tool-call-delta","value":{"toolCallId":"call_001","argsTextDelta":"{\"query\":\"weather Belfast\"}"}} {"type":"tool-call","value":{"toolCallId":"call_001","toolName":"search","args":{"query":"weather Belfast"}}} {"type":"tool-result","value":{"toolCallId":"call_001","result":{"temp":"14°C","condition":"cloudy"}}} {"type":"text-delta","value":"It's currently"} {"type":"text-delta","value":" 14°C and cloudy"} {"type":"text-delta","value":" in Belfast."} {"type":"finish-message","value":{"finishReason":"stop","usage":{"promptTokens":30,"completionTokens":28}}}OpenAI Responses API:1 2 3 4 5 6 7 8 9 {"event":"response.created","data":{"id":"resp_abc123","object":"response","status":"in_progress","model":"gpt-4o"}} {"event":"response.output_item.added","data":{"output_index":0,"item":{"id":"item_001","type":"message","role":"assistant"}}} {"event":"response.content_part.added","data":{"output_index":0,"content_index":0,"part":{"type":"output_text","text":""}}} {"event":"response.output_text.delta","data":{"output_index":0,"content_index":0,"delta":"Hello"}} {"event":"response.output_text.delta","data":{"output_index":0,"content_index":0,"delta":"! How can I"}} {"event":"response.output_text.delta","data":{"output_index":0,"content_index":0,"delta":" help you today?"}} {"event":"response.output_text.done","data":{"output_index":0,"content_index":0,"text":"Hello! How can I help you today?"}}…
This excerpt is published under fair use for community discussion. Read the full article at /dev/knill.