ADR-007: Real-time Communication with Socket.IO

Status: Accepted Date: February 2026 Decision makers: SALLY Engineering Team

Context

SALLY requires two distinct real-time communication patterns:

Server-to-client push — The backend needs to push alerts, route status updates, integration sync progress, and ETA changes to the dispatcher dashboard without the client polling.
Bidirectional messaging — Dispatchers and drivers need to exchange messages in real-time, with both sides sending and receiving.

The team evaluated three approaches:

Polling — Simple to implement but introduces latency (minimum of the poll interval) and unnecessary server load from empty responses.
WebSocket only — A single protocol for both patterns. Simpler architecture but requires maintaining WebSocket connections for all users, even those who only need server push.
SSE + WebSocket — SSE for server push (lightweight, auto-reconnect, works through proxies) and WebSocket for bidirectional messaging (only opened when needed).

Use Server-Sent Events (SSE) for server push and Socket.IO (WebSocket) for bidirectional messaging.

SSE implementation:

Clients connect to GET /sse/events with JWT cookie authentication
The SSE service maintains a map of connected clients per tenant
Backend services publish events to Redis pub/sub channels (tenant:{tenantId}:alerts, etc.)
The SSE service subscribes to these channels and fans out events to connected clients
Event types: new_alert, alert_resolved, route_status, sync_progress, eta_update

Socket.IO implementation:

Used exclusively for the messaging gateway between dispatchers and drivers
Socket.IO chosen over raw WebSocket for its automatic reconnection, room-based broadcasting, and fallback to long-polling
The messaging gateway (infrastructure/websocket/) authenticates connections via JWT
Messages are emitted as send_message events and received as new_message events

What became easier:

SSE connections are lightweight and auto-reconnect on network interruptions. The dispatcher dashboard maintains a persistent event stream with minimal overhead.
Socket.IO’s room abstraction simplifies tenant-scoped messaging — each tenant is a room, and broadcasts are automatically scoped.
The SSE + WebSocket split means most users only need an SSE connection (low overhead). WebSocket connections are opened only when entering the messaging view.
Redis pub/sub decouples event producers from consumers. Any backend service can publish events without knowing which clients are connected.

What became harder:

Two real-time protocols mean two sets of connection management, error handling, and monitoring.
Frontend code must handle both SSE (EventSource API) and WebSocket (socket.io-client) event streams.
Testing real-time flows requires more infrastructure setup (Redis for pub/sub, SSE/WebSocket connections in tests).
Socket.IO adds a dependency (~50KB client-side) that raw WebSocket would not require.