Sender reputation first: why the Inbox Hub has no messages table

The first table name I wrote down was inbox_messages. Then I deleted it.

Not because messages aren’t the point of an email client. They are. But the Inbox Hub isn’t an email client — it’s a triage surface. The question it needs to answer fast isn’t “what did I receive?” It’s “who sent something I haven’t classified yet?” Those two questions need different primary tables.

inbox_sender_rules is the anchor. One row per sender address, one action column. Everything downstream of that.

Sprint E1 produced five tables. inbox_accounts holds the Gmail accounts under monitoring, and inbox_sender_rules is the reputation layer. inbox_new_senders_queue parks first-contact senders pending a decision. inbox_message_actions_audit is the append-only trail. inbox_sync_state keeps the Gmail historyId cursor per account.

Five indexes. The one that matters is idx_new_senders_pending — a partial index on inbox_new_senders_queue filtering to unresolved rows only. It stays cheap regardless of how many historical decisions have accumulated.

The classification

inbox_sync.run_once() does four things on each tick. Unknown senders go into inbox_new_senders_queue. Blocked senders generate an audit entry — nothing else. Muted and unsubscribed addresses are skipped silently, and allowed senders bypass the queue entirely.

Three of the four branches exist to keep messages out of the decision queue. One branch puts them in. The product is shaped around that asymmetry.

The sync function also persists the Gmail historyId cursor so the next tick starts from where the current one finished. No replay, no gaps. If a sync crashes mid-run, the cursor stays at the last committed position.

No messages table

There is no inbox_messages table. Messages are events, not records. Gmail already stores them; replicating them locally would add sync overhead and stale-data risk without adding capability. The historyId cursor is the only Gmail state the editorial pipeline owns outright.

That makes certain future features harder by construction. Displaying message content in the Inbox Hub UI, cross-account deduplication, search across message history — all of those would require a local message store the schema deliberately excludes.

Out of scope for Phase E: all of the above. The design makes them hard on purpose for now. Adding a messages table later is a distinct schema migration, not a gap in E1.

The HTTP layer

Sprint E4 mounted twelve route handlers on brain_server.py under /api/v1/inbox/*, all behind bearer auth. Three new modules back the routes: sse.py for the broadcaster and routes.py for the pure-function handlers.

sse.py has a 64-subscriber cap and a 30-second keepalive interval. Over-cap connections are refused; slow subscribers get dropped rather than backed up. This is a single-user local service — the cap exists to prevent a runaway client loop from leaking file descriptors, not to handle concurrent load.

The route handlers are pure functions returning (status, body) tuples, the same pattern as the pipeline routes. No side effects in the handler; the handler calls into the db and sync modules and returns. That structure keeps the routes testable without spinning up an HTTP server.

The badge

Sprint E5 added one change to index.html: a New Senders badge on the existing email triage card. It polls /api/v1/inbox/new-senders/count every 60 seconds alongside loadToday(). The badge disappears at zero.

Sixty seconds is the right interval. Fast enough to surface a new sender within a working session, slow enough not to flood the endpoint during a working evening. Real-time SSE for the badge count would be over-engineering — a new sender arriving 30 seconds earlier doesn’t change the decision.

The badge sits next to the urgent count, not in place of it. Urgent counts message priority; the new-senders count tracks classification debt. Both are useful; neither subsumes the other.

The UI

Sprint E6 shipped inbox.html with a 3-column grid and two breakpoints — 2-column on medium viewports, 1-column on mobile. Four Shadow DOM components live under docs/brain-guide/components/: <bb-inbox-sidebar> and three others. Same pattern as <bb-pipeline-status>: encapsulated styles, encapsulated fetch, slots for progressive enhancement.

The pipeline page and the inbox page have the same structural requirements: bearer auth, SSE for live updates, REST fallback for initial load. Sharing the component architecture keeps polling logic and the auth handshake in one place. The pattern is earning its keep.

The partial index

idx_new_senders_pending filters to unresolved rows only. Over months of decisions, resolved rows accumulate far beyond pending ones. A full index scan answering “how many pending unknowns?” slows as that tail grows. The partial index keeps query cost tied to the pending count rather than the total.

This is cheap insurance. The performance difference is invisible today. In a year of daily syncs, it’s the difference between a sub-millisecond badge query and a noticeably slow one. Writing the partial index is a statement about which query is on the hot path — and a commitment to not revisiting it later.

Where Phase E sits

Five sprints done. E7 is a smoke runbook and 7-day soak verification — the same gate the pipeline module went through before it was treated as stable.

The soak matters because the historyId cursor is the only synchronisation mechanism between Gmail and the editorial pipeline. A sync crash mid-tick, an API timeout, or a duplicate delivery can cause the cursor to drift — the queue picks up phantom senders or misses real ones. Seven days of live syncs is the minimum to trust the cursor logic under actual conditions.

The Inbox Hub is invisible to everyone except me. The partial index and the SSE broadcaster are invisible during normal use. The badge is the only surface that matters.

The badge is also the only surface that has to work correctly for Phase E to justify five sprints. Everything else is infrastructure supporting that one count. When the badge appears, a sender is queued; when it disappears, the queue is clear. That’s the product.

All writing