I built six sprints of article scheduling. For four of them, nothing fired.

Sprint 4 landed and the ⏰ at… button worked. I could tap it, reply with “tomorrow 09:00” or “+3h” or a full ISO timestamp, and the article would get a scheduled time back. article_queue had the row. The state was scheduled.

Nothing fired.

Sprint 5 landed shortly after. The devlog note reads: “Scheduled articles now actually fire.” That gap between setting a schedule and executing one was four sprints wide. That was not an accident.

Six sprints, two jobs

The the article pipeline article-scheduling layer shipped across six sprints. The devlog entries compress a design decision that’s easy to miss: sprints 1–4 built the mechanism to record a schedule. Sprint 5 built the mechanism to execute one. Sprint 6 hardened the edges.

Those are different problems. Treating them as one is how you end up with a sprint trying to parse free-text time while managing queue state, rendering Telegram buttons, running a sweep daemon, and handling slot exhaustion. Separating them at the sprint level produced cleaner code and a cleaner separation of concerns at each step.

Sprints 1–2: the invisible layer

Sprint 2 shipped dvlaw/schedule_grammar.py. The devlog is explicit: “No external behaviour change yet — sprints 3–4 wire the parser in.” Nothing visible from Telegram. The bot didn’t change.

That’s not a bug in the plan; it’s the plan. A free-text time parser that turns “tomorrow 09:00” and “+3h” and “2026-06-15 14:00 UTC” into deterministic timestamps is self-contained logic. It has no business touching Telegram callbacks or queue state. Writing it in isolation means it can be tested in isolation. The coupling arrives later, when sprint 4’s reply listener hands it a raw string and expects a timestamp back.

Two sprints of invisible work before anything changes in the UI is a real cost. The sprint counter feels like slow progress. It isn’t; it’s verifiable progress on a component that needs to be right before anything else leans on it.

Sprint 3: first visible change

Sprint 3’s devlog note calls it the “first sprint with user-visible behaviour change.” From that sprint forward, every voice-fixed article in Telegram has a [📅 schedule] button. Tapping it opens a six-button sub-menu: [🚀 now], [⏰ at…], [🪟 next slot], [📥 hold], [🛑 cancel schedule], [« back].

Each of those buttons records intent. Tap “🚀 now” and the queue row gets an immediate publish timestamp. Tap “⏰ at…” and the bot enters a pending-reply state, holding for the free-text time. Tap “📥 hold” and the article stays queued indefinitely. None of these actions publish anything. They update state.

Sprint 3 also wired M2 preflight — queue-aware checks that run before the menu is shown, so the offered options match what’s actually valid for that article’s current state. That’s the record layer being honest about what it knows, not execution logic.

Sprint 4: closing the input loop

Sprint 4’s job was narrow: close the ⏰ at… loop that sprint 3 left open. I tap ⏰ at…, the bot enters the pending-reply state, I type a free-text time, and the grammar parser from sprint 2 resolves it to a timestamp. Per-reason error replies handle the cases where parsing fails: ambiguous input, times in the past, formats the grammar doesn’t recognise.

After sprint 4, the record layer was complete. Every scheduling intent I might form had a path into article_queue. That covered publish now, publish at a specific time, publish in the next available slot, hold indefinitely, and cancel. Nothing had ever been dispatched.

Sprint 5: the first dispatch

Sprint 5’s devlog note is terse: “Scheduled articles now actually fire.” Three things landed: the schedule sweep daemon, publisher_lock, and the dispatch wiring.

The sweep daemon starts alongside brain_server’s lifespan and sweeps article_queue on a regular cadence. When it finds a row with a planned_publish_at in the past, it dispatches. publisher_lock prevents two sweep cycles from dispatching the same article concurrently — the standard problem with sweep-based dispatch when the sweep interval and the job duration are close.

This is where the architectural split pays off. The sweep daemon only answers one question: are there scheduled articles whose time has come? It doesn’t know about Telegram buttons or free-text parsing. It reads article_queue. It dispatches. The record layer’s job was to make article_queue correct; the execution layer’s job is to act on it.

Sprint 6: the edge cases

Sprint 6 layered UX polish onto sprint 5’s sweep. Three remaining items from the original plan: past-due grace policy, 48-hour hard-expire, and slot-exhaustion recovery UX. Plus a security-LOW publisher-exception.

Past-due grace handles the gap between when an article was scheduled and when the sweep actually picks it up. If the gap is small, dispatch proceeds. If the article is old enough, past the 48-hour hard-expire, it doesn’t. Stale scheduled rows that never fired should not silently dispatch days later. The hard-expire makes that explicit.

Slot-exhaustion recovery UX closes a separate gap: what happens when I tap “🪟 next slot” and there are no available slots? Sprint 6 gave it a recovery path.

The security-LOW publisher-exception is listed in the devlog as a sprint 6 closure, without detail. Out of scope here.

The pattern

Recording intent and executing on it have different failure modes.

Recording fails when input handling is wrong — when the grammar parser misreads “+3h”, when the queue state machine transitions to an invalid state, when M2 preflight shows the wrong options. These failures are testable in isolation, without a running sweep daemon, without any live dispatch.

Executing fails when timing logic is wrong — when the sweep interval is misconfigured, when publisher_lock races, when past-due grace expires too aggressively. These failures require the daemon to be running and a populated queue to sweep.

Build both layers at once and you can’t tell which failed. A dispatch that doesn’t happen could be a parsing problem, a state machine problem, a sweep problem, or a lock problem. Keeping them in separate sprints narrowed the diagnosis space: when sprint 5’s sweep wasn’t firing correctly, the problem lived in the sweep, not the entire stack.

This is the producer–consumer split applied to a scheduled-job system: the part that writes to the queue and the part that drains it stay independent. The novel part is doing it deliberately at the sprint level, so each sprint ships a complete, testable slice of one job rather than an incomplete slice of both.

What didn’t split cleanly

Sprint 3 bundled M2 preflight with the Telegram button wiring. Preflight is partially execution-adjacent — it makes decisions about validity, not just records intent. It could have been its own sprint.

It wasn’t. Preflight is tightly coupled to the button-rendering logic; splitting them would have required sprint 3 to render a menu without valid state checks, which is a worse intermediate state. Sometimes the right split isn’t a clean cut.

Where this sits

B-schedule is complete. The scheduling layer now sits between two stages. Upstream is the draft stage, where Sonnet produces an MDX file and I review it in Telegram. Downstream is the ship stage, where the article hits the site and the pipeline raises a PR. I tap a button, set a time or take the next available slot, and the sweep fires when the time arrives.

Six sprints, from button design through to live dispatch. The record–execute boundary held across all six. That’s the part worth keeping for the next scheduler.