One osascript harness, two consumers — how skipping Archon kept the the article pipeline stack composable

The pivot happened mid-session. The variant-picker had just shipped and the next logical move was wiring thesis generation to a real model. Archon was on the table — it would have handled orchestration, routing, the multi-agent scaffolding. The note in the devlog is unambiguous: “We’re not building an Archon team. We are building this particular harness.” That sentence closed the architectural question before it became a debate.

The harness in question is an osascript-based terminal runner, project-owned, wired directly to Opus 4.7. No SDK dependency for the hot path. No shared orchestration layer with opinions about how agents should compose. The thesis generator was its first consumer. The voice-fix rewrite layer became its second. That second consumer is what validated the design.

Why Archon was the wrong shape

Archon is a multi-agent orchestration framework. The the article pipeline project at this point needed one agent doing one thing: take a draft, apply a voice profile, return prose. An orchestration framework optimised for agent teams introduces routing abstractions, message-passing conventions, and a deployment surface that the article pipeline doesn’t need and can’t easily inspect.

The cost of Archon would have been indirection. Every call to Opus would pass through a layer that the article pipeline doesn’t own, can’t modify mid-sprint, and can’t cheaply instrument. For a solo project with a build log as its primary diagnostic tool, that’s the wrong tradeoff. The osascript harness is twenty-odd lines the project owns entirely. When something behaves oddly at 2am, the failure is in a file I can read.

Archon would have been fine if the architecture were a team. It wasn’t.

The harness itself

The implementation routes through osascript to spawn a Terminal session and drive Opus 4.7 directly. The mechanism is deliberately dumb: osascript opens Terminal, sends a command string, and the project script handles the rest. No persistent daemon. No message broker. The harness is stateless between invocations.

That statelessness is a tradeoff worth naming. A more sophisticated harness would maintain session context across calls, which would let Opus accumulate knowledge of the document across multiple rewrite passes. The current design doesn’t do that. Each invocation is independent. For thesis generation and single-pass voice rewriting, independent is sufficient. If the architecture later needs multi-pass refinement, the harness will need revisiting.

The thesis generator was the first proof. It worked. That was a necessary condition, not a sufficient one. The sufficient condition was a second consumer that arrived independently and used the harness without modification.

The de-AI-ish stack and why it has two layers

The voice-fix sprint is where the architectural bet paid off. The stack that emerged has two distinct layers, and understanding why both exist matters.

The lower layer is dvlaw/ai_tells.py — a pure regex dictionary that catches mechanical AI tells. No model, no network call, no latency. The B4-dei-list-extend sprint broadened its coverage: the dictionary is the foundation the rest of the stack sits on. Regex is fast, deterministic, and cheap to audit. It catches the structural patterns: hedging constructions, performative warmth markers, padding phrases. What it cannot do is rewrite prose that retains the meaning while shedding the tell. That requires judgement.

The upper layer is the Opus rewrite agent, the B4-voice-fix-agent sprint. Its job is to take what the regex layer flagged and rewrite it through the voice profile as a positive constraint — not just “remove the hedging,” but “rewrite this with the voice profile as the target state.” That distinction matters. A scrubber without a positive target produces prose that has had things removed. An agent with a voice profile produces prose that sounds like something. The voice profile is what gives the upper layer direction rather than just subtraction.

Those two layers are independent by design. The regex scrubber runs without the agent. The agent consumes the scrubber’s output but doesn’t require it. They compose; they don’t depend on each other at runtime. That composability came for free from the harness architecture. Archon would have enforced a pipeline shape; the harness just provides a call surface.

The voice profile as a prerequisite

The B-voice-seed sprint ran before either layer was wired up. The devlog describes it as CJ writing a 10-rule codification of his voice directly, bypassing the multi-day enrolment interview that the eventual B-voice agent would otherwise run to learn the voice empirically.

That shortcut is worth calling out. The enrolment interview exists because voice is hard to articulate from the inside — you know your voice better by seeing it reflected back than by describing it cold. Bypassing the interview means the v1 profile is a hypothesis about the voice, not an empirically derived one. It’s a faster path to a working system, with the known risk that the profile codifies what CJ thinks his voice is rather than what it demonstrably is. The devlog notes this as a v1 baseline, not a settled state. The B-voice agent will eventually close that gap.

For the voice-fix agent to work at all, though, a profile had to exist first. Running the enrolment interview to get there would have taken days. The hand-written rules were sufficient to make the sprint land. That call was correct given the sprint goal.

The CLI surface

The B4-voice-fix-cli sprint closed the user-facing gap. the article pipeline_voice_fix.py <file.mdx> runs ai_tells.scrub, writes the .tells.txt sidecar, and prints a preview of up to five flagged items before handing off to the rewrite layer.

The sidecar is the part worth noting. Every run produces a .tells.txt file alongside the source MDX. That means every pass is auditable without re-running the tool. The reviewer can see what the regex layer caught, what the agent rewrote, and where the two diverged, in a file that persists independently of the terminal session.

Out of scope: the CLI does not currently support batch processing across a directory. One file per invocation, which is sufficient for the current workflow.

Two consumers, one primitive

The thesis generator and the voice-fix agent are architecturally unrelated to each other. They share one thing: both route through the osascript harness to reach Opus 4.7. Neither required the harness to change when the second consumer arrived. That’s the claim the devlog entry made — “validating the reusability claim” — and it held.

The practical implication is that any future the article pipeline feature that needs Opus has a clear path. Write a script, call the harness, handle the output. The orchestration question doesn’t reopen.

That’s the return on skipping Archon. Not a more powerful architecture. A simpler one that stayed simple when it grew.

What’s unresolved

The v1 voice profile is a known approximation. The Opus rewrite layer is only as good as the rules it’s given, and rules written from introspection have gaps. The B-voice enrolment agent will run eventually and the profile will be revised. When that happens, every previous voice-fix output becomes a candidate for a re-pass. The sidecar files will make that tractable.

The harness is also untested under load. It works for single invocations against one file. Whether it degrades gracefully when called from a batch job or a longer agent chain is unknown. That question doesn’t need answering this sprint. It needs answering before any batch feature ships.

The devlog entries for 2026-05-28 are the canonical record. This article is the consolidated form.