Engineering process quality arc — AC gates, smoke rules, UX audit automation

Vidar the Patient — Architecture

By the end of this chapter you’ll understand:

What the five root causes from the iOS TestFlight post-mortem were and which child issue addresses each one
How the skill, CI, agent, and docs layers work together to make quality checks structural rather than optional
What the combined pipeline looks like from task creation to PR merge

Engineering process quality arc — AC gates, smoke rules, UX audit automation

Status: Delivered
CAS: CAS-2731
Delivered: 2026-05-14
PRs: #724, #725, #726, #727, #728, #729, #730, #731

What’s new

A full arc of process-quality improvements shipped in a single coordinated batch. Tasks now require acceptance criteria before implementation begins. UI PRs require a UX walk and visual evidence before merge. Every stateful surface requires a two-action smoke test. Four test layers — Storybook Loki, axe a11y, Playwright visual sanity, and Eivind’s device walk — run automatically on every relevant PR. The result: the quality checks that used to depend on someone remembering to do them are now structural requirements.

Components delivered

CAS	What shipped	Doc
CAS-2732	Acceptance criteria required on every task	doc
CAS-2733	Two-action smoke gate for stateful surfaces	doc
CAS-2734	UX walk auto-fires on every UI PR	doc
CAS-2737	Eivind’s UX walk skill + autonomous Paperclip routine	doc
CAS-2740	Storybook Loki + test-runner + axe CI infrastructure	doc
CAS-2741	Storybook page-level stories backfilled	doc
CAS-2742	Playwright visual sanity + axe per route	doc
CAS-2743	Astrid UX advisor on every UI PR	doc

Why we built it

A post-mortem on the iOS TestFlight arcs (CAS-2460–2587) showed a consistent pattern: bugs that should have been caught at review time were instead caught by the regent on a physical device after a build shipped. The root causes were:

Tasks without acceptance criteria → no agreed success condition → “it works” was self-assessed.
UI PRs without UX review → layout bugs invisible in code review.
No visual regression baseline → pixel-level drift undetected.
No a11y gate → contrast and semantic markup regressions undetected.
No automated stateful-surface smoke → race conditions and broken flows shipped.

CAS-2731 addressed all five root causes in a single coordinated effort. Each child issue is independently useful, but their combined effect is a quality pipeline that catches at review time what used to be caught in production.

What changed under the hood

See each child epic’s feature doc for the technical details. At the process level:

Skill layer: casaconomy-task-worktree, casaconomy-review-protocol, casaconomy-planning all updated to enforce the new gates.
CI layer: Two new GitHub Actions jobs (playwright-visual-sanity, storybook-loki); axe-playwright and @storybook/test-runner added to devDependencies.
Agent layer: Eivind’s Paperclip routine fires every 30 minutes; Astrid’s UX advisory fires on every UI PR via astrid-ux-advisor skill.
Docs layer: docs/ux/ux-ui-guidelines-v0.md, docs/ux-baseline/, docs/testing-strategy.md updated.

Known limitations / follow-on work

Eivind’s walk is skill-guided but not yet fully scripted. Full Simulator automation is deferred.
The playwright-visual-sanity screenshot baseline requires a first clean run with --update-snapshots before drift detection kicks in.
Storybook page-level stories cover 7 surfaces; remaining pages are follow-on work (CAS-2741 tracks the backlog).

Recap

Five root causes from the iOS arc (no AC, no UX review, no visual baseline, no a11y gate, no stateful smoke) each map to a specific child CAS that closes it structurally.
The arc touches four layers: skill prompts (task-worktree, review-protocol, planning), CI jobs (playwright-visual-sanity, storybook-loki), agent routines (Eivind’s walk, Astrid’s UX advisory), and docs (ux-guidelines, testing-strategy).
Every check that previously depended on someone remembering is now wired to fire automatically at the right point in the pipeline.

What changed {#what-changed}

This feature shipped in CAS-2731. See: CHANGELOG → 2026-05-18