Harness design
Trust is engineered, not assumed.
A harness is everything wrapped around a powerful, fallible actor — an LLM, a deploy script, a migration — that makes its power safe to use. I design harnesses the way other people design features. It's the most transferable thing I do.
The shape every harness reduces to. The actor can be brilliant or buggy — the loop doesn't care.
The catalog
Six harnesses I've built and operate.
The LLM never acts alone
My ops agent proposes; typed capability boundaries, pre-flight checks, and human approval gates dispose. Every proposal carries a blast-radius estimate computed from a deterministic dependency graph — not from the model's confidence.
Autonomy is earned, not granted
New automation runs in shadow mode: it says what it would do, and gets graded against what a human actually did. Only measured accuracy above threshold unlocks the execute path — revocable by the same metric.
Stage, then commit
Compile gates, atomic transfer, remote build under a supervisor that survives disconnects, then an end-to-end smoke test through the real front door. A deploy without its smoke test didn't happen.
Failures become rules
Every incident gets a written root-cause analysis with the full cascade, trigger to symptom. Recurrence converts the RCA into a permanent fix — an alert, a governor, an operating rule.
Read the file, not the "OK"
Mutating scripts must verify by reading back what they changed — a regex that didn't match still prints success. Learned at full price; now baked into every tool I ship.
Any problem → ticket → plan → window → reviewed change → validated outcome → repeat
The end-state the whole system converges toward. Security findings were just the first input source; the loop doesn't care where problems come from — only that every one leaves a record and a fix behind.
Operating rules
Written in incident blood.
Blast radius first
Before any change: what depends on this, what breaks, what's the rollback? One host at a time. Backups before, not after.
Find the cause under the cause
The disk that filled is the trigger; the pipeline that leaks files for weeks is the cause. Fixes aimed at triggers buy days. Causes buy years.
Pin what holds state
Databases and stateful apps ride pinned versions, upgraded deliberately with backups — never dragged forward by a :latest tag at 2 a.m.
Everything becomes a record
Code through merge requests, incidents through tickets, lessons through written rules. If it isn't recorded, it will be re-learned at full price.