You've built agent factories for AINA for months while your own controls sat on defaults. This is the afternoon that fixes it — grounded in how you actually work, phased so it can't sprawl.
Stop re-explaining the same guardrails every session. Convert your most-repeated prose rules — which erode under compaction and long-session drift — into mechanical config that makes the right thing automatic and ambient, so an ADHD founder with no internal reset mechanism doesn't have to hold it in his head.
The research corrected a false assumption: your cockpit is not blank. Hooks are mature, 227 skills and 20 plugins are installed, and bypassPermissions is deliberately on. What's genuinely never-touched is narrower and sharper — and that's exactly where the leverage is.
~/.claude/agents/ — zero personal subagents/goal, routines, output-styles, statusLine~/Downloads/agentops for ali repo is sitting on master with 80 dirty files and zero signal (your reconcile hook only watches the aina-platform family). And your Mac disk is at 86% used / 2.0Gi free — a real near-term failure surface.
agentops for ali folder (1,097 files) has real skills to adapt, not author: codex-prompt-builder (→ the U11 scaffold), codex-goals (→ /goal), codex-structured-outputs-and-gates (→ verifier), a session-closeout hook design (→ U4), and a full cost-discipline spec (→ U13). Genuinely fresh work shrinks to the retry stop-rule, the statusline, and the founder-brief style.
~/.codex/config.toml has goals, memories, hooks, multi_agent all on, ~20 live automations (including U2's exact targets + a disk-pressure report), and a founder-brief skill that already exists. So U2 and U3 are ports, not authors, and the cockpit's job is bringing Claude Code to the parity Codex already runs. The AINA AgentOps plugin even has the merge/reconcile machinery (U9) in tested Python.
Every decision below traces to an evidenced, recurring failure — not generic best practice:
| Pain | Why it earns a fix |
|---|---|
| Repo/branch amnesia | Highest-frequency, highest-cost. 33 project dirs in 7 days; a 627M-token "which repo is this" session. Live-verified broken now. |
| "Done" ≠ "Landed" | The #1 named failure (13-agent forensic). Partly covered by mature Stop hooks — one escape hatch remains (direct commits to main). |
| Scope-pivot without closure | 7 pivots in one session, 4 projects abandoned-not-parked. ADHD-linked. No hook catches this — Stop fires too late. |
| Prose over mechanism | Reconcile / verify / land-it live as CLAUDE.md prose an agent silently drops under pressure. Only subagent-contracts are mechanized. |
| Never-configured leverage | No personal subagents; /goal, routines, output-styles, statusLine all unused; 3 automations written but idle. |
AgentsView — the new VDS session-observability app you're wiring in the other chat — is the sensing half of this cockpit's acting half. Together they close a loop you've never had.
Phase 1 is three cheap, high-leverage wins landable this week — each its own commit. Additive-complexity items are sequenced behind them.
| Phase | Area | The move |
|---|---|---|
| 1 · now | settings / statusLine / hooks | Ambient repo/branch/dirty statusline + generalize reconcile + daily branch-parking sweep + direct-to-main nudge |
| 1 · now | scheduled routines | Wire the 3 pre-written automations + disk-guard |
| 1 · now | output-styles + skills | founder-brief style + landed-check + which-repo skills |
| 1.5 | hooks | Scope-closure pivot hook (UserPromptSubmit) |
| 2 | hooks | Wiki-recall nudge (separate UserPromptSubmit) |
| 2 | subagents | verifier · repo-reconciler · builder-lane + post-Task verify nudge |
| 2 | workflows | Reusable named-workflow library (your own realization) |
| 2 | /goal | One trial on a known-good task shape |
| 3 | delegation | Decision-rule artifact + direct-Codex default |
| 3 | agent-teams / Paperclip | Explicitly gated, not built |
| 1+ | skills | U11 Prompt-scaffold (adapt codex-prompt-builder) — highest AgentsView leverage |
| 1.5 | hooks/skills | U12 Retry / tool stop-rule (genuinely fresh) |
| 2 | skills | U13 Cost-guardrail checkpoints (adapt llm_billing) |
| 2 | routines | U14 AgentsView weekly digest → PKM report |
| 1 | hooks | U15 PreCompact hard-backstop handoff hook (you asked for this) |
| 1.5 | subagents | U16 Decision Contract + adversarial QA-verifier (decide-once) |
| 1.5 | safety | U17 Kill-switch + freeze runbook + config-security-audit |
| 2 | skills | U18 Agent-session-resume skill (port hacktivist123) |
| 3 | skills-meta | U19 Self-evolving-skills / instinct loop (ECC-style) |
An ambient statusline (repo:branch (Ndirty ↑A↓B)) on every turn; generalize reconcile-prime.sh off the hardcoded aina path so it fires in any repo; a daily sweep for branches parked >24h; and a one-line nudge when you commit straight to main. Fixes the #1 pain and the live-verified gap.
The prompts already exist in ~/.claude/automations/, labelled "not yet wired." Register aina-workspace-closer-audit first (it covers orphan work), verify one full run, then wire the other two. Add a cheap disk-guard — 2.0Gi free is a real risk.
founder-brief output style + landed-check + which-repo skillsPhase 1Make plain-English/decision-ready surfacing and the exact Landed/Not-landed closeout structural (loaded every session), and give the #1 failure a one-word check that runs the real git/GitHub verification instead of trusting an agent's self-report.
A UserPromptSubmit heuristic that notices a mid-session repo pivot with dirty prior work and injects a "close the prior thread first" nudge — advisory, never blocking. Catches the pivot when it happens, not at Stop (too late).
A second, separate UserPromptSubmit heuristic (regex, no LLM) that reminds you to run pkm-agent think on new-domain work. Ships as its own commit — never built in the same session as U4.
Formalize the parallel pattern you already run ad-hoc into named, model-pinned agents: verifier (adversarial, falsifies self-reports — the structural fix for the RunFusion repair-cascade), repo-reconciler (Haiku, isolated), builder-lane (Sonnet). Plus a nudge to verify a subagent's claims before they count.
Your own realization — you never set up and ran workflows. A small library you fire yourself (research-a-topic, review-my-branch, mine-my-sessions), seeded from this plan's own cockpit-research script.
/goal trial on a known-good shapePhase 2One trial on a uniform+verifiable fan-out you've already proven (the aina-platform Fable pattern) — removes "keep going" re-prompting before you trust it for anything that matters.
Write the Mac↔VDS↔Hermes↔Paperclip routing rule down as a durable, committed artifact; default heavy work to the proven direct-Codex lane; gate Paperclip on the 626 upgrade.
Agent-teams and Paperclip-as-default stay explicitly gated with named unblock conditions, so neither silently starts and becomes an orphaned parallel pipeline.
codex-prompt-builderPhase 1+Highest AgentsView leverage (prompt maturity = D). Make a minimum scaffold — Goal / Context / Constraints / Success-criteria / Verification — the default shape for prompts an agent runs, with a "too fuzzy → ask one question" escape hatch. Pair with a dr-gate so "done" is checkable, not asserted. Lift-ready from your own folder.
Genuinely fresh — nothing in the folder solves it. After N equivalent tool failures, force a stop-and-diagnose (failing command + error + next alternative + environment preflight) instead of looping. Directly targets the 8,305-retry bleed. Advisory, fail-open.
llm_billingPhase 2Checkpoint long/expensive runs before they compound (the $1,813 session). A soft/hard turn-or-token threshold that forces an interim artifact + summary before continuing; a running cost hint on the statusline; /goal bound clauses to terminate runaways.
Closes the sense→act→confirm loop. Weekly, read-only, no auth: agentsview stats --since 7d --format json → Sonnet narrates against the 6 review-kind headers → lands as a dated PKM report you can search. Generate the never-run instruction_opportunity_review manually once first.
The session-handoff hook you asked about. Set CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=65 + a PreCompact hook that writes a ~30-line pointer handoff (resume command + paths, not 300 lines of pasted content) to ~/.claude/handoffs/ before compaction wipes context. Paired with a soft statusline nudge at 50%. A guarantee, not discipline — directly fixes the "compaction-sessions do worse" finding.
Your "decide once, no PR dance" made mechanical. Fill a contract at spec-time (CAN / must-NEVER / files-may / files-must-not / acceptance / risk), write the never-list first, then a QA-verifier subagent behaves like an adversarial user, attacks each must-NEVER, and is forbidden from saying "looks good." No human in the gate — you're pulled in only when reality breaks the contract.
The "2am, an agent loops and opens 6 PRs" answer: one always-known command to freeze all autonomous activity in <60s, merges-default-OFF, minimum-scope git creds. Plus a one-time AgentShield-style audit of the cockpit's own settings/hooks/CLAUDE.md/MCP.
hacktivist123Phase 2Makes "continue where we left off" evidence-graded: DONE / PARTIALLY / NOT-DONE with file:line citations, read-to-end-of-transcript, git-status-first, and it preserves your "park this" deferrals instead of silently reopening them. Doubles as founder-readable status.
Skills that improve from your corrections (ECC's continuous-learning-v2). Cheap start: log every correction as a candidate edit, review weekly. Only for objectively-checkable skills — never taste/voice. Proposals never auto-merge.
Because you review behavior, not shell, the actual authoring is delegated. Two representative prompts:
Write ~/.claude/hooks/statusline.sh: in the current cwd, print
"repo:branch (Ndirty ↑ahead↓behind)" using git rev-parse --show-toplevel
(basename), git branch --show-current, git status --porcelain | wc -l,
and git rev-list --left-right --count HEAD...@{u}. Fail SILENT outside a
git repo (no output, exit 0). Then add a "statusLine" key to
~/.claude/settings.json pointing at it. Do not modify closeout-stop.sh
or durability-autopush.sh in this pass.
Create ~/.claude/skills/landed-check/SKILL.md with frontmatter
disable-model-invocation: true. On invocation it runs git status
--porcelain, git log @{u}..HEAD, and gh pr view --json state,mergeable,
then reports the exact ✅ Landed / ❌ Not-landed plain-English line from
global CLAUDE.md. No raw /srv paths. It reports actual git/GitHub state
— it never infers "done" from an agent's prior claim.
Default heavy AINA build to the already-proven aina-build-lane kernel (Claude architects and never writes implementation code; Codex-over-SSH is the workhorse; the VDS is the engine; report-file + watcher; commit-per-lane; independent verifier). This works now, with zero dependency on Paperclip.
Reserve full Paperclip-factory dispatch for genuinely multi-agent, cross-lane work, and treat it as blocked-preferred until 609→626 ships native hermes_local, native codex-login, and native watchdog. Until then, anything on Paperclip inherits 609's manual-glue fragility — the same shape of gap that produced both RunFusion repair-cascade failures (April's cross-workspace-isolation break; June's self-repair loop that discarded landed work via uncontrolled concurrency + autoMerge/autoUnpause).
The guardrails are the anti-sprawl contract — dogfooding your own scope-closure discipline on the plan itself:
| Guardrail | Why |
|---|---|
| Fail-open hooks only | Inject context / print a digest — never block on a human answer. Preserves your bypassPermissions posture. |
| Exactly 3 Phase-1 commits | U1 · U2 · U3 land separately — a cold session verifies each. "Landed not done" applied to the plan. |
| Never two UserPromptSubmit hooks in one session | U4 and U5 ship apart so heuristics don't fight over the event. |
| "Never configured X" ≠ configure everything | Subagents / agent-teams are Phase 2/3, after the cheap wins. |
| Don't rebuild what works | Only a narrow addendum touches the mature close-out hooks. |
| Delegate authoring, review behavior | Codex/Sonnet drafts; you validate the observable result. |
Live risks to respect: disk at 2.0Gi free (do the disk-guard early); the pivot hook must stay advisory or it becomes the fatigue it prevents; verify one automation end-to-end before wiring all three; extend reconcile-prime.sh carefully (much-iterated); output-styles/subagents only load after /clear — early "not working" is usually an unreloaded session.
founder-brief default for ALL sessions (Ali, 2026-07-01); U3 sets the global outputStyle with keep-coding-instructions.instruction_opportunity_review manually first? — to see its shape before wiring the weekly loop (recommended: yes).memories to match Codex? — Codex has it on; the cockpit could too.Three commits this week — an ambient statusline, three wired routines, and a founder-brief that ends every session in one plain-English line. You're one good afternoon from a cockpit that flies.