Core Concepts

The building blocks of current Sidekick delegation — Kay and Codex sidekick modes, per-session hooks, the shared active-sidekick selector, and host-owned verification. For canonical terms, see Glossary.

Working across runtimes? Keep Glossary and Compatibility handy while you read the deeper sections.

Codex Delegation Mode

When you invoke /sidekick:codex-delegate, the host activates Codex sidekick mode for the current session. The host AI does not write implementation edits directly while the mode is active. It delegates bounded work to the local OpenAI Codex CLI through codex exec, then verifies the result before reporting success.

The Codex sidekick is pinned to gpt-5.4-mini with extra-high reasoning. Sidekick injects the model, reasoning, sandbox, and approval flags through the hook layer so the host can keep prompts focused on the task instead of runtime plumbing.

Delegation is session scoped. Once /sidekick:codex-delegate is active, implementation tasks are routed to Codex until /sidekick:codex-stop removes the current-session marker. Codex and Kay are mutually exclusive in the same host session through the shared active-sidekick selector.

Kay Delegation Model

Kay is Sidekick's Kay/OpenCode Go execution sidekick. The user-facing activation surface is /sidekick:kay-delegate, with xiaomi and ocg selectors for provider routing. Once Kay mode is active, Sidekick routes implementation through Kay's native kay exec runtime path.

Kay keeps native agents, skills, subagents, and AGENTS.md support. Sidekick contributes package wiring, a session-scoped Kay marker, the shared active-sidekick selector, a .kay/conversations.idx lookup ledger, and progress summaries from the Kay hook surface.

Codex and Kay are intentionally different. Codex uses the OpenAI Codex CLI with fixed model/reasoning flags. Kay uses its native runtime and provider routing with Sidekick packaging, summaries, and audit metadata around it.

SKILL.md — The Instruction Set

Canonical workflows live under skills/: kay-delegate, kay-stop, codex-delegate, and codex-stop. Generated host bundles under agents/claude/ and agents/codex/ are rendered from those canonical skills.

The canonical skills cover:

  • Runtime readiness — per-session health check during explicit delegation startup
  • Activation — session markers, active-sidekick, and hook enforcement
  • Delegation Protocol — child runtime command and managed flags
  • Deactivation — clean shutdown
  • Host Verification — evidence, taxonomy, and relaunch loop
  • Runtime Notes — Kay provider selectors and Codex CLI flags

Failure Detection

The host verifies every sidekick result. It treats sidekick success output as a claim to audit, not proof. Failures are classified with the current taxonomy:

  • MISSED_REQUIREMENT, MISUNDERSTOOD_TASK, or TRIAL_INCOMPLETE when the sidekick did not complete the request.
  • INTEGRATION_ERROR, REGRESSION, WRONG_LOGIC, WRONG_FILE, or SYNTAX_ERROR when the repository state is wrong.
  • UNVERIFIED_ASSUMPTION or KNOWLEDGE_GAP when the sidekick guessed.
  • API_FAILURE or EXECUTION_ERROR_EXTERNAL when provider or environment failures block completion.

If any failure is found, the host relaunches or guides the active sidekick with focused correction context, then verifies again.

The Verification Loop

The recovery loop is evidence-based and repeats until no taxonomy failure remains.

Level 1
Inspect
The host compares the diff and final state against the original prompt, success criteria, and surrounding repository behavior.
Level 2
Verify
The host runs the smallest meaningful commands: tests, lint, build, type checks, or targeted runtime checks.
Level 3
Relaunch
If verification fails, the host relaunches the active sidekick with the failure code, observed evidence, relevant files, and exact success criteria.
Completion is host-owned. The loop ends only when host evidence supports the result. A sidekick STATUS: SUCCESS is not enough by itself.

Hook Boundary

Sidekick's hooks stay dormant until a sidekick is active for the current session. While active, the hook layer denies direct implementation edits and routes supported runtime commands through bounded, redacted progress surfaces.

ModeHook behavior
KayRoutes implementation through kay exec, records .kay/conversations.idx, and surfaces Kay summaries.
CodexRoutes implementation through codex exec, injects gpt-5.4-mini and xhigh, records .codex/conversations.idx, and surfaces Codex summaries.
Direct host modeNo sidekick marker is active, so normal host behavior resumes.

The shared selector lives at ~/.sidekick/sessions/<session>/active-sidekick and contains either kay or codex. Only one sidekick should enforce within a host session.

AGENTS.md

AGENTS.md remains the project instruction surface for hosts and sidekicks. It should describe supported sidekicks, project conventions, tests, integrity workflow, and release discipline.

Generated surfaces

Canonical workflows live under skills/. Generated host bundles under agents/claude/ and agents/codex/ are rendered by bash scripts/sync-host-surfaces.sh. Keep those surfaces aligned when canonical skills change.

Runtime Flags

Codex mode uses the local OpenAI Codex CLI with these Sidekick-managed flags:

Codex child runtime
codex exec -m gpt-5.4-mini -c model_reasoning_effort=xhigh --sandbox workspace-write --ask-for-approval never

Provider Configuration

Sidekick keeps provider ownership with the active runtime. Kay uses its Kay/OpenCode Go path by default and supports xiaomi, ocg, and SIDEKICK_KAY_PROVIDER selectors. Codex sidekick mode uses the local OpenAI Codex CLI and does not use Kay provider aliases.

Kay OpenCode Go config

For Kay, use Kay's native login flow when credentials are missing:

Kay OpenCode Go login
kay login --provider opencode-go --with-api-key
Provider selectors are sidekick-specific. Use Kay selectors only with Kay mode. Use Codex mode when you specifically want the OpenAI Codex CLI child runtime.