Core Concepts

The building blocks of Forge delegation — how SKILL.md works, what the fallback ladder does, and how AGENTS.md keeps getting smarter over time.

Forge Delegation Mode

When you invoke /forge, Claude activates Forge delegation mode. In this mode, Claude does not write code directly. Instead, it acts as a task orchestrator — composing structured prompts, submitting them to Forge, monitoring output, and handling failures.

The key insight is that Forge (ForgeCode) is a specialized coding agent that operates autonomously. Claude's role is to communicate the task clearly, not to do the implementation itself.

Delegation persists until deactivated. Once /forge is active, all coding tasks are delegated to Forge. To return to Claude doing implementation directly, deactivate with /forge:deactivate.

SKILL.md — The Instruction Set

All Forge delegation behavior is defined in skills/forge/SKILL.md. This file is loaded when you invoke /forge and contains every rule Claude follows: how to write task prompts, when to escalate failures, how to update AGENTS.md, and what token limits to observe.

SKILL.md has these sections, added across four implementation phases:

  • Activation — health check, bootstrap config, session state
  • Delegation Protocol — 5-field task prompt structure
  • Deactivation — clean shutdown
  • Failure Detection — three signal types
  • Fallback Ladder — L1/L2/L3 escalation
  • Skill Injection — mapping table and selector rules
  • AGENTS.md Mentoring Loop — extraction and three-tier write
  • Token Optimization — budget constraints and .forge.toml compaction

Failure Detection

Claude continuously monitors Forge's output for three failure signals:

  • Error signal — Output contains Error:, Failed:, fatal:, or a non-zero exit code
  • Wrong output — Forge's output does not satisfy the SUCCESS CRITERIA on retry
  • Stall — Forge asks a clarifying question without making any progress

Any of these signals triggers the fallback ladder.

The Fallback Ladder

The fallback ladder is a three-level automatic recovery system. When Forge fails, Claude escalates through levels rather than immediately taking over or giving up.

Level 1
Guide — Automatic Reframe
Claude rewrites the task prompt with a diagnosis of what went wrong, a tighter DESIRED STATE, and specific code references. No user input needed. One retry only — if L1 fails, escalate to L2.
Level 2
Handhold — Subtask Decomposition
Claude decomposes the original task into atomic subtasks (≤200 tokens each) and submits them to Forge sequentially with full 5-field prompts. Maximum 3 attempts before escalating to L3.
Level 3
Take Over — Direct Action + DEBRIEF
The delegation restriction is lifted. Claude implements the task directly, then produces a structured DEBRIEF (TASK / FORGE_FAILURE / LEARNED / AGENTS_UPDATE) to capture what went wrong and update AGENTS.md.
L3 takeovers are learning events. The DEBRIEF from every L3 takeover feeds the AGENTS.md update — so Forge gets better instructions for similar tasks next time.

Skill Injection

Forge has access to four bootstrap skills that Claude can inject into task prompts via the INJECTED SKILLS field. Claude selects skills based on the task type:

SkillTask Type
testing-strategyWriting or fixing tests, TDD tasks
code-reviewCode quality, refactoring, review-driven changes
securityAuth, input validation, credential handling
quality-gatesMulti-phase delivery, release preparation

For general code changes and refactoring, inject code-review. Add quality-gates for multi-phase delivery tasks.

Claude enforces an injection budget: inject ≤2 skills per task unless the task is clearly multi-domain. Over-injection bloats the prompt and degrades Forge's focus.

AGENTS.md Mentoring Loop

After every completed Forge task, Claude extracts learnings and writes them to AGENTS.md. This is the mechanism by which delegation gets smarter over time.

What Claude extracts

  • Corrections — mistakes Forge made that Claude had to fix
  • User preferences — expressed during the session
  • Project patterns — conventions Forge discovered in this codebase
  • Forge behavior observations — what Forge does well or poorly here

Three-tier write

Claude writes to three locations after each task:

  • ~/forge/AGENTS.md — global cross-project knowledge
  • ./AGENTS.md — project-specific instructions
  • docs/sessions/YYYY-MM-DD-session.md — per-session evolution log

Deduplication

Before every AGENTS.md write, Claude runs a two-phase check: first an exact substring match, then a semantic similarity check. If either matches an existing entry, the write is skipped entirely — no partial appends.

Token Optimization

Task prompts to Forge are capped at 2,000 tokens. Claude enforces this by:

  • Including only the 5 mandatory fields (OBJECTIVE, CONTEXT, DESIRED STATE, SUCCESS CRITERIA, INJECTED SKILLS)
  • Omitting conversation history and unrelated file contents
  • Including only files directly relevant to the task in CONTEXT
  • Keeping INJECTED SKILLS to ≤2 skills unless multi-domain

Context compaction in .forge.toml uses validated defaults:

.forge.toml
token_threshold = 80000 # trigger compaction at 80k tokens eviction_window = 0.20 # evict oldest 20% of context retention_window = 6 # keep last 6 exchanges always max_tokens = 16384 # max output tokens per Forge call

Provider Configuration

Sidekick supports two providers for Forge. The provider and model are set in ~/forge/.forge.toml (global, not per-project). API credentials are stored separately in ~/forge/.credentials.json.

OpenRouter (recommended)

OpenRouter routes requests to the forge/forge-code model. This is the default and recommended provider — it has the broadest availability and is the model Sidekick is optimized for.

~/forge/.forge.toml — OpenRouter
"$schema" = "https://forgecode.dev/schema.json" max_tokens = 16384 [session] provider_id = "open_router" model_id = "qwen/qwen3-coder-plus"
~/forge/.credentials.json — OpenRouter API key
{ "api_key": "sk-or-your-key-here" }

MiniMax Coding

MiniMax is an alternative provider with different rate limits and pricing. Use it if you have a MiniMax account or prefer its model characteristics. Switch by updating ~/forge/.forge.toml:

~/forge/.forge.toml — MiniMax Coding
"$schema" = "https://forgecode.dev/schema.json" max_tokens = 16384 [session] provider_id = "minimax" model_id = "MiniMax-M2.7"
Credentials are global, never per-project. ~/forge/.credentials.json and ~/forge/.forge.toml live in your home directory and apply to all projects. The project-root .forge.toml contains only compaction settings and is gitignored automatically.