Offload Claude by giving it top coding agent Sidekicks like ForgeCode (#2, #3 and #6 in Terminal Bench 2.0) and super cost-effective models like Qwen3 Coder Plus. Let Claude delegate all implementation work — while you interact only with Claude.
You talk to Claude. Claude plans the work and delegates execution to the best agent for the job. Results come back to Claude, reviewed and summarized for you.
Claude delegates all file writes, edits, test runs, and commits. You get faster results without Claude burning context on execution.
Each Sidekick installs itself on the first Claude session after the plugin is enabled. No manual setup required.
AGENTS.md keeps each Sidekick fully aware of project goals, stack, and conventions on every invocation.
12-failure-type recovery playbook built into Claude's orchestration skill. 402s, 429s, PATH issues — handled automatically.
Each Sidekick is a specialized AI coding agent. More are coming.
A Rust-powered terminal AI coding agent. Handles all file writes, test runs, and git commits with a 1M-context window and vision support.
The #1 ranked agent on Terminal-Bench 2.0 (82.9%). A ticket-to-PR pipeline agent — takes a GitHub issue and produces a full pull request autonomously.
An open-source, extensible terminal coding assistant. Provider-agnostic — works with any LLM. Lightweight, fast, and ideal for long coding sessions.
Terminal-Bench 2.0 is ICLR 2026's benchmark for terminal AI agents — 89 Docker-containerized real-world tasks. ForgeCode holds 3 spots in the top 6.
| Rank | Agent | Model | Score | Uncertainty |
|---|---|---|---|---|
| #1 | Pilot Sidekick coming soon | Claude Opus 4.6 | 82.9% | ± 1.4 |
| #2 | ForgeCode ← Sidekick | GPT-5.4 | 81.8% | ± 2.0 |
| #3 | ForgeCode ← Sidekick | Claude Opus 4.6 | 81.8% | ± 1.7 |
| #4 | TongAgents | Gemini 3.1 Pro | 80.2% | ± 2.6 |
| #5 | SageAgent | GPT-5.3-Codex | 78.4% | ± 2.2 |
| #6 | ForgeCode ← Sidekick | Gemini 3.1 Pro | 78.4% | ± 1.8 |
| · · · ranks 7 – 27 · · · | ||||
| #28 | Codex CLI | GPT-5.2 | 62.9% | ± 3.0 |
| #40 | Claude Code | Claude Opus 4.6 | 58.0% | ± 2.9 |
| #52 | OpenCode Sidekick coming soon | Claude Opus 4.5 | 51.7% | — |
Source: tbench.ai · Terminal-Bench 2.0 (ICLR 2026)
Sidekick's dual value proposition: top-ranked coding agents powered by models that rival frontier AI — at up to 36× lower cost than Claude Sonnet 4.6.
| Model | Coding Benchmark | Input / MTok | Output / MTok | Context | vs Sonnet 4.6 |
|---|---|---|---|---|---|
|
Claude Sonnet 4.6 Anthropic · API baseline |
79.6% SWE-bench 59.1% Terminal-Bench 2.0 |
$3.00 | $15.00 | 1M | baseline |
|
Qwen3 Coder Plus
Forge default Alibaba · via OpenRouter |
78.8% SWE-bench 61.6% Terminal-Bench 2.0 (#1!) |
$0.33 | $1.95 | 1M | ~8× cheaper |
|
Gemma 4 31B
Coming soon Google DeepMind · via OpenRouter |
80.0% LiveCodeBench v6 86.4% τ2-bench (agentic tools) |
$0.14 | $0.40 | 262K | ~36× cheaper |
Blended effective cost at 1:3 input:output ratio · Pricing via OpenRouter
Open Claude Code, go to Settings → Plugins, search for Sidekick and click Install. Or paste this into your settings file — Claude Code auto-updates on every session.
Restart Claude Code — Sidekick installs automatically on the next session start.
After installation, Claude walks you through everything.
On first session start after enabling the plugin, ForgeCode is downloaded and added to your PATH automatically.
Claude guides you to sign up at openrouter.ai and add a small credit. $5 goes a long way with Qwen3 Coder Plus.
From this point on, Claude acts as orchestrator. All file writes, tests, and commits go to Forge. You stay in Claude.
Add Sidekick to your Claude settings and start shipping faster today. Free to install — pay only for the OpenRouter API calls Forge makes.