Coding AI Harnesses — Market Analysis 2026

Developer Adoption (Jan 2026 · JetBrains AI Pulse Survey · n=10,000+)

Work Adoption (% using at work)

GitHub Copilot 29%

ChatGPT (chatbot) 28%

Cursor 18%

Claude Code 18%

JetBrains AI 11%

Gemini (chatbot) 8%

Google Antigravity 6%

Codex CLI 3%*

* Pre-desktop app launch. Codex app expected 10%+ post-launch.

Satisfaction vs. Adoption

The key tension: high satisfaction ≠ market share

Claude Code 84% satisfaction · 22% share

Cursor 78% satisfaction · 20% share

Codex CLI 74% satisfaction · est. 8% share

GitHub Copilot 52% satisfaction · 24% share

OpenCode Community tool · 120k+ stars

Key insight: Copilot's distribution (GitHub integration, enterprise deals) lets it dominate despite low satisfaction. Claude Code wins on quality but costs $100–200/mo for heavy use.

GitHub Stars (OSS Traction Proxy · Apr 2026)

147k

OpenCode

▲ 4.5× growth rate

60k+

Codex CLI (Rust)

fast growing

41k

Pi (v0.70)

niche but loyal

closed

Claude Code

proprietary

The Four You Asked About — Plus Context

Claude Code

Anthropic · Closed source · Launched May 2025

Terminal-native

Terminal-first agentic coding tool powered by Claude Opus/Sonnet. The architecture is a sophistated while-loop surrounded by a 5-layer compaction pipeline, 7-mode permission system, MCP integration, subagent delegation with worktree isolation, and CLAUDE.md as persistent memory. The source code leaked in March 2026 via left-in source maps — and the community found it's more complex than expected.

80.8%

SWE-bench

84%

Satisfaction

18%

Work Adopt.

Strengths

Highest SWE-bench score
5-layer context compaction
Full computer use (browser)
Subagent + worktree isolation
CLAUDE.md persistent memory
MCP + hooks + skills
4% of all public GitHub commits

Weaknesses

Closed source (no inspection)
$100–200/mo heavy use
April 2026 harness bugs
Behavior changed with updates
Vendor lock-in to Anthropic
No offline/local model support

Codex CLI

OpenAI · Open source (Rust) · Launched 2025

Cloud-native

OpenAI's coding agent platform: a Rust-based open-source CLI, cloud delegation app, IDE extensions, and ChatGPT-connected workflows. Unlike Claude Code's local-first design, Codex operates in sandboxed cloud environments and submits PRs asynchronously. Built on the Responses API (model-agnostic). The App Server exposes the same harness across all surfaces via WebSocket.

77.3%

Terminal-Bench

#1

Apr 2026 rank

240+

tok/s speed

Strengths

Open source Rust CLI
Cloud sandbox isolation
Async background PR generation
GPT-5.5 model quality (Apr)
6 concurrent subagents
ZDR compliance support
App Server multi-surface

Weaknesses

Cloud sandbox = limited local access
Weaker computer use than CC
OpenAI model dependency
Sandboxed isolation is a tradeoff
Cost unpredictable at scale

OpenCode

SST (Anomaly) · MIT · 147k+ stars

Provider-agnostic

TypeScript/Bun monorepo with client-server architecture. A persistent backend server (HTTP/SSE via Hono) talks to TUI, desktop, web, and mobile clients. Uses Vercel AI SDK for unified provider access across 75+ LLMs. Six built-in agents (build, plan, explore, general, compaction, title) with LSP integration and SQLite session storage. GitHub Copilot partnership (Jan 2026) lets paid Copilot subscribers auth directly.

75+

LLM providers

147k

GitHub stars

$0

Subscription

Strengths

True vendor independence
Persistent server across terminals
LSP integration for diagnostics
6 specialized built-in agents
Full offline / local model support
Copilot auth integration
Fastest OSS star growth rate

Weaknesses

Heavy system prompt (10k+ tokens)
Self-managed, no enterprise SLAs
Context mgmt less graceful
Docs lagging behind features
No managed compute

Pi

Mario Zechner · MIT · 41k stars · v0.70

Minimalist harness

Deliberately minimal: four core tools (read, write, edit, bash), system prompt under 1,000 tokens. Built in reaction to Claude Code's increasing unpredictability. Pi is a TypeScript monorepo (pi-mono) with a strict DAG dependency graph — every package independently usable. Extensibility via TypeScript extensions, skills (AGENTS.md), and installable Pi packages. "Primitives, not features" is the design philosophy.

4

Core tools

<1k

System prompt tokens

15+

Providers

Strengths

Minimal token overhead
Predictable, stable behavior
Full context control
Local-first (MLX, GGUF)
Self-documenting / inspectable
TypeScript extension system
Same model → better results vs. heavy harness

Weaknesses

No built-in subagents/plan mode
No permission gates by default
DIY everything beyond core 4
Smaller community than OpenCode
No managed free tier
Niche — requires skilled user

Core Architectural Decisions Compared

Dimension	Claude Code	Codex CLI	OpenCode	Pi
Agent loop	While-loop; model decides next action. Intelligence in model, loop is "dumb".	Same pattern but Responses API-based. Explicit thread lifecycle management.	SessionPrompt orchestrates loop. 6 specialized agents swap in by task type.	Minimal agentic loop. 4 tools only. Model drives everything.
Context strategy	5-layer compaction: snip → summarize → sub-agent offload. CLAUDE.md as persistent anchor. Deferred tool schemas via ToolSearch.	Prompt caching (linear not quadratic). Encrypted compaction. AGENTS.md hierarchy. Cache-miss-aware design.	HTTP/SSE server persists state. SQLite session storage. Context compaction via compaction agent. LSP feeds diagnostics.	No compaction built-in. <1k token system prompt. Custom compaction via extension. Developer owns context engineering.
Execution model	Local. Files on your machine. Computer use (browser control). Subagents with worktree isolation.	Sandboxed cloud container (clone of repo). Safe but separate from live env. Also local CLI mode.	Client-server. Server headless, clients connect over HTTP+SSE. Sessions survive disconnects.	Local only. Process runs in terminal. Extensions add sandboxing if desired.
Permission system	7 modes + ML-based classifier. Tool-level, per-pattern, per-directory deny-first evaluation.	Permission profiles: workspace-write, sandbox modes, approval policies. Network rules.	Permission system per agent type. Build agent (full) vs plan agent (read-only). Switchable.	None by default. DIY via extension (permission-gate.ts example provided).
Memory / config	CLAUDE.md hierarchy (lazy-loaded). Auto memory. 6 memory layers at session start.	AGENTS.md + AGENTS.override.md. config.toml. developer_instructions. Thread persistence.	opencode.json + .opencode/ dir. Config from 8 sources in precedence order. Skills system.	AGENTS.md. ~/.pi/ user configs. Pi packages. Skills follow Agent Skills standard.
Extensibility	MCP (3 transport modes), plugins, skills, hooks. Declarative (.md/.json). No code needed for basic extensions.	MCP servers, plugins marketplace, skills, hooks. Multi-agent via Agents SDK. App Server for custom surfaces.	MCP built-in. Plugin system. 20+ built-in tools. Custom agents via opencode.json. oh-my-opencode (48k stars).	TypeScript extensions (fully typed). Pi packages (npm/GitHub). Skills. 4 execution modes: interactive, print, RPC, SDK.
Source / lang	TypeScript (closed). Source maps leaked Mar 2026.	Rust (open source). LLM-agnostic via Responses API.	TypeScript + Bun monorepo. MIT license. Vercel AI SDK.	TypeScript monorepo (pi-mono). DAG dependency graph. MIT license.
Multi-agent	Subagent delegation. Worktree isolation. Agent Teams (Feb 2026). Summary-only return from sub-agents.	6 concurrent threads default. spawn_agents_on_csv for parallelism. Role-based agent config.	6 built-in agents. Custom agents. Team/enterprise: shared server with cost controls.	Extension-based. oh-my-pi (fork) adds Sisyphus orchestration, Prometheus planning, Oracle debugging.

The "Around the Loop" Insight

All four agents share the same core: a while-loop that calls the model, runs tools, and repeats. The real architecture is what lives around that loop. Claude Code wraps it with 5-layer compaction, 7-mode permissions, and MCP. Pi strips all of that away deliberately. OpenCode externalizes the loop into a persistent server. Codex adds cloud sandboxing and async execution.

Context Window is the Bottleneck

Cognition measured that agents spend 60% of their time on search — building context before writing code. Claude Code's deferred tool schemas, sub-agent summary-only returns, and CLAUDE.md lazy-loading are all direct responses to this. Pi's <1k token system prompt is the opposite bet: waste less context on harness overhead, give more to the model.

Benchmark Data (Q1–Q2 2026)

SWE-bench Verified (% resolved)

Claude Code

80.8%

Antigravity

76.2%

Codex CLI

~74%

Cursor

~72%

OpenCode

model-dep.

Note: Scaffolding matters. Same model in different harnesses scored 17 problems apart on 731 total issues.

Terminal-Bench 2.0 (autonomous tasks)

Codex CLI

77.3%

Claude Code

~72%

OpenCode

model-dep.

Pi

model-dep.

Codex leads on background autonomous execution; CC leads on reasoning-heavy tasks.

Token Speed (tokens/sec)

Codex CLI

240+

Pi (local)

model-dep.

Claude Code

Opus slower

OpenCode

provider-dep.

System Prompt Overhead (tokens)

Smaller = more context for your actual code

Pi

<1k ✓

Codex CLI

~3–5k

Claude Code

~7–9k

OpenCode

10k+

OpenCode's rich built-ins cost context. Pi's philosophy inverts this tradeoff entirely.

The Benchmark Caveat

The gap between top models has narrowed to a few percentage points. Raw benchmark differences matter less than architecture and workflow fit in 2026. The same model running in different harnesses can score 17 problems apart on 731 total issues — scaffolding quality is now a primary performance variable.

Full Market Landscape (May 2026)

Tool	Category	Model	Pricing	Share/Traction	Best For
GitHub Copilot	IDE + Agent	GPT-5 + Claude	$10–19/mo	29% work	Enterprise, IDE-native, distribution
Claude Code	Terminal agent	Claude Opus 4.7	$20/mo + API	18% work · 84% sat.	Deep reasoning, architecture, hard bugs
Cursor	IDE	Multi-model	$20/mo	18% work · 360k paying	IDE-native daily coding
Codex CLI	Terminal + Cloud	GPT-5.5	ChatGPT sub	~8% · fast growing	Async background tasks, PR generation
OpenCode	Terminal OSS	75+ providers	$0 sub + API	147k stars · 6.5M devs/mo	Privacy, compliance, provider flexibility
Pi	Minimal harness	15+ providers	$0 + API	41k stars · niche	Context engineering, local models, control
Windsurf	IDE	Multi-model	$15/mo	Free tier leader	Best value, unlimited autocomplete
Aider	Terminal OSS	Multi-model	$0 + API	Git-native	Git-integrated workflows, BYOM
JetBrains Junie	IDE agent	Multi-model	Bundled	5% work	IntelliJ/PyCharm/GoLand native
Gemini CLI	Terminal	Gemini 3.1 Pro	Free tier	Fastest free	1M token context, free frontier access
Devin	Full agent	Proprietary	$20+/mo	67% PR merge rate	Fully autonomous defined-scope tasks
Augment	Code review	GPT-5.2	Enterprise	Best AI code review	100k+ file codebases, code review

How to Choose in 2026

You want best raw quality, complex refactors → Claude Code with Opus. Budget $150–200/mo for heavy use. Accept vendor lock-in.

You want async, parallel, cloud PR generation → Codex CLI + App. GPT-5.5 now competitive. Sandbox isolation is a feature here.

You need privacy / compliance / offline → OpenCode. Run local models via Ollama. 75+ providers means you never get locked in.

You want to own your context engineering → Pi. Sub-1k prompt. Same model produces better output with less harness noise. Not for beginners.

Daily IDE coding, mainstream team → Cursor or Copilot. Proven distribution, good enough quality, IDE ergonomics matter.

The power-user stack (2026 consensus) → Claude Code for hard problems + Cursor for daily coding + OpenCode or Pi for model flexibility.

MCP: The Emerging Standard

Every major tool shipped multi-agent capabilities in the same 2-week window in February 2026 — it's now table stakes. The real next battleground is MCP interoperability: Augment exposes its context engine as an MCP server usable from Claude Code, Codex, or any MCP-compatible agent. The tools are converging toward a layer-cake: inference (model) → context (MCP tools) → harness (agent loop) → surface (IDE/terminal/cloud).