Claude Code vs Codex CLI vs Gemini CLI vs OpenCode: The Real Differences After Convergence

Converging AI Coding CLIs: Shared primitives, divergent strengths, and the future of multi-agent workflows.

Rick Hightower 28 min read

Originally published on Medium.

AI Agents arms race: Claude Code, OpenCode, Gemini CLI, and Codex CLI converging on subagent architecture in the April 2026 developer tooling convergence

Converging AI Coding CLIs: Shared primitives, divergent strengths, and the future of multi-agent workflows.

Four AI coding CLIs have finally converged on the same sub-agent primitives -- discover how this reshapes planning, parallel work, and model-agnostic automation.

Summary: The article examines how four major AI coding command-line interfaces, Claude Code, OpenCode, Codex CLI, and Gemini CLI, have converged on a common set of primitives such as subagents, plan mode, ask-user tools, parallel execution, sandboxing, memory, and MCP integration, highlighting that these capabilities now exist across all tools despite differing launch dates and vendor branding; it then compares their genuine differentiators -- including model lock-in versus model-agnostic design, agent definition formats, background scheduling, approval-gate implementations, and manager context window size -- while emphasizing the shared skill-file format that enables portable workflows, and offers guidance on selecting the appropriate CLI for various scenarios like interactive pair programming, enterprise refactoring, bulk automation, scheduled routines, and multi-vendor cost-sensitive tasks.


Part 1: The Convergence That Already Happened

A Story the Marketing Got Backwards

Read the April 2026 launch posts for Gemini CLI v0.38.1 and Codex CLI v0.107 and you would think a new architectural idea had just arrived in developer tooling. Subagents with isolated context windows. Plan modes. Approval gates. Parallel workers. Memory banks. Sandboxes.

That is not the story. The story is that all of these primitives already existed, in production, across multiple coding CLIs. They just spent the last six weeks getting marketed.

Claude Code shipped subagents with isolated context windows in July 2025 [1]. Plan mode has been a recommended Claude Code workflow for nearly as long. Custom Markdown agent definitions in ~/.claude/agents/ were the format the rest of the industry copied. Memory and skills have been part of Claude Code for many months. Sandbox execution has been part of both Claude Code and Codex for a long time.

OpenCode shipped its Plan agent and Build agent as primary built-ins in mid 2025, with a tab-to-swap workflow that lets you flip between read-only planning and write-access execution in a single session (like Claude Code). OpenCode's pitch is that it is model-agnostic: it runs against GPT, Claude, Gemini, or any model accessible through your GitHub Copilot login, with the same agent definitions, the same skill files, the same workflow.

Codex CLI reached General Availability for thread-forking subagents on March 16, 2026 [3]. Codex emphasizes parallel bulk work and adversarial reviewer patterns, but neither is exclusive to Codex; the explorer/worker/reviewer trio is a prompt pattern that can translate to any CLI that supports custom agents.

Gemini CLI followed with v0.38.1 on April 14 to 16, 2026 [4], adding subagents on top of a Plan Mode that had shipped a month earlier. The launch attracted disproportionate attention because Gemini packaged the convergent feature set into one cohesive marketing moment. The features themselves were already standard.

This is not a critique of Gemini, or of Google's marketing. It is a correction to the framing many of us reached for when these launches landed back to back. The story is not "three vendors took different roads to subagents." The story is four major coding CLIs converged on the same primitives, and the differences that remain are smaller than the press releases imply.

This article walks through what they all now share, where the four tools genuinely differ, and what the convergence means for teams choosing where to invest their workflow time.


What Every Major Coding CLI Now Has

By the end of April 2026, all four major coding CLIs ship the following primitives. Where each one originated and how mature each is varies, but the capability is universal.

The honest read of that table: nine of the nine capabilities are present in every tool. The variation is in dates of arrival, naming, default-on behavior, and packaging. Treating any one of them as a vendor differentiator is misleading.

AI Agents context isolation: a manager session stays clean while a subagent does file exploration in an isolated scratch-pad context window. This is the convergent core primitive shared by all four CLIs.

What this means in practice: when a launch post claims Tool X "introduces" or "adds" a primitive, the right question is no longer "is this new?" but "how does this tool's implementation compare to the equivalent in the others?" That is a much narrower question, and it is the one this article tries to answer fairly.


Why the Convergence Happened in This Window

If the primitives existed in Claude Code from mid-2025, why did the rest of the industry catch up only in early 2026? Three forces converged.

MCP ecosystem maturity. The Model Context Protocol passed the threshold from "interesting standard" to "table-stakes integration layer" between late 2025 and early 2026. By late 2025, the MCP ecosystem had grown substantially across multiple registries and directories (with counts varying widely by source and methodology)[13], and OpenAI, Anthropic, and Google had all standardized around it. Once tool integration became a solved-and-shared problem, the next layer of differentiation was orchestration. Subagents are the orchestration primitive that becomes worth shipping the moment tool integration is no longer the bottleneck.

Long context windows. Gemini 2.5 Pro's 1M-token main session window was a real differentiator. This large context window does change what the orchestrating agent can do (We are now at Gemini 3.2). This was a major differentiator for quite some time for Gemini. Claude's 200K window was larger than it sounds in practice once context isolation moves the noisy reads into subagents.

However, now Claude Code also has a 1M context window as does Codex since the GPT-5.4 release. The point is that the manager-orchestrator pattern only pays off when the manager can hold enough context to route precisely, and 2025-vintage windows finally crossed that line. 1MB is the standard. FWIW, the Gemini 3.1 Pro model support 2MB so now Gemini CLI still has the largest context window, but it is not 5x more, only 2x more.

Enterprise audit-trail demand. Through 2025, enterprise adoption of AI coding CLIs hit a wall on one specific problem: when an agent makes fifty file changes across a refactor, who approved each step? The market priced in approval gates as a non-negotiable feature. Plan modes (read-only planning before code is written, with a human-editable plan file) and ask-user tools are both responses to that pressure. Once one tool shipped them as a default, the others followed; once the Plan/Build split shipped in OpenCode, the others made it more prominent.

Open-source pressure. OpenCode's willingness to work across LLM vendors and ship dual Plan/Build agents was a credible signal that the multi-agent pattern had broad developer demand. When a model-agnostic open-source tool ships a primitive, every closed-source vendor has to match it or explain why they have not.

The result is the convergent baseline above. Same primitives, same general shape, four different implementations of the same idea.


Part 2: The Four CLIs, Honestly Compared

Each of the four tools has a real personality and real strengths. The personalities are not "reactive vs. predictive vs. parallel"; that framing oversells features that are actually shared. The honest personalities are about ecosystem, model lock-in, file formats, and what each tool is best at defaulting to.


Claude Code: The Pioneer with the Mature Ecosystem

Claude Code shipped subagents in July 2025 and has had nine months to mature them[1]. That maturity shows up in three places that matter.

Established custom-agent patterns. Anyone who built a Claude Code workflow in 2025 wrote agents in .claude/agents/ as Markdown plus YAML frontmatter:

--- name: security-reviewer description: Adversarial reviewer for security vulnerabilities and unsafe patterns tools: Read, Glob, Grep --- You are a security-focused code reviewer. Find vulnerabilities, check input validation, flag unsafe patterns. Do not make changes; report findings only.

That format is what the rest of the industry adopted. Gemini CLI's .gemini/agents/*.md uses the same structure with minor field renames. OpenCode's .opencode/agent/*.md is the same shape. Only Codex chose TOML instead of Markdown frontmatter, and even there the field set is similar enough that translation between formats is mechanical.

Recommended Plan workflow. Claude Code has had a Plan mode workflow for many months. Anthropic has been recommending it as the right pattern for non-trivial work: explore the codebase, build a plan, then execute. The plan-first pattern is not a Gemini CLI invention; it is the Claude Code recommendation that Gemini CLI later made into a default-on read-only state.

Ask-user is built in. Claude Code agents can pause and ask the user a question. The tool name is not literally ask_user, but the capability is the same: structured interrupt, formatted question, agent waits for the answer before proceeding. Calling this Gemini-only is a marketing artifact, not a capability gap.

Background routines. What Claude Code does have that the others mostly do not is durable, schedule-driven background routines. Claude Code Routines (research preview, April 2026[14]) lets you register an agent that runs on a cron schedule, on a GitHub event, or on an external trigger. None of the other three CLIs ship native scheduled-routine support at the same level of integration.

The honest critique: Claude Code is single-vendor by design. You run it against Anthropic models or you do not run it. For teams that want to evaluate Claude vs. GPT-5.4 vs. Gemini 3 head-to-head on the same task, that is a constraint. The mature ecosystem also means most existing patterns are written for Anthropic-style prompting, which does not always translate cleanly to other models.

Gemini CLI had TOML-based custom commands before it adopted Markdown+YAML frontmatter for agents in v0.38.1. A Reddit thread from August 1, 2025 confirms Gemini users were writing .toml command files, with the format prompt = """markdown""" -- structurally similar to Codex's TOML agents. People were using these to simulate Claude Agents. I wrote an article on how GSD supports Gemini, OpenCode and Codex. It covers this early primordial Gemini Agents.

Claude Code has cron, managed agents, Agent Teams, tight GitHub integration, ultrareview modes, ultraplan modes, remote control, channels, etc. It is more expensive than the others but the price is probably worth it. The downside is running out of tokens faster and it seems to have more outages. That said, it is the one that I use the most.


OpenCode: The Model-Agnostic Standardizer

OpenCode is the tool the marketing posts mostly forgot. It deserves first-class treatment because it is doing something none of the other three are: making the same workflow run across every model family.

Plan agent and Build agent as built-ins. OpenCode shipped Plan and Build as its two primary agents in early 2026[2]. Plan is read-only and pushes file edits and shell commands to "ask" mode (every action requires confirmation, exactly like Gemini's Plan Mode). Build is the default with all tools enabled. You tab between them in a single session. The functional contract is identical to what Gemini CLI later made the default.

Multi-model. This is the differentiator the others cannot match. OpenCode runs against:

  • GPT family (via OpenAI or GitHub Copilot login)
  • Claude family (via Anthropic API)
  • Gemini family (via Google AI)
  • Any other model accessible through your GitHub Copilot subscription

The same agent definition, the same skill file, the same workflow runs against all of them. For teams that want to A/B test models on real tasks, or that need to switch models for cost or capability reasons, OpenCode is the only one of the four that does not lock you to a single vendor.

Shared skill format. OpenCode skills use the SKILL.md format (Markdown plus YAML frontmatter). The format travels well across tools, but discovery paths differ; a run_lint skill can be reused across CLIs once it is placed (or symlinked/copied) into each tool's native skills directory.

The honest critique: OpenCode is younger and thinner than Claude Code in raw feature surface. The model-agnostic architecture means every feature has to work against every supported model, which slows feature velocity. The community is smaller than Claude Code's. You will find fewer reference patterns and fewer prebuilt agent libraries.


Codex CLI: The Parallel Specialist

Codex CLI hit GA for thread-forking subagents on March 16, 2026[3]. The headline capability is parallel execution; the more interesting capability is the integrated approval-gate model.

Branching threads + /agent. Codex lets you branch the current session into independent sub-agent threads from within the conversation[15]. The /agent slash command acts as a thread navigator (think tab switcher), letting you inspect and switch between active threads without leaving any of them. The split between creating and navigating is the architectural decision that makes parallel management ergonomic. You can spawn a worker, switch back to the main thread, check on the worker later, redirect a different worker, and never lose the context of any of them.

Built-in roles. Codex ships explorer (read-only), worker (focused implementation), and default (general-purpose). These are not exclusive abilities; you can build similar roles as custom agents in any CLI that supports custom agents. The advantage is that they ship pre-tuned and you do not have to write them yourself.

Approval gates for background subagents. This is Codex's strongest enterprise primitive. When a background subagent (one running in a forked thread while you work on something else) tries to execute a command outside its sandbox policy, an approval popup surfaces in the terminal. You see which thread requested it, what it is trying to do, and you approve or deny[16]. The popup blocks the requesting thread, not your main work. Sandboxes are configurable per-agent (workspace-write, read-only, etc.), and managed organizations can enforce a requirements.toml that prevents agents from being run with approval_policy = "never" [16].

Spec-driven planning. /plan (or Shift+Tab) lets the manager generate a plan or auto-generated tests that workers must pass[5]. This is similar in spirit to Gemini's Plan Mode but lands in a slightly different place: where Gemini's plan is a Markdown file you edit, Codex's plan is often a test specification that grounds implementation in verifiable outcomes.

The honest critique: Codex's parallel model is genuinely faster for bulk independent work, but it adds orchestration complexity for sequential workflows. If your tasks have tight interdependencies, the parallel model is more friction than help. The TOML agent format is a small but real point of friction when porting agents from the Markdown ecosystem the other three share.


Gemini CLI: The Predictive Defaults

Gemini CLI v0.38.1 is the newest of the four, and Google leaned hard into making the convergent feature set look opinionated through aggressive defaults[4].

Plan Mode as the default. Plan Mode shipped opt-in in v0.33.0 (March 11, 2026)[17] and became the default in v0.34.0 (March 17, 2026). When /plan is the default, every interaction starts in a read-only state where the agent uses grep, read_file, and glob to gather context, then produces a Markdown plan you must approve before any code is written[6]. Same workflow as Claude Code's recommendation and OpenCode's Plan agent; the difference is defaulting to it rather than treating it as an opt-in pattern.

ask_user as a first-class tool. The ask_user tool[7] gives subagents a structured way to pause and surface a decision to the user. Multiple-choice, free-form text, Yes/No. The capability is shared (Claude Code can do this; OpenCode's Plan agent does this; Codex's approval popups do something similar), but Gemini's API surface is the cleanest of the four for the specific case of "subagent encounters a decision and pauses."

1M-token main session. Gemini 2.5 Pro and 3.1 Pro both provide 2M-token context windows in Gemini CLI[18]. For monorepo-scale refactors where the manager benefits from holding the entire codebase in context at once, this is a real capability advantage. The 1M window applies only to the main session orchestrator; subagents have their own isolated, smaller windows.

Auto-routing across Pro and Flash. Gemini CLI auto-routes requests between Pro and Flash variants within a chosen model family (Gemini 2.5 or Gemini 3), using task complexity as the routing signal. This is a useful default for teams that do not want to think about model selection. The pattern (high-reasoning model for harder requests, fast model for simpler ones) is convergent across the field; Codex does the same with GPT-5.5 / 5.4 / 5.4 mini, and OpenCode lets you pick per-task across vendors. In Plan mode, Claude Code will use Haiku to read and summarize files and Opus to do the actual planning.

Memory Bank (/memory). Introduced a major update in Gemini CLI v0.39.0[11], /memory now lets you extract skills from a session and review or prune what the agent retains. Worth being explicit: Claude Code has had memory and skills for many months; but this is NOT Gemini catching up to a longstanding Claude Code capability, but introducing a novel one. The ability to look at memory and capture agent skills is a new feature and unique. They others can do it but it takes a bit more coaxing, there is not a generic way to look at memory and create skill files built-in.

MCP, Sandbox, Conductor. Gemini CLI ships MCP integration, sandbox execution (via macOS Seatbelt, gVisor, or LXC backends)[9], and the Conductor extension for multi-track development. None of these are unique. MCP is universal across the field. Sandbox execution exists in Claude Code and Codex with similar configurability. Conductor is one of several plugin ecosystems for spec-driven development; Codex, OpenCode, and Claude Code all have plugin equivalents.

The honest critique: The Gemini CLI launch posts present a polished, opinionated feature set, which can read as innovation when it is closer to packaging. Several of the headlined "new" capabilities (Plan Mode, ask_user, sandbox, memory) shipped earlier in other tools. The 1M-token window is a real differentiator; the rest is mostly defaults and positioning.


Part 3: Where the Four Tools Genuinely Differ

If the primitives are convergent, what is left to compare? Five things that actually matter.


Differentiator 1: Model Lock-In vs. Model-Agnostic

The single biggest non-marketing difference among the four tools is which models they support.

For teams that want to A/B test models or that have policy reasons to switch between vendors, OpenCode is structurally the only neutral option. For teams that want to stay deep on one vendor, the other three each excel at being deep on their respective vendor. There is no universal "right" answer here; it depends on whether your organization's model strategy is single-vendor or multi-vendor.


Differentiator 2: Agent Definition Format

Anthropic published Agent Skills as a formal open standard on December 18, 2025, governed at agentskills.io. They had it as early as October 2025. This is not organic convergence; it's the same MCP playbook: publish a spec, release an SDK, let the ecosystem adopt it. Within months, it was adopted by Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor, VS Code, Roo Code, Amp, Goose, Windsurf, Mistral, Databricks, and 20+ others.

Claude Code, OpenCode, and Gemini CLI all use Markdown plus YAML frontmatter. Codex CLI uses TOML. The functional content is similar across all four (name, description, tool list, model preference, prompt body), but the file format is not portable without a small translation step.

This is a real friction point if you are running multiple CLIs and want a single source of truth for your agent definitions. The user-side workaround that most teams settle on:

.claude/agents/security-reviewer.md    <- canonical
.opencode/agent/security-reviewer.md  <- copy with minor renames
.gemini/agents/security-reviewer.md   <- copy with minor renames
.codex/agents/security-reviewer.toml  <- TOML translation

A shared skills directory at .agents/skills/*/SKILL.md (consumed by OpenCode, Codex, and Gemini CLI) reduces duplication for skill-level definitions. Agent-level definitions still require per-tool wrappers. We will return to skill portability in Part 4.

Shared .agents/skills/ directory (Part 4, "The Skill-Portability Story") -- The key claim that "OpenCode, Codex, and Gemini CLI all read skills from .agents/skills/*/SKILL.md" is not exactly correct. Each tool has its own native path: Claude Code ~/.claude/skills and .claude/skills for project level, Codex uses ~/.codex/skills / .codex/skills, Gemini CLI uses .gemini/skills/ and ~/.gemini/skills, and OpenCode uses .opencode/skill/, ~/.config/opencode/skill / as its primary path. Both OpenCode and Codex support .agents/skills/ as a compatibility alias. The format is portable -- SKILL.md + YAML frontmatter is an open standard across all four -- but there's no single shared directory. Anthropic published Agent Skills as a formal open standard on December 18, 2025, governed at agentskills.io. This is not organic convergence -- it's the same MCP playbook: publish a spec, release an SDK, let the ecosystem adopt it. Within months, it was adopted by Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor, VS Code, Roo Code, Amp, Goose, Windsurf, Mistral, Databricks, and 20+ others. The article's entire Part 4 ("The Skill-Portability Story") attributes the format convergence to accidental parallel evolution when it's actually an Anthropic-authored open standard that everyone implemented. This is a significant omission that changes the framing.

A January 2026 GitHub issue specifically flagged Gemini CLI's incomplete compliance with the Agent Skills standard. This may have been fixed. I have not found any issues. It would be nice if they all supported this. Also would be nice if they all supported AGENT.md instead of GEMINI.md and CLAUDE.md. OpenCode and Codex do support AGENT.md so perhaps they are the ones that seem more inclined to follow standards. Claude Code was the first company to support Agent Skills. They get a free pass. Also I noticed that Gemini will read skills from the Claude directories. I digress.


Differentiator 3: Background and Scheduled Work

Claude Code is the only tool of the four with native, well-integrated support for scheduled background routines. Claude Code Routines (research preview, April 2026[14]) lets an agent run on a cron schedule, in response to a GitHub event, or via API call. The other three CLIs have plugin or extension paths to similar capability, but none ship it as native infrastructure with the same level of integration.

For teams running monitoring, periodic analysis, or event-driven automation as part of an agentic workflow, this is a real differentiator. For teams whose use of AI agents is purely interactive, it does not matter much.


Differentiator 4: Approval Gate Model

All four tools support pausing for human approval. The implementations differ in defaults and ergonomics.

  • Gemini CLI: Plan Mode is the default read-only state[6]. ask_user is a first-class tool. Approval is the gate; execution is the exception.
  • OpenCode: Plan agent defaults file edits and shell commands to ask mode. Build agent is the default with all tools enabled. The tab-to-swap between them puts the user in explicit control of the approval boundary.
  • Codex CLI: Approval gates fire when a subagent tries to execute a command outside its sandbox policy[16]. Popups surface from background threads even when you are looking at the main thread. Less of a "default plan first" model and more of a "default execute, but interrupt for restricted operations."
  • Claude Code: Plan mode is recommended for non-trivial work but is not the default. Ask-user equivalents fire when the agent surfaces a question. Less opinionated than Gemini or OpenCode.

For regulated environments where every step needs to be approvable, Gemini CLI's defaults and OpenCode's Plan/Build split are the most cleanly aligned with the audit-trail expectation. For teams that want the agent to be productive by default and only interrupt for genuine risk, Claude Code and Codex give a smoother flow.


Differentiator 5: Context Window of the Manager

Subagents have isolated windows in all four tools, so the size that matters at the orchestration level is the main session. Gemini CLI's 1M-token main session is meaningfully larger than the others' 200K-class windows. For codebases small enough to fit in 200K tokens, the difference is invisible. For monorepos large enough that the manager would otherwise navigate by grep, the 1M window measurably improves routing precision.

Worth being concrete: 1M tokens is on the order of tens of thousands of lines of code (very rough, language- and whitespace-dependent). If your repo is below that threshold, no advantage. If it is above, the manager-side context advantage is real but narrow (only the manager benefits; subagents still operate under their own smaller windows).


Part 4: The Skill-Portability Story

The most underdiscussed feature of the convergence is that the same skill files are portable across all four CLIs at the format level.

The portable unit is the SKILL.md format (Markdown plus YAML frontmatter). Discovery paths differ by tool, so portability is achieved by placing (or symlinking/copying) the same skill folder into each tool's native skills location.

---
name: run_lint
description: Run the repository linter, summarize, and write lint-report.md
---


## Inputs and outputs
- Read: package.json, Makefile, lint config
- Write: lint-report.md

## Workflow
1. Detect the repo's preferred lint command.
2. Run without applying fixes unless explicitly asked.
3. Summarize results grouped by file, rule, and severity.

## Guardrails
- Do not modify source files unless the user asks for fix mode.

The core idea is that the skill content is portable; what changes is where each tool discovers it. In practice, teams running multiple CLIs often maintain one canonical skills directory in a repo and then copy/symlink the same skill folders into each tool's preferred path.

The agent wrapper around the skill is where format differences live (Markdown YAML for three, TOML for Codex). But the skill itself, the workflow definition, the inputs and outputs, the guardrails, is portable.

This is the concrete shape of the convergence. It is not "all four tools made the same architectural bet." It is "all four tools converged on a shared skill format that means a run_lint skill written for OpenCode runs in Codex and Gemini with no changes." That is a much more useful kind of convergence than the one the launch posts emphasize.

For teams maintaining workflows across multiple CLIs, the practical pattern is:

  • Canonical skills directory: a team-chosen source of truth (often a repo-local directory), then distributed into each tool's native skills path.
  • Per-runtime agents: small Markdown or TOML wrappers in .claude/agents/, .opencode/agents/, .codex/agents/, .gemini/agents/ that name the skill, set the model, and configure the tool list.
  • Shared trackers and outputs: Markdown trackers, JSON state files, log files written by all four runtimes use the same canonical paths and formats. Format drift between runtimes is the failure mode to avoid.

This is the "tri-runtime" or "quad-runtime" pattern that the author has seen emerging in the community. The author has working master prompts that automate the conversion of a Claude Code repository into a dual Claude+Codex or Claude+Gemini runtime, preserving .claude/ untouched and adding the parallel runtime alongside. In the author's experience, the conversion is largely mechanical -- suggesting the underlying problem surface is now more aligned than differentiated.


Part 5: Picking a Workflow

The April 2026 convergence makes "which CLI?" a less interesting question than "which workflow?" Here is the honest mapping.

AI Agents workflow comparison: Claude Code, OpenCode, Gemini CLI, and Codex CLI subagent philosophies and key differentiators side by side. They all feel about the same.

I find that Gemini CLI does a really good job of understanding the whole code base quickly. It seems very fast. Codex seems to catch more edge cases during planning (not always). OpenCode plus Codex model seems to be even better at catching edge cases than Codex plus the Codex model.

I do most workflows in Claude Code, and second is a tie between Codex and OpenCode. But now that it is quite easy to port from one to the next since they all have similar primitives, the pain is less when I run out of my double max Claude Code plan before the end of the month or week or session.

I have prompts that convert a Claude Code environment to a Codex environment or Gemini or OpenCode that get me 95% the way there so switching is not as painful.

Bulk Automation (overnight, parallel, reviewer-checked)

You have a long list of independent tasks. You want them done in parallel with quality checks before morning.

  • Codex CLI for the strongest defaults: /fork for branching a parallel session, approval gates for background threads, a built-in pattern for explorer->worker->reviewer pipelines. Claude Code has /fork to, and it works the same way. I have used the Claude Code version a lot, I found out Codex support it when I was writing this article. :)
  • Codex CLI for the strongest defaults: in-session thread branching for parallel workers, approval gates for background threads, a built-in pattern for explorer->worker->reviewer pipelines.
  • OpenCode if you want the same pattern across multiple model vendors or you want to A/B test reviewer agents on different models.
  • Claude Code and Gemini CLI can do this with custom agents and a coordinating script, but neither defaults to the parallel-bulk model the way Codex does.

Scheduled and Event-Driven Routines

You need an agent that runs on cron, on a GitHub event, or via API.

  • Claude Code is the clear best choice; Routines is the most integrated scheduled-agent infrastructure of the four.
  • The other three require plugin or external-orchestrator paths.

Multi-Vendor or Cost-Sensitive Workflows

You want to run the same workflow against multiple model vendors, or you want to pick the cheapest model for each task.

  • OpenCode is the only structurally model-agnostic option. The same skill files run against GPT, Claude, and Gemini, and Copilot login covers a lot of ground.

The reasonable read of the table above is that most non-trivial teams will run more than one of these CLIs. The convergence on shared skill formats and MCP integration makes that less expensive than it sounds, but it is still real overhead. Picking one as your daily driver and a second for a specific scenario (Claude Code for interactive + Codex for overnight bulk, or OpenCode for everything if you are multi-vendor by policy) is the most common pattern.


Part 6: Hooks -- The Determinism Layer

There is one feature the convergence story underplays because it does not fit the subagent narrative: hooks. Hooks are the mechanism that lets you inject deterministic behavior into an agent loop that is, by nature, probabilistic. When every major coding CLI now delegates work across multiple sub-agents, the question of what happens when an agent does something it should not becomes structural, not incidental. Hooks are the answer the field has converged on -- at very different speeds.

What a Hook Is

A hook is a synchronous interception point in the agent's execution loop. At defined events -- before a tool call is made, after it completes, when the session starts, when the user submits a prompt -- the runtime pauses and passes control to an external script or process you define. That script can inspect the context, modify it, block the action, log it, fire an external notification, or simply let it pass through unchanged.

The canonical examples:

  • PreToolUse -- intercept every Write or Bash call before it executes; reject calls that touch protected paths, inject a required approval comment, or route the event to an audit log
  • PostToolUse -- after a file write, trigger a linter, run a security scanner, or update a tracker
  • UserPromptSubmit -- inject context (current branch, ticket number, compliance policy) into every prompt before the model ever sees it
  • SessionStop / HTTP webhook -- notify Slack, fire a CI event, or record the session summary to a database

The key property: the agent cannot override a hook. It is not a prompt instruction. It is not a recommendation in a CLAUDE.md file. It is code that runs at the infrastructure layer, outside the model's decision space entirely. This is what the commenter in the article thread called "needed determinism in the chaos of the probabilistic world." That framing is exactly right: hooks are the layer where you stop asking the model to remember a policy and start enforcing it structurally.

Who Ships Hooks and When

Claude Code was the first to ship hooks, in v1.0.59 on July 23, 2025; the same summer it shipped subagents. I wrote an article about this around the same time. The full event set (PreToolUse, PostToolUse, SessionStop, UserPromptSubmit, AfterAgent) shipped from day one, and Anthropic later extended it with HTTP Hooks in early 2026 and other events along the way, letting a Claude Code session fire webhooks to external systems on any lifecycle event. That is over nine months of production maturity at the time of this writing.

And if we are being honest, I believe hooks have been part of Cursor for longer than Claude Code.

Gemini CLI launched hooks in v0.26.0 on January 27, 2026; approximately six months after Claude Code. I have written plugins that have insallers for Gemini CLI, Claude and OpenCode using their Claude Code Hook equiv. The Gemini hooks model supports BeforeRequest interception, tool call validation, context injection, and notification events, and hooks can be bundled directly into Gemini CLI extensions for shareable, reusable enforcement logic. Stable and well-documented. It does not support as many events as Claude Code does (the last time I checked which was Feb/March).

Codex CLI added an experimental hooks engine in v0.114.0 on March 10, 2026, behind a feature flag (features.codex_hooks). The current event set covers SessionStart and SessionStop only; no PreToolUse or PostToolUse equivalents yet. The feature is explicitly experimental, which is the honest signal that it is not yet ready for production enforcement workflows. Hooks without PreToolUse or PostToolUse is like chocolate chip cookies without warm milk, you could do it but what is the point.

OpenCode handles this through a lifecycle plugin model rather than a native hook system. The functional surface area is similar, plugin manifests define interception points in the agent loop, but the architecture differs: hooks are configured through the plugin registry rather than a dedicated hooks config block. For teams that are already building OpenCode plugins, the capability is real; for teams that want a hook without building a plugin, it requires more scaffolding than in Claude Code or Gemini CLI. They do not support as many events as Claude Code.

Why This Matters More Than It Sounds

The table above understates the gap. A PreToolUse hook that blocks file writes to /secrets/** is not the same capability as a SessionStart hook that logs a session ID. One is an enforcement primitive; the other is an observability primitive. Codex's current experimental hook set is closer to the latter. No bueno.

For teams deploying coding CLIs in regulated environments, this gap is often the decision point. You can audit after the fact with PostToolUse logging. You can enforce before the fact with PreToolUse blocking. Without PreToolUse, the best you can do is detect violations after they occur, which is not compliance, it is incident response.

Claude Code's nine-month head start here means the ecosystem of pre-built hooks (security scanners, compliance injectors, CI bridge scripts) is substantially deeper than what exists for the other three. The convergence that happened in April 2026 was on subagents, plan modes, and skill formats. On hooks, the tools are still on different chapters.


Part 7: What Comes Next

The April 2026 convergence closed one chapter. Three things are likely to define the next one.

Cross-runtime agent and skill standards. The skill format is already converging. Agent definitions are next. Expect either a community-driven shared format (Markdown YAML wins) or an MCP-mediated agent-discovery layer that abstracts away the format differences entirely. The author's master prompts that convert Claude Code agents into Codex TOML or Gemini Markdown formats are a temporary workaround; the longer-term answer is a shared format that does not require translation.

Cross-vendor handoffs. A Gemini CLI planning session producing a structured plan that Codex workers execute in parallel. A Claude Code interactive session handing off to OpenCode for multi-model verification. These workflows do not yet have first-class plumbing, but they are obvious enough that someone will build them. MCP is the most likely substrate.

Approval-gate standardization. ask_user is a clean primitive but a Gemini-specific implementation. The interesting next step is making "subagent pauses for human input" an MCP tool call rather than a vendor primitive. Once that happens, every CLI that supports MCP gains the capability uniformly.

The convergence is the story of April 2026. The standardization that follows is the story of the next twelve months. The teams that start treating agents and skills as portable assets now are the ones that will benefit most when the cross-runtime layer becomes routine.

My general opinion is use them all. They tend to leap frog each other. There was a time when I would only use Codex as a last resort, now I fairly regularly use it for certain tasks. I focus on Claude Code mostly because I know it the best but with all of this convergence, I can and do use others. It is always good to have a back up plan. There are times when you might run out of tokens, or there is an outage, or you just want a second opinion on a tricky task.

Fact check false! Rick does not have hair on his head. Skilz is an installer for Agent Skills.


References

[1]: Anthropic introduces subagents in Claude Code. Winbuzzer, July 26, 2025. https://winbuzzer.com/2025/07/26/anthropic-rolls-out-sub-agents-for-claude-code-to-streamline-complex-ai-workflows-xcxwbn/

[2]: OpenCode Plan and Build agents. OpenCode Docs: Agents, 2026. https://opencode.ai/docs/agents/

[3]: Codex CLI subagents reach GA. Simon Willison, March 16, 2026. https://simonwillison.net/2026/Mar/16/codex-subagents/

[4]: Subagents in Gemini CLI v0.38.1. Google Developers Blog, April 15, 2026. https://developers.googleblog.com/subagents-have-arrived-in-gemini-cli/

[5]: Codex CLI slash commands including /plan. OpenAI Codex Docs, 2026. https://developers.openai.com/codex/cli/slash-commands

[6]: Plan Mode in Gemini CLI. Gemini CLI Docs: Plan Mode, 2026. https://geminicli.com/docs/cli/plan-mode/

[7]: ask_user tool in Gemini CLI. Gemini CLI Docs: Ask User Tool, 2026. https://geminicli.com/docs/tools/ask-user/

[8]: Codex sandbox modes. OpenAI Codex Sandboxing, 2026. https://developers.openai.com/codex/cli/sandboxing

[9]: Gemini CLI sandbox backends. Gemini CLI Docs: Sandbox, 2026. https://geminicli.com/docs/sandbox/

[10]: Claude Code skills and memory predate Gemini's /memory. Anthropic Claude Code Docs, 2025. https://code.claude.com/docs/en/skills

[11]: Memory Bank in Gemini CLI v0.39.0. Gemini CLI v0.39.0 Changelog, April 2026. https://github.com/google-gemini/gemini-cli/releases/tag/v0.39.0

[12]: MCP integration across CLIs. Model Context Protocol overview. https://en.wikipedia.org/wiki/Model_Context_Protocol

[13]: MCP server registry passes 10,000 active servers. MCP Statistics, December 2025. https://mcpevals.io/stats

[14]: Claude Code Routines research preview. Anthropic Docs, April 2026. https://docs.claude.com/en/docs/claude-code/routines

[15]: /fork and /agent slash commands. Codex Subagents Docs, OpenAI Developers, 2026. https://developers.openai.com/codex/cli/subagents

[16]: Codex approval policy and requirements.toml. OpenAI Codex Approvals & Security, 2026. https://developers.openai.com/codex/cli/approvals

[17]: Plan Mode launches in Gemini CLI v0.33.0. GitHub Discussion #22078, google-gemini/gemini-cli, March 11, 2026. https://github.com/google-gemini/gemini-cli/discussions/22078

[18]: Gemini 2.5 Pro 1M-token context. Google DeepMind Blog, March 2025. https://blog.google/technology/google-deepmind/gemini-model-thinking-updates-march-2025/

[19]: Codex models GPT-5.5 / 5.4 / 5.4 mini. OpenAI Codex Models, 2026. https://developers.openai.com/codex/cli/models


About the Author

Rick Hightower is a former Senior Distinguished Engineer at a fortune 100 focusing on delivering ML / AI insights to front line applications, and practitioner building multi-agent production systems. Follow him on Medium for more hands-on agent engineering content. You can also book him to speak and train your team: Check out Rick Hightower's SpeakerHub.

He created skilz, the universal agent skill installer, supporting 30+ coding agents including Claude Code, Gemini, Copilot, and Cursor, and co-founded the world's largest agentic skill marketplace. Connect with Rick Hightower on LinkedIn or Medium. Check out SpillWave, your source for AI expertise.

Rick has been actively developing generative AI systems, agents, and agentic workflows for years. He is the author of numerous agentic frameworks and developer tools and brings deep practical expertise to teams looking to adopt AI. He enjoys writing about himself in the 3rd person.

Rick also wrote a Claude Certified Architect (CCA) series of articles that have a lot of useful information on writing agentic AI systems. A lot of ideas captured in the CCA and the exam prep that Rick wrote echoes what you see in this article. If you want to improve your ability to create well-behaved AI agents, studying for the CCA Exam is a good place to start.

CCA Exam Prep on Agentic Development

Rick also wrote a series on harness engineering and how to improve agentic systems using harness engineering for feedback loops and adversarial agents. These articles also go hand in hand with this article.

Harness Engineering Articles

#Openai Codex #Claude Code #Gemini Cli #Open Code