AI Engine Comparison

Last updated: 2026-03-12

GitHub Agentic Workflows supports four AI engines for running agentic workflows. Choosing the right engine affects research quality, cost, and reliability. Here's how they compare for use in AgentPages.

The Four Engines

Engine	`engine.id`	Provider	Required Secret
GitHub Copilot CLI (default)	`copilot`	GitHub/Microsoft	`COPILOT_GITHUB_TOKEN`
Claude Code	`claude`	Anthropic	`ANTHROPIC_API_KEY`
OpenAI Codex	`codex`	OpenAI	`OPENAI_API_KEY`
Google Gemini CLI	`gemini`	Google	`GEMINI_API_KEY`

Copilot is the default. Omit the engine: block entirely to use it.

GitHub Copilot CLI

The easiest to set up if you already have a GitHub Copilot subscription — no extra API key needed.

Uses your existing GitHub token (COPILOT_GITHUB_TOKEN)
Tight integration with GitHub's ecosystem
Supports custom agent files in .github/agents/ for specialized system prompts
Billing through GitHub rather than a separate API provider

engine:
  id: copilot
  model: gpt-5           # or gpt-5-mini, gpt-4.1, gpt-4.1-mini
  agent: research-agent  # references .github/agents/research-agent.agent.md

Claude (Claude Code)

The default engine in the official AgentPages template. Anthropic's Claude is a popular choice for research agents due to its strong long-context reasoning and writing quality.

Excellent synthesis of complex, multi-source research
Wide model range: Opus (frontier) → Sonnet (balanced) → Haiku (budget)
Strong instruction-following for nuanced research prompts

engine:
  id: claude
  model: claude-sonnet-4-6   # recommended for research
  # model: claude-opus-4-6   # highest quality, highest cost
  # model: claude-haiku-4-5  # fastest, lowest cost

Choosing a Claude Model

Model	Best for
`claude-opus-4-6`	Complex multi-step research synthesis
`claude-sonnet-4-6`	Balanced quality/cost — AgentPages default
`claude-haiku-4-5`	Triage, labeling, quick summaries

OpenAI Codex

OpenAI's Codex engine gives access to the GPT-5 series and o-series reasoning models.

Access to GPT-5, GPT-4.1, and o-series reasoning models
Strong at code-heavy tasks and structured data extraction
o-series models excel at multi-step planning

engine:
  id: codex
  model: gpt-5       # frontier
  # model: gpt-4.1   # balanced
  # model: gpt-4.1-mini  # budget

Google Gemini CLI

Google's Gemini engine — the newest addition to gh-aw.

Very large context windows — excellent for large knowledge bases
Competitive pricing, especially for the Flash tier
Newest engine, less battle-tested in production gh-aw use

engine:
  id: gemini
  model: gemini-3-pro-preview  # or gemini-2.5-flash for budget

Which Engine Should You Use?

Use Case	Recommended	Why
Getting started quickly	Copilot	No extra API key needed
Best research quality	Claude Sonnet/Opus	Strong synthesis and writing
Budget-conscious operation	Claude Haiku or GPT-4.1 mini	Much lower cost per run
Code-heavy research topics	Codex (GPT-5)	Strong at technical content
Large knowledge bases	Gemini Pro/Flash	Very large context window

Version Pinning

For reproducible builds, pin the engine CLI version:

engine:
  id: claude
  version: "2.1.70"    # pin to a specific Claude Code CLI release
  model: claude-sonnet-4-6

This prevents unexpected behavior from new CLI releases. Remember to update periodically to get bug fixes.

Switching Engines

After changing engine: in your workflow .md file, you must recompile:

gh aw compile .github/workflows/research.md
git add .github/workflows/research.md .github/workflows/research.md.lock.yml
git commit -m "Switch to Claude engine"
git push

The lock file embeds engine-specific configuration and must be regenerated on every engine change.

Cost Monitoring

Use these commands to track usage and cost:

# View logs for recent runs
gh aw logs

# Deep-dive into a specific run's token usage
gh aw audit <run-id>