AI Model Comparison

Claude vs ChatGPT vs Gemini

A concise public benchmark comparison of latest Claude, ChatGPT/GPT, and Gemini models, with what the scores mean for coding, automation, reasoning, and cost per task.

Updated 2026-06-01

Current comparison

Latest public Claude, ChatGPT, and Gemini

The table uses public benchmark rows and excludes unreleased, preview, alpha, beta, internal, research, and prototype models before selecting each provider.

Live Claude, ChatGPT, and Gemini rows load from public benchmark feeds when available.

How to read it

What the scores mean

Overall score is a shortlist signal. Coding, agentic workflow, reasoning, context, and token price decide whether the right model changes by task.

Coding is not one number

A high coding score is the first filter, not the final decision. Repo repair, terminal-agent behavior, long context, and local tests decide whether the model can actually move a codebase.

Agentic work changes the winner

Workflow automation rewards tool selection, persistence, browser or terminal actions, and recovery from bad intermediate results. That can favor a different model than raw coding or reasoning.

Cost means completed task

Token price matters only after accounting for retries, long context, output length, cached input, tool calls, and human correction. Cheap tokens can be expensive if the task fails.

Method

Use this as a model shortlist

For production, choose the model by task: repo repair, automation, retrieval, regulated-domain work, latency, and cost per completed task.

This page is intentionally narrow. It compares the latest public Claude, ChatGPT/GPT, and Gemini rows, then points back to the full benchmark guide for benchmark-family selection, cost interpretation, and local evaluation design.

Read the benchmark guide ->

Inspect first

Sources

Third-party data note: live rows come from public benchmark and pricing feeds, not internal Dreamers testing. Preview, alpha, beta, internal, research, and prototype rows are excluded before leaders are shown.