ModelBeat
Models/Claude Opus 4.6
All models
A

Anthropic: Claude Opus 4.6

anthropic/claude-opus-4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.

Context
1M
Input
$5 / 1M
Output
$25 / 1M
Released
Feb 5, 2026
Modalities
textimagefiletext

Benchmarks

Scores on standardized evaluations. Higher is better — and rank shows where Claude Opus 4.6 lands among all models tracked on Model Beat.

84.0
Intelligence Index
Epoch AI
84th percentile of tracked models
92.0
Coding Index
Epoch AI
92nd percentile of tracked models
88.0
Agentic Index
Epoch AI
88th percentile of tracked models

Reasoning

7 evals
ARC-AGI94.0%

Abstract visual reasoning

ARC-AGI-269.2%

Harder abstract reasoning

GPQA Diamond90.5%

Graduate-level scientific reasoning

Humanity's Last Exam34.4%

Frontier of human expert knowledge

SimpleBench67.6%

Common-sense trick questions

SimpleQA Verified46.5%

Factual accuracy & hallucination

WeirdML78.0%

Novel ML problem-solving

Coding

3 evals
GSO (code optimization)41.2%

Code performance optimization

SWE-bench Verified78.7%

Real GitHub issue resolution

WebDev Arena1556

Human-rated web development

Math

3 evals
AIME 2024/202594.4%

Olympiad-qualifier math

FrontierMath40.7%

Research-level math problems

FrontierMath Tier 422.9%

Hardest research math

Agentic & Tools

3 evals
APEX31.7%

Multi-step agentic tasks

METR task horizon12.0 h

Autonomous task length

Terminal-Bench79.8%

Command-line agentic tasks

Model & benchmark data from Epoch AI (CC BY); pricing, specs & descriptions from OpenRouter.