anthropic/claude-opus-4.6Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks.
Scores on standardized evaluations. Higher is better — and rank shows where Claude Opus 4.6 lands among all models tracked on Model Beat.
Abstract visual reasoning
Harder abstract reasoning
Graduate-level scientific reasoning
Frontier of human expert knowledge
Common-sense trick questions
Factual accuracy & hallucination
Novel ML problem-solving
Code performance optimization
Real GitHub issue resolution
Human-rated web development
Olympiad-qualifier math
Research-level math problems
Hardest research math
Multi-step agentic tasks
Autonomous task length
Command-line agentic tasks
Model & benchmark data from Epoch AI (CC BY); pricing, specs & descriptions from OpenRouter.