Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Opus 4.7 | 16 Apr 2026 | 80.60% | - | Yes | Source | |
| Claude Sonnet 5 | 30 Jun 2026 | 59.40% | Exact-match accuracy on Anthropic internal agentic harness; mean of five trials | Yes | Source |