Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Claude Mythos Preview | 07 Apr 2026 | 59% | Multimodal (internal implementation) | Yes | Source | |
| Claude Opus 4.7 | 16 Apr 2026 | 34.50% | - | Yes | Source | |
| Claude Sonnet 5 | 30 Jun 2026 | 28.10% | SWE-bench Multimodal; internal harness; average over five trials | Yes | Source |