Individual benchmark scores plotted by date.
| Organisation | Model | Reported | Top Score | Info | Self Reported | Source |
|---|---|---|---|---|---|---|
| Gemini 3.1 Pro Preview | 19 Feb 2026 | 59% | - | Yes | - | |
| Seed 2.0 Lite | 14 Feb 2026 | 52.40% | Seed2 official benchmark table | Scicode | Yes | Source | |
| Seed 2.0 Pro | 14 Feb 2026 | 52.10% | Seed2 official benchmark table | Scicode | Yes | Source | |
| Kimi K2.5 | 27 Jan 2026 | 48.70% | - | Yes | Source | |
| Grok 4.20 | 17 Feb 2026 | 45.60% | Artificial Analysis structured model metrics | No | Source | |
| Kimi K2 Thinking | 06 Nov 2025 | 44.80% | inferred alias from kimi-k2-thinking-0905 | Yes | Source | |
| GLM 5 Turbo | 15 Mar 2026 | 43.60% | Artificial Analysis structured model metrics | No | Source | |
| Nemotron 3 Super | 11 Mar 2026 | 42.05% | - | Yes | Source | |
| GLM 4.5 | 28 Jul 2025 | 41.70% | - | Yes | Source | |
| MiniMax M2.1 | 23 Dec 2025 | 39% | - | Yes | Source | |
| Mercury 2 | 24 Feb 2026 | 38% | - | Yes | Source | |
| GLM 4.5 Air | 28 Jul 2025 | 37.30% | - | Yes | Source | |
| MiniMax M2 | 27 Oct 2025 | 36% | - | Yes | - | |
| MiniMax M2 Her | 24 Jan 2026 | 36% | inferred modality/version alias from minimax-m2 | Yes | - | |
| Nemotron Nano 3 30B A3B | 15 Dec 2025 | 33.30% | - | Yes | Source | |
| Solar Pro 3 (2026-01-26) | 26 Jan 2026 | 24.70% | Artificial Analysis structured model metrics | No | Source |