Docs
Search
Ctrl K
Docs
Search
Ctrl K
Models
Playground
Compare
Providers
Apps
Rankings
Sign In
AI2 Reasoning Challenge (ARC) - Benchmark Leaderboard & Model Performance | AI Stats
AI2 Reasoning Challenge (ARC)
Overview
Overview
Type: numerical
General
Recorded Results
1
Average Score
0.96
Score Range
0.96 - 0.96
Leading Model
0.96 - GPT 4 32K 0613
Scores Over Time
Individual benchmark scores plotted by date.
Models Using This Benchmark
Organisation
Model
Reported
Top Score
Info
Self Reported
Source
OpenAI
GPT 4 32K 0613
13 Jun 2023
0.96
inferred high-confidence family alias from gpt-4-0613 (score=0.4899; benches=12)
Yes
Source