Code Arena

Compare the performance of AI models on agentic coding tasks involving multi-step reasoning and tool use

Last Updated

Jan 23, 2026

Total Votes

118,204

Total Models

35

Rank Spread
1
1◄─►1
1504+10/-107,543
Anthropic
Proprietary
2
2◄─►5
1475+16/-161,691
OpenAI
Proprietary
3
2◄─►5
1467+9/-97,900
Anthropic
Proprietary
4
2◄─►6
1462+8/-814,043
Google
Proprietary
5
2◄─►6
1454+9/-98,389
Google
Proprietary
6
4◄─►6
1445+10/-105,650
Z.ai
MIT
7
7◄─►10
1414+9/-97,201
MiniMax
MIT
8
7◄─►10
1412+10/-105,430
Google
Proprietary
9
7◄─►15
1399+15/-151,632
OpenAI
Proprietary
10
7◄─►15
1397+12/-123,929
OpenAI
Proprietary
11
9◄─►15
1392+9/-96,594
OpenAI
Proprietary
12
9◄─►15
1392+8/-89,124
Anthropic
Proprietary
13
9◄─►15
1390+8/-811,001
Anthropic
Proprietary
14
9◄─►15
1386+8/-812,662
Anthropic
Proprietary
15
9◄─►16
1377+11/-113,552
DeepSeek
MIT
16
15◄─►19
1358+8/-88,890
Z.ai
MIT
17
16◄─►19
1355+8/-89,917
OpenAI
Proprietary
18
16◄─►20
1351+10/-103,943
Xiaomi
MIT
19
16◄─►21
1344+13/-132,500
OpenAI
Proprietary
20
18◄─►21
1334+9/-96,661
OpenAI
Proprietary
21
19◄─►21
1333+8/-89,556
Moonshot
Modified MIT
22
22◄─►23
Minimax
1316+8/-88,997
MiniMax
Apache 2.0
23
22◄─►26
1299+10/-104,581
DeepSeek
MIT
24
23◄─►26
1298+8/-810,767
Anthropic
Proprietary
25
23◄─►26
1289+10/-105,133
DeepSeek
MIT
26
23◄─►26
1287+8/-810,516
Alibaba
Apache 2.0
27
27◄─►29
1262+15/-151,956
KwaiKAT
Proprietary
28
27◄─►30
1247+17/-171,538
OpenAI
Proprietary
29
27◄─►30
1240+11/-115,127
xAI
Proprietary
30
28◄─►32
1225+20/-201,037
Mistral
Apache 2.0
31
30◄─►32
1209+13/-133,454
Google
Proprietary
32
30◄─►32
1208+19/-191,266
xAI
Proprietary
33
33◄─►34
1156+22/-22970
xAI
Proprietary
34
33◄─►35
1143+21/-211,017
xAI
Proprietary
35
34◄─►35
1101+22/-221,020
Mistral
Proprietary

Remove Style Control Leaderboard Plots

Battle Count for Each Combination of Models (without Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Confidence Intervals on Model Strength (via Bootstrapping)

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)