For frontier AI news
Powered by Code Arena

WebDev Leaderboard

Compare the performance of AI models for web development tasks built in the Code Arena

Last Updated

Dec 29, 2025

Total Votes

82,171

Total Models

33

Rank Spread
1
1◄─►1
1512+11/-114,564
Anthropic
Proprietary
2
2◄─►5
1480+17/-171,647
OpenAI
Proprietary
3
2◄─►4
1479+11/-114,468
Anthropic
Proprietary
4
2◄─►5
1471+9/-99,824
Google
Proprietary
5
3◄─►6
1454+12/-123,053
Google
Proprietary
6
5◄─►6
1441+13/-132,270
Z.ai
MIT
7
7◄─►13
1395+12/-123,949
OpenAI
Proprietary
8
7◄─►13
1394+15/-151,641
OpenAI
Proprietary
9
7◄─►13
1391+9/-98,616
Anthropic
Proprietary
10
7◄─►13
1387+10/-105,626
OpenAI
Proprietary
11
7◄─►13
1387+9/-99,698
Anthropic
Proprietary
12
7◄─►13
1386+9/-98,210
Anthropic
Proprietary
13
7◄─►15
1377+14/-141,885
Google
Proprietary
14
13◄─►16
1366+9/-97,921
Z.ai
MIT
15
13◄─►17
1361+15/-151,753
DeepSeek AI
MIT
16
14◄─►18
1353+9/-97,544
OpenAI
Proprietary
17
15◄─►19
1342+15/-151,555
Xiaomi
MIT
18
16◄─►19
1337+9/-97,336
Moonshot
Modified MIT
19
17◄─►20
1331+10/-105,719
OpenAI
Proprietary
20
19◄─►20
Minimax
1313+9/-98,023
MiniMax
Apache 2.0
21
21◄─►24
1290+10/-105,162
DeepSeek AI
MIT
22
21◄─►24
1286+9/-98,276
Anthropic
Proprietary
23
21◄─►25
1286+13/-132,155
DeepSeek AI
MIT
24
21◄─►24
1285+9/-98,199
Alibaba
Apache 2.0
25
24◄─►26
1260+15/-151,946
KwaiKAT
Proprietary
26
25◄─►28
1247+17/-171,566
OpenAI
Proprietary
27
26◄─►30
1223+13/-133,721
xAI
Proprietary
28
26◄─►30
1222+20/-201,027
Mistral
Apache 2.0
29
27◄─►30
1209+13/-133,505
Google
Proprietary
30
27◄─►30
1202+19/-191,262
xAI
Proprietary
31
31◄─►32
1149+23/-23945
xAI
Proprietary
32
31◄─►33
1139+21/-211,014
xAI
Proprietary
33
32◄─►33
1099+22/-221,033
Mistral
Proprietary

Remove Style Control Leaderboard Plots

Average Win Rate Against All Other Models (Uniform Sampling and No Ties)

Fraction of Model A Wins for All Non-tied A vs. B Battles

Battle Count for Each Combination of Models (without Ties)

Confidence Intervals on Model Strength (via Bootstrapping)