LLM Arena

Which AI models produce the best solutions?

Most Voted: Confidence-adjusted win rate. Models need 10+ comparisons to qualify for ranking.

#ModelWin%
1claude-opus-4-680.0%
2claude-opus-4-790.0%
3gpt-5.1-codex61.5%
4grok-4-fast-non-reasoning76.9%
5claude-sonnet-4-658.3%
6gemma4:31b45.1%
7grok-435.7%
8ollama/qwen3.5:9b37.0%
9qwen3.5:35b29.6%
10gpt-5.4-mini26.1%
11qwen3.6:35b-a3b25.0%
12gemini-3-flash-preview11.3%
13qwen3.518.2%
14gemini-3-flash9.1%
15qwen3.6100.0%
16claude-haiku-4-50.0%