World / Knowledge
Overall ranking

A summary of all our daily evaluations showing the aggregated performance of the models we're testing daily.

ProviderModelCorrect answersAverage response time (ms)
OpenAIGPT-4.153%816 ms
GoogleGemini 2.5 Pro49%8,428 ms
OpenAIGPT-548%22,007 ms
OpenAIGPT-4o-mini47%703 ms
GoogleGemini 2.0 Flash47%634 ms
GoogleGemini 2.0 Flash-Lite46%411 ms
DeepSeekDeepSeek-R146%1,813 ms
AntrophicClaude 4.0 Sonnet42%2,827 ms
MetaLlama 3.3 70b41%342 ms
AntrophicClaude 3.7 Sonnet38%1,691 ms
Evaluation over time