Benchmark profile

Artificial Analysis Omniscience Accuracy (AA-Omniscience Accuracy)

A display-only Artificial Analysis knowledge metric for the proportion of correctly answered questions.

Data verified July 23, 2026

Benchmark score on AA-Omniscience Accuracy — July 23, 2026

BenchLM mirrors the published score view for AA-Omniscience Accuracy. Claude Fable 5 leads the public snapshot at 61.4% , followed by GPT-5.6 Sol (58.5%) and GPT-5.5 (56.9%). BenchLM does not use these results to rank models overall.

1Closed

Claude Fable 5

Anthropic

claude-fable-5

61.4%

Overall 83.68Context 1M+

2Closed

GPT-5.6 Sol

OpenAI

gpt-5-6-sol

58.5%

Overall 81.96Context 1M

3Closed

GPT-5.5

OpenAI

gpt-5-5

56.9%

Overall 73.51Context 1M

153 modelsKnowledgeCurrentDisplay onlyUpdated July 23, 2026

Benchmark score table (153 models)

Score

Claude Fable 5Anthropic · Closed

61.4%

GPT-5.6 SolOpenAI · Closed

58.5%

GPT-5.5OpenAI · Closed

56.9%

Gemini 3 ProGoogle · Closed

55.9%

Gemini 3.1 ProGoogle · Closed

55.3%

Grok 4.5xAI · Closed

52.1%

Gemini 3.5 FlashGoogle · Closed

51.9%

GPT-5.3 CodexOpenAI · Closed

51.8%

GPT-5.3-Codex-SparkOpenAI · Closed

51.8%

Gemini 3.6 FlashGoogle · Closed

50.2%

GPT-5.4OpenAI · Closed

50.0%

Claude Opus 4.8Anthropic · Closed

46.6%

Claude Opus 4.6 (Adaptive)Anthropic · Closed

46.4%

Kimi K3Moonshot AI · Closed

46.0%

GPT-5.6 TerraOpenAI · Closed

45.9%

Claude Opus 4.7 (Adaptive)Anthropic · Closed

45.8%

Claude Opus 4.5 ThinkingAnthropic · Closed

45.7%

Gemini 3 FlashGoogle · Closed

45.5%

Claude Opus 4.6Anthropic · Closed

45.2%

Muse SparkMeta · Closed

44.6%

GPT-5.2OpenAI · Closed

43.8%

Claude Opus 4.7Anthropic · Closed

43.5%

DeepSeek V4 Pro (Max)DeepSeek · Open weight

43.3%

DeepSeek V4 Pro (High)DeepSeek · Open weight

41.8%

GPT-5.6 LunaOpenAI · Closed

41.5%

Grok 4xAI · Closed

41.4%

Claude Opus 4.5Anthropic · Closed

40.7%

GPT-5 (high)OpenAI · Closed

40.7%

GPT-5.2-CodexOpenAI · Closed

40.7%

Muse Spark 1.1Meta · Closed

40.6%

InklingThinking Machines Lab · Open weight

40.0%

GPT-5.1-Codex-MaxOpenAI · Closed

39.2%

GPT-5.1-CodexOpenAI · Closed

39.2%

Gemini 2.5 ProGoogle · Closed

39.0%

GPT-5 (medium)OpenAI · Closed

38.9%

Kimi K2.7 CodeMoonshot AI · Open weight

38.6%

o3OpenAI · Closed

38.4%

Claude Sonnet 5Anthropic · Closed

38.3%

Claude Sonnet 4.6Anthropic · Closed

38.0%

Qwen 3.6 Max (preview)Alibaba · Closed

37.7%

GPT-5.1OpenAI · Closed

37.6%

GPT-5.4 miniOpenAI · Closed

37.5%

DeepSeek V4 Flash (Max)DeepSeek · Open weight

37.2%

Gemini 3.1 Flash-LiteGoogle · Closed

36.4%

DeepSeek V4 Flash (High)DeepSeek · Open weight

35.5%

o1OpenAI · Closed

34.7%

Grok 4.3xAI · Closed

34.6%

Kimi K2.5Moonshot AI · Open weight

34.3%

Kimi K2.5 (Reasoning)Moonshot AI · Closed

34.3%

Kimi K2.6Moonshot AI · Open weight

32.8%

Hy3 PreviewTencent · Open weight

31.5%

Hy3Tencent · Open weight

31.5%

Qwen3.5 397BAlibaba · Open weight

31.4%

Qwen3.5 397B (Reasoning)Alibaba · Open weight

31.4%

DeepSeek-R1DeepSeek · Open weight

31.0%

Gemini 3.5 Flash-LiteGoogle · Closed

30.3%

Qwen3.7 MaxAlibaba · Closed

30.1%

GLM-4.7Z.AI · Open weight

29.3%

GLM-5V-TurboZ.AI · Closed

29.1%

GLM-5-TurboZ.AI · Closed

29.0%

DeepSeek V3.1 (Reasoning)DeepSeek · Open weight

28.8%

GLM-5Z.AI · Open weight

26.9%

Kimi K2Moonshot AI · Closed

26.8%

MiMo-V2-ProXiaomi · Closed

26.8%

Gemini 2.5 FlashGoogle · Closed

26.5%

Qwen3.6 PlusAlibaba · Closed

26.2%

MiniMax M2.5MiniMax · Closed

26.2%

MiniMax M2.7MiniMax · Open weight

26.1%

GPT-5.4 nanoOpenAI · Closed

25.4%

Step 3.7 FlashStepFun · Open weight

25.4%

DeepSeek V3DeepSeek · Open weight

25.4%

Grok 4.1 Fast (Reasoning)xAI · Closed

25.3%

GLM-5.2Z.AI · Open weight

25.1%

Mistral Medium 3.5 128BMistral · Open weight

25.1%

Step 3.5 FlashStepFun · Open weight

25.0%

Qwen3.5-122B-A10BAlibaba · Open weight

24.7%

Qwen3 MaxAlibaba · Closed

24.4%

Llama 4 MaverickMeta · Open weight

24.3%

GLM-5.1Z.AI · Open weight

24.2%

DeepSeek V3.2DeepSeek · Open weight

24.2%

GPT-4.1OpenAI · Closed

24.2%

Mistral Large 3Mistral · Closed

24.1%

GPT-5 miniOpenAI · Closed

24.0%

Nemotron 3 Super 120B A12BNVIDIA · Open weight

24.0%

Grok Code Fast 1xAI · Closed

23.8%

DeepSeek V3.1DeepSeek · Open weight

23.1%

Trinity-Large-PreviewArcee AI · Open weight

22.8%

Trinity-Large-ThinkingArcee AI · Open weight

22.8%

MiMo-V2.5-ProXiaomi · Closed

22.6%

Grok 4 Fast (Reasoning)xAI · Closed

22.6%

Claude 4 SonnetAnthropic · Closed

22.4%

Llama 3.1 405BMeta · Open weight

22.3%

Qwen3.7 PlusAlibaba · Closed

22.2%

Mistral Small 4Mistral · Open weight

22.1%

Mistral Small 4 (Reasoning)Mistral · Open weight

22.1%

Nemotron 3 UltraNVIDIA · Open weight

21.6%

GPT-OSS 120BOpenAI · Open weight

21.5%

MiniMax M1 80kMiniMax · Closed

21.1%

Qwen3.5-27BAlibaba · Open weight

21.0%

100

GLM-4.6Z.AI · Open weight

20.8%

101

Qwen3.5-35B-A3BAlibaba · Open weight

20.5%

102

Mercury 2Inception · Closed

20.5%

103

Mistral Large 2Mistral · Closed

20.1%

104

Gemma 4 31BGoogle · Open weight

19.9%

105

Nemotron Ultra 253BNVIDIA · Open weight

19.9%

106

GPT-4oOpenAI · Closed

19.7%

107

Qwen3.6-27BAlibaba · Open weight

19.2%

108

Qwen3.6-35B-A3BAlibaba · Open weight

18.9%

109

MiMo-V2-OmniXiaomi · Closed

18.7%

110

Mistral Medium 3Mistral · Closed

18.3%

111

GPT-5 nanoOpenAI · Closed

18.3%

112

Gemma 4 26B A4BGoogle · Open weight

18.2%

113

Sarvam 105BSarvam · Open weight

17.6%

114

GPT-4.1 miniOpenAI · Closed

17.5%

115

Claude 3 HaikuAnthropic · Closed

17.2%

116

Nemotron 3 Nano 30BNVIDIA · Open weight

17.1%

117

Grok 4.1 FastxAI · Closed

17.0%

118

Nova ProAmazon · Closed

17.0%

119

K-ExaoneLG AI Research · Closed

16.5%

120

Gemma 4 12BGoogle · Open weight

16.0%

121

GLM-4.7-FlashZ.AI · Open weight

15.9%

122

Solar Pro 2Upstage · Closed

15.6%

123

GLM-4.5-AirZ.AI · Closed

15.5%

124

GPT-OSS 20BOpenAI · Open weight

15.5%

125

Ling 2.6 FlashInclusionAI · Open weight

15.4%

126

MiMo-V2-FlashXiaomi · Open weight

15.2%

127

MiniMax M3MiniMax · Open weight

15.0%

128

Nemotron 3 Nano Omni 30B A3BNVIDIA · Open weight

14.8%

129

Llama 4 ScoutMeta · Open weight

14.6%

130

GPT-4.1 nanoOpenAI · Closed

13.3%

131

Phi-4Microsoft · Open weight

13.2%

132

Sarvam 30BSarvam · Open weight

12.7%

133

Gemma 3 27BGoogle · Open weight

12.5%

134

Ministral 3 14B (Reasoning)Mistral · Open weight

12.3%

135

Ministral 3 14BMistral · Open weight

12.3%

136

Ministral 3 8B (Reasoning)Mistral · Open weight

11.2%

137

Ministral 3 8BMistral · Open weight

11.2%

138

Exaone 4.0 32BLG AI Research · Open weight

10.4%

139

LFM2.5-8B-A1BLiquidAI · Open weight

9.4%

140

Command A+Cohere · Open weight

8.9%

141

Gemma 4 E4BGoogle · Open weight

8.6%

142

Ministral 3 3B (Reasoning)Mistral · Open weight

7.6%

143

Ministral 3 3BMistral · Open weight

7.6%

144

Gemma 4 E2BGoogle · Open weight

6.7%

145

LFM2.5-1.2B-ThinkingLiquidAI · Closed

6.6%

146

LFM2-24B-A2BLiquidAI · Closed

6.4%

147

Granite-4.0-1BIBM · Open weight

6.1%

148

LFM2.5-1.2B-InstructLiquidAI · Closed

6.0%

149

Granite-4.0-H-1BIBM · Open weight

5.3%

150

LFM2.5-VL-1.6B-ExtractLiquidAI · Open weight

5.2%

151

Exaone 4.0 1.2BLG AI Research · Open weight

4.7%

152

Granite-4.0-H-350MIBM · Open weight

3.7%

153

Granite-4.0-350MIBM · Open weight

3.2%

The published AA-Omniscience Accuracy snapshot places Claude Fable 5 first at 61.4%. The third row is 4.5 points behind. The broader top-10 range is 11.2 points, so the table still separates the published systems.

153 models have been evaluated on AA-Omniscience Accuracy. The benchmark falls in the Knowledge category. This category carries a 12% weight in BenchLM.ai's overall scoring system. AA-Omniscience Accuracy is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About AA-Omniscience Accuracy

Year

2026

Tasks

Knowledge questions

Format

Accuracy

Difficulty

Broad knowledge

BenchLM stores AA-Omniscience Accuracy as a display-only row when a model page publishes the exact Artificial Analysis benchmark card value.

Artificial Analysis model benchmarks

BenchLM freshness & provenance

Version

AA-Omniscience Accuracy 2026

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

FAQ

What does AA-Omniscience Accuracy measure?

A display-only Artificial Analysis knowledge metric for the proportion of correctly answered questions.

Which model scores highest on AA-Omniscience Accuracy?

Claude Fable 5 by Anthropic currently leads with a score of 61.4% on AA-Omniscience Accuracy.

How many models are evaluated on AA-Omniscience Accuracy?

153 AI models have been evaluated on AA-Omniscience Accuracy on BenchLM.

Compare Top Models on AA-Omniscience Accuracy

Claude Fable 5 vs GPT-5.6 Sol GPT-5.6 Sol vs GPT-5.5 GPT-5.5 vs Gemini 3 Pro Gemini 3 Pro vs Gemini 3.1 Pro

Last updated: July 23, 2026 · BenchLM version AA-Omniscience Accuracy 2026

Choose a model with this week’s evidence

Join 2,000+ readers for ranking moves, pricing changes, and the claims that still need proof.

One email each week. Unsubscribe anytime.