Skip to main content

Artificial Analysis Omniscience Hallucination Rate (AA-Omniscience Hallucination Rate)

A display-only Artificial Analysis factuality metric for the rate of incorrect answers among non-correct responses.

Benchmark score on AA-Omniscience Hallucination Rate — June 2, 2026

BenchLM mirrors the published score view for AA-Omniscience Hallucination Rate. Command A+ leads the public snapshot at 14.1% , followed by Qwen3.7 Max (22.9%) and MiMo-V2.5-Pro (24.5%). BenchLM does not use these results to rank models overall.

115 modelsKnowledgeCurrentDisplay onlyUpdated June 2, 2026

The published AA-Omniscience Hallucination Rate snapshot is tightly clustered at the top: Command A+ sits at 14.1%, while the third row is only 10.4 points behind. The broader top-10 spread is 19.9 points, so the benchmark still separates strong models even when the leaders cluster.

115 models have been evaluated on AA-Omniscience Hallucination Rate. The benchmark falls in the Knowledge category. This category carries a 12% weight in BenchLM.ai's overall scoring system. AA-Omniscience Hallucination Rate is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About AA-Omniscience Hallucination Rate

Year

2026

Tasks

Knowledge questions

Format

Hallucination rate

Difficulty

Factuality

BenchLM marks this row lower-is-better because a lower hallucination rate is preferable, even though the OpenRouter card displays the raw percentage.

BenchLM freshness & provenance

Version

AA-Omniscience Hallucination Rate 2026

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

Benchmark score table (115 models)

1
14.1%
2
22.9%
3
24.5%
4
25.0%
5
29.4%
6
29.9%
7
31.3%
8
32.0%
9
32.9%
10
34.0%
11
34.4%
12
35.9%
13
36.2%
14
37.9%
15
39.3%
16
40.8%
17
44.2%
18
44.4%
19
48.3%
20
49.7%
21
49.9%
22
51.0%
23
51.3%
24
51.9%
25
59.8%
26
60.7%
27
60.9%
28
61.3%
29
62.2%
30
64.2%
31
64.6%
32
64.6%
33
65.9%
34
66.0%
35
66.1%
36
66.8%
37
66.8%
38
67.8%
39
67.9%
40
69.3%
42
72.8%
43
73.2%
44
73.6%
45
74.2%
46
74.4%
47
74.4%
48
75.1%
49
75.4%
50
76.0%
51
77.8%
52
77.9%
53
78.2%
54
78.3%
55
78.5%
56
79.6%
57
79.7%
58
79.7%
59
79.8%
60
80.1%
61
80.3%
62
80.4%
63
80.5%
64
80.9%
65
81.0%
66
81.6%
67
81.6%
68
81.7%
69
81.8%
70
82.0%
71
82.0%
72
82.1%
74
83.4%
75
83.5%
76
83.7%
77
84.0%
78
84.0%
79
84.4%
80
85.5%
81
85.5%
82
86.6%
83
86.6%
84
86.9%
85
86.9%
86
87.1%
87
87.3%
88
87.4%
89
88.6%
90
88.6%
91
89.1%
92
89.1%
93
89.4%
94
89.4%
95
89.5%
96
89.7%
97
89.8%
98
90.2%
99
90.3%
100
90.9%
101
90.9%
102
91.2%
103
91.5%
104
91.5%
105
92.3%
106
93.3%
107
93.5%
108
93.5%
109
93.5%
110
94.0%
111
94.1%
112
94.4%
113
95.8%
114
95.8%
115
97.0%

FAQ

What does AA-Omniscience Hallucination Rate measure?

A display-only Artificial Analysis factuality metric for the rate of incorrect answers among non-correct responses.

Which model scores highest on AA-Omniscience Hallucination Rate?

Command A+ by Cohere currently leads with a score of 14.1% on AA-Omniscience Hallucination Rate.

How many models are evaluated on AA-Omniscience Hallucination Rate?

115 AI models have been evaluated on AA-Omniscience Hallucination Rate on BenchLM.

Last updated: June 2, 2026 · BenchLM version AA-Omniscience Hallucination Rate 2026

The AI models change fast. We track them for you.

For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.

Free. No spam. Unsubscribe anytime.