Benchmark profile

OmniDocBench 1.5

A document understanding benchmark used in frontier-model comparison tables to measure extraction and grounded reasoning quality on complex documents.

Data verified July 23, 2026

Benchmark score on OmniDocBench 1.5 — July 23, 2026

BenchLM mirrors the published score view for OmniDocBench 1.5. MiniMax M3 leads the public snapshot at 91.6% , followed by Qwen3.7 Plus (91.4%) and Qwen3.6-35B-A3B (89.9%). BenchLM does not use these results to rank models overall.

1Open

MiniMax M3

MiniMax

minimax-m3

91.6%

Overall 69.75Context 1M

2Closed

Qwen3.7 Plus

Alibaba

qwen3-7-plus

91.4%

Overall 67.22Context 1M

3Open

Qwen3.6-35B-A3B

Alibaba

qwen3-6-35b-a3b

89.9%

Overall 51.47Context 262K

3 modelsMultimodal & GroundedCurrentDisplay onlyUpdated July 23, 2026

Benchmark score table (3 models)

Score

MiniMax M3MiniMax · Open weight

91.6%

Qwen3.7 PlusAlibaba · Closed

91.4%

Qwen3.6-35B-A3BAlibaba · Open weight

89.9%

The published OmniDocBench 1.5 snapshot places MiniMax M3 first at 91.6%. The third row is 1.7 points behind. The broader top-10 range is 1.7 points, so many of the published results sit in a relatively narrow band.

3 models have been evaluated on OmniDocBench 1.5. The benchmark falls in the Multimodal & Grounded category. This category carries a 12% weight in BenchLM.ai's overall scoring system. OmniDocBench 1.5 is currently displayed for reference but excluded from the scoring formula, so it does not directly affect overall rankings.

About OmniDocBench 1.5

Year

2026

Tasks

Document understanding tasks

Format

Document understanding benchmark

Difficulty

Grounded document reasoning

BenchLM stores OmniDocBench 1.5 as the higher-is-better score format used in current first-party comparison tables. Earlier low-is-better error-style rows are intentionally not mixed into this benchmark key.

Introducing GPT-5.4 mini and nano

BenchLM freshness & provenance

Version

OmniDocBench 1.5 2026

Refresh cadence

Quarterly

Staleness state

Current

Question availability

Public benchmark set

CurrentDisplay only

BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.

FAQ

What does OmniDocBench 1.5 measure?

A document understanding benchmark used in frontier-model comparison tables to measure extraction and grounded reasoning quality on complex documents.

Which model scores highest on OmniDocBench 1.5?

MiniMax M3 by MiniMax currently leads with a score of 91.6% on OmniDocBench 1.5.

How many models are evaluated on OmniDocBench 1.5?

3 AI models have been evaluated on OmniDocBench 1.5 on BenchLM.

Compare Top Models on OmniDocBench 1.5

MiniMax M3 vs Qwen3.7 Plus Qwen3.7 Plus vs Qwen3.6-35B-A3B

Last updated: July 23, 2026 · BenchLM version OmniDocBench 1.5 2026

Choose a model with this week’s evidence

Join 2,000+ readers for ranking moves, pricing changes, and the claims that still need proof.

One email each week. Unsubscribe anytime.