BenchLM recommendation

Best Document AI Models in 2026

Data verified July 23, 2026

As of July 23, 2026, the top model in best document ai models on the BenchLM leaderboard is Qwen3.7 Plus with a score of 90.3.

Last verified: July 23, 2026

This reporting page focuses on document workflows: OCR, office artifacts, long documents, and enterprise-style grounded reasoning over PDFs, slides, tables, and screenshots.

This page ranks models using only sourced document AI and OCR benchmarks in the reporting family.

Bottom line: Document AI benchmarks (OfficeQA Pro, OmniDocBench, CC-OCR) are newer and coverage is still building. Check the multimodal leaderboard for broader document understanding.

Qwen3.7 Plus leads this ranking with a score of 90.3, followed by Qwen3.6-35B-A3B (86.5) and Qwen3.6-27B (82.2). There is a significant gap between the leading models and the rest of the field.

The best open-weight option is Qwen3.6-35B-A3B (ranked #2 with a score of 86.5). Open-weight models are highly competitive in this category — self-hosting is a viable alternative to proprietary APIs.

This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

1Closed

Qwen3.7 Plus

Alibaba · 1M

90.3sourced avg

2Open

Qwen3.6-35B-A3B

Alibaba · 262K

86.5sourced avg

3Open

Qwen3.6-27B

Alibaba · 262K

82.2sourced avg

How to choose

PDF and document processing?

Check multimodal scores — MMMU-Pro includes document reasoning

OCR-heavy workflows?

CC-OCR coverage is growing — check back soon

Enterprise document pipelines?

Claude and GPT-5 lead on multimodal document tasks

Full Rankings (4 models)

1

Qwen3.7 Plus

Alibaba·Proprietary·1M

90.3

sourced avg

vs #2

2

Qwen3.6-35B-A3B

Alibaba·Open Weight·262K

86.5

sourced avg

vs #3

3

Qwen3.6-27B

Alibaba·Open Weight·262K

82.2

sourced avg

vs #4

4

MiniMax M3

MiniMax·Open Weight·1M

68.4

sourced avg

Key Takeaways

The top model on this sourced reporting-family slice is Qwen3.7 Plus by Alibaba with an average of 90.3.

The best open-weight model is Qwen3.6-35B-A3B at position #2.

4 models are listed with sourced benchmark coverage in this reporting family.

Score in Context

What these scores mean

This is a reporting family ranking, not a weighted category. It averages sourced document AI and OCR benchmarks to give a focused view of this capability.

Known limitations

Models must have sourced results on at least a quarter of the benchmarks in this family to be included. Coverage varies — a model with 2 benchmark scores is less reliable than one with 5.

Explore More

Last updated: July 23, 2026

Choose a model with this week’s evidence

Join 2,000+ readers for ranking moves, pricing changes, and the claims that still need proof.

One email each week. Unsubscribe anytime.