Tool-free variant of CharXiv that isolates raw visual reasoning ability without code execution or tool augmentation.
As of June 2, 2026, Claude Mythos Preview leads the CharXiv w/o tools leaderboard with 86.1% , followed by Claude Opus 4.7 (Adaptive) (82.1%) and Claude Opus 4.8 (80.5%).
Claude Mythos Preview
Anthropic
Claude Opus 4.7 (Adaptive)
Anthropic
Claude Opus 4.8
Anthropic
According to BenchLM.ai, Claude Mythos Preview leads the CharXiv w/o tools benchmark with a score of 86.1%, followed by Claude Opus 4.7 (Adaptive) (82.1%) and Claude Opus 4.8 (80.5%). The scores show moderate spread, with meaningful differences between the top tier and mid-tier models.
3 models have been evaluated on CharXiv w/o tools. The benchmark falls in the Multimodal & Grounded category. This category carries a 12% weight in BenchLM.ai's overall scoring system. Within that category, CharXiv w/o tools contributes 5% of the category score, so strong performance here directly affects a model's overall ranking.
Year
2024
Tasks
Scientific chart reasoning (tool-free)
Format
Chart understanding without tools
Difficulty
Scientific visualization reasoning
The tool-free CharXiv variant measures pure multimodal reasoning. Mythos Preview scores 86.1% without tools vs 93.2% with tools, demonstrating strong baseline chart reasoning.
Version
CharXiv w/o tools 2024
Refresh cadence
Annual
Staleness state
Refreshing
Question availability
Public benchmark set
BenchLM uses freshness metadata to decide whether a benchmark should still be treated as a strong differentiator, a benchmark to watch, or a display-only reference. For the full scoring policy, see the BenchLM methodology page.
Tool-free variant of CharXiv that isolates raw visual reasoning ability without code execution or tool augmentation.
Claude Mythos Preview by Anthropic currently leads with a score of 86.1% on CharXiv w/o tools.
3 AI models have been evaluated on CharXiv w/o tools on BenchLM.
For engineers, researchers, and the plain curious — a weekly brief on new models, ranking shifts, and pricing changes.
Free. No spam. Unsubscribe anytime.