BenchLM recommendation

Best Value LLM Overall in 2026 — Cost-Adjusted Rankings

Data verified July 23, 2026

As of July 23, 2026, the top model in best value llm overall on the BenchLM leaderboard is Ministral 3 3B (Reasoning) with a score of 395.2.

Last verified: July 23, 2026

This ranking answers the simplest question: which model gives you the most benchmark performance per dollar? It divides each model's overall weighted score (across all 8 categories) by its output token price. The leaders here are generalist value picks — strong across coding, reasoning, agentic, knowledge, and more, at prices that won't break your API budget. Start here if you need a single cost-effective model for mixed workloads.

Unless noted otherwise, ranking surfaces on this page use BenchLM's provisional leaderboard lane rather than the stricter sourced-only verified leaderboard.

Ministral 3 3B (Reasoning) leads this ranking with a score of 395.2, followed by Ministral 3 8B (Reasoning) (269.2) and Ministral 3 14B (Reasoning) (246.7). There is a significant gap between the leading models and the rest of the field.

The best open-weight option is Ministral 3 3B (Reasoning) (ranked #1 with a score of 395.2). Open-weight models are highly competitive in this category — self-hosting is a viable alternative to proprietary APIs.

This ranking is based on provisional overall weighted scores across BenchLM.ai's scoring formula tracked by BenchLM.ai. For detailed model profiles, click any model name below. To compare two specific models head-to-head, use the "vs #" links.

1Open

Ministral 3 3B (Reasoning)

Mistral · 128K

395.2Score/$

Score: 39.5 · $0.1/1M

2Open

Ministral 3 8B (Reasoning)

Mistral · 128K

269.2Score/$

Score: 40.4 · $0.15/1M

3Open

Ministral 3 14B (Reasoning)

Mistral · 128K

246.7Score/$

Score: 49.3 · $0.2/1M

How to choose

Category-specific value?

Check category-specific value pages (coding, agentic, etc.)

Best raw performance regardless of cost?

See the overall leaderboard

Full Rankings (107 models)

Ministral 3 3B (Reasoning)

Mistral·Open Weight·128K

395.2

Score/$

Score: 39.5 · $0.1/1M

vs #2

Ministral 3 8B (Reasoning)

Mistral·Open Weight·128K

269.2

Score/$

Score: 40.4 · $0.15/1M

vs #3

Ministral 3 14B (Reasoning)

Mistral·Open Weight·128K

246.7

Score/$

Score: 49.3 · $0.2/1M

vs #4

DeepSeek V4 Flash

DeepSeek·Open Weight·1M

210.29

Score/$

Score: 58.9 · $0.28/1M

vs #5

DeepSeek V4 Flash (High)

DeepSeek·Open Weight·1M

192.68

Score/$

Score: 54 · $0.28/1M

vs #6

Step 3.5 Flash

StepFun·Open Weight·256K

183.67

Score/$

Score: 55.1 · $0.3/1M

vs #7

Ministral 3 3B

Mistral·Open Weight·128K

182.7

Score/$

Score: 18.3 · $0.1/1M

vs #8

Ministral 3 14B

Mistral·Open Weight·128K

172.35

Score/$

Score: 34.5 · $0.2/1M

vs #9

Ministral 3 8B

Mistral·Open Weight·128K

139.73

Score/$

Score: 21 · $0.15/1M

vs #10

DeepSeek V3.2

DeepSeek·Open Weight·128K

131.9

Score/$

Score: 55.4 · $0.42/1M

vs #11

Qwen3.5 Flash

Alibaba·Proprietary·1M

119.12

Score/$

Score: 47.7 · $0.4/1M

vs #12

GPT-5 nano

OpenAI·Proprietary·400K

115.9

Score/$

Score: 46.4 · $0.4/1M

vs #13

GPT-4.1 nano

OpenAI·Proprietary·1M

105.15

Score/$

Score: 42.1 · $0.4/1M

vs #14

Grok 4.1 Fast

xAI·Proprietary·1M

102.5

Score/$

Score: 51.3 · $0.5/1M

vs #15

Mistral Small 4

Mistral·Open Weight·256K

77.05

Score/$

Score: 46.2 · $0.6/1M

vs #16

DeepSeek V4 Pro

DeepSeek·Open Weight·1M

69.72

Score/$

Score: 60.7 · $0.87/1M

vs #17

Mercury 2

Inception·Proprietary·128K

68.37

Score/$

Score: 51.3 · $0.75/1M

vs #18

DeepSeek V4 Pro (High)

DeepSeek·Open Weight·1M

63.76

Score/$

Score: 55.5 · $0.87/1M

vs #19

GPT-4o mini

OpenAI·Proprietary·128K

63.12

Score/$

Score: 37.9 · $0.6/1M

vs #20

MiniMax M3

MiniMax·Open Weight·1M

58.13

Score/$

Score: 69.8 · $1.2/1M

vs #21

Trinity-Large-Thinking

Arcee AI·Open Weight·512K

58.13

Score/$

Score: 52.3 · $0.9/1M

vs #22

Trinity-Large-Preview

Arcee AI·Open Weight·512K

55.94

Score/$

Score: 55.9 · $1/1M

vs #23

GPT-5.4 nano

OpenAI·Proprietary·400K

53.43

Score/$

Score: 66.8 · $1.25/1M

vs #24

MiniMax M2.7

MiniMax·Open Weight·200K

53.43

Score/$

Score: 64.1 · $1.2/1M

vs #25

MiniMax M2.5

MiniMax·Proprietary·128K

49.6

Score/$

Score: 59.5 · $1.2/1M

vs #26

Step 3.7 Flash

StepFun·Open Weight·256K

44.23

Score/$

Score: 50.9 · $1.15/1M

vs #27

GLM-4.5-Air

Z.AI·Proprietary·128K

43.36

Score/$

Score: 47.7 · $1.1/1M

vs #28

DeepSeek V3

DeepSeek·Open Weight·128K

40.88

Score/$

Score: 45 · $1.1/1M

vs #29

Gemini 3.1 Flash-Lite

Google·Proprietary·1M

33.89

Score/$

Score: 50.8 · $1.5/1M

vs #30

Mistral Large 3

Mistral·Proprietary·128K

33.6

Score/$

Score: 50.4 · $1.5/1M

vs #31

Aion-2.0

Aion Labs·Proprietary·128K

30.41

Score/$

Score: 48.7 · $1.6/1M

vs #32

GPT-4.1 mini

OpenAI·Proprietary·1M

27.62

Score/$

Score: 44.2 · $1.6/1M

vs #33

DeepSeek V3.2 (Thinking)

DeepSeek·Open Weight·128K

26.55

Score/$

Score: 58.2 · $2.19/1M

vs #34

GLM-4.5

Z.AI·Proprietary·128K

26.16

Score/$

Score: 57.6 · $2.2/1M

vs #35

Grok 4.3

xAI·Proprietary·1M

26.04

Score/$

Score: 65.1 · $2.5/1M

vs #36

Grok Code Fast 1

xAI·Proprietary·256K

25.76

Score/$

Score: 38.6 · $1.5/1M

vs #37

DeepSeek-R1

DeepSeek·Open Weight·128K

23.59

Score/$

Score: 51.7 · $2.19/1M

vs #38

GPT-5 mini

OpenAI·Proprietary·128K

21.97

Score/$

Score: 43.9 · $2/1M

vs #39

Mistral Medium 3

Mistral·Proprietary·128K

21.6

Score/$

Score: 43.2 · $2/1M

vs #40

GLM-5

Z.AI·Open Weight·200K

20.64

Score/$

Score: 66.1 · $3.2/1M

vs #41

Gemini 3 Flash

Google·Proprietary·1M

20.16

Score/$

Score: 60.5 · $3/1M

vs #42

Kimi K2.5

Moonshot AI·Open Weight·256K

19.89

Score/$

Score: 59.7 · $3/1M

vs #43

Kimi K2.5 (Reasoning)

Moonshot AI·Proprietary·128K

19.78

Score/$

Score: 59.4 · $3/1M

vs #44

Qwen3.5 Plus

Alibaba·Proprietary·1M

19.67

Score/$

Score: 47.2 · $2.4/1M

vs #45

Gemini 2.5 Flash

Google·Proprietary·1M

19.24

Score/$

Score: 48.1 · $2.5/1M

vs #46

GLM-5 (Reasoning)

Z.AI·Open Weight·200K

18.68

Score/$

Score: 59.8 · $3.2/1M

vs #47

Claude 3 Haiku

Anthropic·Proprietary·200K

17.11

Score/$

Score: 21.4 · $1.25/1M

vs #48

GLM-5-Turbo

Z.AI·Proprietary·200K

16.72

Score/$

Score: 66.9 · $4/1M

vs #49

Qwen3.5 397B (Reasoning)

Alibaba·Open Weight·128K

16.53

Score/$

Score: 59.5 · $3.6/1M

vs #50

GLM-5V-Turbo

Z.AI·Proprietary·200K

15.88

Score/$

Score: 63.5 · $4/1M

vs #51

Qwen3.5 397B

Alibaba·Open Weight·128K

15.84

Score/$

Score: 57 · $3.6/1M

vs #52

GLM-5.1

Z.AI·Open Weight·203K

15.4

Score/$

Score: 67.7 · $4.4/1M

vs #53

GLM-5.2

Z.AI·Open Weight·1M

14.54

Score/$

Score: 64 · $4.4/1M

vs #54

Inkling

Thinking Machines Lab·Open Weight·1M

14.43

Score/$

Score: 67.5 · $4.68/1M

vs #55

Kimi K2.6

Moonshot AI·Open Weight·256K

14.2

Score/$

Score: 56.8 · $4/1M

vs #56

Kimi K2.7 Code

Moonshot AI·Open Weight·256K

13.75

Score/$

Score: 55 · $4/1M

vs #57

Grok 4.5

xAI·Proprietary·500K

12.79

Score/$

Score: 76.7 · $6/1M

vs #58

GPT-5.4 mini

OpenAI·Proprietary·400K

12.62

Score/$

Score: 56.8 · $4.5/1M

vs #59

Claude Haiku 4.5

Anthropic·Proprietary·200K

11.32

Score/$

Score: 56.6 · $5/1M

vs #60

GPT-5.6 Luna

OpenAI·Proprietary·1M

11.2

Score/$

Score: 67.2 · $6/1M

vs #61

Kimi K2

Moonshot AI·Proprietary·128K

10.88

Score/$

Score: 27.2 · $2.5/1M

vs #62

o3-mini

OpenAI·Proprietary·200K

10.77

Score/$

Score: 47.4 · $4.4/1M

vs #63

GPT-5.2 Instant

OpenAI·Proprietary·128K

9.83

Score/$

Score: 59 · $6/1M

vs #64

Grok 4.20

xAI·Proprietary·2M

9.11

Score/$

Score: 54.7 · $6/1M

vs #65

Gemini 3.5 Flash

Google·Proprietary·1M

7.19

Score/$

Score: 64.8 · $9/1M

vs #66

Gemini 1.5 Pro

Google·Proprietary·2M

7.14

Score/$

Score: 35.7 · $5/1M

vs #67

Claude Sonnet 5

Anthropic·Proprietary·1M

6.53

Score/$

Score: 65.3 · $10/1M

vs #68

GPT-4.1

OpenAI·Proprietary·1M

6.39

Score/$

Score: 51.1 · $8/1M

vs #69

OpenAI·Proprietary·200K

5.99

Score/$

Score: 47.9 · $8/1M

vs #70

GPT-5 (high)

OpenAI·Proprietary·128K

5.86

Score/$

Score: 58.6 · $10/1M

vs #71

Gemini 2.5 Pro

Google·Proprietary·1M

5.73

Score/$

Score: 57.3 · $10/1M

vs #72

Gemini 3 Pro

Google·Proprietary·2M

5.64

Score/$

Score: 67.7 · $12/1M

vs #73

GPT-5.1-Codex-Max

OpenAI·Proprietary·400K

5.45

Score/$

Score: 54.5 · $10/1M

vs #74

Kimi K3

Moonshot AI·Pending·1.05M

5.4

Score/$

Score: 81 · $15/1M

vs #75

GPT-5.1

OpenAI·Proprietary·200K

5.37

Score/$

Score: 53.7 · $10/1M

vs #76

GPT-5.4

OpenAI·Proprietary·1.05M

4.95

Score/$

Score: 74.2 · $15/1M

vs #77

GPT-5.6 Terra

OpenAI·Proprietary·1M

4.84

Score/$

Score: 72.6 · $15/1M

vs #78

GPT-5.3 Codex

OpenAI·Proprietary·400K

4.76

Score/$

Score: 66.7 · $14/1M

vs #79

Command A+

Cohere·Open Weight·128K

4.75

Score/$

Score: 47.5 · $10/1M

vs #80

Gemini 3.1 Pro

Google·Proprietary·1M

4.61

Score/$

Score: 55.3 · $12/1M

vs #81

Claude Sonnet 4.6

Anthropic·Proprietary·200K

4.34

Score/$

Score: 65.1 · $15/1M

vs #82

GPT-5.2-Codex

OpenAI·Proprietary·400K

4.22

Score/$

Score: 59.1 · $14/1M

vs #83

GPT-5.3 Instant

OpenAI·Proprietary·400K

4.21

Score/$

Score: 58.9 · $14/1M

vs #84

GPT-5.2

OpenAI·Proprietary·400K

4.17

Score/$

Score: 58.4 · $14/1M

vs #85

GPT-4o

OpenAI·Proprietary·128K

4.15

Score/$

Score: 41.5 · $10/1M

vs #86

Claude Sonnet 4.5

Anthropic·Proprietary·200K

3.57

Score/$

Score: 53.6 · $15/1M

vs #87

Claude 3.5 Sonnet

Anthropic·Proprietary·200K

3.18

Score/$

Score: 47.7 · $15/1M

vs #88

Claude Opus 4.8

Anthropic·Proprietary·1M

3.13

Score/$

Score: 78.3 · $25/1M

vs #89

Claude Opus 4.7

Anthropic·Proprietary·1M

2.88

Score/$

Score: 71.9 · $25/1M

vs #90

Claude 4 Sonnet

Anthropic·Proprietary·200K

2.85

Score/$

Score: 42.8 · $15/1M

vs #91

Claude Opus 4.6

Anthropic·Proprietary·1M

2.74

Score/$

Score: 68.6 · $25/1M

vs #92

GPT-5.6 Sol

OpenAI·Proprietary·1M

2.73

Score/$

Score: 82 · $30/1M

vs #93

Claude Opus 4.7 (Adaptive)

Anthropic·Proprietary·1M

2.65

Score/$

Score: 66.3 · $25/1M

vs #94

Claude Opus 4.5

Anthropic·Proprietary·200K

2.57

Score/$

Score: 64.2 · $25/1M

vs #95

GPT-5.5

OpenAI·Proprietary·1M

2.45

Score/$

Score: 73.5 · $30/1M

vs #96

Claude Mythos 5

Anthropic·Proprietary·1M+

1.68

Score/$

Score: 83.9 · $50/1M

vs #97

Claude Fable 5

Anthropic·Proprietary·1M+

1.67

Score/$

Score: 83.7 · $50/1M

vs #98

GPT-4 Turbo

OpenAI·Proprietary·128K

0.91

Score/$

Score: 27.4 · $30/1M

vs #99

o1-preview

OpenAI·Proprietary·200K

0.82

Score/$

Score: 49.1 · $60/1M

vs #100

100

OpenAI·Proprietary·200K

0.8

Score/$

Score: 48.1 · $60/1M

vs #101

101

Claude 4.1 Opus

Anthropic·Proprietary·200K

0.61

Score/$

Score: 45.9 · $75/1M

vs #102

102

o3-pro

OpenAI·Proprietary·200K

0.6

Score/$

Score: 48.3 · $80/1M

vs #103

103

Claude 3 Opus

Anthropic·Proprietary·200K

0.55

Score/$

Score: 41.1 · $75/1M

vs #104

104

GPT-5.2 Pro

OpenAI·Proprietary·400K

0.45

Score/$

Score: 67 · $150/1M

vs #105

105

GPT-5.5 Pro

OpenAI·Proprietary·1M

0.35

Score/$

Score: 63.7 · $180/1M

vs #106

106

GPT-5.4 Pro

OpenAI·Proprietary·1.05M

0.34

Score/$

Score: 60.9 · $180/1M

vs #107

107

o1-pro

OpenAI·Proprietary·200K

0.08

Score/$

Score: 45.9 · $600/1M

Key Takeaways

The best value model is Ministral 3 3B (Reasoning) by Mistral with a provisional Score/$ ratio of 395.2 (score: 39.5, output: $0.1/1M tokens).

The best open-weight model is Ministral 3 3B (Reasoning) at position #1.

107 models are included in this ranking.

Score in Context

What these scores mean

Value scores divide the weighted overall score by output token price (per 1M tokens). Higher means more capability per dollar. Models with no listed price are excluded.

Known limitations

Value rankings favor cheap models even if absolute performance is modest. A model scoring half as well at one-tenth the price wins on value — but may not meet your quality bar. Always check raw scores alongside value rankings.

Explore More

Last updated: July 23, 2026

Choose a model with this week’s evidence

Join 2,000+ readers for ranking moves, pricing changes, and the claims that still need proof.

One email each week. Unsubscribe anytime.