| Open Model Leaderboard
M5 Max · 128GB MLX LIVE DATA

Open Model Leaderboard

The definitive benchmark for open-source AI models on Apple Silicon. Speed, quality, and memory at every quantization level — compared against cloud APIs.

-- models tested
-- quantization levels
3 quality benchmarks
-- cloud models compared
Loading benchmark data...

Failed to load data

Could not fetch benchmark data. Please check your connection and try again.

Quality metric:
Show:
Quality vs Speed
Best models are in the top-right (high quality + high speed). Click a point for details.
Quality vs Memory Usage
Find the sweet spot: best quality for your available RAM.
Quantization:
Sort by:
Architecture:
# Model Quant Speed Quality ARC GSM8K IFEval Agentic Memory Params
Compare:
Quantization Impact: Speed vs Quality Tradeoff
Connected dots show the same model at Q4 (blue) and Q8 (green). Q8 is slower but higher quality.
Local vs Cloud: Quality Comparison
How do local quantized models compare to cloud APIs on quality benchmarks?
Cloud API Quality Breakdown
Per-benchmark scores for cloud models (from published technical reports).
Best Models by RAM Tier
Ranked by quality index for models that fit in your available memory.

NVIDIA RTX 4090 Benchmarks

We're adding NVIDIA RTX 4090 (24GB VRAM) benchmarks for direct comparison with Apple Silicon unified memory.

COMING SOON