Open Model Leaderboard

The definitive benchmark for open-source AI models on Apple Silicon. Speed, quality, and memory at every quantization level — compared against cloud APIs.

-- models tested

-- quantization levels

3 quality benchmarks

-- cloud models compared

Loading benchmark data...

Failed to load data

Could not fetch benchmark data. Please check your connection and try again.

Quality metric:

Show:

Quality vs Speed

Best models are in the top-right (high quality + high speed). Click a point for details.

Quality vs Memory Usage

Find the sweet spot: best quality for your available RAM.

Quantization:

Sort by:

Architecture:

#	Model	Quant	Speed	Quality	ARC	GSM8K	IFEval	Agentic	Memory	Params

Compare:

Quantization Impact: Speed vs Quality Tradeoff

Connected dots show the same model at Q4 (blue) and Q8 (green). Q8 is slower but higher quality.

Local vs Cloud: Quality Comparison

How do local quantized models compare to cloud APIs on quality benchmarks?

Cloud API Quality Breakdown

Per-benchmark scores for cloud models (from published technical reports).

Best Models by RAM Tier

Ranked by quality index for models that fit in your available memory.

NVIDIA RTX 4090 Benchmarks

We're adding NVIDIA RTX 4090 (24GB VRAM) benchmarks for direct comparison with Apple Silicon unified memory.

COMING SOON