Public R2 mirror · pre-quantised MLX weights

The rapid-mlx model mirror. One command away.

A public, anonymous Cloudflare R2 mirror of MLX-quantised LLM weights — the same files rapid-mlx pull downloads under the hood. Browse the catalog, copy the install command, and have a model serving on your Mac in minutes.

$ rapid-mlx pull qwen3.5-4b-4bit

Pick by Mac RAM

Not sure which model? Pick your Mac RAM tier; we'll surface the alias that gives the best single-user throughput at that footprint. The full catalog is below.

1 · Choose your Mac RAM
2 · Recommended model
16 GBselected

Qwen3.5-4B 4bit

chat, coding, tools
On disk
2.4 GB
Throughput
147 tok/s
3 · Compare nearby tiers

Model catalog

Every alias rapid-mlx ships with, joined against what is currently mirrored on R2. Files are immutable per HuggingFace revision and cached aggressively at the edge.

Loading catalog…