--- license: other base_model: MiniMaxAI/MiniMax-M3 tags: [mlx, jang, reap, awq, moe, code, multimodal, minimax-m3, osaurus, apple-silicon] pipeline_tag: text-generation ---

Osaurus

MiniMax-M3-Coder-Small

🦖 Osaurus Exclusive — a compact JANG-quantized MiniMax-M3 coder (coding · agentic · multimodal) for Apple Silicon.

> ⚠️ **JANG-format model — runs on Osaurus.** > This uses the **JANG** quantization format (mixed-precision affine + **AWQ** + **REAP** expert pruning) and loads through **Osaurus's native Swift runtime**. It will **NOT** load with `transformers`, `vLLM`, or generic MLX loaders. ## What is a JANG model? **JANG** is a mixed-precision quantization + packing format — per-projection affine bit widths + **AWQ** activation-aware scaling + **REAP** expert pruning — described by a `jang_config.json`. Weights stay quantized in GPU memory. **Osaurus loads it through its native Swift JANG runtime** on Apple Silicon. ## Highlights - **Smallest M3 coder — ~84 GB** (the compact Osaurus build). - **REAP45:** keep **70/128** routed experts (45% pruned). - **All-2-bit routed experts + AWQ** (gate/up 2-bit AWQ-scaled, down 2-bit); attention 8-bit, shared experts 6-bit, embeddings 6-bit, lm_head 8-bit, Lightning Indexer FP16. - **Multimodal (vision) kept.** - Calibration: Vera (agentic-coder) + GSM8K; "floor" recipe keeps the most-salient coding experts. ## Benchmarks - **HumanEval: pass@1 = 100%** (82/82, scrambled-half adaptive eval, seed 42; 0 failures, 0 escalations). - Despite 45% expert pruning + all-2-bit routed experts, coding accuracy holds at **100%** — the REAP45 keep-set is a subset of the larger M3-Coder builds' proven coding experts, so coding capability is preserved while the model shrinks to ~84 GB. ## Run it Load it in **Osaurus** (Apple Silicon) — it runs on Osaurus's native Swift JANG runtime. ## Attribution - Base model: **MiniMaxAI/MiniMax-M3** · Pruning: **REAP** (Cerebras, arXiv:2510.13999) - **Vera calibration + testing: [@hornsman1](https://huggingface.co/hornsman1) (hornsan1 on GitHub)** · math calibration: GSM8K - Quantization: **JANG** · Runtime & distribution: **Osaurus**