Gemma 4 QAT Collection Gemma 4 QAT (Quantization-Aware Training) for 3x less memory use and near original accuracy. • 16 items • Updated 1 day ago • 89
DFlash Collection Block Diffusion for Flash Speculative Decoding • 22 items • Updated 2 days ago • 130
Granite Embedding Collection Embedding models (bi‑encoders and rerankers) for RAG, semantic search, and retrieval tasks. • 9 items • Updated Apr 30 • 45
Embedding Models Collection Run or fine-tune embedding models with Unsloth. • 14 items • Updated 1 day ago • 6