shisa-ai/Qwen3.6-35B-A3B-PARO-full4096-e5-packed Text Generation • 6B • Updated about 4 hours ago • 14
shisa-ai/Qwen3.6-35B-A3B-PARO-full4096-e5-packed Text Generation • 6B • Updated about 4 hours ago • 14
Running Featured 84 Distilling 100B+ Models 40x Faster with TRL 📝 84 TRL distillation for 100B+ teachers, 40x faster