Tiny models used for testing
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 98 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 99 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 62 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
Tiny models used for testing
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 98 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 99 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 62 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
models 376
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt1-20260609-0052
0.6B • Updated
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-ep0p11
2B • Updated
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt1
0.6B • Updated • 107
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt0.5
0.6B • Updated
inference-optimization/Qwen3-8B-speculator.dflash.swa.unified-ep0p28
2B • Updated
inference-optimization/Qwen3-8B-speculator.dflash.swa.unified-ep0p19
2B • Updated
inference-optimization/DFlash-SWA-Causal-Qwen3-8B-Magpie-Ultrachat
2B • Updated • 181
inference-optimization/DFlash-SWA-Causal-Qwen3-8B-PerfectBlend
2B • Updated • 51
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt0
0.6B • Updated • 106
inference-optimization/gpt-oss-2.5B-A1.3B
3B • Updated • 21
datasets 22
inference-optimization/Qwen3.5-0.8B-responses
Viewer • Updated • 7.47k • 29
inference-optimization/Qwen3.5-9B-responses
Viewer • Updated • 7.67k • 19
inference-optimization/Qwen3-8B-Regenerated-Collection
Preview • Updated • 130
inference-optimization/Qwen3-30B-A3B-responses
Preview • Updated • 29
inference-optimization/Qwen3-32B-responses
Preview • Updated • 38
inference-optimization/ctest-Qwen3.6-27B-speculator-dataset
Viewer • Updated • 5.61k • 28
inference-optimization/Gemma4-Responses-Nemotron
Viewer • Updated • 762k • 58 • 1
inference-optimization/Longbench_Samples_Specdec
Viewer • Updated • 160 • 64
inference-optimization/ctest-subset-Qwen3.5-397B-A17B-FP8-dynamic-speculator-dataset
Viewer • Updated • 10k • 73
inference-optimization/final-ctest-Qwen3-8B-speculator-dataset
Viewer • Updated • 10k • 61