GSQ Collection GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling, https://huggingface.co/papers/2604.18556 • 9 items • Updated 25 days ago • 9
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling Paper • 2604.18556 • Published Apr 20 • 8
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning Paper • 2605.07850 • Published May 8 • 18
MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization Paper • 2602.03537 • Published Feb 3 • 5
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Paper • 2502.05003 • Published Feb 7, 2025 • 44
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization Paper • 2509.23202 • Published Sep 27, 2025 • 30
CAGE: Curvature-Aware Gradient Estimation For Accurate Quantization-Aware Training Paper • 2510.18784 • Published Oct 21, 2025 • 2
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published Nov 30, 2025 • 23
MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization Paper • 2602.03537 • Published Feb 3 • 5 • 1
MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization Paper • 2602.03537 • Published Feb 3 • 5