AlexWortega/ml-intern-v4-100m-tinystories-20260512-1721 Text Generation • 0.1B • Updated 25 days ago • 3.43k • 3
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning Paper • 2605.07850 • Published 30 days ago • 18
Reasoning Shift: How Context Silently Shortens LLM Reasoning Paper • 2604.01161 • Published Apr 1 • 32
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 76
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding Paper • 2602.23881 • Published Feb 27 • 18
MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization Paper • 2602.03537 • Published Feb 3 • 5
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers Paper • 2602.02016 • Published Feb 2 • 13
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published Jan 30 • 62
WUSH: Near-Optimal Adaptive Transforms for LLM Quantization Paper • 2512.00956 • Published Nov 30, 2025 • 23