One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining Paper • 2606.30634 • Published 3 days ago • 20
Reasoning Shift: How Context Silently Shortens LLM Reasoning Paper • 2604.01161 • Published Apr 1 • 32