view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 77
view article Article Nanochat-Ascend: Training Karpathy's Nanochat on Ascend NPU (Part 1) leideng • 4 days ago • 1
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 190
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 qgallouedec, stevhliu, pcuenq, sergiopaniego • Mar 31 • 54
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face dvgodoy • Feb 11, 2025 • 123
Transformers.js V4 demos Collection A collection of demos built with Transformers.js V4 • 24 items • Updated Apr 16 • 59
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 78
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 105
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 23 items • Updated 2 days ago • 319
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 309