view article Article Aligning to What? Rethinking Agent Generalization in MiniMax M2 MiniMax-AI • Oct 30, 2025 • 43
view article Article Why Did MiniMax M2 End Up as a Full Attention Model? MiniMax-AI • Oct 30, 2025 • 80
RWKV World v3 Corpus Collection RWKV World v3.0 Dataset for training RWKV-7 Goose World v3 models • 64 items • Updated Mar 9, 2025 • 3
3D Gaussian Splatting for Real-Time Radiance Field Rendering Paper • 2308.04079 • Published Aug 8, 2023 • 202
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published Jan 30, 2025 • 61
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 161
view article Article Fine-tuning Llama 2 70B using PyTorch FSDP +2 smangrul, sgugger, lewtun, philschmid • Sep 13, 2023 • 32