view article Article Why Coding Agents Canβt Replace ML Systems Engineers (Yet) AmberLJC β’ Jan 28 β’ 2
Decoding ML Decision: An Agentic Reasoning Framework for Large-Scale Ranking System Paper β’ 2602.18640 β’ Published Feb 20 β’ 9
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper β’ 2601.22060 β’ Published Jan 29 β’ 155
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper β’ 2602.14041 β’ Published Feb 15 β’ 53
Efficient Autoregressive Video Diffusion with Dummy Head Paper β’ 2601.20499 β’ Published Jan 28 β’ 8
Quantifying the Gap between Understanding and Generation within Unified Multimodal Models Paper β’ 2602.02140 β’ Published Feb 2 β’ 12
SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? Paper β’ 2602.03916 β’ Published Feb 3 β’ 11
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper β’ 2602.04816 β’ Published Feb 4 β’ 20
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper β’ 2601.20833 β’ Published Jan 28 β’ 183
Executable Code Actions Elicit Better LLM Agents Paper β’ 2402.01030 β’ Published Feb 1, 2024 β’ 194
QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models Paper β’ 2512.19526 β’ Published Dec 22, 2025 β’ 12
Running 3.85k The Ultra-Scale Playbook π 3.85k The ultimate guide to training LLM on large GPU Clusters
Cosmos-Tokenizer Collection A suite of image and video tokenizers β’ 12 items β’ Updated 9 days ago β’ 44
VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads Paper β’ 2407.18245 β’ Published Jul 25, 2024 β’ 12
dima806/facial_emotions_image_detection Image Classification β’ 85.8M β’ Updated Oct 19, 2024 β’ 55.7k β’ β’ 123