Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 18 days ago • 99
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 15 days ago • 114
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 5 days ago • 140
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 9 days ago • 63
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 23 days ago • 88
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published 18 days ago • 86
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published 8 days ago • 141
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding Paper • 2603.13366 • Published 13 days ago • 91