OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper • 2606.02031 • Published 3 days ago • 12
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts Paper • 2606.02404 • Published 3 days ago • 50
NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation Paper • 2606.03159 • Published 1 day ago • 7
PlatonicNav: Unveiling Semantic Correspondence in Navigation with Platonic Topological Maps Paper • 2606.01788 • Published 3 days ago • 7
VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies Paper • 2605.30011 • Published 7 days ago • 8
WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction Paper • 2605.29341 • Published 7 days ago • 14
DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation Paper • 2605.30350 • Published 7 days ago • 10
PANDO: Efficient Multimodal AI Agents via Online Skill Distillation Paper • 2605.24785 • Published 9 days ago • 9
When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems Paper • 2605.30102 • Published 7 days ago • 13
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security Paper • 2605.29801 • Published 7 days ago • 139
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 7 days ago • 132
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published 7 days ago • 54
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation Paper • 2605.28655 • Published 8 days ago • 11
Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems Paper • 2605.26302 • Published 10 days ago • 31
From Pixels to Words -- Towards Native One-Vision Models at Scale Paper • 2605.28820 • Published 8 days ago • 70