GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents Paper • 2606.18829 • Published 9 days ago • 17
ENPIRE: Agentic Robot Policy Self-Improvement in the Real World Paper • 2606.19980 • Published 8 days ago • 14
Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performance Paper • 2606.19195 • Published 9 days ago • 135
Guava: An Effective and Universal Harness for Embodied Manipulation Paper • 2606.18363 • Published 10 days ago • 28
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability Paper • 2606.19236 • Published 9 days ago • 12
AlloSpatial: Agentic Harness Framework for Spatial Reasoning in Foundation Models Paper • 2606.08952 • Published 18 days ago • 4
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 10 days ago • 204
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 16 days ago • 68
InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning Paper • 2606.12195 • Published 16 days ago • 23
From Chatbot to Digital Colleague: The Paradigm Shift Toward Persistent Autonomous AI Paper • 2606.14502 • Published 14 days ago • 109
Beyond Monolingual Deep Research: Evaluating Agents and Retrievers with Cross-Lingual BrowseComp-Plus Paper • 2606.15345 • Published 13 days ago • 16
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? Paper • 2606.17861 • Published 10 days ago • 55
Ling and Ring 2.6 Technical Report: Efficient and Instant Agentic Intelligence at Trillion-Parameter Scale Paper • 2606.15079 • Published 13 days ago • 84
Evoflux: Inference-Time Evolution of Executable Tool Workflows for Compact Agents Paper • 2606.12674 • Published 16 days ago • 5