VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding Paper • 2603.22285 • Published 2 days ago • 45
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models Paper • 2507.23682 • Published Jul 31, 2025 • 24