BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding Paper • 2503.21483 • Published Mar 27, 2025 • 1
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 12 days ago • 111