Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published 10 days ago • 133
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published 17 days ago • 134
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios Paper • 2602.22638 • Published 15 days ago • 106
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published about 1 month ago • 199
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published Jan 28 • 111
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published Jan 28 • 120
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published Jan 15 • 155
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 169
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding Paper • 2509.15178 • Published Sep 18, 2025 • 6
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding Paper • 2509.15178 • Published Sep 18, 2025 • 6
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published Dec 12, 2024 • 21