Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 93
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 124
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders Paper • 2601.10332 • Published Jan 15 • 30
Enhancing Spatial Understanding in Image Generation via Reward Modeling Paper • 2602.24233 • Published 10 days ago • 50
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published 6 days ago • 80
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 6 days ago • 77
UniT: Unified Multimodal Chain-of-Thought Test-time Scaling Paper • 2602.12279 • Published 25 days ago • 20
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published Jan 15 • 31
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8, 2025 • 30
LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals Paper • 2509.21875 • Published Sep 26, 2025 • 10
Understanding Language Prior of LVLMs by Contrasting Chain-of-Embedding Paper • 2509.23050 • Published Sep 27, 2025 • 15