See What I Mean: Aligning Vision and Language Representations for Video Fine-grained Object Understanding Paper • 2605.18018 • Published 2 days ago • 18
Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation Paper • 2604.25819 • Published 22 days ago • 17
Mixture of Style Experts for Diverse Image Stylization Paper • 2603.16649 • Published Mar 17 • 3
Mixture of Style Experts for Diverse Image Stylization Paper • 2603.16649 • Published Mar 17 • 3
Mixture of Style Experts for Diverse Image Stylization Paper • 2603.16649 • Published Mar 17 • 3
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published Nov 25, 2025 • 38
AgeBooth: Controllable Facial Aging and Rejuvenation via Diffusion Models Paper • 2510.05715 • Published Oct 7, 2025 • 2
AgeBooth: Controllable Facial Aging and Rejuvenation via Diffusion Models Paper • 2510.05715 • Published Oct 7, 2025 • 2
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published Nov 25, 2025 • 38
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs Paper • 2506.21862 • Published Jun 27, 2025 • 36