view article Article NEO-unify: Building Native Multimodal Unified Models End to End 15 days ago • 102
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation Paper • 2508.00782 • Published Aug 1, 2025 • 7
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference Paper • 2508.02193 • Published Aug 4, 2025 • 138
Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation Paper • 2508.03320 • Published Aug 5, 2025 • 63
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation Paper • 2508.03694 • Published Aug 5, 2025 • 52
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation Paper • 2508.00782 • Published Aug 1, 2025 • 7
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation Paper • 2508.00782 • Published Aug 1, 2025 • 7 • 2
TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization Paper • 2408.03637 • Published Aug 7, 2024