-
UniDDT: Unifying Multimodal Understanding and Generation with Decoupled Diffusion Transformer
Paper • 2606.16255 • Published • 14 -
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation
Paper • 2606.17030 • Published • 28 -
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?
Paper • 2606.19531 • Published • 15
Muhammadinam
INAM2004
AI & ML interests
None yet
Recent Activity
updated a collection 2 days ago
Research for mutimodle modle updated a collection 2 days ago
Research for mutimodle modleOrganizations
None yet