7 26 31

Zhongwei Zhang

zzwustc

zzw-ustc

AI & ML interests

AIGC

Recent Activity

liked a Space 9 days ago

facebook/vggt-omega

upvoted a paper 11 days ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

upvoted a paper 11 days ago

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

View all activity

Organizations

upvoted 4 papers 11 days ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published 25 days ago • 86

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published 26 days ago • 101

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Paper • 2605.18739 • Published 21 days ago • 112

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 27 days ago • 191

upvoted 3 papers 3 months ago

upvoted an article 3 months ago

Article

设计位置编码

FL33TW00D-HF

•

Nov 25, 2024

• 27

upvoted a paper 4 months ago

Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 201

upvoted 3 papers 5 months ago

Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 155

Region-Constraint In-Context Generation for Instructional Video Editing

Paper • 2512.17650 • Published Dec 19, 2025 • 53

SAM Audio: Segment Anything in Audio

Paper • 2512.18099 • Published Dec 19, 2025 • 25

upvoted a paper 7 months ago

FARMER: Flow AutoRegressive Transformer over Pixels

Paper • 2510.23588 • Published Oct 27, 2025 • 59

upvoted a paper 8 months ago

Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

Paper • 2510.01284 • Published Sep 30, 2025 • 37

upvoted an article 10 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 482

upvoted 2 papers about 1 year ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Paper • 2503.23377 • Published Mar 30, 2025 • 57

MotionPro: A Precise Motion Controller for Image-to-Video Generation

Paper • 2505.20287 • Published May 26, 2025 • 20

upvoted 3 papers over 1 year ago

Wonderland: Navigating 3D Scenes from a Single Image

Paper • 2412.12091 • Published Dec 16, 2024 • 16

Stable Flow: Vital Layers for Training-Free Image Editing

Paper • 2411.14430 • Published Nov 21, 2024 • 22

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published Nov 4, 2024 • 20