NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity Paper โข 2006.06280 โข Published Jun 11, 2020
Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference Paper โข 2409.12117 โข Published Sep 18, 2024
Edit-A-Video: Single Video Editing with Object-Aware Consistency Paper โข 2303.07945 โข Published Mar 14, 2023
VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech Paper โข 2408.14739 โข Published Aug 27, 2024
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Paper โข 2507.08128 โข Published Jul 10, 2025 โข 13
Music Flamingo: Scaling Music Understanding in Audio Language Models Paper โข 2511.10289 โข Published Nov 13, 2025 โข 18
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning Paper โข 2510.12000 โข Published Oct 13, 2025 โข 1
ETTA: Elucidating the Design Space of Text-to-Audio Models Paper โข 2412.19351 โข Published Dec 26, 2024
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Paper โข 2507.08128 โข Published Jul 10, 2025 โข 13
Negative-Guided Subject Fidelity Optimization for Zero-Shot Subject-Driven Generation Paper โข 2506.03621 โข Published Jun 4, 2025 โข 22
Cosmos Collection โ ๏ธ This collection is archived. ๐ https://huggingface.co/collections/nvidia/nvidia-cosmos-2 โข 14 items โข Updated 3 days ago โข 300
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper โข 2410.01680 โข Published Oct 2, 2024 โข 34