MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated 24 days ago • 55
Paused 240 Omnilingual ASR Media Transcription 🌍 240 Transcribe audio/video files into text instantly
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 516k • 1.6k