MOSS-TTSD: Text to Spoken Dialogue Generation
Transcribe or translate audio and YouTube videos to text
Generate personalized photos with your face