Instructions to use Supertone/supertonic-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Supertonic
How to use Supertone/supertonic-2 with Supertonic:
from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance." wav, duration = tts.synthesize(text, voice_style=style) tts.save_audio(wav, "output.wav")
- Notebooks
- Google Colab
- Kaggle
frequent dropouts during voice generation.
This issue becomes more apparent when the sentences get longer. In a text of about 100-200 characters, the audio skips at least one or two characters. It also frequently jumps over or omits numbers, especially in the thousands or ten-thousands range.
Although the overall voice quality is clean and consistent, this skipping problem is quite severe. It seems highly probable—almost guaranteed—that a dropout will occur at least once per generation.
esp. use case 'Korean'
Hello! Reading numbers can be challenging for the current model, as it does not use a dedicated text normalizer and the training data volume is not yet sufficient to handle these cases robustly. As a workaround, you can apply your own text normalization for numbers if needed. Also, the model can occasionally exhibit skip/repeat issues. We're aware of these limitations and are working on improvements. We hope to release an improved model as soon as possible.