Qwen3-TTS DLL + ONNX (Minimal, Single-File ONNX)
This Hugging Face repository provides a minimal runtime bundle for Qwen3-TTS:
- Rust DLL for audio preprocessing + tokenizer (BPE)
- ONNX models (single
.onnxfiles with embedded weights) - Minimal tokenizer files (
config.json,vocab.json,merges.txt,tokenizer_config.json) - Python sample that runs the full pipeline using ONNX Runtime
Important: ONNX Runtime is not bundled. Install onnxruntime (CPU) or onnxruntime-gpu.
Directory Layout
dist/dll_release/
qwen3_tts_rust.dll
qwen3_tts.h
README_dll_release.txt
README.md
onnx_kv/ # 1.7B ONNX, embedded weights
onnx_kv_06b/ # 0.6B ONNX, embedded weights (optional)
models/
Qwen3-TTS-12Hz-1.7B-Base/
config.json
vocab.json
merges.txt
tokenizer_config.json
Qwen3-TTS-12Hz-0.6B-Base/
config.json
vocab.json
merges.txt
tokenizer_config.json
examples/python_dll_call/
run_pipeline.py
Quick Start (Python)
1. Install dependencies
python -m pip install numpy onnxruntime
For GPU:
python -m pip install numpy onnxruntime-gpu
2. Set DLL path
set QWEN3_TTS_DLL=.\qwen3_tts_rust.dll
3. Run (1.7B)
python examples\python_dll_call\run_pipeline.py ^
--onnx-dir .\onnx_kv ^
--model-dir .\models\Qwen3-TTS-12Hz-1.7B-Base ^
--ref-audio C:\path\to\ref.wav ^
--ref-text C:\path\to\ref.txt ^
--text "Hello world."
4. Run (0.6B)
python examples\python_dll_call\run_pipeline.py ^
--onnx-dir .\onnx_kv_06b ^
--model-dir .\models\Qwen3-TTS-12Hz-0.6B-Base ^
--ref-audio C:\path\to\ref.wav ^
--ref-text C:\path\to\ref.txt ^
--text "Hello world."
CPU / GPU switching
- Default: CUDA if available, otherwise CPU.
- Force CPU:
python examples\python_dll_call\run_pipeline.py --device cpu ...
Required Files
Required:
qwen3_tts_rust.dllonnx_kv/*.onnx(oronnx_kv_06b/*.onnx)models/<model>/{config.json,vocab.json,merges.txt,tokenizer_config.json}examples/python_dll_call/run_pipeline.py
Optional:
qwen3_tts.h(C/C++ bindings)onnx_kv_06b/(only for 0.6B)
Notes
- ONNX files are single-file (no
.onnx.data, noonnx__MatMul_*shards). - Samples are not included. Provide your own reference audio/text.
- First load can be slow due to large model size.
Troubleshooting
- DLL not found: set
QWEN3_TTS_DLLor run from this folder. - CUDAExecutionProvider not available: install
onnxruntime-gpuor use--device cpu. - InvalidArgument / input shape: ensure reference audio is mono. The script will resample.
License
Apache-2.0. This bundle is derived from Qwen3-TTS: https://github.com/QwenLM/Qwen3-TTS