Any plan to open the streaming model to run without vLLM?
#6
by
nenad1002
- opened
Currently Qwen3-ASR "streaming" is only available through vLLM, while the HF / Torch interface is offline only.
Is the streaming variant a different trained model, or the same weights with a different runtime graph?
And is there any plan to release the streaming model, so it can be used with non-vLLM runtimes (e.g. directly through torch with a standalone PyTorch API)?