Any plan to open the streaming model to run without vLLM?

#6
by nenad1002 - opened

Currently Qwen3-ASR "streaming" is only available through vLLM, while the HF / Torch interface is offline only.
Is the streaming variant a different trained model, or the same weights with a different runtime graph?

And is there any plan to release the streaming model, so it can be used with non-vLLM runtimes (e.g. directly through torch with a standalone PyTorch API)?

Sign up or log in to comment