lightvl is a lightweight Vision-Language Model (VLM) quantization toolkit supporting FP8, INT8, FP8-Block. It integrates with vLLM for high-throughput inference and supports Qwen3-VL, Qwen3.5, InternVL-Chat, and Gemma-4 models.
fast quant your model step by step:
1、 pip3 install lightvl
2、 lightvl YOUR_HF_MODEL_PATH