Considering releasing the FP8 quantization model?
#1
by
a463724055
- opened
The weight is too large.(Cry)
You can quantize it yourself
You can quantize it yourself
ok, setting CPU offload allows it to run with 16GB VRAM.
a463724055
changed discussion status to
closed