Considering releasing the FP8 quantization model?

#1
by a463724055 - opened

The weight is too large.(Cry)

You can quantize it yourself

You can quantize it yourself

ok, setting CPU offload allows it to run with 16GB VRAM.

a463724055 changed discussion status to closed

Sign up or log in to comment