What impact has quantization had on model performance / ability?

#4
by spanspek - opened

The README for this quantized model still shows benchmark results of the full un-quantized model

Generally speaking, models quantized to FP8 lose almost no accuracy but beyond that we know that performance starts to degrade

Are benchmarks available for this model at this specific level of quantization?

hanks you too

Sign up or log in to comment