"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked a model 1 day ago
antirez/deepseek-v4-gguf liked a model 5 days ago
SulphurAI/Sulphur-2-base liked a model 14 days ago
z-lab/Qwen3-4B-DFlash-b16