Qwen 2.5 0.5B Instruct (GGUF Quantized)
This repository contains the GGUF quantized version of the Qwen 2.5 0.5B Instruct model. It is an Ultra-Lightweight Micro SLM designed for edge devices, mobile phones, and IoT applications.
Model Creator: Qwen Team (Alibaba Cloud)
Quantized By: Md Habibur Rahman (Aasif)
Quantization Format: GGUF (Q4_0)
Target Device: Android, Raspberry Pi, Low-end Laptops
β‘ Performance
This model is extremely fast and requires minimal RAM.
| Metric | Value |
|---|---|
| Model Size | ~350 MB |
| RAM Required | < 1 GB |
| Parameters | 0.5 Billion |
| Speed (GPU) | 100+ Tokens/sec (Est.) |
π Usage Code
from huggingface_hub import hf_hub_download
from llama_cpp import Llama
model_path = hf_hub_download(
repo_id="Habibur2/Qwen2.5-0.5B-GGUF",
filename="qwen-2.5-0.5b-q4_0.gguf"
)
llm = Llama(model_path=model_path, n_ctx=1024, n_gpu_layers=-1)
response = llm.create_chat_completion(
messages=[{"role": "user", "content": "Write a hello world code in Python."}]
)
print(response['choices'][0]['message']['content'])
- Downloads last month
- 54
Hardware compatibility
Log In
to add your hardware
4-bit