yzy-python-0.5b π
Lightweight Python-focused language model (0.5B parameters) fine-tuned for code generation and instruction-following.
Optimized for:
- Python code generation
- scripting help
- small coding copilots
- local inference
- experimentation
- hackathons
Base model: Qwen2-0.5B-Instruct
Fine-tuning method: QLoRA (4-bit)
Dataset style: Alpaca-format Python instructions
Demo
Transformers usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "SamirXR/yzy-python-0.5b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto"
)
prompt = "Write a Python function to reverse a string"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=200
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
4-bit inference (recommended)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
model_id = "SamirXR/yzy-python-0.5b"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
prompt = "Write a Python function for fibonacci numbers"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Gradio Chatbot Demo
import torch
import gradio as gr
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
MODEL_NAME = "SamirXR/yzy-python-0.5b"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token
def generate_code(instruction, history):
prompt = f"### Instruction:\n{instruction}\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response = response.split("### Response:\n")[-1].strip()
return response
demo = gr.ChatInterface(
fn=generate_code,
title="yzy-python-0.5b Chatbot",
description="Python coding assistant (QLoRA fine-tuned Qwen2-0.5B)",
examples=[
"Write a function to calculate fibonacci numbers",
"Create a Python class for a linked list",
"Reverse a string in Python"
],
)
demo.launch(share=True)
Training Details
Base model: Qwen/Qwen2-0.5B-Instruct
Dataset: iamtarun/python_code_instructions_18k_alpaca
Format used during training:
### Instruction:
<task>
### Response:
<answer>
Training method: QLoRA (4-bit NF4 quantization)
Key parameters:
- LoRA rank: 8
- alpha: 16
- dropout: 0.05
- epochs: 2
- learning rate: 2e-4
- context length: 512
- optimizer: paged_adamw_8bit
Citation
If you use this model, please cite:
Base model: Qwen2 Technical Report (Qwen Team, 2024)
Dataset: python_code_instructions_18k_alpaca (iamtarun)
Model: yzy-python-0.5b (SamirXR)
Notes
This is a small model intended for experimentation and lightweight coding assistance. Performance will not match large models but allows fast local inference with minimal resources.
- Downloads last month
- 26