qwen2.5-1.5b-sql-qlora

Fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct for natural language to SQL generation using QLoRA (4-bit quantization + LoRA).

Model Description

This model takes a natural language question and a SQL table schema (one or more CREATE TABLE statements) and returns the corresponding SQL query.

Base model: Qwen/Qwen2.5-1.5B-Instruct Fine-tuning method: QLoRA (4-bit NF4 + LoRA rank 16) Task: Text-to-SQL generation Training data: b-mc2/sql-create-context

Intended Use

SQL query generation from natural language in applications and chatbots
Database querying assistants
Prototyping text-to-SQL systems on a budget (1.5B parameters)

Out-of-scope: Production database systems without human review; complex multi-table joins not represented in the training data; dialects other than standard SQL / SQLite.

Training Data

Dataset: b-mc2/sql-create-context (~82,000 rows)

question: natural language query
context: one or more CREATE TABLE statements
answer: target SQL query

Split: 95% train / 5% validation (seed 42) Format: Qwen2.5-Instruct chat template Max sequence length: 512 tokens

Training Procedure

Hyperparameter	Value
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
Target modules	q/k/v/o/gate/up/down_proj
Quantization	4-bit NF4
Learning rate	2e-4
LR schedule	Cosine
Warmup ratio	0.05
Effective batch size	16
Epochs	3
Max seq length	512
Framework	HuggingFace PEFT + TRL

Training was done on a single GPU (NVIDIA T4 / A100) using gradient checkpointing. Experiment tracking: Weights & Biases

Evaluation Results

SQL Generation (500-sample validation subset)

Metric	Baseline	Fine-tuned	Delta
ROUGE-L	0.8784	0.9856	+0.1072
Exact Match	0.0000	0.7540	+0.7540

Catastrophic Forgetting (MMLU subset)

Subject	Accuracy
High School Mathematics	0.36
Computer Security	0.76
Moral Scenarios	0.32
Overall	0.48

The MMLU scores confirm general capability is retained after fine-tuning.

Limitations

Trained on a single domain (single-table SQL); performance degrades on complex multi-table queries
Standard SQL only — dialect-specific syntax (e.g., T-SQL window functions) may be unreliable
Always review generated SQL before executing against production databases
English-only questions

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "samratkar77/qwen2.5-1.5b-sql-qlora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.float16, device_map="auto"
)

system_prompt = (
    "You are an expert SQL assistant. "
    "Given a natural language question and the relevant database schema, "
    "write a single correct SQL query that answers the question. "
    "Return only the SQL query with no explanation."
)

question = "How many employees are in the sales department?"
context = "CREATE TABLE employees (id INT, name TEXT, department TEXT, salary REAL);"

messages = [
    {"role": "system",    "content": system_prompt},
    {"role": "user",      "content": f"Given the following SQL tables:\n\n{context}\n\nWrite a SQL query to answer: {question}"},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=100, do_sample=False)

new_tokens = output[0, inputs["input_ids"].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
# Expected: SELECT COUNT(*) FROM employees WHERE department = 'sales';

Using LoRA adapters only (memory-efficient)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
adapter_id    = "samratkar77/qwen2.5-1.5b-sql-qlora"

bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                                 bnb_4bit_compute_dtype=torch.float16)
base = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config,
                                             device_map="auto")
model = PeftModel.from_pretrained(base, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)

Citation

If you use this model, please cite the base model and dataset:

@misc{qwen2.5-1.5b-sql-qlora,
  author = {samratkar77},
  title  = {Qwen2.5-1.5B fine-tuned for Text-to-SQL with QLoRA},
  year   = {2025},
  url    = {https://huggingface.co/samratkar77/qwen2.5-1.5b-sql-qlora}
}

Downloads last month: -

Safetensors

Model size

2B params

Tensor type

F16

Model tree for samrat-kar/qwen2.5-1.5b-sql-qlora

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Adapter

(807)

this model

samrat-kar
/

qwen2.5-1.5b-sql-qlora