qwen2.5-1.5b-sql-qlora

Fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct for natural language to SQL generation using QLoRA (4-bit quantization + LoRA).

Model Description

This model takes a natural language question and a SQL table schema (one or more CREATE TABLE statements) and returns the corresponding SQL query.

Base model: Qwen/Qwen2.5-1.5B-Instruct Fine-tuning method: QLoRA (4-bit NF4 + LoRA rank 16) Task: Text-to-SQL generation Training data: b-mc2/sql-create-context

Intended Use

  • SQL query generation from natural language in applications and chatbots
  • Database querying assistants
  • Prototyping text-to-SQL systems on a budget (1.5B parameters)

Out-of-scope: Production database systems without human review; complex multi-table joins not represented in the training data; dialects other than standard SQL / SQLite.

Training Data

Dataset: b-mc2/sql-create-context (~82,000 rows)

  • question: natural language query
  • context: one or more CREATE TABLE statements
  • answer: target SQL query

Split: 95% train / 5% validation (seed 42) Format: Qwen2.5-Instruct chat template Max sequence length: 512 tokens

Training Procedure

Hyperparameter Value
LoRA rank (r) 16
LoRA alpha 32
LoRA dropout 0.05
Target modules q/k/v/o/gate/up/down_proj
Quantization 4-bit NF4
Learning rate 2e-4
LR schedule Cosine
Warmup ratio 0.05
Effective batch size 16
Epochs 3
Max seq length 512
Framework HuggingFace PEFT + TRL

Training was done on a single GPU (NVIDIA T4 / A100) using gradient checkpointing. Experiment tracking: Weights & Biases

Evaluation Results

SQL Generation (500-sample validation subset)

Metric Baseline Fine-tuned Delta
ROUGE-L 0.8784 0.9856 +0.1072
Exact Match 0.0000 0.7540 +0.7540

Catastrophic Forgetting (MMLU subset)

Subject Accuracy
High School Mathematics 0.36
Computer Security 0.76
Moral Scenarios 0.32
Overall 0.48

The MMLU scores confirm general capability is retained after fine-tuning.

Limitations

  • Trained on a single domain (single-table SQL); performance degrades on complex multi-table queries
  • Standard SQL only — dialect-specific syntax (e.g., T-SQL window functions) may be unreliable
  • Always review generated SQL before executing against production databases
  • English-only questions

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "samratkar77/qwen2.5-1.5b-sql-qlora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.float16, device_map="auto"
)

system_prompt = (
    "You are an expert SQL assistant. "
    "Given a natural language question and the relevant database schema, "
    "write a single correct SQL query that answers the question. "
    "Return only the SQL query with no explanation."
)

question = "How many employees are in the sales department?"
context = "CREATE TABLE employees (id INT, name TEXT, department TEXT, salary REAL);"

messages = [
    {"role": "system",    "content": system_prompt},
    {"role": "user",      "content": f"Given the following SQL tables:\n\n{context}\n\nWrite a SQL query to answer: {question}"},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(**inputs, max_new_tokens=100, do_sample=False)

new_tokens = output[0, inputs["input_ids"].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
# Expected: SELECT COUNT(*) FROM employees WHERE department = 'sales';

Using LoRA adapters only (memory-efficient)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
adapter_id    = "samratkar77/qwen2.5-1.5b-sql-qlora"

bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                                 bnb_4bit_compute_dtype=torch.float16)
base = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config,
                                             device_map="auto")
model = PeftModel.from_pretrained(base, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)

Citation

If you use this model, please cite the base model and dataset:

@misc{qwen2.5-1.5b-sql-qlora,
  author = {samratkar77},
  title  = {Qwen2.5-1.5B fine-tuned for Text-to-SQL with QLoRA},
  year   = {2025},
  url    = {https://huggingface.co/samratkar77/qwen2.5-1.5b-sql-qlora}
}
Downloads last month
-
Safetensors
Model size
2B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samrat-kar/qwen2.5-1.5b-sql-qlora

Adapter
(807)
this model

Dataset used to train samrat-kar/qwen2.5-1.5b-sql-qlora