xlangai/spider
Viewer β’ Updated β’ 8.03k β’ 8.81k β’ 174
A private, on-device SQL generation model fine-tuned on the Spider dataset.
This is a LoRA fine-tuned version of Llama-3-8B-Instruct specialized for Text-to-SQL generation. The model was trained on 6,300 examples from the Spider dataset and achieves 100% valid SQL generation.
Evaluated on 100 examples from Spider validation set:
| Metric | Value |
|---|---|
| Valid SQL Generation | 100.0% |
| Exact String Match | 2.0% |
| Successful Queries | 100/100 |
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"tanvicas/nano-analyst-sql"
)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
question = "How many users are from California?"
schema = "CREATE TABLE users (id INT, name TEXT, state TEXT);"
# Generate SQL using the model
sql = generate_sql(question, schema)
print(sql) # SELECT COUNT(*) FROM users WHERE state = 'California';
Apache 2.0 License. Base model subject to Meta's Llama 3 Community License.
Built for learning and research π
Base model
meta-llama/Meta-Llama-3-8B-Instruct