GPT-SW3 1.3B — Icelandic Grammar-Aligned (SAGA)
This is a LoRA adapter for GPT-SW3 1.3B trained to generate grammatically correct Icelandic text.
The adapter was trained using SAGA (Syntax-Aware Generation Alignment), applying Delta-DPO directly on the 1.3B base model. The model generates 8 candidate continuations per prompt, filters pairs by quality gap, and optimizes preference with DPO. No SFT warm-up was used since Delta-DPO from the 1.3B base already achieves strong results.
This model achieves the highest absolute parse score of all models we evaluated.
Results
Evaluated on 200 Icelandic Wikipedia sentences:
| Metric | Base 1.3B | This model |
|---|---|---|
| Greynir parse success | 85.5% | 96.0% |
| Parse score | 0.593 | 0.763 |
| PPL-Wiki | 19.4 | 24.4 |
Parse score = parse success rate times mean parse quality.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b", torch_dtype="auto")
model = PeftModel.from_pretrained(base, "Hodfa71/gpt-sw3-1b3-icelandic-delta-dpo")
tokenizer = AutoTokenizer.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b")
prompt = "Íslenska er"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training details
The training data is Icelandic Wikipedia (10,000 sentences, filtered for quality).
Delta-DPO generates 8 candidate continuations per prompt (4 at temperature 0.7, 4 at temperature 1.1). Pairs where the quality gap is above 0.25 and the chosen score is above 0.20 are kept for training. DPO runs for 2 epochs with beta=0.1, LoRA r=16.
The grammar reward uses Greynir, an Icelandic NLP library that checks whether a sentence has a valid constituency parse with a verbal root and a nominal subject.
Citation
Paper is under review. Will update with citation when available.
License
The base model (GPT-SW3) is released by AI Sweden under their LLM license. This fine-tuned version inherits the same license. Attribution: AI Sweden, RISE, and WASP.
- Downloads last month
- 28
Model tree for Hodfa71/gpt-sw3-1b3-icelandic-delta-dpo
Base model
AI-Sweden-Models/gpt-sw3-1.3b