GPT-SW3 1.3B — Icelandic Grammar-Aligned (SAGA)

This is a LoRA adapter for GPT-SW3 1.3B trained to generate grammatically correct Icelandic text.

The adapter was trained using SAGA (Syntax-Aware Generation Alignment), applying Delta-DPO directly on the 1.3B base model. The model generates 8 candidate continuations per prompt, filters pairs by quality gap, and optimizes preference with DPO. No SFT warm-up was used since Delta-DPO from the 1.3B base already achieves strong results.

This model achieves the highest absolute parse score of all models we evaluated.

Results

Evaluated on 200 Icelandic Wikipedia sentences:

Metric	Base 1.3B	This model
Greynir parse success	85.5%	96.0%
Parse score	0.593	0.763
PPL-Wiki	19.4	24.4

Parse score = parse success rate times mean parse quality.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b", torch_dtype="auto")
model = PeftModel.from_pretrained(base, "Hodfa71/gpt-sw3-1b3-icelandic-delta-dpo")
tokenizer = AutoTokenizer.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b")

prompt = "Íslenska er"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training details

The training data is Icelandic Wikipedia (10,000 sentences, filtered for quality).

Delta-DPO generates 8 candidate continuations per prompt (4 at temperature 0.7, 4 at temperature 1.1). Pairs where the quality gap is above 0.25 and the chosen score is above 0.20 are kept for training. DPO runs for 2 epochs with beta=0.1, LoRA r=16.

The grammar reward uses Greynir, an Icelandic NLP library that checks whether a sentence has a valid constituency parse with a verbal root and a nominal subject.

Citation

Paper is under review. Will update with citation when available.

License

The base model (GPT-SW3) is released by AI Sweden under their LLM license. This fine-tuned version inherits the same license. Attribution: AI Sweden, RISE, and WASP.

Downloads last month: 28

Model tree for Hodfa71/gpt-sw3-1b3-icelandic-delta-dpo

Base model

AI-Sweden-Models/gpt-sw3-1.3b

Adapter

(20)

this model