sumitdotml
/

lora-and-friends

supervised-fine-tuning

Model card Files Files and versions

LoRA and Friends

This repository contains the six retained PEFT LoRA adapter exports from the lora-and-friends target-module comparison on Qwen/Qwen3-8B.

The study compared two adapter scopes on the same rendered math SFT dataset, using three seeds per condition. All adapters here are the selected step-3169 checkpoints, chosen by the frozen validation-NLL rule before GSM8K evaluation.

Files

checkpoints/best-checkpoints/
  attention_only/seed-0/step-3169/
  attention_only/seed-1/step-3169/
  attention_only/seed-2/step-3169/
  all_layer/seed-0/step-3169/
  all_layer/seed-1/step-3169/
  all_layer/seed-2/step-3169/

Each checkpoint directory contains:

adapter_config.json
adapter_model.safetensors
checkpoint_complete

Conditions

Condition	Intended adapter scope	Seeds	Selected step
`attention_only`	attention projections only	0, 1, 2	3169
`all_layer`	attention and MLP projections	0, 1, 2	3169

The exported PEFT adapter configs record r=8, lora_alpha=32, and lora_dropout=0.

GSM8K Results

Condition	Seed 0	Seed 1	Seed 2	Mean
`attention_only`	0.904473	0.906748	0.905231	0.905484
`all_layer`	0.899166	0.902199	0.901440	0.900935

The untouched Qwen/Qwen3-8B baseline in the retained evaluation scored 0.845337 on the same 1,319-example GSM8K test setup.

Dataset

The frozen raw and rendered training files are published at:

https://huggingface.co/datasets/sumitdotml/lora-and-friends-dataset

Use the rendered dataset split for reproduction:

rendered/openmath_original_clean_qwen3_disable_thinking/train.jsonl
rendered/openmath_original_clean_qwen3_disable_thinking/val.jsonl

Project Article

The technical write-up is published on the project site:

https://sumit.ml/research/lora-and-friends/

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sumitdotml/lora-and-friends

Base model

Qwen/Qwen3-8B-Base

Finetuned

Adapter

(1317)

this model

Dataset used to train sumitdotml/lora-and-friends

Collection including sumitdotml/lora-and-friends

lora-and-friends

2 items • Updated 6 days ago