Unhinged Horoscopes β LoRA adapter
A ~22MB LoRA adapter on top of Llama 3.2 1B Instruct that overrides the base model's tone and turns it into a generator for absurd, specific, chaotic-neutral horoscopes from a 30-token prompt. The adapter is narrow on the input format and on output length; it does not significantly rewrite the base model's general knowledge or safety behaviour.
If you only want to run the model, grab the merged and quantised GGUF at edbuildingstuff/unhinged-horoscopes (~770MB, drops into llama.cpp / ollama / mobile FFI as a single file).
This adapter repo is for developers who want to:
- inspect what was changed
- merge it into a different base build, dtype, or runtime
- continue training on top of it
- reproduce the result from scratch
Adapter config
| Field | Value |
|---|---|
| Base model | meta-llama/Llama-3.2-1B-Instruct |
LoRA rank (r) |
16 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj (all 7 projection layers) |
| Adapter size | ~22MB |
| Format | Safetensors (PEFT) |
Prompt format
The adapter was trained on a single user message with no system prompt. Match this format exactly; the fine-tune is narrow on it.
Sign: Aries
Category: Daily Chaos
Date: 2026-05-02
Generate an unhinged horoscope.
Required values:
Signis one of:Aries,Taurus,Gemini,Cancer,Leo,Virgo,Libra,Scorpio,Sagittarius,Capricorn,Aquarius,PiscesCategoryis one of:Daily Chaos,Love Life,Career,Vibe CheckDateisYYYY-MM-DD
Apply the standard Llama 3.2 chat template around the user message.
Quick start
Load with PEFT
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_id = "meta-llama/Llama-3.2-1B-Instruct"
adapter_id = "edbuildingstuff/unhinged-horoscopes-lora"
base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(base_id)
model = PeftModel.from_pretrained(base, adapter_id)
prompt = (
"Sign: Leo\n"
"Category: Career\n"
"Date: 2026-05-02\n"
"Generate an unhinged horoscope."
)
input_ids = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
out = model.generate(
input_ids,
max_new_tokens=120,
temperature=0.9,
top_p=0.9,
do_sample=True,
)
print(tokenizer.decode(out[0][input_ids.shape[1]:], skip_special_tokens=True))
Merge into FP16 base
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
torch_dtype="auto",
device_map="cpu",
)
model = PeftModel.from_pretrained(base, "edbuildingstuff/unhinged-horoscopes-lora")
merged = model.merge_and_unload()
merged.save_pretrained("./merged_hf")
AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct").save_pretrained("./merged_hf")
Output: ./merged_hf/ β FP16 merged base + adapter, ~2.4GB safetensors.
Convert to GGUF and quantise to Q4_K_M
Clone and build llama.cpp (one-time):
git clone https://github.com/ggerganov/llama.cpp.git
cmake -B llama.cpp/build llama.cpp
cmake --build llama.cpp/build --config Release
Convert merged FP16 to GGUF, then quantise:
python llama.cpp/convert_hf_to_gguf.py ./merged_hf \
--outtype f16 \
--outfile ./unhinged-horoscopes-f16.gguf
llama.cpp/build/bin/llama-quantize \
./unhinged-horoscopes-f16.gguf \
./unhinged-horoscopes-q4_k_m.gguf \
Q4_K_M
Outputs:
unhinged-horoscopes-f16.ggufβ FP16 GGUF (~2.48GB)unhinged-horoscopes-q4_k_m.ggufβ Q4_K_M GGUF (~770MB), ready to drop intollama.cpp,ollama, orllamadart
For a different precision (Q5_K_M, Q8_0, IQ-quants, etc.) substitute the last argument to llama-quantize.
Shortcut: pre-merged + Q4_K_M GGUF
If you don't need to inspect the intermediates, the merged Q4_K_M GGUF is published at edbuildingstuff/unhinged-horoscopes. Drop-in usable in llama.cpp / ollama / llamadart.
What the adapter changes
- Tone register. Confident, absurd, specific, chaotic neutral. The trained register dominates on prompts that match the 4-line template.
- Output length. 1 to 3 sentences, ~30 to 80 tokens. The model does not pad, does not preface with "Sure, here is your horoscope", does not list bullets.
- Format adherence. Responds directly to the 4-line prompt template without preamble.
- Per-sign personality threads. Subtle (Aries impulsive, Capricorn workaholic, Pisces dreamer, Aquarius alien, etc.) β present but not heavy-handed.
What the adapter does not change
- Base safety behaviour is largely intact. The training set is benign and short, so the adapter does not significantly rewrite the base model's refusal patterns.
- General knowledge is preserved. Off-template prompts (free-form questions, advice-seeking, factual queries) still resolve through the base model. The adapter is narrow on the prompt template and does not crowd out base capability.
- Off-template behaviour is uncalibrated. If you stray from the 4-line template, expect base-Llama-with-some-tone-bleed, not horoscope output.
If you stack this adapter with another LoRA, expect tone interference; the chaotic-neutral register tends to dominate.
Training
| Field | Value |
|---|---|
| Base model | meta-llama/Llama-3.2-1B-Instruct |
| Method | LoRA, all 7 projection modules |
Rank (r) |
16 |
| Alpha | 32 |
| Epochs | 3 |
| Batch size | 4 |
| Learning rate | 2e-4 |
| Max sequence length | 256 tokens |
| Training platform | Ertas.AI (managed fine-tuning, GPUs pre-configured) |
Dataset
| Field | Value |
|---|---|
| Size | 480 examples |
| Coverage | 12 signs Γ 4 categories Γ 10 each (no missing combos) |
| Format | ShareGPT JSONL (one user + one assistant message per line, no system prompt) |
| Date conditioning | ~70% date-agnostic, ~30% date-conditioned (season, day-of-week, month vibes) |
| Output length distribution | 1 to 3 sentences, 30 to 80 tokens target |
| Hard rules | No real people, brands, or locations. No mean-spirited content. No harmful advice (even as jokes). No repetitive patterns. |
| Validation | dataset/validate_dataset.py checks JSON schema, sign and category coverage, response length, and flags generic phrases |
A small Alpaca-format mirror of the dataset ({instruction, output}) was also generated for trainers that prefer Alpaca over ShareGPT.
Sample training pairs
{"messages": [{"role": "user", "content": "Sign: Aries\nCategory: Daily Chaos\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "You will argue with a GPS today. You will lose. It knows things about you that you told no one. Lucky object: a fork you've been suspicious of since February."}]}
{"messages": [{"role": "user", "content": "Sign: Capricorn\nCategory: Career\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "Your boss will email you at 11:47pm. It will just say 'hmm'. Do not respond. Do not sleep. Just know."}]}
Files in this repo
| File | Purpose |
|---|---|
adapter_config.json |
PEFT adapter configuration (rank, alpha, target modules) |
adapter_model.safetensors |
LoRA delta weights (~22MB) |
| Tokenizer files (if shipped) | Inherit from base β re-load from meta-llama/Llama-3.2-1B-Instruct if absent |
Related
- Merged + Q4_K_M GGUF (run-ready): edbuildingstuff/unhinged-horoscopes
- Reference Android app (Flutter +
llamadart): Unhinged Horoscopes β Google Play / horoscope.ertas.ai (bundle idai.ertas.horoscope) - Fine-tuning platform: Ertas.AI
License and credits
- Adapter weights: Apache-2.0 (downstream use must also comply with Meta's Llama 3.2 community licence)
- Training dataset: MIT
- Fine-tuned with Ertas.AI, the managed fine-tuning platform that ran this LoRA on pre-configured GPUs end-to-end
- Built by Edward Yang (edbuildingstuff) as a reference POC for Ertas Product A: build your own on-device AI model and ship it inside your app. App live at horoscope.ertas.ai / Google Play.
- Downloads last month
- 12
Model tree for edbuildingstuff/unhinged-horoscopes-lora
Base model
meta-llama/Llama-3.2-1B-Instruct