Unhinged Horoscopes — LoRA adapter

A ~22MB LoRA adapter on top of Llama 3.2 1B Instruct that overrides the base model's tone and turns it into a generator for absurd, specific, chaotic-neutral horoscopes from a 30-token prompt. The adapter is narrow on the input format and on output length; it does not significantly rewrite the base model's general knowledge or safety behaviour.

If you only want to run the model, grab the merged and quantised GGUF at edbuildingstuff/unhinged-horoscopes (~770MB, drops into llama.cpp / ollama / mobile FFI as a single file).

This adapter repo is for developers who want to:

inspect what was changed
merge it into a different base build, dtype, or runtime
continue training on top of it
reproduce the result from scratch

Adapter config

Field	Value
Base model	`meta-llama/Llama-3.2-1B-Instruct`
LoRA rank (`r`)	16
LoRA alpha	32
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` (all 7 projection layers)
Adapter size	~22MB
Format	Safetensors (PEFT)

Prompt format

The adapter was trained on a single user message with no system prompt. Match this format exactly; the fine-tune is narrow on it.

Sign: Aries
Category: Daily Chaos
Date: 2026-05-02
Generate an unhinged horoscope.

Required values:

Sign is one of: Aries, Taurus, Gemini, Cancer, Leo, Virgo, Libra, Scorpio, Sagittarius, Capricorn, Aquarius, Pisces
Category is one of: Daily Chaos, Love Life, Career, Vibe Check
Date is YYYY-MM-DD

Apply the standard Llama 3.2 chat template around the user message.

Quick start

Load with PEFT

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "meta-llama/Llama-3.2-1B-Instruct"
adapter_id = "edbuildingstuff/unhinged-horoscopes-lora"

base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(base_id)
model = PeftModel.from_pretrained(base, adapter_id)

prompt = (
    "Sign: Leo\n"
    "Category: Career\n"
    "Date: 2026-05-02\n"
    "Generate an unhinged horoscope."
)

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)

out = model.generate(
    input_ids,
    max_new_tokens=120,
    temperature=0.9,
    top_p=0.9,
    do_sample=True,
)
print(tokenizer.decode(out[0][input_ids.shape[1]:], skip_special_tokens=True))

Merge into FP16 base

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-1B-Instruct",
    torch_dtype="auto",
    device_map="cpu",
)
model = PeftModel.from_pretrained(base, "edbuildingstuff/unhinged-horoscopes-lora")
merged = model.merge_and_unload()
merged.save_pretrained("./merged_hf")
AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B-Instruct").save_pretrained("./merged_hf")

Output: ./merged_hf/ — FP16 merged base + adapter, ~2.4GB safetensors.

Convert to GGUF and quantise to Q4_K_M

Clone and build llama.cpp (one-time):

git clone https://github.com/ggerganov/llama.cpp.git
cmake -B llama.cpp/build llama.cpp
cmake --build llama.cpp/build --config Release

Convert merged FP16 to GGUF, then quantise:

python llama.cpp/convert_hf_to_gguf.py ./merged_hf \
  --outtype f16 \
  --outfile ./unhinged-horoscopes-f16.gguf

llama.cpp/build/bin/llama-quantize \
  ./unhinged-horoscopes-f16.gguf \
  ./unhinged-horoscopes-q4_k_m.gguf \
  Q4_K_M

Outputs:

unhinged-horoscopes-f16.gguf — FP16 GGUF (~2.48GB)
unhinged-horoscopes-q4_k_m.gguf — Q4_K_M GGUF (~770MB), ready to drop into llama.cpp, ollama, or llamadart

For a different precision (Q5_K_M, Q8_0, IQ-quants, etc.) substitute the last argument to llama-quantize.

Shortcut: pre-merged + Q4_K_M GGUF

If you don't need to inspect the intermediates, the merged Q4_K_M GGUF is published at edbuildingstuff/unhinged-horoscopes. Drop-in usable in llama.cpp / ollama / llamadart.

What the adapter changes

Tone register. Confident, absurd, specific, chaotic neutral. The trained register dominates on prompts that match the 4-line template.
Output length. 1 to 3 sentences, ~30 to 80 tokens. The model does not pad, does not preface with "Sure, here is your horoscope", does not list bullets.
Format adherence. Responds directly to the 4-line prompt template without preamble.
Per-sign personality threads. Subtle (Aries impulsive, Capricorn workaholic, Pisces dreamer, Aquarius alien, etc.) — present but not heavy-handed.

What the adapter does not change

Base safety behaviour is largely intact. The training set is benign and short, so the adapter does not significantly rewrite the base model's refusal patterns.
General knowledge is preserved. Off-template prompts (free-form questions, advice-seeking, factual queries) still resolve through the base model. The adapter is narrow on the prompt template and does not crowd out base capability.
Off-template behaviour is uncalibrated. If you stray from the 4-line template, expect base-Llama-with-some-tone-bleed, not horoscope output.

If you stack this adapter with another LoRA, expect tone interference; the chaotic-neutral register tends to dominate.

Training

Field	Value
Base model	`meta-llama/Llama-3.2-1B-Instruct`
Method	LoRA, all 7 projection modules
Rank (`r`)	16
Alpha	32
Epochs	3
Batch size	4
Learning rate	2e-4
Max sequence length	256 tokens
Training platform	Ertas.AI (managed fine-tuning, GPUs pre-configured)

Dataset

Field	Value
Size	480 examples
Coverage	12 signs × 4 categories × 10 each (no missing combos)
Format	ShareGPT JSONL (one user + one assistant message per line, no system prompt)
Date conditioning	~70% date-agnostic, ~30% date-conditioned (season, day-of-week, month vibes)
Output length distribution	1 to 3 sentences, 30 to 80 tokens target
Hard rules	No real people, brands, or locations. No mean-spirited content. No harmful advice (even as jokes). No repetitive patterns.
Validation	`dataset/validate_dataset.py` checks JSON schema, sign and category coverage, response length, and flags generic phrases

A small Alpaca-format mirror of the dataset ({instruction, output}) was also generated for trainers that prefer Alpaca over ShareGPT.

Sample training pairs

{"messages": [{"role": "user", "content": "Sign: Aries\nCategory: Daily Chaos\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "You will argue with a GPS today. You will lose. It knows things about you that you told no one. Lucky object: a fork you've been suspicious of since February."}]}
{"messages": [{"role": "user", "content": "Sign: Capricorn\nCategory: Career\nDate: 2026-04-15\nGenerate an unhinged horoscope."}, {"role": "assistant", "content": "Your boss will email you at 11:47pm. It will just say 'hmm'. Do not respond. Do not sleep. Just know."}]}

Files in this repo

File	Purpose
`adapter_config.json`	PEFT adapter configuration (rank, alpha, target modules)
`adapter_model.safetensors`	LoRA delta weights (~22MB)
Tokenizer files (if shipped)	Inherit from base — re-load from `meta-llama/Llama-3.2-1B-Instruct` if absent

Merged + Q4_K_M GGUF (run-ready): edbuildingstuff/unhinged-horoscopes
Reference Android app (Flutter + llamadart): Unhinged Horoscopes — Google Play / horoscope.ertas.ai (bundle id ai.ertas.horoscope)
Fine-tuning platform: Ertas.AI

License and credits

Adapter weights: Apache-2.0 (downstream use must also comply with Meta's Llama 3.2 community licence)
Training dataset: MIT
Fine-tuned with Ertas.AI, the managed fine-tuning platform that ran this LoRA on pre-configured GPUs end-to-end
Built by Edward Yang (edbuildingstuff) as a reference POC for Ertas Product A: build your own on-device AI model and ship it inside your app. App live at horoscope.ertas.ai / Google Play.

Downloads last month: 12

Model tree for edbuildingstuff/unhinged-horoscopes-lora

Base model

meta-llama/Llama-3.2-1B-Instruct

Adapter

(618)

this model

edbuildingstuff
/

unhinged-horoscopes-lora