PixelModel 🖼️

A neural network where the weights are the image.

🧪 Dataset vs Outputs

Ground truth dataset images compared with generated outputs.

Red	Green	Blue
dataset output	dataset output	dataset output
White	Yellow	Dark
dataset output	dataset output	dataset output

What is this?

model.png is not a picture of anything — it is the model. Every pixel's RGB values encode neural network weights:

R channel — weight magnitude
B channel — weight sign (≥128 = positive)
G channel — bias values

At inference, pixels are parsed into 3 weight matrices forming a tiny MLP. The prompt is embedded into a vector, then a forward pass generates a 32×32 image. Training directly optimizes pixel values via gradient descent until the PNG itself becomes the model.

📁 Files

model.png       ← THE MODEL (64×3200 px)
main.py         ← inference
train.py        ← training
model.py        ← architecture
dataset/        ← training data
  cat.png
  cat.txt       ← prompt: "a cat"
  ...

⚙️ Usage

Train

python train.py
python train.py --epochs 500 --lr 0.05

Generate

python main.py "red"
python main.py "a cat" --out cat_out.png --scale 8

--scale 8 upscales 32×32 → 256×256 using nearest-neighbour interpolation.

🧠 Architecture

prompt string
  → char-level embedding → 32-dim vector
  → W1 (64×32)  → tanh
  → W2 (64×64)  → tanh
  → W3 (3072×64) → sigmoid
  → reshape → 32×32×3 image

All weights live inside model.png. Opening the PNG is literally opening the neural network.

📊 Dataset Tips

6–20 image-prompt pairs is enough
Simple targets converge fastest (solid colors, gradients, shapes)
200–500 epochs typically sufficient
Loss below 0.001 is good for simple datasets
Model capacity is fixed (~600K implicit parameters)

It's a toy. It's not useful. But it's cool that it works.

Seton Labs · Coordinate · Evaluate · Upgrade

Downloads last month: -; Downloads are not tracked for this model. How to track

seton-labs
/

pixelmodel