Update BF16 weights + code to modelv2 shards (region LN + finetune support)

#32
by err805 - opened
moondream org

Summary
This PR updates moondream/moondream3-preview to the new BF16 codepath, adds region‑head LN, enables finetune adapters (LoRA), and fixes spatial‑ref handling so spatial refs can be provided as inputs without re-encoding during answer generation.

New weights

  • Added modelv2-00001-of-00004.safetensorsmodelv2-00004-of-00004.safetensors
  • Updated model.safetensors.index.json to point to modelv2-* as the new default
  • Legacy model-0000x-of-00004.safetensors are retained for hard‑coded URL compatibility

Region model update

  • Region head now applies LN before coord/size decoders (matches the new weights and backend parity)

Finetune / LoRA support

  • Adapters are resolved via finetune_id@step and fetched from the finetune endpoint
  • API-style model strings are supported (prefix ignored; /<finetune_id>@<step> is parsed)
  • Example request format (API):
    { "model": "moondream3-preview/01K...@80", "question": "...", "image_url": "..." }
  • Example model usage:
    model.query(image, question, settings={"adapter": "01K...@80"})
vikhyatk changed pull request status to merged

Sign up or log in to comment