Update BF16 weights + code to modelv2 shards (region LN + finetune support)
#32
by
err805
- opened
Summary
This PR updates moondream/moondream3-preview to the new BF16 codepath, adds region‑head LN, enables finetune adapters (LoRA), and fixes spatial‑ref handling so spatial refs can be provided as inputs without re-encoding during answer generation.
New weights
- Added
modelv2-00001-of-00004.safetensors…modelv2-00004-of-00004.safetensors - Updated
model.safetensors.index.jsonto point tomodelv2-*as the new default - Legacy
model-0000x-of-00004.safetensorsare retained for hard‑coded URL compatibility
Region model update
- Region head now applies LN before coord/size decoders (matches the new weights and backend parity)
Finetune / LoRA support
- Adapters are resolved via
finetune_id@stepand fetched from the finetune endpoint - API-style
modelstrings are supported (prefix ignored;/<finetune_id>@<step>is parsed) - Example request format (API):
{ "model": "moondream3-preview/01K...@80", "question": "...", "image_url": "..." } - Example model usage:
model.query(image, question, settings={"adapter": "01K...@80"})
vikhyatk
changed pull request status to
merged