Halide Vision

Halide Vision is a MiniCPM-V 4.6 checkpoint fine-tuned for analog film-scan defect extraction. It is maintained by Lonelyguyse1 for Project Halide.

The model emits JSON defect proposals for dust, dirt, scratches, hair-like surface contamination, emulsion damage, chemical stains, and light leaks. The Project Halide runtime validates the JSON schema, removes low-confidence or duplicate boxes, and uses tiled inspection when large scans hide thin crack networks at full-frame scale.

Training Summary

  • Base model: openbmb/MiniCPM-V-4.6
  • Training method: LoRA fine-tuning with LLaMA-Factory, merged for inference
  • Curriculum: FilmDamageSimulator annotations, procedural film-defect positives, hard clean negatives, and a v7 crack curriculum
  • Held-out private negatives: used only for evaluation, not for training

Held-Out Smoke Result

Final v7 checkpoint with 960 px tiled fallback:

Sample Expected surface condition Result
negative1 Long scratches across portrait 8 defects
negative2 Abraded emulsion and dirt patches 9 defects
negative3 Severe emulsion damage and debris 6 defects
negative4 Near-clean hard negative 0 defects
negative5 Broad lifted crack network 45 defects

Runtime Notes

This model is intended to run inside Project Halide with GPU inference. The runtime refuses local CPU model inference and does not call cloud inference APIs.

The repo also includes a llama.cpp Q4_K_M GGUF artifact: minicpm-v-4.6-merged-v7-crack-curriculum-r1-ckpt625-q4_k_m.gguf.

Use the model as an inspection aid. It can over-box broad damage regions, and film metadata should be treated as context unless verified by notes or edge marks.

Downloads last month
118
Safetensors
Model size
1B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Lonelyguyse1/halide-vision

Quantized
(23)
this model

Spaces using Lonelyguyse1/halide-vision 3