Uchen-Ume Binary Script Classifier
This model is a fine-tuned version of Meta's DINOv3-ViT-S/16 for binary classification of Tibetan scripts (Uchen vs. Ume). It serves as the "Router" stage for a hierarchical classification pipeline.
Model Details
Model Description
The model was developed to provide a high-reliability baseline for separating formal block scripts (Uchen) from cursive script families (Ume). By focusing on global page geometry rather than local character patches, it achieves high accuracy on whole-page manuscript scans.
- Project Name: The BDRC Etext Corpus
- Developed by: Dharmaduta
- Specifications provided by: Buddhist Digital Resource Center (BDRC)
- Funded by: Khyentse Foundation
- Model type: Vision Transformer (ViT)
- Language(s): Tibetan (Classical/Manuscript)
- Finetuned from model: facebook/dinov3-vits16-pretrain-lvd1689m
Performance Summary
The model achieved near-perfect discrimination during testing, specifically excelling in identifying formal Uchen with high recall.
- Test Accuracy: 98.95%
- Macro F1-Score: 0.984
- AUC-ROC: 0.9988
- Best Training Configuration: Stage B (Partial Backbone Unfreezing)
Confusion Matrix
| Predicted \ Actual | Uchen | Ume |
|---|---|---|
| Uchen | 159 | 2 |
| Ume | 6 | 595 |
Uses
Direct Use
This model is intended to be used as a pre-processing filter or router within the BDRC Etext Corpus pipeline. It can automatically sort large digital archives into Uchen or Ume categories to be processed by specialized downstream OCR engines.
Out-of-Scope Use
- Classification of modern printed Tibetan fonts (untested).
- Recognition of non-Tibetan scripts (Sanskrit, Lantsa, etc.).
- Character-level recognition (OCR).
Bias, Risks, and Limitations
The model was trained primarily on BDRC manuscript scans. It may struggle with:
- Extremely faint or damaged woodblock prints.
- Pages containing a roughly equal mix of both Uchen and Ume (Multi-script).
How to Get Started with the Model
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image
processor = AutoImageProcessor.from_pretrained("your-username/uchen-ume-classifier")
model = AutoModelForImageClassification.from_pretrained("your-username/uchen-ume-classifier")
image = Image.open("manuscript_page.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()
print(f"Detected Script: {model.config.id2label[prediction]}")
Model tree for openpecha/uchen-ume-classifier
Base model
facebook/dinov3-vit7b16-pretrain-lvd1689m