DocOracle-v1

DocOracle-v1 is a compact Hugging Face transformer router for synthetic business-document workflows. It classifies synthetic invoices, inbox requests, and RFQs into operational decisions used by an agentic back-office benchmark.

The model is part of PerimeterReasoner-AgentBench, a public-safe benchmark for local-first business-document agents. The benchmark combines deterministic extraction, relational memory, fake ERP actions, human-review routing, audit traces, exact metrics, and this optional learned routing layer.

Labels

DocOracle-v1 predicts one of four workflow labels:

Label	Meaning
`auto_approve`	Route a valid invoice to automatic approval.
`human_review`	Route a risky, incomplete, unsupported, unknown, or ambiguous case to human review.
`draft_created`	Route a resolved inbox/order request to fake ERP order-draft creation.
`draft_quote`	Route a resolved RFQ request to quote drafting.

Label Examples

`auto_approve`

The model should predict auto_approve when an invoice is complete, uses a supported currency, includes a purchase order, and is below the review threshold.

Invoice INV-000123
Vendor: Alpine Robotics AG
Total: 1250.00 EUR
VAT: 250.00 EUR
Due date: 2026-06-15
PO number: PO-1001
Payment terms: NET 30

`human_review`

The model should predict human_review when the case is risky, incomplete, unsupported, unknown, or ambiguous.

Invoice INV-000124
Vendor: Marinello & Co
Total: 8750.00 EUR
VAT: 1750.00 EUR
Due date: 2026-06-25
PO number: PO-1006
Payment terms: NET 45

Reason: the amount is above the automatic approval threshold.

Another example:

From: buyer@unknown.example
Subject: New order

Please send 20 boxes of our usual premium filters next Tuesday.

Reason: the sender/customer cannot be resolved from memory.

`draft_created`

The model should predict draft_created when an inbox/order request has enough information to create a fake ERP order draft.

From: purchasing@marinello.example
Subject: Repeat order

Please send 20 boxes of our usual premium filters next Tuesday.
Use our normal shipping method.

Reason: the customer, product alias, quantity, and shipping preference can be resolved.

`draft_quote`

The model should predict draft_quote when a customer is asking for pricing and the RFQ has enough information to prepare a quote draft.

Customer asks for a quote for 100 industrial sensors, delivery in Zurich, payment terms NET 45.

Reason: the product exists, the quantity is present, and the payment terms are supported.

Technical Details

Architecture: BertForSequenceClassification
Base checkpoint: google/bert_uncased_L-2_H-128_A-2
Library: Hugging Face transformers
Checkpoint format: model.safetensors
Task: synthetic workflow routing / text classification
Training mode: full fine-tuning for the compact CPU-friendly checkpoint
Optional repo support: PEFT/LoRA training path and ModernBERT LoRA showcase path

Training Data

The model was trained only on synthetic examples generated by the benchmark. The data covers three task families:

invoice validation
inbox-to-ERP order drafting
RFQ triage

The generated split is balanced across the four workflow labels.

Split	Examples
Train	576
Test	144

No customer documents, real invoices, private prompts, credentials, or production code are included.

Evaluation

Evaluation was run on the synthetic test split.

Metric	Value
Accuracy	0.9792
Macro F1	0.9791

The deterministic rule-based benchmark remains the reference baseline. DocOracle-v1 is useful for comparing learned routing behavior against exact, reproducible rules.

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="YOUR_USERNAME/doc-oracle-v1",
)

text = """Invoice INV-000123
Vendor: Alpine Robotics AG
Total: 1250.00 EUR
VAT: 250.00 EUR
Due date: 2026-06-15
PO number: PO-1001
Payment terms: NET 30"""

print(classifier(text))

Expected label:

auto_approve

Example Inputs

Invoice:

Invoice INV-000123
Vendor: Alpine Robotics AG
Total: 1250.00 EUR
VAT: 250.00 EUR
Due date: 2026-06-15
PO number: PO-1001
Payment terms: NET 30

Inbox request:

From: purchasing@marinello.example
Subject: Repeat order

Please send 20 boxes of our usual premium filters next Tuesday.
Use our normal shipping method.

RFQ:

Customer asks for a quote for 100 industrial sensors, delivery in Zurich, payment terms NET 45.

Intended Use

Use DocOracle-v1 for:

synthetic benchmark demos
workflow-routing experiments
comparing learned routing against deterministic business rules
interview or portfolio demonstrations of document AI and agent evaluation

It is not intended for real invoice approval, financial automation, or production business decisions.

Limitations

The model is trained only on synthetic data.
It does not understand real customer contracts, business policies, or ERP systems.
It should not be used for real approvals without real validation, monitoring, governance, and human review controls.
The high metric values reflect the controlled synthetic benchmark, not real-world production performance.

Safety And Privacy

This model and its benchmark data are public-safe:

no customer data
no real invoices
no private prompts
no credentials
no LuxoAI production code
no private infrastructure details

Project Context

DocOracle-v1 is the learned router component of PerimeterReasoner-AgentBench. The broader benchmark includes:

deterministic invoice extraction and policy validation
synthetic email and RFQ tasks
relational customer/product memory
fake ERP state-machine actions
human-review escalation
audit traces
exact metrics and deterministic judge scoring
FastAPI and Docker surfaces
optional PEFT/LoRA and ModernBERT training paths

Downloads last month: 22

Safetensors

Model size

4.39M params

Tensor type

F32