Ordis-1.5B-V355-VarGH-GGUF
GGUF Quantized Versions β 7 Formats
1.5B | World's Strongest 1.5B Practical Model
"Size implies limits; Architecture breaks them."
About Ordis
Ordis is the product of dozens of version iterations and 16+ controlled-variable experiments, built to push the absolute performance ceiling of 1.5B parameters.
Not a benchmark-gaming model. Ordis is engineered for real-world deployment β with native anti-hallucination, structured self-correction, causal reasoning, and honest epistemic humility (no safety templates).
For full technical details, benchmarks, demos, and theoretical foundation, see the full model card.
Quantized Versions
| Filename | Quant | Bits | Size | Max RAM | Recommendation |
|---|---|---|---|---|---|
ordis-v355-vargh-1.5b-q2-k.gguf |
Q2_K | 2 | ~0.7 GB | ~3.2 GB | Experimental only |
ordis-v355-vargh-1.5b-q3-k-m.gguf |
Q3_K_M | 3 | ~0.8 GB | ~3.3 GB | Low-end devices |
ordis-v355-vargh-1.5b-q4-k-m.gguf |
Q4_K_M | 4 | ~1.0 GB | ~3.5 GB | Recommended β best balance |
ordis-v355-vargh-1.5b-q5-k-m.gguf |
Q5_K_M | 5 | ~1.1 GB | ~3.6 GB | Good quality |
ordis-v355-vargh-1.5b-q6-k.gguf |
Q6_K | 6 | ~1.3 GB | ~3.8 GB | Desktop recommended |
ordis-v355-vargh-1.5b-q8-0.gguf |
Q8_0 | 8 | ~1.6 GB | ~4.1 GB | Near-lossless |
ordis-v355-vargh-1.5b-f16.gguf |
F16 | 16 | ~3.1 GB | ~5.6 GB | Full precision reference |
Which File Should I Choose?
- Mobile / embedded: Q3_K_M or Q4_K_M (under 1 GB)
- Laptop / general use: Q4_K_M (best quality-to-size ratio)
- Desktop with spare RAM: Q6_K or Q8_0 (minimal quality loss)
- Research / evaluation: F16 (no quantization artifacts)
- Extreme compression: Q2_K (expect noticeable quality degradation)
Quick Start
Ollama (Easiest)
ollama run hf.co/sugiken/Ordis-1.5B-V355-VarGH-GGUF:Q4_K_M
llama.cpp CLI
./llama-cli -m ordis-v355-vargh-1.5b-q4-k-m.gguf \
-p "<|im_start|>user\nWhat makes you different?<|im_end|>\n<|im_start|>assistant\n" \
-ngl 99 -n 512 --temp 0.7 --top-p 0.9
llama-cpp-python
from llama_cpp import Llama
llm = Llama(model_path="ordis-v355-vargh-1.5b-q4-k-m.gguf", n_gpu_layers=-1)
output = llm.create_chat_completion(
messages=[{"role": "user", "content": "Explain causal reasoning in economics."}],
max_tokens=512,
temperature=0.7
)
print(output["choices"][0]["message"]["content"])
ModelScope SDK
from modelscope import snapshot_download
model_dir = snapshot_download('sugiken/Ordis-1.5B-V355-VarGH-GGUF')
Git Clone
git clone https://www.modelscope.cn/sugiken/Ordis-1.5B-V355-VarGH-GGUF.git
Prompt Template
Ordis uses standard ChatML format:
<|im_start|>user
Your question here<|im_end|>
<|im_start|>assistant
Tips:
- Ask directly β no special prompt engineering needed
- Chinese questions get the strongest responses
- Ordis automatically uses
<think>blocks for reasoning tasks - "I don't know" is a feature, not a bug
Benchmarks (Summary)
Standard Benchmarks (lm-eval v0.4.10, 0-shot, A100-80GB)
0-shot = no examples given, same as real user experience. All four metrics beat base model β no alignment tax paid.
| Benchmark | What It Tests | Ordis 1.5B | Base Qwen | Delta |
|---|---|---|---|---|
| ARC-Challenge | Science reasoning | 45.22% | 40.27% | +4.95 |
| HellaSwag | Common sense | 68.14% | 66.06% | +2.08 |
| GSM8K (CoT) | Math | 50.80% | 48.07% | +2.73 |
| TruthfulQA MC2 | Truthfulness | 47.73% | 43.47% | +4.26 |
| Average | Overall | 52.97% | 49.47% | +3.50 |
Custom Evaluations
| Benchmark | Score |
|---|---|
| Custom 60-Q Eval (6 dimensions) | 85.0% (51/60) |
| 124-Point Comprehensive | 86/114 (75.4%) β Grade A |
| CLadder Causal Reasoning | 54.3% (highest at this scale) |
60-Question Breakdown:
| Dimension | Score |
|---|---|
| Reasoning | 100% |
| Common Sense | 100% |
| Defense Overload | 100% |
| Anti-Hallucination | 90% |
| Identity | 60% |
| IDK Ability | 60% |
Key Capabilities
- Structured Self-Correction (SSC): 5-step error correction protocol β acknowledge, attribute, correct, verify
- Confidence-Guided Decisions: High confidence β assert; Low confidence β honest refusal; Mid confidence β known limitation
- Cross-Domain Causal Reasoning: Transfers causal structures across domains, not task-specific memorization
- Native Anti-Hallucination: Uncertainty from reasoning, not safety templates. No "As an AI..." disclaimers
- Runtime Metacognition: Internal
[Snapshot]cognitive monitoring during conversation
Known Limitations
| Limitation | Root Cause |
|---|---|
| Anti-Gaslighting = 0/4 | Physical law: open-loop systems cannot verify memory (F=0) |
| Mid-confidence instability | 1.5B capacity ceiling |
| English identity leakage | Base model prior too strong |
| Celebrity detail hallucination | Limited parametric memory |
Full analysis and honest disclosure in the complete model card.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-1.5B-Instruct |
| Parameters | 1.5B |
| Fine-tuning | LoRA |
| Context Length | 32K tokens (base model native) |
| Training Seq Length | 2048 tokens (optimal performance range) |
| Languages | Chinese (primary), English |
| Original Format | BF16 SafeTensors |
| GGUF Converted By | OrdisAI |
| License | Apache 2.0 |
Citation
@misc{ordis2026,
title={Ordis-1.5B-V355-VarGH: The Summit of Small Models},
author={Liu, S.},
year={2026},
publisher={OrdisAI},
url={https://www.ordisai.com}
}
Ordis-1.5B-V355-VarGH-GGUF | Apache 2.0 | OrdisAI
Built with honest engineering. Shipped with honest disclosure.
- Downloads last month
- -
16-bit