--- library_name: transformers license: apache-2.0 license_link: LICENSE pipeline_tag: image-text-to-text base_model: - Qwen/Qwen3.5-2B tags: - verus - coding - reasoning - r1 language: - en --- # Verus-r1 [](LICENSE) []() []() [](https://github.com/huggingface/transformers) > [!Note] > This repository contains model weights and configuration files for **Verus-r1** in the Hugging Face Transformers format. > > Compatible with Hugging Face Transformers, vLLM, SGLang, and other major inference frameworks. > > Built for **coding**, **reasoning**, **debugging**, and concise general assistance. ## Verus-r1 Highlights - **Coding-Focused**: Writes, fixes, explains, and reviews code. - **Reasoning-Oriented**: Works through multi-step problems clearly. - **Long Context**: Can handle large prompts, files, and long conversations. - **Instruction Following**: Responds in the format and style requested. - **Efficient**: A compact 2B model for local or hosted inference. ## Model Overview | Property | Value | |---|---| | Parameters | ~2B | | Context Length | **262,144 tokens** | | Architecture | Qwen3.5 | | Chat Format | ChatML (`<\|im_start\|>` / `<\|im_end\|>`) | | Dtype | bfloat16 | | License | Apache 2.0 | ## Quickstart ### Installation ```bash pip install "transformers>=4.52.0" accelerate torch ``` ### Code Generation ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch MODEL_ID = "8F-ai/Verus-r1" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() messages = [ { "role": "system", "content": "You are Verus-r1, a reasoning coding assistant made by 8F-ai. You think through problems carefully before responding." }, { "role": "user", "content": "Write a Python async context manager that manages a PostgreSQL connection pool using asyncpg." } ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.inference_mode(): generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.6, top_p=0.95) output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(output) ``` ### Quantized Inference (4-bit NF4, ~2 GB VRAM) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig import torch quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", ) tokenizer = AutoTokenizer.from_pretrained("8F-ai/Verus-r1") model = AutoModelForCausalLM.from_pretrained( "8F-ai/Verus-r1", quantization_config=quantization_config, device_map="auto", ) ``` ## Intended Use Cases | Use Case | Example | |---|---| | **Code Generation** | Write functions, classes, and scripts | | **Debugging** | Fix bugs from code or error messages | | **Code Review** | Explain code and suggest improvements | | **Reasoning** | Break down multi-step problems | | **Long Context** | Work with long prompts and files | | **General Q&A** | Answer clearly and concisely | ## Limitations - **English-Primary**: Fine-tuning was conducted predominantly on English-language code and documentation. ## Citation ```bibtex @misc{verusr12026, title = {Verus-r1: A Reasoning-Focused Coding Language Model with 262K Context}, author = {8F-ai}, year = {2026}, howpublished = {\url{https://huggingface.co/8F-ai/Verus-r1}}, note = {Apache 2.0 License} } ``` ## License Verus-r1 is released under the **Apache License 2.0**. See [LICENSE](LICENSE) for full terms. Derived from [Qwen/Qwen3.5-2B](https://huggingface.co/Qwen/Qwen3.5-2B) (Apache 2.0). ---