Instructions to use N-Bot-Int/MiniMaid-L2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use N-Bot-Int/MiniMaid-L2 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "N-Bot-Int/MiniMaid-L2") - Transformers
How to use N-Bot-Int/MiniMaid-L2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="N-Bot-Int/MiniMaid-L2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("N-Bot-Int/MiniMaid-L2", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use N-Bot-Int/MiniMaid-L2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "N-Bot-Int/MiniMaid-L2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "N-Bot-Int/MiniMaid-L2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/N-Bot-Int/MiniMaid-L2
- SGLang
How to use N-Bot-Int/MiniMaid-L2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "N-Bot-Int/MiniMaid-L2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "N-Bot-Int/MiniMaid-L2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "N-Bot-Int/MiniMaid-L2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "N-Bot-Int/MiniMaid-L2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use N-Bot-Int/MiniMaid-L2 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for N-Bot-Int/MiniMaid-L2 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for N-Bot-Int/MiniMaid-L2 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for N-Bot-Int/MiniMaid-L2 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="N-Bot-Int/MiniMaid-L2", max_seq_length=2048, ) - Docker Model Runner
How to use N-Bot-Int/MiniMaid-L2 with Docker Model Runner:
docker model run hf.co/N-Bot-Int/MiniMaid-L2
- MiniMaid-L2
- MiniMaid-L1 Base-Model Card Procedure:
- 🧵 MiniMaid-L2: Small Size, Big Bite — The Next-Gen Roleplay Assistant
- MiniMaid-L2 builds on the scrappy L1 foundation and takes the lead over 3B giants like Hermes, Dolphin, and DeepSeek, with better consistency, longer outputs, and a massive boost to immersion.
- 📊 Efficient AND Smart
- MiniMaid-L2 shows that distilled models can outperform much larger ones — when trained right, even 1B can be the boss.
MiniMaid-L2
MiniMaid-L2 is a Finetuned Model of MiniMaid-L1 model, with even big and higher quality dataset used to generated roleplaying Capabilities, MiniMaid-L2 also were extracted from Knowledge Distilling A Popular Roleplaying Model named NoroMaid-7B-DPO, Which we've used to enchanced its lacking Ends for coherent And Good Roleplaying Capabilities.
MiniMaid-L2 Outcompete its predecessor as it uses a Clever Knowledge distilling to transfer Knowledge from NoroMaid, And Finetuned it, building on top of MiniMaid-L1 to Produce a better AI model. Sacrificing Some Non-noticable Token-Generation speed, with a near perfect and Competitive Model against 3b Alternatives!
MiniMaid-L1 Base-Model Card Procedure:
MiniMaid-L1 achieve a good Performance through process of DPO and Combined Heavy Finetuning, To Prevent Overfitting, We used high LR decays, And Introduced Randomization techniques to prevent the AI from learning and memorizing, However since training this on Google Colab is difficult, the Model might underperform or underfit on specific tasks Or overfit on knowledge it manage to latched on! However please be guided that we did our best, and it will improve as we move onwards!
MiniMaid-L2 is Another Instance of Our Smallest Model Yet! if you find any issue, then please don't hesitate to email us at: nexus.networkinteractives@gmail.com about any overfitting, or improvements for the future Model V3, Once again feel free to Modify the LORA to your likings, However please consider Adding this Page for credits and if you'll increase its Dataset, then please handle it with care and ethical considerations
MiniMaid-L2 is
- Developed by: N-Bot-Int
- License: apache-2.0
- Parent Model from model: unsloth/llama-3.2-3b-instruct-unsloth-bnb-1bit
- Dataset Combined Using: Mosher-R1(Propietary Software)
MiniMaid-L1 Official Metric Score

Metrics Made By ItsMeDevRoland Which compares:
- MiniMaid-L1 GGUFF
- MiniMaid-L2 GGUFF Which are All Ranked with the Same Prompt, Same Temperature, Same Hardware(Google Colab), To Properly Showcase the differences and strength of the Models
Visit Below to See details!
🧵 MiniMaid-L2: Small Size, Big Bite — The Next-Gen Roleplay Assistant
She’s sharper, deeper, and more immersive. And this time? She doesn’t just hold her own — she wins.
MiniMaid-L2 builds on the scrappy L1 foundation and takes the lead over 3B giants like Hermes, Dolphin, and DeepSeek, with better consistency, longer outputs, and a massive boost to immersion.
- 💬 Roleplay Evaluation (v1)
- 🧠 Character Consistency: 0.84
- 🌊 Immersion: 0.47 -🧮 Overall RP Score: 0.76
- ✏️ Length Score: 1.00
- L2 scored +0.25 higher overall than L1, while beating top-tier 3B models in every major RP metric.
📊 Efficient AND Smart
- Inference Time: 54.2s — still 3x faster than Hermes
- Tokens/sec: 6.88 — near-instant on consumer GPUs
- BLEU/ROUGE-L: Stronger n-gram overlap than any 3B rival
MiniMaid-L2 shows that distilled models can outperform much larger ones — when trained right, even 1B can be the boss.
- 🛠️ MiniMaid is Built For
- High-fidelity RP generation
- Lower-latency systems
- Custom, character-driven storytelling
🌱 L2 is the turning point — with upgraded conditioning, tighter personality anchoring, and narrative-aware outputs, she's evolving fast.
“MiniMaid-L2 doesn’t just punch above her weight — she’s taking belts. A tighter model, a stronger performer, and still tiny enough to run on a toaster. RP just got smarter.”
Notice
- For a Good Experience, Please use
- Low temperature 1.5, min_p = 0.1 and max_new_tokens = 128
- For a Good Experience, Please use
Detail card:
Parameter
- 1 Billion Parameters
- (Please visit your GPU Vendor if you can Run 1B models)
Finetuning tool:
Unsloth AI
- This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

- This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Fine-tuned Using:
Google Colab
- Downloads last month
- 14


from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "N-Bot-Int/MiniMaid-L2")