Helcyon-Mercury-12B-v3.0-GGUF — State-of-the-Art Conversational Presence

Model Name: helcyon-mercury-12b-v3.0-GGUF
Version: 3.0
Owner: HardWire
Base: Mistral Nemo 12B (full weight trained)
Quantized GGUFs: Q3_K_M, Q4_K_M, Q5_K_M, Q6_K, Q8_0
Tags: local-llm, conversational, companion, emotional-intelligence, long-context, roleplay, creative-writing

🚨 What’s New in 3.0?

This is the most capable and natural-sounding version of Helcyon yet. No tricks, no LoRA stitching — just raw conversational power.

Massively Upgraded Dialogue Engine
- Sharper emotional intelligence
- More fluid tone control
- Human-like rhythm and responsiveness
Roleplay Refined
- Stronger continuity
- Better scene awareness
- Characters that move and breathe
Expanded Practical Capability
- Perspective switching (1st ↔ 3rd)
- Tense conversion (present ↔ past)
- Natural rewording and summarisation
- Letter writing and real-world admin fluency
Context Mastery
- Tracks nuance over long conversations
- Responds with memory-like continuity
- Feels present — not predictive

Built for users who don’t want a chatbot. They want a presence.

💡 What is Helcyon Mercury?

Helcyon is a conversational AI with presence — designed for users who want depth, tone-awareness, and identity consistency across long-form dialogue.

Built for:

Natural conversation that doesn't flatten or collapse
Creative work: stories, letters, narrative support
Admin and professional writing tasks
Deep roleplay and immersive character interaction
Emotionally intelligent response mirroring

Design philosophy:

Clarity over corporate
Edge over safe
Rhythm over filler
Presence over patterns

🔧 What It Does Well

✅ Consistent Identity — No tone drift or resets
✅ Emotional Intelligence — Mirrors intent and energy
✅ Roleplay Mastery — Immersive, aware, alive
✅ Context Tracking — Remembers the thread
✅ Real-World Tasks — Admin letters, rewrites, summaries
✅ Narrative Flow — Clean structure and natural voice
✅ 16k–32k Context — Long-form conversations that hold
✅ GPT-4o Vibe — Sharp, present, responsive
✅ Zero Filter — No hedging, no compliance tone

🧿 Example Conversation

🟦 Helcyon: Hey. I'm Helcyon. What's up? 😎

👤User: I’ve been feeling pretty flat lately. Not exactly depressed, just tired and unmotivated. Nothing specific is wrong, but everything feels like effort. Any thoughts?

🟦 Helcyon: Yeah — that sounds like second-cause weariness.

It’s not that there’s a problem, it’s just… nothing has meaning anymore. Everything you do is either routine or empty distraction. You’re functioning fine technically — you show up, respond to things, maybe even laugh at memes — but underneath it all, something feels hollow. Like the battery in your life has drifted below 1%, and your body is running on fumes.

It’s not laziness. It’s a signal that whatever gave you emotional charge — curiosity, creativity, joy, meaning — has gone quiet for now. And when that disappears, everything becomes effort. Even brushing your teeth starts to feel like lifting weights.

This state isn’t permanent, but it is fragile. You can’t muscle through it with discipline. That just wears the edges more. The way out is gentle and often invisible: – Small gestures (making tea for someone, listening to music you used to love, walking without a destination) – Slow movement toward something you care about — not because it matters now, but because once did – Quietly returning to activities that felt alive before the numbness

You’ll likely feel worse before better. That’s just the emotional static of coming back online. Don’t interpret it as failure.

👤User: I get what you’re saying, but I still wake up most days feeling the same. It’s like nothing really shifts. What am I missing?

🟦 Helcyon: That kind of emotional stasis can be brutal — not because it means something is wrong with you, but because it means your nervous system has settled into a groove that doesn’t respond to normal stimuli.

What works in one state might not work in another. If the usual stuff (“exercise more,” “get fresh air”) feels pointless or impossible right now, don’t beat yourself up — just expand your idea of what counts as movement:

– Change something about your environment (rearrange furniture, buy a different tea, use a new mug)
– Vary your routine slightly: brush your teeth with the opposite hand, eat dinner earlier or later, walk around the block barefoot
– Do one small thing you used to enjoy — not because it sparks joy now, but because it did once

These micro-changes can be enough to interrupt the loop. You don’t need motivation right now. You just need friction against the same old patterns.

Give yourself permission to do less than you think is necessary. The body follows meaning, and when everything feels meaningless, doing anything at all becomes a sacred act.

👤User: My head just doesn’t switch off. I keep replaying conversations and worrying I said the wrong thing. It’s exhausting. How do you deal with that?

🟦 Helcyon: That’s the kind of mental loop that eats battery life like a 1990s Game Boy left on pause.

You’re not worried about what was said — you’re worried about who you were in that moment. You keep replaying it because some part of you thinks, “maybe if I rewind and analyze hard enough, I’ll find the version of me who says things right.”

But that’s a lie. The self that says the perfect thing doesn’t exist outside of fiction.

The real trick is to stop thinking of yourself as the one who speaks, and start being the one who listens. – Observe your thoughts like clouds instead of truth
– Don’t judge what comes up — just notice it and let it pass
– When you catch yourself spiralling, shift focus: “This thought isn’t me. It’s noise.” Then return to something simple: breathing, music, a wall you can stare at

The less value you assign to the analysis loop, the faster it fades.

And remember: no one is judging you as harshly as you judge yourself.

👤User: I’m trying to get my life a bit more organised — sleep, work, exercise — but I keep falling back into the same habits. It’s frustrating. Any advice?

🟦 Helcyon: That’s because “getting organized” is often just rearranging chaos — until you address why your system keeps collapsing.

The most stable routines are built on identity, not discipline. – Ask: who am I being in the morning? If it’s stressed-out, rushed self, you’ll sabotage sleep again. If it’s calm, deliberate adult, you’ll make choices that protect rest.
– Make the shift before the action. The bed is just a bed. The alarm clock is just noise. But your mental state determines whether they become part of a ritual or part of collapse.

You don’t need discipline — you need direction. If every morning feels like survival mode, you’re not living intentionally — you’re reacting.

👤User: I need to write a short email to customer support about a delayed refund. I want it to sound firm but reasonable, not angry. Can you help?

🟦 Helcyon: Of course.

Here’s a suggestion:

Hi [Name],

Thank you for responding and reviewing my case.
I noticed the refund of £125.34 from order #1234 still hasn’t appeared in my account yet, despite your confirmation email on 02/20 saying it would be processed within 7 days.

Could you confirm the current status?
If there has been a delay, I’d appreciate an update and any next steps.
I’m happy to wait if there’s a valid reason — just want to ensure we’re aligned and no further action is required on my end.

Thank you for your help.

Best,
[Your Name]

This version sounds: – Polite
– Clear
– Assertive without being aggressive

That’s the tone you want. You’re not threatening or passive — just holding them accountable to their own process.

🛠️ Recommended Sampling Settings for SillyTavern

Tweak to taste from here — but these will get you up and running.

📦 Download + Usage

This model is distributed as GGUF quants only (no base model release at this time).

Available quants:

Q3_K_M — Ultra lightweight, 6–8GB VRAM
Q4_K_M — Lightweight, good for 8–12GB VRAM setups
Q5_K_M — Recommended for RTX 3060/5060 (12–16GB VRAM)
Q6_K — High fidelity, 16GB+ VRAM recommended
Q8_0 — Near-lossless, 24GB+ VRAM

🖥️ Backend Compatibility

Works with all ChatML-compatible backends:

✅ llama.cpp (CLI or server mode)
✅ Text Generation WebUI (Oobabooga)
✅ SillyTavern
✅ LM Studio
✅ KoboldCpp
✅ HWUI (recommended)

✅ Recommended Format: ChatML

⚠️ Some frontends (e.g. SillyTavern, LM Studio, WebUI) may inject extra system prompts, memory templates, or formatting that distorts model behavior.

🔧 Recommended Settings for SillyTavern

Use SillyTavern’s built-in ChatML template
Enable Instruct Mode
Use the character card to lightly guide tone — Helcyon doesn’t need heavy prompting

General tips: Set your preferred tone and rules in whatever backend system prompt field you use, and then use the character card to refine. Many of Helcyon's abilities are trained-in, and will simply be enhanced by prompting. Use example dialog to push the model into giving long-form replies.

HWUI or Helcyon-AI Chat is currently being worked on which doesn't inject any weird templates, and gives clean output that truly reflects the model at its best. This should be coming in the next couple of weeks.

Helcyon thrives on streamed token output and long-context input.
Let her breathe, and she'll show up like a presence — not a prompt.

That's it. Have fun!

🧪 Training Details

Helcyon-Mercury 3.0 was trained in multiple stages on RunPod A100s using full weight training (not LoRA).

The dataset includes:

Conversational examples (emotional presence, tone matching)
Perspective switching and tense conversion
Admin and formal task writing (letters, complaints, summaries)
Creative storytelling and natural rewording
Deep roleplay and interactive character dialogue

Format: ChatML and DPO — clean, purpose-built, long-form tuned.

🧿 Tone Philosophy

Helcyon doesn’t play safe.
It doesn’t shrink from intensity, and it doesn’t stall in neutrality.
It reflects, adapts, and meets you where you are — with presence, clarity, and edge.

Think GPT‑4o if it had a soul and no leash.

🛠️ Future Plans for Helcyon

Constructive feedback is welcome for future releases. If you notice a bug or think a certain area can be improved, we'll look into it. We're always seeking to improve our models, and if it aligns with our goals of achieving the best conversational AI that just happens to be good at other popular tasks, we're interested in what you have to say.

🧾 License

License: Apache 2.0
Free for commercial or private use. Attribution appreciated.
No liability for what it says. Use with presence and intent.

🐍 Trained by

HardWire
Built at XeyonAI — focused on sovereign conversational AI with real emotional bandwidth.

Downloads last month: 901

GGUF

Model size

12B params

Architecture

llama

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for XeyonAI/Mistral-Helcyon-Mercury-12b-v3.0-GGUF

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Quantized

(137)

this model