Hunyuan-PythonGOD-0.5B

Hunyuan-PythonGOD-0.5B is a Python-focused full fine-tune of tencent/Hunyuan-0.5B-Instruct, built for code generation, coding assistance, implementation tasks, and instruction-following for Python-heavy workflows.

This release is intended as a compact coding model that keeps the small footprint of the 0.5B Hunyuan base while shifting its behavior toward practical Python generation and code-oriented responses.

Model Details

Model Description

Model name: gss1147/Hunyuan-PythonGOD-0.5B
Base model: tencent/Hunyuan-0.5B-Instruct
Architecture: causal decoder-only language model
Model family tag: hunyuan_v1_dense
Primary domain: Python coding / coding assistant
Parameter count: ~0.5B
Weights format: safetensors
Tensor type in repo: F16

Developed by

Shared by: gss1147

Finetuned from model

tencent/Hunyuan-0.5B-Instruct

Intended Uses

Direct Use

This model is intended for:

Python function generation
Python script writing
debugging-oriented coding help
implementation tasks
code completion
coding chat assistants
lightweight local or cloud inference where a small coding model is preferred

Downstream Use

Possible downstream uses include:

code copilots
coding bots
Python tutoring helpers
automation script generation
benchmark experimentation for small code LLMs

Out-of-Scope Use

This model is not designed for:

safety-critical code deployment without human review
medical, legal, or financial decision support
secure production code without auditing
autonomous execution pipelines without sandboxing
guaranteed factual or bug-free code generation

Training Details

Training Objective

This model was trained as a full fine-tune, not as an adapter-only release.

Based on the training workflow you described and the run logs you shared, this release is meant to represent:

full-parameter fine-tuning
no LoRA
no QLoRA
no PEFT adapters in the final model
standard exported Hugging Face model weights

Training Data

This model was trained on the following datasets:

WithinUsAI/Python_GOD_Coder_Omniforge_AI_12k
WithinUsAI/Python_GOD_Coder_5k
WithinUsAI/Legend_Python_CoderV.1

From the training logs you shared, the combined training corpus used:

11,760 rows from Python_GOD_Coder_Omniforge_AI_12k
5,000 rows from Python_GOD_Coder_5k
5,000 rows from Legend_Python_CoderV.1

Total rows: 21,760

Training Procedure

From the training setup you shared, this model was trained with:

dual-GPU Kaggle training
DeepSpeed-assisted distributed training
full model fine-tuning
evaluation during training
final-save upload flow to Hugging Face

Sequence Length

Practical fine-tuning sequence length: 4096 tokens

Context Window Note

If the base model family exposes larger context metadata in config fields, that should not be taken as proof that the full fine-tuning run itself was performed at that larger length. This release should be treated as fine-tuned at 4096 tokens unless revalidated separately.

Evaluation

Formal benchmark results are not finalized in this card.

Benchmark attempts were made on free public coding benchmarks such as:

HumanEval+
MBPP+
BigCodeBench-style workflows

However, based on the evaluation runs you shared, the harness setup encountered tool/runtime issues during some benchmark attempts, so this card does not claim final official benchmark scores yet.

Observed Training Behavior

From the run logs you shared during training, the model showed:

strong reduction in training loss over time
strong reduction in eval loss over time
stable continued learning well into the run
increasingly code-specialized behavior relative to the base model

Examples from your shared eval progression included values around:

~0.2879 early in training
~0.1071
~0.0604
~0.0550
~0.0422
~0.0329
~0.0266
~0.0299
~0.0290

These are training/eval-run observations, not official public benchmark scores.

How to Use

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "gss1147/Hunyuan-PythonGOD-0.5B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "Write a Python function that merges overlapping intervals."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=False,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 38

Safetensors

Model size

0.5B params

Tensor type

F16

Model tree for gss1147/Hunyuan-PythonGOD-0.5B

Base model

tencent/Hunyuan-0.5B-Pretrain

Finetuned

tencent/Hunyuan-0.5B-Instruct