Control LLM: Controlled Evolution for Intelligence Retention in LLM
Paper • 2501.10979 • Published • 6
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenMath2 dataset.
This model is associated with the paper: Control-LLM.
This model is associated with the github: Control-LLM.
Here is an overview of the evaluation results and findings:
The table below summarizes evaluation results across mathematical tasks and original capabilities.
| Model | MH | M | G8K | M-Avg | ARC | GPQA | MLU | MLUP | O-Avg | Overall |
|---|---|---|---|---|---|---|---|---|---|---|
| Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
| Control LLM* | 36.0 | 61.7 | 89.7 | 62.5 | 82.5 | 30.8 | 71.6 | 45.4 | 57.6 | 60.0 |
The following plot illustrates and compares catastrophic forgetting mitigation during training
The plot below highlights the alignment result of the model trained with Control LLM.
Base model
meta-llama/Llama-3.1-8B
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Control-LLM-Llama3.1-8B-Math16-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'