Control LLM: Controlled Evolution for Intelligence Retention in LLM
Paper • 2501.10979 • Published • 6
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ControlLLM/Llama3.1-8B-OpenMath16-Instruct" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ControlLLM/Llama3.1-8B-OpenMath16-Instruct",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on OpenMath2 dataset, as described in the paper Control LLM: Controlled Evolution for Intelligence Retention in LLM.
This model is associated with the paper: Control-LLM.
This model is associated with the github: Control-LLM.
Here is an overview of the evaluation results and findings:
The following plot illustrates benchmark result and catastrophic forgetting mitigation on the OpenMath2 dataset.
The plot below highlights the alignment comparison of the model trained with Control LLM and Full Parameter Tuning.
The table below summarizes evaluation results across mathematical tasks and original capabilities.
| Model | MH | M | G8K | M-Avg | ARC | GPQA | MLU | MLUP | O-Avg | Overall |
|---|---|---|---|---|---|---|---|---|---|---|
| Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
| OpenMath2-Llama3 | 38.4 | 64.1 | 90.3 | 64.3 | 45.8 | 1.3 | 4.5 | 19.5 | 12.9 | 38.6 |
| Full Tune | 38.5 | 63.7 | 90.2 | 63.9 | 58.2 | 1.1 | 7.3 | 23.5 | 16.5 | 40.1 |
| Partial Tune | 36.4 | 61.4 | 89.0 | 61.8 | 66.2 | 6.0 | 25.7 | 30.9 | 29.3 | 45.6 |
| Stack Exp. | 35.6 | 61.0 | 90.8 | 61.8 | 69.3 | 18.8 | 61.8 | 43.1 | 53.3 | 57.6 |
| Hybrid Exp. | 34.4 | 61.1 | 90.1 | 61.5 | 81.8 | 25.9 | 67.2 | 43.9 | 57.1 | 59.3 |
| Control LLM* | 38.1 | 62.7 | 90.4 | 63.2 | 79.7 | 25.2 | 68.1 | 43.6 | 57.2 | 60.2 |
Base model
meta-llama/Llama-3.1-8B
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ControlLLM/Llama3.1-8B-OpenMath16-Instruct" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Llama3.1-8B-OpenMath16-Instruct", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'