Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 24
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("kuotient/EEVE-Instruct-Math-10.8B")
model = AutoModelForCausalLM.from_pretrained("kuotient/EEVE-Instruct-Math-10.8B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))EEVE-Math 프로젝트는
에 대한 내용을 포괄하고 있습니다.
이 모델은 EEVE-Math와 EEVE-Instruct의 dare-ties로 병합한 병합 모델입니다. 이 프로젝트는 이런 과정을 통해 특화 모델의 EEVE-Math의 성능을 많이 잃지 않고 Instruct 모델의 사용성을 유지할 수 있음을 보여주는 Proof of concept의 성격을 가지고 있습니다.
| Model | gsm8k-ko(pass@1) |
|---|---|
| EEVE(Base) | 0.4049 |
| EEVE-Math (epoch 1) | 0.508 |
| EEVE-Math (epoch 2) | 0.539 |
| EEVE-Instruct | 0.4511 |
| EEVE-Instruct + Math | 0.4845 |
This model was merged using the DARE TIES merge method using yanolja/EEVE-Korean-Instruct-10.8B-v1.0 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: yanolja/EEVE-Korean-10.8B-v1.0
# no parameters necessary for base model
- model: yanolja/EEVE-Korean-Instruct-10.8B-v1.0
parameters:
density: 0.53
weight: 0.6
- model: kuotient/EEVE-Math-10.8B
parameters:
density: 0.53
weight: 0.4
merge_method: dare_ties
base_model: yanolja/EEVE-Korean-10.8B-v1.0
parameters:
int8_mask: true
dtype: bfloat16
gsm8k-ko, kobest
git clone https://github.com/kuotient/lm-evaluation-harness.git
cd lm-evaluation-harness
pip install -e .
lm_eval --model hf \
--model_args pretrained=yanolja/EEVE-Korean-Instruct-2.8B-v1.0 \
--tasks gsm8k-ko \
--device cuda:0 \
--batch_size auto:4
| Model | gsm8k(pass@1) | boolq(acc) | copa(acc) | hellaswag(acc) | Overall |
|---|---|---|---|---|---|
| yanolja/EEVE-Korean-10.8B-v1.0 | 0.4049 | - | - | - | - |
| yanolja/EEVE-Korean-Instruct-10.8B-v1.0 | 0.4511 | 0.8668 | 0.7450 | 0.4940 | 0.6392 |
| EEVE-Math-10.8B | 0.5390 | 0.8027 | 0.7260 | 0.4760 | 0.6359 |
| EEVE-Instruct-Math-10.8B | 0.4845 | 0.8519 | 0.7410 | 0.4980 | 0.6439 |
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="kuotient/EEVE-Instruct-Math-10.8B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)