calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6320

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.6895 1.0 6 3.1719
2.7383 2.0 12 2.2173
1.9854 3.0 18 1.7814
1.6898 4.0 24 1.5993
1.5871 5.0 30 1.5507
1.5619 6.0 36 1.5297
1.5216 7.0 42 1.5010
1.4893 8.0 48 1.4371
1.4330 9.0 54 1.3921
1.3753 10.0 60 1.4316
1.3430 11.0 66 1.3675
1.3109 12.0 72 1.3689
1.3134 13.0 78 1.2672
1.2309 14.0 84 1.1653
1.1557 15.0 90 1.1145
1.1064 16.0 96 1.1759
1.1231 17.0 102 1.0402
1.0341 18.0 108 0.9800
0.9828 19.0 114 0.9416
0.9655 20.0 120 0.8970
0.9205 21.0 126 0.9106
0.9365 22.0 132 0.9576
0.9380 23.0 138 0.8321
0.8842 24.0 144 0.8368
0.8462 25.0 150 0.8256
0.8407 26.0 156 0.7950
0.8244 27.0 162 0.7829
0.7980 28.0 168 0.7511
0.7729 29.0 174 0.7392
0.7615 30.0 180 0.7141
0.7358 31.0 186 0.7183
0.7271 32.0 192 0.6933
0.7232 33.0 198 0.6689
0.7081 34.0 204 0.6842
0.7174 35.0 210 0.6742
0.7104 36.0 216 0.6678
0.7049 37.0 222 0.6516
0.6813 38.0 228 0.6380
0.6815 39.0 234 0.6345
0.6729 40.0 240 0.6320

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
22
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support