calculator_model_test_with_steps

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0782

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.3166 1.0 6 1.7843
1.6117 2.0 12 1.4283
1.3630 3.0 18 1.2932
1.2278 4.0 24 1.1252
1.0568 5.0 30 0.9816
0.9235 6.0 36 0.8970
0.8544 7.0 42 0.8140
0.7859 8.0 48 0.7258
0.7176 9.0 54 0.6732
0.6560 10.0 60 0.6337
0.6114 11.0 66 0.5447
0.5682 12.0 72 0.4996
0.5219 13.0 78 0.5244
0.5037 14.0 84 0.4369
0.4572 15.0 90 0.4172
0.4312 16.0 96 0.3730
0.3958 17.0 102 0.3966
0.4204 18.0 108 0.3818
0.3960 19.0 114 0.3489
0.3695 20.0 120 0.3186
0.3458 21.0 126 0.3016
0.3137 22.0 132 0.2860
0.3133 23.0 138 0.2519
0.2819 24.0 144 0.2399
0.2685 25.0 150 0.2222
0.2570 26.0 156 0.1930
0.2389 27.0 162 0.2000
0.2348 28.0 168 0.1874
0.2193 29.0 174 0.1636
0.2052 30.0 180 0.1454
0.2008 31.0 186 0.1367
0.1890 32.0 192 0.1380
0.1886 33.0 198 0.1192
0.1737 34.0 204 0.1071
0.1589 35.0 210 0.1030
0.1528 36.0 216 0.0939
0.1415 37.0 222 0.0864
0.1417 38.0 228 0.0842
0.1282 39.0 234 0.0804
0.1318 40.0 240 0.0782

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
64
Safetensors
Model size
7.78M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support