calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2400

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 80

Training results

Training Loss Epoch Step Validation Loss
1.8223 1.0 6 2.6713
1.6974 2.0 12 1.2700
1.1555 3.0 18 1.0111
0.9334 4.0 24 0.9271
0.8302 5.0 30 0.8386
0.8144 6.0 36 0.6687
0.7410 7.0 42 0.8035
1.0508 8.0 48 0.7194
0.6932 9.0 54 0.6786
0.7005 10.0 60 0.6282
0.6896 11.0 66 0.7197
0.7646 12.0 72 1.0102
0.7867 13.0 78 0.7615
0.6609 14.0 84 0.5590
0.6228 15.0 90 0.5399
0.5934 16.0 96 0.6468
0.6700 17.0 102 0.9275
0.7554 18.0 108 0.5375
0.6135 19.0 114 0.4792
0.5655 20.0 120 0.5007
0.5590 21.0 126 0.4746
0.5327 22.0 132 0.5993
0.5752 23.0 138 0.4929
0.5441 24.0 144 0.5178
0.5788 25.0 150 0.6241
0.6247 26.0 156 0.4842
0.5505 27.0 162 0.4867
0.5455 28.0 168 0.4462
0.5289 29.0 174 0.5937
0.5987 30.0 180 0.6013
0.5828 31.0 186 0.5909
0.6133 32.0 192 0.4737
0.5151 33.0 198 0.5884
0.5772 34.0 204 0.4821
0.5179 35.0 210 0.4324
0.4850 36.0 216 0.4085
0.4696 37.0 222 0.4039
0.4619 38.0 228 0.5007
0.5363 39.0 234 0.5064
0.5657 40.0 240 0.4818
0.5086 41.0 246 0.4906
0.4999 42.0 252 0.5442
0.5849 43.0 258 0.3945
0.5278 44.0 264 0.4150
0.4555 45.0 270 0.3989
0.4777 46.0 276 0.4117
0.4824 47.0 282 0.3651
0.4818 48.0 288 0.3574
0.4731 49.0 294 0.3521
0.4505 50.0 300 0.3882
0.4570 51.0 306 0.3543
0.4322 52.0 312 0.3370
0.4381 53.0 318 0.3251
0.3960 54.0 324 0.3653
0.4062 55.0 330 0.3998
0.4386 56.0 336 0.3577
0.4498 57.0 342 0.3895
0.4408 58.0 348 0.3248
0.3978 59.0 354 0.3223
0.3909 60.0 360 0.3173
0.3818 61.0 366 0.2892
0.4036 62.0 372 0.2931
0.3909 63.0 378 0.2990
0.3724 64.0 384 0.2945
0.3845 65.0 390 0.3036
0.3685 66.0 396 0.3095
0.3648 67.0 402 0.3112
0.3925 68.0 408 0.2939
0.3693 69.0 414 0.2720
0.3661 70.0 420 0.2579
0.3563 71.0 426 0.2672
0.3550 72.0 432 0.2657
0.3509 73.0 438 0.2525
0.3293 74.0 444 0.2487
0.3554 75.0 450 0.2458
0.3365 76.0 456 0.2447
0.3331 77.0 462 0.2475
0.3538 78.0 468 0.2416
0.3264 79.0 474 0.2418
0.3456 80.0 480 0.2400

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
53
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support