calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1301

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.1429 1.0 4 2.2337
2.0240 2.0 8 1.7621
1.6595 3.0 12 1.4806
1.3784 4.0 16 1.1427
1.0363 5.0 20 0.8139
0.7381 6.0 24 0.6046
0.5807 7.0 28 0.5287
0.5117 8.0 32 0.4774
0.4641 9.0 36 0.4449
0.4274 10.0 40 0.4155
0.3970 11.0 44 0.3787
0.3664 12.0 48 0.3443
0.3391 13.0 52 0.3224
0.3196 14.0 56 0.3062
0.3033 15.0 60 0.2950
0.2938 16.0 64 0.2804
0.2757 17.0 68 0.2682
0.2641 18.0 72 0.2580
0.2551 19.0 76 0.2474
0.2441 20.0 80 0.2430
0.2358 21.0 84 0.2323
0.2284 22.0 88 0.2258
0.2149 23.0 92 0.2129
0.2098 24.0 96 0.2078
0.2005 25.0 100 0.1975
0.1898 26.0 104 0.1901
0.1845 27.0 108 0.1790
0.1771 28.0 112 0.1746
0.1708 29.0 116 0.1668
0.1657 30.0 120 0.1610
0.1600 31.0 124 0.1581
0.1559 32.0 128 0.1510
0.1498 33.0 132 0.1475
0.1451 34.0 136 0.1432
0.1426 35.0 140 0.1417
0.1373 36.0 144 0.1365
0.1334 37.0 148 0.1345
0.1335 38.0 152 0.1318
0.1309 39.0 156 0.1303
0.1308 40.0 160 0.1301

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
111
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support