calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.1429	1.0	4	2.2337
2.0240	2.0	8	1.7621
1.6595	3.0	12	1.4806
1.3784	4.0	16	1.1427
1.0363	5.0	20	0.8139
0.7381	6.0	24	0.6046
0.5807	7.0	28	0.5287
0.5117	8.0	32	0.4774
0.4641	9.0	36	0.4449
0.4274	10.0	40	0.4155
0.3970	11.0	44	0.3787
0.3664	12.0	48	0.3443
0.3391	13.0	52	0.3224
0.3196	14.0	56	0.3062
0.3033	15.0	60	0.2950
0.2938	16.0	64	0.2804
0.2757	17.0	68	0.2682
0.2641	18.0	72	0.2580
0.2551	19.0	76	0.2474
0.2441	20.0	80	0.2430
0.2358	21.0	84	0.2323
0.2284	22.0	88	0.2258
0.2149	23.0	92	0.2129
0.2098	24.0	96	0.2078
0.2005	25.0	100	0.1975
0.1898	26.0	104	0.1901
0.1845	27.0	108	0.1790
0.1771	28.0	112	0.1746
0.1708	29.0	116	0.1668
0.1657	30.0	120	0.1610
0.1600	31.0	124	0.1581
0.1559	32.0	128	0.1510
0.1498	33.0	132	0.1475
0.1451	34.0	136	0.1432
0.1426	35.0	140	0.1417
0.1373	36.0	144	0.1365
0.1334	37.0	148	0.1345
0.1335	38.0	152	0.1318
0.1309	39.0	156	0.1303
0.1308	40.0	160	0.1301

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support