calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2400

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 80

Training results

Training Loss	Epoch	Step	Validation Loss
1.8223	1.0	6	2.6713
1.6974	2.0	12	1.2700
1.1555	3.0	18	1.0111
0.9334	4.0	24	0.9271
0.8302	5.0	30	0.8386
0.8144	6.0	36	0.6687
0.7410	7.0	42	0.8035
1.0508	8.0	48	0.7194
0.6932	9.0	54	0.6786
0.7005	10.0	60	0.6282
0.6896	11.0	66	0.7197
0.7646	12.0	72	1.0102
0.7867	13.0	78	0.7615
0.6609	14.0	84	0.5590
0.6228	15.0	90	0.5399
0.5934	16.0	96	0.6468
0.6700	17.0	102	0.9275
0.7554	18.0	108	0.5375
0.6135	19.0	114	0.4792
0.5655	20.0	120	0.5007
0.5590	21.0	126	0.4746
0.5327	22.0	132	0.5993
0.5752	23.0	138	0.4929
0.5441	24.0	144	0.5178
0.5788	25.0	150	0.6241
0.6247	26.0	156	0.4842
0.5505	27.0	162	0.4867
0.5455	28.0	168	0.4462
0.5289	29.0	174	0.5937
0.5987	30.0	180	0.6013
0.5828	31.0	186	0.5909
0.6133	32.0	192	0.4737
0.5151	33.0	198	0.5884
0.5772	34.0	204	0.4821
0.5179	35.0	210	0.4324
0.4850	36.0	216	0.4085
0.4696	37.0	222	0.4039
0.4619	38.0	228	0.5007
0.5363	39.0	234	0.5064
0.5657	40.0	240	0.4818
0.5086	41.0	246	0.4906
0.4999	42.0	252	0.5442
0.5849	43.0	258	0.3945
0.5278	44.0	264	0.4150
0.4555	45.0	270	0.3989
0.4777	46.0	276	0.4117
0.4824	47.0	282	0.3651
0.4818	48.0	288	0.3574
0.4731	49.0	294	0.3521
0.4505	50.0	300	0.3882
0.4570	51.0	306	0.3543
0.4322	52.0	312	0.3370
0.4381	53.0	318	0.3251
0.3960	54.0	324	0.3653
0.4062	55.0	330	0.3998
0.4386	56.0	336	0.3577
0.4498	57.0	342	0.3895
0.4408	58.0	348	0.3248
0.3978	59.0	354	0.3223
0.3909	60.0	360	0.3173
0.3818	61.0	366	0.2892
0.4036	62.0	372	0.2931
0.3909	63.0	378	0.2990
0.3724	64.0	384	0.2945
0.3845	65.0	390	0.3036
0.3685	66.0	396	0.3095
0.3648	67.0	402	0.3112
0.3925	68.0	408	0.2939
0.3693	69.0	414	0.2720
0.3661	70.0	420	0.2579
0.3563	71.0	426	0.2672
0.3550	72.0	432	0.2657
0.3509	73.0	438	0.2525
0.3293	74.0	444	0.2487
0.3554	75.0	450	0.2458
0.3365	76.0	456	0.2447
0.3331	77.0	462	0.2475
0.3538	78.0	468	0.2416
0.3264	79.0	474	0.2418
0.3456	80.0	480	0.2400

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 53

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support