calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.6895	1.0	6	3.1719
2.7383	2.0	12	2.2173
1.9854	3.0	18	1.7814
1.6898	4.0	24	1.5993
1.5871	5.0	30	1.5507
1.5619	6.0	36	1.5297
1.5216	7.0	42	1.5010
1.4893	8.0	48	1.4371
1.4330	9.0	54	1.3921
1.3753	10.0	60	1.4316
1.3430	11.0	66	1.3675
1.3109	12.0	72	1.3689
1.3134	13.0	78	1.2672
1.2309	14.0	84	1.1653
1.1557	15.0	90	1.1145
1.1064	16.0	96	1.1759
1.1231	17.0	102	1.0402
1.0341	18.0	108	0.9800
0.9828	19.0	114	0.9416
0.9655	20.0	120	0.8970
0.9205	21.0	126	0.9106
0.9365	22.0	132	0.9576
0.9380	23.0	138	0.8321
0.8842	24.0	144	0.8368
0.8462	25.0	150	0.8256
0.8407	26.0	156	0.7950
0.8244	27.0	162	0.7829
0.7980	28.0	168	0.7511
0.7729	29.0	174	0.7392
0.7615	30.0	180	0.7141
0.7358	31.0	186	0.7183
0.7271	32.0	192	0.6933
0.7232	33.0	198	0.6689
0.7081	34.0	204	0.6842
0.7174	35.0	210	0.6742
0.7104	36.0	216	0.6678
0.7049	37.0	222	0.6516
0.6813	38.0	228	0.6380
0.6815	39.0	234	0.6345
0.6729	40.0	240	0.6320

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support