train_record_789_1768030958

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5043
  • Num Input Tokens Seen: 928969632

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
6.5735 1.0 31242 6.5507 46450496
1.8394 2.0 62484 2.2958 92891936
2.2052 3.0 93726 1.8496 139349440
2.4318 4.0 124968 1.7949 185796864
0.9519 5.0 156210 1.7236 232235744
1.292 6.0 187452 1.6322 278704192
1.8477 7.0 218694 1.6134 325156032
1.5435 8.0 249936 1.5799 371599168
1.2318 9.0 281178 1.5951 418050784
1.3676 10.0 312420 1.5493 464504128
0.9559 11.0 343662 1.5268 510961472
1.356 12.0 374904 1.5415 557400608
0.9902 13.0 406146 1.5126 603828768
1.3458 14.0 437388 1.5079 650269472
1.3163 15.0 468630 1.5085 696703648
0.7912 16.0 499872 1.5050 743153504
1.3663 17.0 531114 1.5050 789592640
0.9964 18.0 562356 1.5050 836057504
1.3085 19.0 593598 1.5043 882513984
1.4453 20.0 624840 1.5046 928969632

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_789_1768030958

Adapter
(2369)
this model