| 2023-10-13 08:52:45,276 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 Model: "SequenceTagger( | |
| (embeddings): TransformerWordEmbeddings( | |
| (model): BertModel( | |
| (embeddings): BertEmbeddings( | |
| (word_embeddings): Embedding(32001, 768) | |
| (position_embeddings): Embedding(512, 768) | |
| (token_type_embeddings): Embedding(2, 768) | |
| (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| (encoder): BertEncoder( | |
| (layer): ModuleList( | |
| (0-11): 12 x BertLayer( | |
| (attention): BertAttention( | |
| (self): BertSelfAttention( | |
| (query): Linear(in_features=768, out_features=768, bias=True) | |
| (key): Linear(in_features=768, out_features=768, bias=True) | |
| (value): Linear(in_features=768, out_features=768, bias=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| (output): BertSelfOutput( | |
| (dense): Linear(in_features=768, out_features=768, bias=True) | |
| (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| ) | |
| (intermediate): BertIntermediate( | |
| (dense): Linear(in_features=768, out_features=3072, bias=True) | |
| (intermediate_act_fn): GELUActivation() | |
| ) | |
| (output): BertOutput( | |
| (dense): Linear(in_features=3072, out_features=768, bias=True) | |
| (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) | |
| (dropout): Dropout(p=0.1, inplace=False) | |
| ) | |
| ) | |
| ) | |
| ) | |
| (pooler): BertPooler( | |
| (dense): Linear(in_features=768, out_features=768, bias=True) | |
| (activation): Tanh() | |
| ) | |
| ) | |
| ) | |
| (locked_dropout): LockedDropout(p=0.5) | |
| (linear): Linear(in_features=768, out_features=25, bias=True) | |
| (loss_function): CrossEntropyLoss() | |
| )" | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 MultiCorpus: 1100 train + 206 dev + 240 test sentences | |
| - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 Train: 1100 sentences | |
| 2023-10-13 08:52:45,277 (train_with_dev=False, train_with_test=False) | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 Training Params: | |
| 2023-10-13 08:52:45,277 - learning_rate: "5e-05" | |
| 2023-10-13 08:52:45,277 - mini_batch_size: "4" | |
| 2023-10-13 08:52:45,277 - max_epochs: "10" | |
| 2023-10-13 08:52:45,277 - shuffle: "True" | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 Plugins: | |
| 2023-10-13 08:52:45,277 - LinearScheduler | warmup_fraction: '0.1' | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,277 Final evaluation on model from best epoch (best-model.pt) | |
| 2023-10-13 08:52:45,277 - metric: "('micro avg', 'f1-score')" | |
| 2023-10-13 08:52:45,277 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,278 Computation: | |
| 2023-10-13 08:52:45,278 - compute on device: cuda:0 | |
| 2023-10-13 08:52:45,278 - embedding storage: none | |
| 2023-10-13 08:52:45,278 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,278 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" | |
| 2023-10-13 08:52:45,278 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:45,278 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:46,561 epoch 1 - iter 27/275 - loss 3.10895019 - time (sec): 1.28 - samples/sec: 1644.12 - lr: 0.000005 - momentum: 0.000000 | |
| 2023-10-13 08:52:47,828 epoch 1 - iter 54/275 - loss 2.47767572 - time (sec): 2.55 - samples/sec: 1604.37 - lr: 0.000010 - momentum: 0.000000 | |
| 2023-10-13 08:52:49,052 epoch 1 - iter 81/275 - loss 1.94992029 - time (sec): 3.77 - samples/sec: 1650.47 - lr: 0.000015 - momentum: 0.000000 | |
| 2023-10-13 08:52:50,213 epoch 1 - iter 108/275 - loss 1.63188107 - time (sec): 4.93 - samples/sec: 1757.27 - lr: 0.000019 - momentum: 0.000000 | |
| 2023-10-13 08:52:51,415 epoch 1 - iter 135/275 - loss 1.40107997 - time (sec): 6.14 - samples/sec: 1789.74 - lr: 0.000024 - momentum: 0.000000 | |
| 2023-10-13 08:52:52,598 epoch 1 - iter 162/275 - loss 1.24551142 - time (sec): 7.32 - samples/sec: 1806.96 - lr: 0.000029 - momentum: 0.000000 | |
| 2023-10-13 08:52:53,771 epoch 1 - iter 189/275 - loss 1.10653093 - time (sec): 8.49 - samples/sec: 1841.74 - lr: 0.000034 - momentum: 0.000000 | |
| 2023-10-13 08:52:54,943 epoch 1 - iter 216/275 - loss 1.00884801 - time (sec): 9.66 - samples/sec: 1844.13 - lr: 0.000039 - momentum: 0.000000 | |
| 2023-10-13 08:52:56,117 epoch 1 - iter 243/275 - loss 0.92202368 - time (sec): 10.84 - samples/sec: 1855.11 - lr: 0.000044 - momentum: 0.000000 | |
| 2023-10-13 08:52:57,301 epoch 1 - iter 270/275 - loss 0.85863643 - time (sec): 12.02 - samples/sec: 1859.60 - lr: 0.000049 - momentum: 0.000000 | |
| 2023-10-13 08:52:57,528 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:57,528 EPOCH 1 done: loss 0.8490 - lr: 0.000049 | |
| 2023-10-13 08:52:58,240 DEV : loss 0.19745545089244843 - f1-score (micro avg) 0.7379 | |
| 2023-10-13 08:52:58,244 saving best model | |
| 2023-10-13 08:52:58,671 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:52:59,849 epoch 2 - iter 27/275 - loss 0.22311487 - time (sec): 1.18 - samples/sec: 1913.64 - lr: 0.000049 - momentum: 0.000000 | |
| 2023-10-13 08:53:01,051 epoch 2 - iter 54/275 - loss 0.17238447 - time (sec): 2.38 - samples/sec: 1871.75 - lr: 0.000049 - momentum: 0.000000 | |
| 2023-10-13 08:53:02,259 epoch 2 - iter 81/275 - loss 0.16139449 - time (sec): 3.59 - samples/sec: 1912.00 - lr: 0.000048 - momentum: 0.000000 | |
| 2023-10-13 08:53:03,469 epoch 2 - iter 108/275 - loss 0.16624118 - time (sec): 4.80 - samples/sec: 1958.01 - lr: 0.000048 - momentum: 0.000000 | |
| 2023-10-13 08:53:04,666 epoch 2 - iter 135/275 - loss 0.16488607 - time (sec): 5.99 - samples/sec: 1954.84 - lr: 0.000047 - momentum: 0.000000 | |
| 2023-10-13 08:53:05,862 epoch 2 - iter 162/275 - loss 0.16176460 - time (sec): 7.19 - samples/sec: 1914.18 - lr: 0.000047 - momentum: 0.000000 | |
| 2023-10-13 08:53:07,057 epoch 2 - iter 189/275 - loss 0.16462723 - time (sec): 8.38 - samples/sec: 1877.36 - lr: 0.000046 - momentum: 0.000000 | |
| 2023-10-13 08:53:08,276 epoch 2 - iter 216/275 - loss 0.16210098 - time (sec): 9.60 - samples/sec: 1884.46 - lr: 0.000046 - momentum: 0.000000 | |
| 2023-10-13 08:53:09,482 epoch 2 - iter 243/275 - loss 0.16108364 - time (sec): 10.81 - samples/sec: 1881.24 - lr: 0.000045 - momentum: 0.000000 | |
| 2023-10-13 08:53:10,668 epoch 2 - iter 270/275 - loss 0.15993556 - time (sec): 12.00 - samples/sec: 1865.96 - lr: 0.000045 - momentum: 0.000000 | |
| 2023-10-13 08:53:10,892 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:10,892 EPOCH 2 done: loss 0.1598 - lr: 0.000045 | |
| 2023-10-13 08:53:11,579 DEV : loss 0.14459964632987976 - f1-score (micro avg) 0.8109 | |
| 2023-10-13 08:53:11,584 saving best model | |
| 2023-10-13 08:53:12,108 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:13,264 epoch 3 - iter 27/275 - loss 0.09522301 - time (sec): 1.15 - samples/sec: 1917.77 - lr: 0.000044 - momentum: 0.000000 | |
| 2023-10-13 08:53:14,420 epoch 3 - iter 54/275 - loss 0.08734941 - time (sec): 2.31 - samples/sec: 1979.32 - lr: 0.000043 - momentum: 0.000000 | |
| 2023-10-13 08:53:15,560 epoch 3 - iter 81/275 - loss 0.09154588 - time (sec): 3.45 - samples/sec: 1938.01 - lr: 0.000043 - momentum: 0.000000 | |
| 2023-10-13 08:53:16,799 epoch 3 - iter 108/275 - loss 0.08968582 - time (sec): 4.68 - samples/sec: 1872.66 - lr: 0.000042 - momentum: 0.000000 | |
| 2023-10-13 08:53:17,963 epoch 3 - iter 135/275 - loss 0.09075184 - time (sec): 5.85 - samples/sec: 1896.47 - lr: 0.000042 - momentum: 0.000000 | |
| 2023-10-13 08:53:19,117 epoch 3 - iter 162/275 - loss 0.09846142 - time (sec): 7.00 - samples/sec: 1892.98 - lr: 0.000041 - momentum: 0.000000 | |
| 2023-10-13 08:53:20,267 epoch 3 - iter 189/275 - loss 0.10253618 - time (sec): 8.15 - samples/sec: 1929.53 - lr: 0.000041 - momentum: 0.000000 | |
| 2023-10-13 08:53:21,425 epoch 3 - iter 216/275 - loss 0.09906066 - time (sec): 9.31 - samples/sec: 1945.94 - lr: 0.000040 - momentum: 0.000000 | |
| 2023-10-13 08:53:22,606 epoch 3 - iter 243/275 - loss 0.10227532 - time (sec): 10.49 - samples/sec: 1927.28 - lr: 0.000040 - momentum: 0.000000 | |
| 2023-10-13 08:53:23,828 epoch 3 - iter 270/275 - loss 0.10650752 - time (sec): 11.71 - samples/sec: 1911.09 - lr: 0.000039 - momentum: 0.000000 | |
| 2023-10-13 08:53:24,050 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:24,050 EPOCH 3 done: loss 0.1053 - lr: 0.000039 | |
| 2023-10-13 08:53:24,694 DEV : loss 0.17806529998779297 - f1-score (micro avg) 0.8447 | |
| 2023-10-13 08:53:24,699 saving best model | |
| 2023-10-13 08:53:25,181 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:26,360 epoch 4 - iter 27/275 - loss 0.07435083 - time (sec): 1.18 - samples/sec: 2025.78 - lr: 0.000038 - momentum: 0.000000 | |
| 2023-10-13 08:53:27,553 epoch 4 - iter 54/275 - loss 0.08263282 - time (sec): 2.37 - samples/sec: 2018.70 - lr: 0.000038 - momentum: 0.000000 | |
| 2023-10-13 08:53:28,776 epoch 4 - iter 81/275 - loss 0.07225731 - time (sec): 3.59 - samples/sec: 1928.70 - lr: 0.000037 - momentum: 0.000000 | |
| 2023-10-13 08:53:29,987 epoch 4 - iter 108/275 - loss 0.08213242 - time (sec): 4.80 - samples/sec: 1851.95 - lr: 0.000037 - momentum: 0.000000 | |
| 2023-10-13 08:53:31,207 epoch 4 - iter 135/275 - loss 0.08166561 - time (sec): 6.02 - samples/sec: 1820.54 - lr: 0.000036 - momentum: 0.000000 | |
| 2023-10-13 08:53:32,391 epoch 4 - iter 162/275 - loss 0.07334733 - time (sec): 7.21 - samples/sec: 1839.61 - lr: 0.000036 - momentum: 0.000000 | |
| 2023-10-13 08:53:33,607 epoch 4 - iter 189/275 - loss 0.08209898 - time (sec): 8.42 - samples/sec: 1825.26 - lr: 0.000035 - momentum: 0.000000 | |
| 2023-10-13 08:53:34,815 epoch 4 - iter 216/275 - loss 0.08301624 - time (sec): 9.63 - samples/sec: 1851.02 - lr: 0.000035 - momentum: 0.000000 | |
| 2023-10-13 08:53:36,012 epoch 4 - iter 243/275 - loss 0.07839979 - time (sec): 10.83 - samples/sec: 1841.24 - lr: 0.000034 - momentum: 0.000000 | |
| 2023-10-13 08:53:37,191 epoch 4 - iter 270/275 - loss 0.08347042 - time (sec): 12.01 - samples/sec: 1857.03 - lr: 0.000034 - momentum: 0.000000 | |
| 2023-10-13 08:53:37,412 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:37,412 EPOCH 4 done: loss 0.0835 - lr: 0.000034 | |
| 2023-10-13 08:53:38,073 DEV : loss 0.1688060760498047 - f1-score (micro avg) 0.8544 | |
| 2023-10-13 08:53:38,078 saving best model | |
| 2023-10-13 08:53:38,531 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:39,736 epoch 5 - iter 27/275 - loss 0.05444421 - time (sec): 1.20 - samples/sec: 1977.90 - lr: 0.000033 - momentum: 0.000000 | |
| 2023-10-13 08:53:41,015 epoch 5 - iter 54/275 - loss 0.07833915 - time (sec): 2.48 - samples/sec: 1844.63 - lr: 0.000032 - momentum: 0.000000 | |
| 2023-10-13 08:53:42,197 epoch 5 - iter 81/275 - loss 0.08049245 - time (sec): 3.66 - samples/sec: 1810.30 - lr: 0.000032 - momentum: 0.000000 | |
| 2023-10-13 08:53:43,399 epoch 5 - iter 108/275 - loss 0.08594507 - time (sec): 4.86 - samples/sec: 1864.34 - lr: 0.000031 - momentum: 0.000000 | |
| 2023-10-13 08:53:44,612 epoch 5 - iter 135/275 - loss 0.07420887 - time (sec): 6.08 - samples/sec: 1880.16 - lr: 0.000031 - momentum: 0.000000 | |
| 2023-10-13 08:53:45,782 epoch 5 - iter 162/275 - loss 0.07220218 - time (sec): 7.25 - samples/sec: 1852.54 - lr: 0.000030 - momentum: 0.000000 | |
| 2023-10-13 08:53:47,168 epoch 5 - iter 189/275 - loss 0.07102045 - time (sec): 8.63 - samples/sec: 1824.87 - lr: 0.000030 - momentum: 0.000000 | |
| 2023-10-13 08:53:48,555 epoch 5 - iter 216/275 - loss 0.07369192 - time (sec): 10.02 - samples/sec: 1792.57 - lr: 0.000029 - momentum: 0.000000 | |
| 2023-10-13 08:53:49,760 epoch 5 - iter 243/275 - loss 0.07396676 - time (sec): 11.23 - samples/sec: 1806.95 - lr: 0.000029 - momentum: 0.000000 | |
| 2023-10-13 08:53:50,954 epoch 5 - iter 270/275 - loss 0.06944926 - time (sec): 12.42 - samples/sec: 1804.19 - lr: 0.000028 - momentum: 0.000000 | |
| 2023-10-13 08:53:51,168 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:51,168 EPOCH 5 done: loss 0.0688 - lr: 0.000028 | |
| 2023-10-13 08:53:51,846 DEV : loss 0.16338446736335754 - f1-score (micro avg) 0.8747 | |
| 2023-10-13 08:53:51,850 saving best model | |
| 2023-10-13 08:53:52,335 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:53:53,550 epoch 6 - iter 27/275 - loss 0.04940648 - time (sec): 1.21 - samples/sec: 1858.47 - lr: 0.000027 - momentum: 0.000000 | |
| 2023-10-13 08:53:54,796 epoch 6 - iter 54/275 - loss 0.06085884 - time (sec): 2.46 - samples/sec: 1732.55 - lr: 0.000027 - momentum: 0.000000 | |
| 2023-10-13 08:53:56,072 epoch 6 - iter 81/275 - loss 0.06532074 - time (sec): 3.74 - samples/sec: 1699.76 - lr: 0.000026 - momentum: 0.000000 | |
| 2023-10-13 08:53:57,296 epoch 6 - iter 108/275 - loss 0.05344166 - time (sec): 4.96 - samples/sec: 1744.96 - lr: 0.000026 - momentum: 0.000000 | |
| 2023-10-13 08:53:58,546 epoch 6 - iter 135/275 - loss 0.05623617 - time (sec): 6.21 - samples/sec: 1728.58 - lr: 0.000025 - momentum: 0.000000 | |
| 2023-10-13 08:53:59,733 epoch 6 - iter 162/275 - loss 0.05367976 - time (sec): 7.40 - samples/sec: 1733.95 - lr: 0.000025 - momentum: 0.000000 | |
| 2023-10-13 08:54:00,954 epoch 6 - iter 189/275 - loss 0.05240200 - time (sec): 8.62 - samples/sec: 1760.60 - lr: 0.000024 - momentum: 0.000000 | |
| 2023-10-13 08:54:02,179 epoch 6 - iter 216/275 - loss 0.04827172 - time (sec): 9.84 - samples/sec: 1780.91 - lr: 0.000024 - momentum: 0.000000 | |
| 2023-10-13 08:54:03,392 epoch 6 - iter 243/275 - loss 0.04464126 - time (sec): 11.06 - samples/sec: 1811.84 - lr: 0.000023 - momentum: 0.000000 | |
| 2023-10-13 08:54:04,679 epoch 6 - iter 270/275 - loss 0.04294188 - time (sec): 12.34 - samples/sec: 1820.90 - lr: 0.000022 - momentum: 0.000000 | |
| 2023-10-13 08:54:04,901 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:04,902 EPOCH 6 done: loss 0.0432 - lr: 0.000022 | |
| 2023-10-13 08:54:05,579 DEV : loss 0.17971542477607727 - f1-score (micro avg) 0.8616 | |
| 2023-10-13 08:54:05,584 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:06,863 epoch 7 - iter 27/275 - loss 0.02639286 - time (sec): 1.28 - samples/sec: 1746.64 - lr: 0.000022 - momentum: 0.000000 | |
| 2023-10-13 08:54:08,118 epoch 7 - iter 54/275 - loss 0.02728147 - time (sec): 2.53 - samples/sec: 1812.88 - lr: 0.000021 - momentum: 0.000000 | |
| 2023-10-13 08:54:09,328 epoch 7 - iter 81/275 - loss 0.02961491 - time (sec): 3.74 - samples/sec: 1753.35 - lr: 0.000021 - momentum: 0.000000 | |
| 2023-10-13 08:54:10,561 epoch 7 - iter 108/275 - loss 0.03212972 - time (sec): 4.98 - samples/sec: 1816.74 - lr: 0.000020 - momentum: 0.000000 | |
| 2023-10-13 08:54:11,782 epoch 7 - iter 135/275 - loss 0.03119713 - time (sec): 6.20 - samples/sec: 1787.67 - lr: 0.000020 - momentum: 0.000000 | |
| 2023-10-13 08:54:12,951 epoch 7 - iter 162/275 - loss 0.02884273 - time (sec): 7.37 - samples/sec: 1785.48 - lr: 0.000019 - momentum: 0.000000 | |
| 2023-10-13 08:54:14,131 epoch 7 - iter 189/275 - loss 0.02833670 - time (sec): 8.55 - samples/sec: 1799.21 - lr: 0.000019 - momentum: 0.000000 | |
| 2023-10-13 08:54:15,316 epoch 7 - iter 216/275 - loss 0.03330063 - time (sec): 9.73 - samples/sec: 1785.65 - lr: 0.000018 - momentum: 0.000000 | |
| 2023-10-13 08:54:16,508 epoch 7 - iter 243/275 - loss 0.03494813 - time (sec): 10.92 - samples/sec: 1815.85 - lr: 0.000017 - momentum: 0.000000 | |
| 2023-10-13 08:54:17,686 epoch 7 - iter 270/275 - loss 0.03425456 - time (sec): 12.10 - samples/sec: 1839.39 - lr: 0.000017 - momentum: 0.000000 | |
| 2023-10-13 08:54:17,922 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:17,922 EPOCH 7 done: loss 0.0343 - lr: 0.000017 | |
| 2023-10-13 08:54:18,576 DEV : loss 0.17147016525268555 - f1-score (micro avg) 0.8725 | |
| 2023-10-13 08:54:18,581 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:19,773 epoch 8 - iter 27/275 - loss 0.04311433 - time (sec): 1.19 - samples/sec: 1887.91 - lr: 0.000016 - momentum: 0.000000 | |
| 2023-10-13 08:54:20,957 epoch 8 - iter 54/275 - loss 0.03521015 - time (sec): 2.38 - samples/sec: 1963.71 - lr: 0.000016 - momentum: 0.000000 | |
| 2023-10-13 08:54:22,147 epoch 8 - iter 81/275 - loss 0.02980486 - time (sec): 3.57 - samples/sec: 1959.96 - lr: 0.000015 - momentum: 0.000000 | |
| 2023-10-13 08:54:23,316 epoch 8 - iter 108/275 - loss 0.02603309 - time (sec): 4.73 - samples/sec: 1924.60 - lr: 0.000015 - momentum: 0.000000 | |
| 2023-10-13 08:54:24,491 epoch 8 - iter 135/275 - loss 0.03205212 - time (sec): 5.91 - samples/sec: 1884.41 - lr: 0.000014 - momentum: 0.000000 | |
| 2023-10-13 08:54:25,682 epoch 8 - iter 162/275 - loss 0.02823796 - time (sec): 7.10 - samples/sec: 1876.39 - lr: 0.000014 - momentum: 0.000000 | |
| 2023-10-13 08:54:26,856 epoch 8 - iter 189/275 - loss 0.02653460 - time (sec): 8.27 - samples/sec: 1844.64 - lr: 0.000013 - momentum: 0.000000 | |
| 2023-10-13 08:54:28,051 epoch 8 - iter 216/275 - loss 0.02663013 - time (sec): 9.47 - samples/sec: 1863.29 - lr: 0.000012 - momentum: 0.000000 | |
| 2023-10-13 08:54:29,243 epoch 8 - iter 243/275 - loss 0.02604939 - time (sec): 10.66 - samples/sec: 1873.52 - lr: 0.000012 - momentum: 0.000000 | |
| 2023-10-13 08:54:30,423 epoch 8 - iter 270/275 - loss 0.02376052 - time (sec): 11.84 - samples/sec: 1885.98 - lr: 0.000011 - momentum: 0.000000 | |
| 2023-10-13 08:54:30,645 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:30,645 EPOCH 8 done: loss 0.0251 - lr: 0.000011 | |
| 2023-10-13 08:54:31,302 DEV : loss 0.1549442708492279 - f1-score (micro avg) 0.8806 | |
| 2023-10-13 08:54:31,308 saving best model | |
| 2023-10-13 08:54:31,765 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:32,910 epoch 9 - iter 27/275 - loss 0.00512435 - time (sec): 1.14 - samples/sec: 2101.39 - lr: 0.000011 - momentum: 0.000000 | |
| 2023-10-13 08:54:34,063 epoch 9 - iter 54/275 - loss 0.01456194 - time (sec): 2.29 - samples/sec: 2017.01 - lr: 0.000010 - momentum: 0.000000 | |
| 2023-10-13 08:54:35,333 epoch 9 - iter 81/275 - loss 0.01353969 - time (sec): 3.56 - samples/sec: 1912.47 - lr: 0.000010 - momentum: 0.000000 | |
| 2023-10-13 08:54:36,492 epoch 9 - iter 108/275 - loss 0.02093541 - time (sec): 4.72 - samples/sec: 1961.28 - lr: 0.000009 - momentum: 0.000000 | |
| 2023-10-13 08:54:37,647 epoch 9 - iter 135/275 - loss 0.02290426 - time (sec): 5.88 - samples/sec: 1945.14 - lr: 0.000009 - momentum: 0.000000 | |
| 2023-10-13 08:54:38,900 epoch 9 - iter 162/275 - loss 0.02097056 - time (sec): 7.13 - samples/sec: 1926.88 - lr: 0.000008 - momentum: 0.000000 | |
| 2023-10-13 08:54:40,128 epoch 9 - iter 189/275 - loss 0.02272476 - time (sec): 8.36 - samples/sec: 1904.27 - lr: 0.000007 - momentum: 0.000000 | |
| 2023-10-13 08:54:41,322 epoch 9 - iter 216/275 - loss 0.02094148 - time (sec): 9.55 - samples/sec: 1895.67 - lr: 0.000007 - momentum: 0.000000 | |
| 2023-10-13 08:54:42,523 epoch 9 - iter 243/275 - loss 0.02067294 - time (sec): 10.75 - samples/sec: 1896.18 - lr: 0.000006 - momentum: 0.000000 | |
| 2023-10-13 08:54:43,702 epoch 9 - iter 270/275 - loss 0.02004411 - time (sec): 11.93 - samples/sec: 1872.05 - lr: 0.000006 - momentum: 0.000000 | |
| 2023-10-13 08:54:43,925 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:43,925 EPOCH 9 done: loss 0.0201 - lr: 0.000006 | |
| 2023-10-13 08:54:44,602 DEV : loss 0.1626124531030655 - f1-score (micro avg) 0.886 | |
| 2023-10-13 08:54:44,607 saving best model | |
| 2023-10-13 08:54:45,064 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:46,221 epoch 10 - iter 27/275 - loss 0.02174473 - time (sec): 1.15 - samples/sec: 1899.99 - lr: 0.000005 - momentum: 0.000000 | |
| 2023-10-13 08:54:47,383 epoch 10 - iter 54/275 - loss 0.02938685 - time (sec): 2.32 - samples/sec: 1956.41 - lr: 0.000005 - momentum: 0.000000 | |
| 2023-10-13 08:54:48,721 epoch 10 - iter 81/275 - loss 0.02089692 - time (sec): 3.66 - samples/sec: 1859.63 - lr: 0.000004 - momentum: 0.000000 | |
| 2023-10-13 08:54:49,918 epoch 10 - iter 108/275 - loss 0.01667753 - time (sec): 4.85 - samples/sec: 1875.28 - lr: 0.000004 - momentum: 0.000000 | |
| 2023-10-13 08:54:51,178 epoch 10 - iter 135/275 - loss 0.01710559 - time (sec): 6.11 - samples/sec: 1860.81 - lr: 0.000003 - momentum: 0.000000 | |
| 2023-10-13 08:54:52,393 epoch 10 - iter 162/275 - loss 0.01457507 - time (sec): 7.33 - samples/sec: 1871.74 - lr: 0.000002 - momentum: 0.000000 | |
| 2023-10-13 08:54:53,560 epoch 10 - iter 189/275 - loss 0.01657367 - time (sec): 8.49 - samples/sec: 1869.14 - lr: 0.000002 - momentum: 0.000000 | |
| 2023-10-13 08:54:54,777 epoch 10 - iter 216/275 - loss 0.01468909 - time (sec): 9.71 - samples/sec: 1871.33 - lr: 0.000001 - momentum: 0.000000 | |
| 2023-10-13 08:54:55,985 epoch 10 - iter 243/275 - loss 0.01482658 - time (sec): 10.92 - samples/sec: 1871.96 - lr: 0.000001 - momentum: 0.000000 | |
| 2023-10-13 08:54:57,149 epoch 10 - iter 270/275 - loss 0.01473182 - time (sec): 12.08 - samples/sec: 1850.45 - lr: 0.000000 - momentum: 0.000000 | |
| 2023-10-13 08:54:57,371 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:57,371 EPOCH 10 done: loss 0.0147 - lr: 0.000000 | |
| 2023-10-13 08:54:58,144 DEV : loss 0.1649598479270935 - f1-score (micro avg) 0.8828 | |
| 2023-10-13 08:54:58,760 ---------------------------------------------------------------------------------------------------- | |
| 2023-10-13 08:54:58,761 Loading model from best epoch ... | |
| 2023-10-13 08:55:00,476 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date | |
| 2023-10-13 08:55:01,140 | |
| Results: | |
| - F-score (micro) 0.9048 | |
| - F-score (macro) 0.8089 | |
| - Accuracy 0.8402 | |
| By class: | |
| precision recall f1-score support | |
| scope 0.8933 0.9034 0.8983 176 | |
| pers 0.9597 0.9297 0.9444 128 | |
| work 0.8462 0.8919 0.8684 74 | |
| loc 0.5000 1.0000 0.6667 2 | |
| object 1.0000 0.5000 0.6667 2 | |
| micro avg 0.9013 0.9084 0.9048 382 | |
| macro avg 0.8398 0.8450 0.8089 382 | |
| weighted avg 0.9049 0.9084 0.9056 382 | |
| 2023-10-13 08:55:01,140 ---------------------------------------------------------------------------------------------------- | |