hmbyt5-preliminary
/

byt5-small-historic-multilingual-flax

@@ -22,12 +22,13 @@ Preliminary Historic Multilingual and Monolingual ByT5 Models. Following languag
 More details can be found in [our GitHub repository](https://github.com/stefan-it/hmByT5).
 # Pretraining
 We use the official JAX/FLAX example in Hugging Face Transformers to pretrain a ByT5 model on a single v3-8 TPU.
 Details about the training can be found [here](https://github.com/stefan-it/hmByT5/tree/main/hmbyt5-flax).
 # Evaluation on Downstream Tasks (NER)
 We evaluated the hmByT5 model on downstream tasks:

 More details can be found in [our GitHub repository](https://github.com/stefan-it/hmByT5).
 # Pretraining
 We use the official JAX/FLAX example in Hugging Face Transformers to pretrain a ByT5 model on a single v3-8 TPU.
 Details about the training can be found [here](https://github.com/stefan-it/hmByT5/tree/main/hmbyt5-flax).
+The model was trained for 0.5 epoch.
 # Evaluation on Downstream Tasks (NER)
 We evaluated the hmByT5 model on downstream tasks: