Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ language:
|
|
| 9 |
<img alt="OLMo Logo" src="https://cdn-uploads.huggingface.co/production/uploads/65316953791d5a2611426c20/nC44-uxMD6J6H3OHxRtVU.png" width="242px" style="margin-left:'auto' margin-right:'auto' display:'block'">
|
| 10 |
|
| 11 |
|
| 12 |
-
# Model Card for Olmo 3 Instruct
|
| 13 |
|
| 14 |
We introduce Olmo 3, a new family of 7B and 32B models both Instruct and Think variants. Long chain-of-thought thinking improves reasoning tasks like math and coding.
|
| 15 |
|
|
@@ -78,19 +78,6 @@ out = list_repo_refs("allenai/Olmo-3-7B-Instruct")
|
|
| 78 |
branches = [b.name for b in out.branches]
|
| 79 |
```
|
| 80 |
|
| 81 |
-
### Fine-tuning
|
| 82 |
-
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
| 83 |
-
1. Fine-tune with the OLMo-core repository:
|
| 84 |
-
```bash
|
| 85 |
-
torchrun --nproc-per-node=8 ./src/scripts/official/MODEL.py run01
|
| 86 |
-
```
|
| 87 |
-
You can override most configuration options from the command-line. For example, to override the learning rate you could launch the script like this:
|
| 88 |
-
|
| 89 |
-
```bash
|
| 90 |
-
torchrun --nproc-per-node=8 ./src/scripts/train/MODEL.py run01 --train_module.optim.lr=6e-3
|
| 91 |
-
```
|
| 92 |
-
For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo-core).
|
| 93 |
-
|
| 94 |
### Model Description
|
| 95 |
|
| 96 |
- **Developed by:** Allen Institute for AI (Ai2)
|
|
@@ -144,6 +131,9 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo-
|
|
| 144 |
## Bias, Risks, and Limitations
|
| 145 |
Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.
|
| 146 |
|
|
|
|
|
|
|
|
|
|
| 147 |
|
| 148 |
## Citation
|
| 149 |
A technical manuscript is forthcoming!
|
|
|
|
| 9 |
<img alt="OLMo Logo" src="https://cdn-uploads.huggingface.co/production/uploads/65316953791d5a2611426c20/nC44-uxMD6J6H3OHxRtVU.png" width="242px" style="margin-left:'auto' margin-right:'auto' display:'block'">
|
| 10 |
|
| 11 |
|
| 12 |
+
# Model Card for Olmo 3 7B Instruct
|
| 13 |
|
| 14 |
We introduce Olmo 3, a new family of 7B and 32B models both Instruct and Think variants. Long chain-of-thought thinking improves reasoning tasks like math and coding.
|
| 15 |
|
|
|
|
| 78 |
branches = [b.name for b in out.branches]
|
| 79 |
```
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
### Model Description
|
| 82 |
|
| 83 |
- **Developed by:** Allen Institute for AI (Ai2)
|
|
|
|
| 131 |
## Bias, Risks, and Limitations
|
| 132 |
Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from OLMo or any LLM are often inaccurate, so facts should be verified.
|
| 133 |
|
| 134 |
+
## License
|
| 135 |
+
This model is licensed under Apache 2.0. It is intended for research and educational use in accordance with [Ai2's Responsible Use Guidelines](https://allenai.org/responsible-use).
|
| 136 |
+
|
| 137 |
|
| 138 |
## Citation
|
| 139 |
A technical manuscript is forthcoming!
|