view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face Feb 11, 2025 • 93
Resolving Discrepancies in Compute-Optimal Scaling of Language Models Paper • 2406.19146 • Published Jun 27, 2024 • 1
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 508
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets Paper • 2506.04598 • Published Jun 5, 2025 • 7
Reproducible scaling laws for contrastive language-image learning Paper • 2212.07143 • Published Dec 14, 2022 • 2
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 27 items • Updated Nov 6, 2024 • 131
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10, 2024 • 27
OpenCLIP DataComp Collection OpenCLIP models trained on DataComp (https://huggingface.co/papers/2304.14108). • 6 items • Updated Oct 20, 2025 • 6
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 46