view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 1 day ago • 176
ColBERT-Zero 🐶 Collection First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT • 10 items • Updated 1 day ago • 12
view article Article **ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models?** 1 day ago • 10
jina-embeddings-v5-text: Task-Targeted Embedding Distillation Paper • 2602.15547 • Published 4 days ago • 20
jina-embeddings-v5-text Collection Our 5th-gen embeddings: two lightweight multilingual models with SOTA performance in retrieval, matching, clustering, and classification. • 23 items • Updated 2 days ago • 26
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling 9 days ago • 44
LateOn-Code 💻 Collection State-of-the-art late interaction code retrieval models • 6 items • Updated 1 day ago • 13
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 17 days ago • 72
view article Article Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries Dec 22, 2025 • 9
view article Article Introducing Daggr: Chain apps programmatically, inspect visually +3 23 days ago • 100
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective 25 days ago • 56
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 • 120
view article Article RexRerankers: SOTA Rankers for Product Discovery and AI Assistants 28 days ago • 44
Embedding Models Collection Run or fine-tune embedding models with Unsloth. • 14 items • Updated 5 days ago • 3