NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 6 items • Updated 6 days ago • 106
PaCoRe Collection Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning • 3 items • Updated 20 days ago • 8
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 26 days ago • 63
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 26 days ago • 167
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 5 items • Updated Nov 25 • 31
ColModernVBERT Collection Resources for ColModernVBERT – the document retrieval–optimized variant of ModernVBERT • 5 items • Updated Oct 3 • 7
Qianfan-VL Collection Qianfan-vl model series. The models are mainly domain enhanced vision language model, targeting enterprise level multi modal understanding scenarios. • 4 items • Updated Sep 24 • 19
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 56
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated Sep 21 • 29