arxiv:2512.05103
Shang-Wen Daniel Li
swdanielli
AI & ML interests
Large foundation models, vision and language multimodal, and pretraining and self-supervised training
Recent Activity
upvoted
a
collection
about 1 hour ago
Pixio
authored
a paper
10 days ago
TV2TV: A Unified Framework for Interleaved Language and Video Generation
liked
a model
15 days ago
facebook/metaclip-2-worldwide-s16-384