Active Intelligence in Video Avatars via Closed-loop World Modeling Paper • 2512.20615 • Published 10 days ago • 8
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 25 days ago • 115
Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published 15 days ago • 30
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning Paper • 2512.15635 • Published 16 days ago • 19
Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task Paper • 2512.10359 • Published 23 days ago • 3
Self-Improving VLM Judges Without Human Annotations Paper • 2512.05145 • Published about 1 month ago • 18
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published 30 days ago • 149
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 219
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action Paper • 2511.22134 • Published Nov 27, 2025 • 21
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action Paper • 2511.22134 • Published Nov 27, 2025 • 21
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action Paper • 2511.22134 • Published Nov 27, 2025 • 21 • 2
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14, 2025 • 15
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks Paper • 2511.15065 • Published Nov 19, 2025 • 74
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 82
UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models Paper • 2509.21760 • Published Sep 26, 2025 • 14