Adaptive Preference Optimization with Uncertainty-aware Utility Anchor Paper • 2509.10515 • Published Sep 3, 2025
UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models Paper • 2510.22588 • Published Oct 26, 2025 • 1
IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer Paper • 2511.22167 • Published Nov 27, 2025
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 27 days ago • 74
Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment Paper • 2510.13387 • Published Oct 15, 2025
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia Paper • 2512.03318 • Published Dec 3, 2025
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper • 2510.10689 • Published Oct 12, 2025 • 46
V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs Paper • 2509.25773 • Published Sep 30, 2025
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception Paper • 2510.12720 • Published Oct 14, 2025 • 2
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19, 2025 • 27
OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts Paper • 2503.22952 • Published Mar 29, 2025 • 17
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens Paper • 2502.18890 • Published Feb 26, 2025 • 30
VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions Paper • 2305.18756 • Published May 30, 2023
Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation Paper • 2210.12460 • Published Oct 22, 2022