akashsri99 's Collections My papers
updated
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published
• 441
Training Language Models to Self-Correct via Reinforcement Learning
Paper
• 2409.12917
• Published
• 140
StoryMaker: Towards Holistic Consistent Characters in Text-to-image
Generation
Paper
• 2409.12576
• Published
• 16
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper
• 2408.04619
• Published
• 175
Perception, Reason, Think, and Plan: A Survey on Large Multimodal
Reasoning Models
Paper
• 2505.04921
• Published
• 186
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper
• 2505.24726
• Published
• 277
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper
• 2506.06395
• Published
• 133
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper
• 2505.24863
• Published
• 97
Time Blindness: Why Video-Language Models Can't See What Humans Can?
Paper
• 2505.24867
• Published
• 82
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical
Reasoning
Paper
• 2506.09513
• Published
• 101
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper
• 2506.09113
• Published
• 107
REASONING GYM: Reasoning Environments for Reinforcement Learning with
Verifiable Rewards
Paper
• 2505.24760
• Published
• 74
Qwen3 Embedding: Advancing Text Embedding and Reranking Through
Foundation Models
Paper
• 2506.05176
• Published
• 79
Video World Models with Long-term Spatial Memory
Paper
• 2506.05284
• Published
• 55
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement
Learning
Paper
• 2506.18841
• Published
• 56
Decoupled Planning and Execution: A Hierarchical Reasoning Framework for
Deep Search
Paper
• 2507.02652
• Published
• 26
MemOS: A Memory OS for AI System
Paper
• 2507.03724
• Published
• 159
WithAnyone: Towards Controllable and ID Consistent Image Generation
Paper
• 2510.14975
• Published
• 85