Evolving Deeper LLM Thinking
Paper
•
2501.09891
•
Published
•
115
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper
•
2412.06559
•
Published
•
85
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward
Modeling
Paper
•
2412.15084
•
Published
•
13
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
•
2501.07301
•
Published
•
99
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
•
2501.04519
•
Published
•
287
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
•
2501.03226
•
Published
•
43
System-2 Mathematical Reasoning via Enriched Instruction Tuning
Paper
•
2412.16964
•
Published
•
2
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
•
2501.04686
•
Published
•
53
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
Language Models
Paper
•
2402.03300
•
Published
•
138
Reasoning Language Models: A Blueprint
Paper
•
2501.11223
•
Published
•
32
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and
Refinement
Paper
•
2501.12273
•
Published
•
14
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
41
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
•
2412.16145
•
Published
•
38
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
•
2501.08313
•
Published
•
300
Tensor Product Attention Is All You Need
Paper
•
2501.06425
•
Published
•
90
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
Paper
•
2501.10799
•
Published
•
15
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
Paper
•
2502.07316
•
Published
•
50
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem
Proving
Paper
•
2502.07640
•
Published
•
9
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of
Physical Concept Understanding
Paper
•
2502.08946
•
Published
•
191
Logical Reasoning in Large Language Models: A Survey
Paper
•
2502.09100
•
Published
•
24
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open
Software Evolution
Paper
•
2502.18449
•
Published
•
75
BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic
Theorem Proving
Paper
•
2502.03438
•
Published
•
2
START: Self-taught Reasoner with Tools
Paper
•
2503.04625
•
Published
•
113
Group Sequence Policy Optimization
Paper
•
2507.18071
•
Published
•
316