EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control Paper • 2511.15248 • Published Nov 19, 2025 • 6
S$^3$c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners Paper • 2409.01524 • Published Sep 3, 2024 • 1
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Paper • 2505.15801 • Published May 21, 2025 • 17
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17, 2025 • 2
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification Paper • 2506.04592 • Published Jun 5, 2025
SALT4Decompile: Inferring Source-level Abstract Logic Tree for LLM-Based Binary Decompilation Paper • 2509.14646 • Published Sep 18, 2025
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30, 2025 • 33
On Predictability of Reinforcement Learning Dynamics for Large Language Models Paper • 2510.00553 • Published Oct 1, 2025 • 8
Can We Verify Step by Step for Incorrect Answer Detection? Paper • 2402.10528 • Published Feb 16, 2024
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Paper • 2502.00334 • Published Feb 1, 2025
UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models Paper • 2501.13766 • Published Jan 23, 2025
Advancing Multimodal Reasoning Capabilities of Multimodal Large Language Models via Visual Perception Reward Paper • 2506.07218 • Published Jun 8, 2025
GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling Paper • 2506.22049 • Published Jun 27, 2025 • 2
Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning Paper • 2506.21285 • Published Jun 26, 2025
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving Paper • 2502.12022 • Published Feb 17, 2025