Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Maxwell Yao's picture
11

Maxwell Yao

MaxwellJryao
·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 8 hours ago
PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary
upvoted a paper 8 days ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
upvoted a paper 3 months ago
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
View all activity

Organizations

Post-training-Data-Flywheel's profile picture

authored 2 papers 9 months ago

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Paper • 2502.13131 • Published Feb 18, 2025 • 37

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15, 2025 • 19
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs