Banghua Zhu's picture

Banghua Zhu

banghua

·

https://banghua.me

AI & ML interests

Foundation models, reinforcement learning, statistics, information theory

Recent Activity

liked a dataset 21 days ago

nvidia/Nemotron-RL-math-OpenMathReasoning

updated a dataset about 1 month ago

nvidia/Nemotron-RL-math-OpenMathReasoning

published a dataset about 1 month ago

nvidia/Nemotron-RL-instruction_following-structured_outputs

View all activity

Organizations

authored 2 papers over 1 year ago

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

Paper • 2406.11939 • Published Jun 17, 2024 • 8

The Effective Horizon Explains Deep RL Performance in Stochastic Environments

Paper • 2312.08369 • Published Dec 13, 2023

authored a paper almost 2 years ago

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

Paper • 2403.04132 • Published Mar 7, 2024 • 40

authored a paper about 2 years ago

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Paper • 2311.03285 • Published Nov 6, 2023 • 31

authored 7 papers over 2 years ago

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Paper • 2103.12021 • Published Mar 22, 2021

Doubly Robust Self-Training

Paper • 2306.00265 • Published Jun 1, 2023 • 1

On Optimal Caching and Model Multiplexing for Large Model Inference

Paper • 2306.02003 • Published Jun 3, 2023 • 1

Fine-Tuning Language Models with Advantage-Induced Policy Alignment

Paper • 2306.02231 • Published Jun 4, 2023 • 2

Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons

Paper • 2301.11270 • Published Jan 26, 2023 • 2

Online Learning in Stackelberg Games with an Omniscient Follower

Paper • 2301.11518 • Published Jan 27, 2023 • 1

Jump-Start Reinforcement Learning

Paper • 2204.02372 • Published Apr 5, 2022 • 1