Pushi's picture

2 19 6

Pushi

zpschang

·

https://zpschang.github.io/

zpschang

AI & ML interests

Embodied AI, Reinforcement Learning

Organizations

upvoted a paper 2 months ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 165

upvoted a paper 3 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 320

upvoted a paper 4 months ago

PIG-Nav: Key Insights for Pretrained Image Goal Navigation Models

Paper • 2507.17220 • Published Jul 23, 2025 • 1

upvoted a paper 5 months ago

villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models

Paper • 2507.23682 • Published Jul 31, 2025 • 23

upvoted 4 papers 10 months ago

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 123

Learning Getting-Up Policies for Real-World Humanoid Robots

Paper • 2502.12152 • Published Feb 17, 2025 • 42

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 156

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18, 2025 • 73

upvoted 4 papers about 1 year ago

IGOR: Image-GOal Representations are the Atomic Control Units for Foundation Models in Embodied AI

Paper • 2411.00785 • Published Oct 17, 2024 • 8

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Paper • 2110.13578 • Published Oct 26, 2021 • 1

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Paper • 2410.05363 • Published Oct 7, 2024 • 45

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published Oct 23, 2024 • 52

upvoted 3 papers almost 2 years ago

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

Paper • 2403.13064 • Published Mar 19, 2024 • 31

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 29

upvoted 4 papers about 2 years ago

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 260

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

Paper • 2312.10763 • Published Dec 17, 2023 • 19

An Embodied Generalist Agent in 3D World

Paper • 2311.12871 • Published Nov 18, 2023 • 8

Holodeck: Language Guided Generation of 3D Embodied AI Environments

Paper • 2312.09067 • Published Dec 14, 2023 • 15