Biography
I am a first-year PhD student (2025 Fall) at Peking University, advised by Prof. Ning Ding and Prof. Bowen Zhou. I am also fortunate to work closely with Dr. Ganqu Cui and Prof. Yu Cheng. Before that, I graduated from University of Electronic Science and Technology of China with the Most Outstanding Students Award.
My research interests are building Large Reasoning Models in both digital and physical world with scalable and generalizable Reinforcement Learning methods.
Please feel free to contact me if you’re interested in relevant research or would like to discuss potential collaborations!
News
-
[09/2025] Release the blog on RL helps LLMs composes learned skills, a physics benchmark includes systematic and up-to-date coverage of real-world physics competitions for (M)LLMs, the survey on RL for large reasoning models and SimpleVLA-RL, all focusing on how to developing scalable training methods for advanced intelligence, enjoy!
-
[05/2025] Excited to release our work on scalable RL for reasoning: The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models! It ranked #1 at huggingface daily paper!
-
[12/2024] I win The Most Outstanding Students Award of UESTC (The Highest Honor for UESTC students, Top 10 in all undergraduates).
-
[10/2024] I win the the UESTC-LuZhouLaoJiao Scholarship (10K RMB). I also win the first-class scholarship for the third time.
-
[10/2024] I am selected as an Outstanding Graduate of UESTC and an Outstanding Graduate of Sichuan Province.
-
[09/2023] Two papers were accepted by NeurIPS 2024.
-
[05/2024] GEOM, the first lossless graph condensation approach is accepted by ICML 2024!
-
[05/2024] Our workshop: The First Dataset Distillation Challenge got accepted at ECCV 2024 as a half-day workshop!
-
[10/2023] I win the UESTC-Huameng Scholarship (10k RMB). I also win the first-class scholarship for the second time.
-
[09/2023] One paper was accepted by NeurIPS 2023.
* indicates equal contribution unless otherwise specified.
-
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
Ganqu Cui*, Yuchen Zhang*, Jiacheng Chen*, Lifan Yuan, Zhi Wang, Yuxin Zuo, Haozhan Li, Yuchen Fan, Huayu Chen, Weize Chen, Zhiyuan Liu, Hao Peng, Lei Bai, Wanli Ouyang, Yu Cheng, Bowen Zhou, Ning Ding
arXiv 2025
-
Process Reinforcement through Implicit Rewards
Ganqu Cui*, Lifan Yuan*, Zefan Wang*, Hanbin Wang*, Yuchen Zhang*, Jiacheng Chen*, Wendi Li*, Bingxiang He*, Yuchen Fan*, Tianyu Yu*, Qixin Xu, Weize Chen, Jiarui Yuan, Huayu Chen, Kaiyan Zhang, Xingtai Lv, Shuo Wang, Yuan Yao, Xu Han, Hao Peng, Yu Cheng, Zhiyuan Liu, Maosong Sun, Bowen Zhou, Ning Ding
(* Core Contributor) arXiv 2025
-
TTRL: Test-time reinforcement learning
Yuxin Zuo, Kaiyan Zhang, Li Sheng, Shang Qu, Ganqu Cui, Xuekai Zhu, Haozhan Li, Yuchen Zhang, Xinwei Long, Ermo Hua, Biqing Qi, Youbang Sun, Zhiyuan Ma, Lifan Yuan, Ning Ding, Bowen Zhou
NeurIPS 2025
-
Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Yuchen Zhang, Tianle Zhang, Kai Wang, Ziyao Guo, Yuxuan Liang, Xavier Bresson, Wei Jin, Yang You
ICML 2024
-
GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning
Guibin Zhang, Haonan Dong, Yuchen Zhang, Zhixun Li, Dingshuo Chen, Kai Wang, Tianlong Chen, Yuxuan Liang, Dawei Cheng, Kun Wang
NeurIPS 2024
-
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality
Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang
NeurIPS 2024
-
Enhancing Knowledge Transfer for Task Incremental
Learning with Data-free Subnetwork
Qiang Gao, Xiaojun Shan, Yuchen Zhang, Fan Zhou
NeurIPS 2023
Honors & Awards
-
The Most Outstanding Students Award of UESTC (Top 0.1%)
-
UESTC-Huameng Scholarship (Top 3%)
-
UESTC-LuZhouLaoJiao Scholarship (Top 3%)
-
Outstanding Graduate of UESTC, Outstanding Graduate of Sichuan Province (Top 3%)
-
First-class Scholarship, 2022, 2023, 2024
-
National Second Prize | 16th Chinese Collegiate Computing Competition in 2023 (Top 2%)
-
Provincial First Prize | 16th Chinese Collegiate Computing Competition in 2023
-
Provincial Second Prize | 13th National E-commerce Innovation, Creativity and Entrepreneurship Challenge
-
Provincial Second Prize | 8th C4-Network Technology Challenge
-
Provincial Second Prize | 9th C4-Network Technology Challenge
-
Provincial Second Prize | 17th China Collegiate Computing Competition
Invited Talks
Professional Activities
Conference Reviewer: NeurIPS 2024, 2025; ICLR 2025, 2026; ICML 2025; ACM MM 2024; WWW 2024; AISTATS 2025, 2026; ICASSP 2025; ICCV 2025; AAAI 2026
Organizer: The First Dataset Distillation Challenge @ ECCV 2024