Shunyu Liu (刘顺宇)

Research Scientist

Alibaba-NTU Global e-Sustainability CorpLab (ANGEL)
Nanyang Technological University
Singapore

Email: shunyu.liu.cs at gmail dot com

[Google Scholar] [GitHub]

Biography

I am currently a research scientist at Nanyang Technological University, working with Prof. Dacheng Tao. Before that, I received the Ph.D. degree in Computer Science and Technology from Zhejiang University, advised by Prof. Mingli Song and Prof. Chun Chen, and received the B.Eng. degree in Software Engineering from Sun Yat-sen University.

My research interests include multi-agent learning, reinforcement learning and agentic large language models. Applications of my work include autonomous power system control, as well as applications in other decision-making domains. The long-term goal of my research is to develop efficient, generalized, and practical agents. In tandem with this, my research strives to facilitate intelligent interaction among multiple agents, empowering them to tackle complex decision-making challenges in both the virtual and real worlds.

Please feel free to contact me if you are interested in my research :)

News

[Jan 2026] Two paper were accepted by ICLR 2026.
[Dec 2024] One paper was accepted by AAMAS 2026.
[Nov 2024] One paper was accepted by AAAI 2026.
[Sep 2025] Several papers were accepted by NeurIPS 2025.
[Jun 2025] One paper was accepted by ICCV 2025.
[May 2025] Two papers were accepted by ACL 2025.
[Apr 2025] Two papers were accepted by IJCAI 2025.
[Jan 2025] One paper was accepted by ICLR 2025.
[Jan 2025] One paper was accepted by WWW 2025.
[Jan 2025] One paper was accepted by IEEE Transactions on Intelligent Transportation Systems.

[Show more]

Surveys

A Survey on Agentic Multimodal Large Language Models
Huanjin Yao, Ruifei Zhang, Jiaxing Huang^✉, Jingyi Zhang, Yibo Wang, Bo Fang, Ruolin Zhu, Yongcheng Jing, Shunyu Liu, Guanbin Li, Dacheng Tao
arXiv preprint arXiv:2510.10991, 2025
[arXiv] [Code]

A Survey of Direct Preference Optimization
Shunyu Liu, Wenkai Fang, Zetian Hu, Junjie Zhang, Yang Zhou, Kongcheng Zhang, Rongcheng Tu, Ting-En Lin, Fei Huang, Mingli Song, Yongbin Li, Dacheng Tao^✉
arXiv preprint arXiv:2503.11701, 2025
[arXiv] [Code]

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges
Yunpeng Qing, Shunyu Liu, Jie Song, Yang Zhou, Kaixuan Chen, Huiqiong Wang^✉, Mingli Song
arXiv preprint arXiv:2211.06665, 2022
[arXiv] [Code]

Selected Publications

^* denotes equal contribution, and ^✉ denotes the corresponding author.

2026

Incentivizing LLM Reasoning via Reinforcement Learning with Functional Monte Carlo Tree Search
Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang, Dacheng Tao, Mingli Song, Shunyu Liu^✉
International Conference on Learning Representations (ICLR), 2026
[Paper] [arXiv] [Code]

A Simple "Motivation" Can Enhance Reinforcement Finetuning of Large Reasoning Models
Junjie Zhang, Guozheng Ma, Shunyu Liu, Haoyu Wang, Jiaxing Huang, Ting-En Lin, Fei Huang, Yongbin Li^✉, Dacheng Tao^✉
International Conference on Learning Representations (ICLR), 2026
[Paper] [arXiv]

Parallelized Planning-Acting for Multi-Agent LLM Systems in Minecraft
Yaoru Li, Shunyu Liu^✉, Tongya Zheng, Li Sun, Mingli Song
International Joint Conference on Autonomous Agents and Multi-agent Systems (AAMAS), 2026, Oral
[arXiv] [Code]

Dual-branch Spatial-Temporal Self-supervised Representation for Enhanced Road Network Learning
Qinghong Guo, Yu Wang, Ji Cao, Tongya Zheng^✉, Junshu Dai, Bingde Hu, Shunyu Liu, Canghong Jin
AAAI Conference on Artificial Intelligence (AAAI) Artificial Intelligence for Social Impact Track, 2026
[arXiv] [Code]

2025

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
Kongcheng Zhang, Qi Yao, Shunyu Liu^✉, Yingjie Wang, Baisheng Lai, Jieping Ye, Mingli Song, Dacheng Tao^✉
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv] [Code]

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu^✉, Yang Zhou, Kongcheng Zhang, Tongya Zheng, Kaixuan Chen, Mingli Song, Dacheng Tao
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv] [Code]

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Huanjin Yao^*, Jiaxing Huang^*✉, Wenhao Wu, Jingyi Zhang, Yibo Wang, Shunyu Liu, Yingjie Wang, Yuxin Song, Haocheng Feng, Li Shen, Dacheng Tao^✉
Advances in Neural Information Processing Systems (NeurIPS), 2025, Spotlight
[arXiv] [Code]

Tree of Preferences for Diversified Recommendation
Hanyang Yuan^*, Ning Tang^*, Tongya Zheng^✉, Jiarong Xu^✉, Xintong Hu, Renhong Huang, Shunyu Liu, Jiacong Hu, Jiawei Chen, Mingli Song
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv] [Code]

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
Zhao Jin, Rong-Cheng Tu^✉, Jingyi Liao, Wenhao Sun, Xiao Luo, Shunyu Liu, Dacheng Tao^✉
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv]

VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu^✉, Yifu Ding, Jingyi Liao, Zhao Jin, Shunyu Liu, Dacheng Tao^✉
Advances in Neural Information Processing Systems (NeurIPS), 2025
[arXiv] [Code]

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Jingyi Zhang, Jiaxing Huang^✉, Huanjin Yao, Shunyu Liu, Xikun Zhang, Shijian Lu, Dacheng Tao
International Conference on Computer Vision (ICCV), 2025
[Paper] [arXiv] [Code]

Dynamic Parallel Tree Search for Efficient LLM Reasoning
Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang^✉, Ziwei Liu, Bo Du, Xianglong Liu^✉, Dacheng Tao^✉
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
[Paper] [arXiv] [Code]

Supervised Optimism Correction: Be Confident When LLMs Are Sure
Junjie Zhang^*, Rushuai Yang^*, Shunyu Liu, Ting-En Lin, Fei Huang, Yi Chen, Yongbin Li^✉, Dacheng Tao^✉
Annual Meeting of the Association for Computational Linguistics (ACL) Findings, 2025
[Paper] [arXiv]

CADP: Towards Better Centralized Learning for Decentralized Execution in MARL
Yihe Zhou, Shunyu Liu^✉, Yunpeng Qing, Tongya Zheng, Kaixuan Chen, Jie Song, Mingli Song
International Joint Conference on Artificial Intelligence (IJCAI), 2025
[Paper] [arXiv] [Code]

Odyssey: Empowering Minecraft Agents with Open-World Skills
Shunyu Liu^*, Yaoru Li^*, Kongcheng Zhang^*, Zhenyu Cui^*, Wenkai Fang^*, Yuxuan Zheng, Tongya Zheng, Mingli Song^✉
International Joint Conference on Artificial Intelligence (IJCAI), 2025
[Paper] [arXiv] [Code]

From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks
Jie Yang, Yuwen Wang, Kaixuan Chen, Tongya Zheng, Yihe Zhou, Zhenbang Xiao, Ji Cao, Mingli Song, Shunyu Liu^✉
International Conference on Learning Representations (ICLR), 2025
[Paper] [arXiv] [Code]

Disentangled Condensation for Large-scale Graphs
Zhenbang Xiao, Yu Wang, Shunyu Liu, Bingde Hu, Huiqiong Wang^✉, Mingli Song, Tongya Zheng
International World Wide Web Conference (WWW), 2025
[Paper] [arXiv] [Code]

Curricular Subgoals for Inverse Reinforcement Learning
Shunyu Liu^*, Yunpeng Qing^*, Shuqi Xu, Hongyan Wu, Jiangtao Zhang, Jingyuan Cong, Tianhao Chen, Yunfu Liu, Mingli Song^✉
IEEE Transactions on Intelligent Transportation Systems (TITS), 2025
[Paper] [arXiv] [Code]

Holistic Semantic Representation for Navigational Trajectory Generation
Ji Cao, Tongya Zheng^✉, Qinghong Guo, Yu Wang, Junshu Dai, Shunyu Liu, Jie Yang, Jie Song, Mingli Song
AAAI Conference on Artificial Intelligence (AAAI), 2025
[Paper] [arXiv] [Code]

Disentangled Table-Graph Representation for Interpretable Transmission Line Fault Location
Na Yu^*, Yutong Deng^*, Shunyu Liu, Kaixuan Chen^✉, Tongya Zheng, Mingli Song
AAAI Conference on Artificial Intelligence (AAAI), 2025
[Paper]

Cooperative Policy Agreement: Learning Diverse Policy for Offline MARL
Yihe Zhou, Yuxuan Zheng, Yue Hu, Kaixuan Chen, Tongya Zheng, Jie Song, Mingli Song, Shunyu Liu^✉
AAAI Conference on Artificial Intelligence (AAAI), 2025
[Paper]

Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning
Yaoquan Wei, Shunyu Liu^✉, Jie Song, Tongya Zheng, Kaixuan Chen, Yong Wang, Mingli Song
AAAI Conference on Artificial Intelligence (AAAI), 2025
[Paper] [arXiv]

Powerformer: A Section-adaptive Transformer for Power Flow Adjustment
Kaixuan Chen^*, Wei Luo^*, Shunyu Liu^✉, Yaoquan Wei, Yihe Zhou, Yunpeng Qing, Quan Zhang, Jie Song, Mingli Song
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) Applied Data Science Track, 2025
[Paper] [arXiv] [Code]

2024

A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware Perspective
Yunpeng Qing, Shunyu Liu^✉, Jingyuan Cong, Kaixuan Chen, Yihe Zhou, Mingli Song
Advances in Neural Information Processing Systems (NeurIPS), 2024
[Paper] [arXiv] [Code]

Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation
Yu Wang, Tongya Zheng^✉, Shunyu Liu, Kaixuan Chen, Zunlei Feng, Yunzhi Hao, Mingli Song
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2024
[Paper] [arXiv] [Code]

Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks
Feiyang Xu^*, Shunyu Liu^*✉, Yunpeng Qing, Yihe Zhou, Yuwen Wang, Mingli Song
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024
[Paper] [arXiv] [Code]

Unveiling Global Interactive Patterns across Graphs: Towards Interpretable Graph Neural Networks
Yuwen Wang, Shunyu Liu^✉, Tongya Zheng, Kaixuan Chen, Mingli Song
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2024
[Paper] [arXiv] [Code]

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning
Shunyu Liu, Jie Song^✉, Yihe Zhou, Na Yu, Kaixuan Chen, Zunlei Feng, Mingli Song
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)^#, 2024
^# Top-tier Journal in Artificial Intelligence.
[Paper] [arXiv] [Code]

Improving Adversarial Robustness via Feature Pattern Consistency Constraint
Jiacong Hu, Jingwen Ye, Zunlei Feng^✉, Jiazhen Yang, Shunyu Liu, Xiaotian Yu, Lingxiang Jia, Mingli Song
International Joint Conference on Artificial Intelligence (IJCAI), 2024
[Paper] [arXiv]

COLA: Cross-city Mobility Transformer for Human Trajectory Simulation
Yu Wang, Tongya Zheng^✉, Yuxuan Liang, Shunyu Liu, Mingli Song
International World Wide Web Conference (WWW), 2024
[Paper] [arXiv] [Code]

Transmission Interface Power Flow Adjustment: A Deep Reinforcement Learning Approach based on Multi-task Attribution Map
Shunyu Liu^*, Wei Luo^*, Yanzhen Zhou, Kaixuan Chen, Quan Zhang, Huating Xu, Qinglai Guo, Mingli Song^✉
IEEE Transactions on Power Systems (TPWRS)^#, 2024
^# Top-tier Journal in Power Systems.
[Paper] [arXiv] [Code]

2023

Lookaround Optimizer: k steps around, 1 step average
Jiangtao Zhang, Shunyu Liu, Jie Song^✉, Tongtian Zhu, Zhengqi Xu, Mingli Song
Advances in Neural Information Processing Systems (NeurIPS), 2023
[Paper] [arXiv] [Code]

Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework
Shunyu Liu, Kaixuan Chen, Na Yu, Jie Song, Zunlei Feng, Mingli Song^✉
IEEE Transactions on Systems, Man and Cybernetics: Systems (TSMC), 2023
[Paper] [arXiv] [Code]

Improving Expressivity of GNNs with Subgraph-specific Factor Embedded Normalization
Kaixuan Chen^*, Shunyu Liu^*, Tongtian Zhu^*, Ji Qiao, Yun Su, Yingjie Tian, Tongya Zheng, Haofei Zhang, Zunlei Feng, Jingwen Ye^✉, Mingli Song
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023
[Paper] [arXiv] [Code]

Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition
Shunyu Liu^*, Yihe Zhou^*, Jie Song^✉, Tongya Zheng, Kaixuan Chen, Tongtian Zhu, Zunlei Feng, Mingli Song
AAAI Conference on Artificial Intelligence (AAAI), 2023, Oral
[Paper] [arXiv] [Code]

Distribution Knowledge Embedding for Graph Pooling
Kaixuan Chen, Jie Song, Shunyu Liu, Na Yu, Zunlei Feng, Gengshi Han, Mingli Song^✉
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
[Paper] [arXiv] [Code]

Honors

Awards

First Prize of the Science and Technology Progress Award, China Electric Power Research Institute

2024
Outstanding Graduate, Zhejiang University

2024
Chen Tianzhou Scholarship, Zhejiang University

2023
Outstanding Graduate Student Award, Zhejiang University

2023
Outstanding Graduate Student Award, Zhejiang University

2021
Outstanding Ph.D. Student Scholarship, Zhejiang University

2021
Outstanding Graduate Student Award, Zhejiang University

2020
National Scholarship, Ministry of Education of the People's Republic of China

2018
Outstanding Undergraduate Student Scholarship, Sun Yat-sen University

2018
Outstanding Undergraduate Student Scholarship, Sun Yat-sen University

2017
Outstanding Undergraduate Student Scholarship, Sun Yat-sen University

2016

Competition

Second Prize in "China Innovation and Entrepreneurship Competition Innovation Challenge (Zhejiang)"

2024
Third Prize in "Power System Dispatching AI Application Competition", China Southern Power Grid

2023
Third Prize in "Power System Dispatching AI Application Competition", China Southern Power Grid

2022
Ranked 5th in "L2RPN Challenge - Energies of the Future and Carbon Neutrality", IEEE WCCI

2022
Excellence Award in "Advanced Power System AI Application Competition", State Grid Corporation of China

2022
Meritorious Winner in "Mathematical Contest in Modeling", COMAP

2018

Academic Services

Journal Reviewer

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
IEEE Transactions on Knowledge and Data Engineering (TKDE)
IEEE Transactions on Systems, Man and Cybernetics: Systems (TSMC)
IEEE Transactions on Intelligent Transportation Systems (TITS)
Information Sciences (INS)
Autonomous Agents and Multi-Agent Systems (AAMAS)
International Journal of Electrical Power and Energy Systems (IJEPES)

Conference Reviewer

ICLR 2026, ICML 2026, CVPR 2026, ACL 2026, KDD 2026, AAAI 2026, ECCV 2026
NeurIPS 2025, ICML 2025, ICLR 2025, ICCV 2025, ACL 2025, KDD 2025, IJCAI 2025
NeurIPS 2024, KDD 2024
ECAI 2023

Last updated on Sep 2025. Webpage template borrowed from Prof. Sida Peng.