About Me

I am an Assistant Professor (Presidential Young Fellow) at the School of Data Science (SDS), The Chinese University of Hong Kong, Shenzhen (CUHKSZ) since Aug 2024. We have many openings for PhD/MPhil students for 2026 Fall admission, and we are also looking for Research Assistants and visiting students/interns.

I work on both the theory and applications of AI/machine learning. On the practical side, I'm mostly interested in large language models (LLMs), including LLM-based agents, personalization of LLMs, LLM routing, LLM-based social simulation, prompt optimization, and RLHF/DPO. On the theoretical side, I'm mainly interested in the theoretical study of multi-armed bandits.

My research has received supports from multiple grants, such as

  • NSFC Young Scientist Fund (国家自然科学基金青年基金(C类))
  • Guangdong Provincial NSF Excellent Young Scientists Fund (广东省自然科学基金优秀青年项目)
  • Shenzhen NSF General Program (深圳市自然科学基金面上项目)
  • Collaboration project with Huawei on LLM Agents (华为大模型智能体合作项目)

Experience: From Jan 2024 to Jun 2024, I worked as a Postdoctoral Associate at MIT, Laboratory for Information and Decision Systems (LIDS), advised by Prof. Patrick Jaillet. From 2021 to 2023, I was a Postdoctoral Fellow at the National University of Singapore (NUS) with Prof. Bryan Kian Hsiang Low. I obtained my Ph.D. in Computer Science from NUS (2017-2021), co-advised by Prof. Bryan Kian Hsiang Low (NUS) and Prof. Patrick Jaillet (MIT). My Ph.D. study was supported by Singapore-MIT Alliance for Research and Technology (SMART) Graduate Fellowship. In 2015, I obtained my undergraduate degree from NUS, Electrical Engineering with first class honors.

Recent Preprints

* denotes equal contribution, denotes corresponding author, 🎓 denotes my students.

  1. T-POP: Test-Time Personalization with Online Preference Feedback.

    Zikun Qu🎓, Min Zhang, Mingze Kong🎓, Xiang Li, Zhiwei Shang🎓, Zhiyong Wang, Yikun Ban, Shuang Qiu, Yao Shu, Zhongxiang Dai

    Preprint, 2025. [arXiv]

  2. FedPOB: Sample-Efficient Federated Prompt Optimization via Bandits.

    Pingchen Lu*🎓, Zhi Hong*🎓, Zhiwei Shang🎓, Zhiyong Wang, Yikun Ban, Yao Shu, Min Zhang, Shuang Qiu, Zhongxiang Dai

    Preprint, 2025. [arXiv]

  3. Meta-Prompt Optimization for LLM-Based Sequential Decision Making.

    Mingze Kong🎓, Zhiyong Wang, Yao Shu, Zhongxiang Dai

    ICLR 2025 Workshop on Reasoning and Planning for Large Language Models. [arXiv]

  4. Large Language Model-Enhanced Multi-Armed Bandits.

    Jiahang Sun*🎓, Zhiyong Wang*, Runhan Yang*🎓, Chenjun Xiao, John C.S. Lui, Zhongxiang Dai

    ICLR 2025 Workshop on Reasoning and Planning for Large Language Models. [arXiv]

  5. Active Human Feedback Collection via Neural Contextual Dueling Bandits.

    Arun Verma, Xiaoqiang Lin, Zhongxiang Dai, Daniela Rus, Bryan Kian Hsiang Low

    ICLR 2025, Workshop on on Bidirectional Human-AI Alignment. [arXiv]

  6. ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment.

    Xiaoqiang Lin, Arun Verma, Zhongxiang Dai, Daniela Rus, See-Kiong Ng, Bryan Kian Hsiang Low

    Preprint, 2025. [arXiv]

  7. Prompt Optimization with Human Feedback.

    Xiaoqiang Lin, Zhongxiang Dai, Arun Verma, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    ICML 2024 Workshop on Models of Human Feedback for AI Alignment. (Selected as Oral) [arXiv]

  8. Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate Gradients.

    Yao Shu, Xiaoqiang Lin, Zhongxiang Dai and Kian Hsiang Low

    ICML 2024 Workshop on Differentiable Almost Everything. [arXiv]

Publications

* denotes equal contribution, denotes corresponding author, 🎓 denotes my students.

  1. Federated Linear Dueling Bandits.

    Xuhan Huang🎓, Yan Hu, Zhiyan Li🎓, Zhiyong Wang, Zhongxiang Dai

    AAAI 2026. [arXiv]

  2. Convergence Rates of Constrained Expected Improvement.

    Haowei Wang, Jingyi Wang, Zhongxiang Dai, Nai-Yuan Chiang, Szu Hui Ng, Cosmin G. Petra.

    NeurIPS 2025 (Spotlight). [arXiv]

  3. Adaptive Sample Scheduling for Direct Preference Optimization.

    Zixuan Huang, Yikun Ban, Lean Fu, Xiaojie Li, Zhongxiang Dai, Jianxin Li, Deqing Wang

    NeurIPS 2025. [arXiv]

  4. Source Attribution for Large Language Model-Generated Data.

    Jingtan Wang*, Xinyang Lu*, Zitong Zhao*, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng and Kian Hsiang Low

    ACL Findings 2025 [arXiv]

  5. Online Clustering of Dueling Bandits.

    Zhiyong Wang, Jiahang Sun🎓, Mingze Kong🎓, Jize Xie, Qinghua Hu, John C.S. Lui, Zhongxiang Dai

    ICML 2025 [arXiv]

  6. Refining Adaptive Zeroth-Order Optimization at Ease.

    Yao Shu, Qixin Zhang, Kun He, Zhongxiang Dai

    ICML 2025 [arXiv]

  7. Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization.

    Shouri Hu, Haowei Wang, Zhongxiang Dai, Kian Hsiang Low and Szu Hui Ng.

    Journal of Machine Learning Research (JMLR), 2025 [Paper]

  8. Neural Dueling Bandits: Principled Preference-Based Optimization with Non-Linear Reward Function.

    Arun Verma*, Zhongxiang Dai*, Xiaoqiang Lin, Patrick Jaillet and Kian Hsiang Low.

    ICLR 2025 [arXiv]

  9. Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars.

    Zhaoxuan Wu*, Xiaoqiang Lin*, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    NeurIPS 2024 [arXiv]

  10. Localized Zeroth-Order Prompt Optimization.

    Wenyang Hu*, Yao Shu*, Zongmin Yu, Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, See-Kiong Ng and Kian Hsiang Low.

    NeurIPS 2024 (Spotlight) [arXiv]

  11. Data-Centric AI in the Age of Large Language Models.

    Xinyi Xu, et al.

    EMNLP Findings 2024 [arXiv]

  12. Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers.

    Xiaoqiang Lin*, Zhaoxuan Wu*, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    ICML 2024 [arXiv][Code][Project]

  13. Robustifying and Boosting Training-Free Neural Architecture Search.

    Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low.

    ICLR 2024 [arXiv]

  14. Quantum Bayesian Optimization.

    Zhongxiang Dai*, Gregory Kang Ruey Lau*, Arun Verma, Yao Shu, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2023 [Paper][Code]

  15. Batch Bayesian Optimization For Replicable Experimental Design.

    Zhongxiang Dai, Quoc Phong Nguyen, Sebastian Shenghong Tay, Daisuke Urano, Richalynn Leong, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2023 [Paper]

  16. Exploiting Correlated Auxiliary Feedback in Parameterized Bandits.

    Arun Verma, Zhongxiang Dai, Yao Shu and Kian Hsiang Low.

    NeurIPS 2023 [Paper]

  17. Training-Free Neural Active Learning with Initialization-Robustness Guarantees.

    Apivich Hemachandra, Zhongxiang Dai, Jasraj Singh, See-Kiong Ng and Kian Hsiang Low.

    ICML 2023 [Paper]

  18. Federated Neural Bandits.

    Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Kian Hsiang Low and Patrick Jaillet.

    ICLR 2023 [Paper]

  19. Zeroth-Order Optimization with Trajectory-Informed Derivative Estimation.

    Yao Shu*, Zhongxiang Dai*, Weicong Sng, Arun Verma, Patrick Jaillet and Kian Hsiang Low.

    ICLR 2023 [Paper]

  20. Recursive Reasoning-Based Training-Time Adversarial Machine Learning.

    Yizhou Chen, Zhongxiang Dai, Haibin Yu, Kian Hsiang Low and Teck-Hua Ho.

    Artificial Intelligence Journal, 2023 [Paper]

  21. Sample-Then-Optimize Batch Neural Thompson Sampling.

    Zhongxiang Dai, Yao Shu, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2022 [arXiv][Code]

  22. Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search.

    Yao Shu, Zhongxiang Dai, Zhaoxuan Wu and Kian Hsiang Low.

    NeurIPS 2022 [arXiv]

  23. Bayesian Optimization under Stochastic Delayed Feedback.

    Arun Verma*, Zhongxiang Dai* and Kian Hsiang Low.

    ICML 2022 [Paper]

  24. On Provably Robust Meta-Bayesian Optimization.

    Zhongxiang Dai, Yizhou Chen, Haibin Yu, Kian Hsiang Low and Patrick Jaillet.

    UAI 2022 [Paper]

  25. Neural Ensemble Search via Bayesian Sampling.

    Yao Shu, Yizhou Chen, Zhongxiang Dai and Kian Hsiang Low.

    UAI 2022 [Paper]

  26. NASI: Label- and Data-agnostic Neural Architecture Search at Initialization.

    Yao Shu, Shaofeng Cai, Zhongxiang Dai, Beng Chin Ooi and Kian Hsiang Low.

    ICLR 2022 [arXiv]

  27. Differentially Private Federated Bayesian Optimization with Distributed Exploration.

    Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2021 [Paper][Code]

  28. Optimizing Conditional Value-At-Risk of Black-Box Functions.

    Quoc Phong Nguyen, Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2021 [Paper][Code]

  29. Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee.

    Xiaofeng Fan, Yining Ma, Zhongxiang Dai, Wei Jing, Cheston Tan and Kian Hsiang Low.

    NeurIPS 2021 [Paper][Code]

  30. Value-at-Risk Optimization with Gaussian Processes.

    Quoc Phong Nguyen, Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    ICML 2021 [Paper][Code]

  31. Federated Bayesian Optimization via Thompson Sampling.

    Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2020 [Paper][Code]

  32. R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games.

    Zhongxiang Dai, Yizhou Chen, Kian Hsiang Low, Patrick Jaillet and Teck-Hua Ho.

    ICML 2020 [Paper][Code][Video]

  33. Private Outsourced Bayesian Optimization.

    Dmitrii Kharkovskii, Zhongxiang Dai and Kian Hsiang Low.

    ICML 2020 [Paper][Code][Video]

  34. Bayesian Optimization Meets Bayesian Optimal Stopping.

    Zhongxiang Dai, Haibin Yu, Kian Hsiang Low, and Patrick Jaillet.

    ICML 2019 [Paper][Code]

  35. Bayesian Optimization with Binary Auxiliary Information.

    Yehong Zhang, Zhongxiang Dai, and Kian Hsiang Low.

    UAI 2019 (Plenary Talk) [Paper][Code]

  36. Implicit Posterior Variational Inference for Deep Gaussian Processes.

    Haibin Yu*, Yizhou Chen*, Zhongxiang Dai, Kian Hsiang Low, and Patrick Jaillet.

    NeurIPS 2019 (Spotlight) [Paper][Code]

Awards and Honors

  • 国家级高层次青年人才
  • Presidential Young Fellow, The Chinese University of Hong Kong, Shenzhen, 2024
  • Dean's Graduate Research Excellence Award, NUS, School of Computing, 2021
  • Research Achievement Award, NUS, School of Computing, 2019 & 2020 (Twice)
  • Singapore-MIT Alliance for Research and Technology (SMART) Graduate Fellowship, 2017
  • ST Electronics Prize (Top student in cohort), NUS, 2012 & 2013 (Twice)
  • Dean’s List, NUS, 2011-2015 (Five times)

Professional Service

  • Area Chair: NeurIPS 2025, ICLR 2025 & 2026
  • Conference Reviewer: NeurIPS, ICML, ICLR, UAI, AISTATS, AAAI, CoRL, CVPR, ICCV, AAMAS, IROS, ICRA.
  • Journal Reviewer: IEEE TPAMI, Operations Research, SIAM Journal on Optimization, Automatica, TMLR, Neural Networks, IEEE RA-L.