About Me

I am an Assistant Professor and Presidential Young Fellow at the School of Data Science (SDS), The Chinese University of Hong Kong, Shenzhen (CUHKSZ) since Aug 2024. I am looking for PhD students, RAs, and visiting students/interns. Please feel free to reach out if you're interested in working with me.

I work on both the theory and practice of AI/machine learning. On the practical side, I'm mostly interested in large language models (LLMs), including LLM-based agents, personalization of LLMs, LLM routing, prompt optimization, and RLHF/DPO, all of which can be studied from the perspective of multi-armed bandits (MAB) and Bayesian optimization (BO). On the theoretical side, I'm mainly interested in the theoretical study of MAB and BO.

From Jan 2024 to Jun 2024, I worked as a Postdoctoral Associate at MIT, Laboratory for Information and Decision Systems (LIDS), advised by Prof. Patrick Jaillet. From 2021 to 2023, I was a Postdoctoral Fellow at the National University of Singapore (NUS) with Prof. Bryan Kian Hsiang Low. I obtained my Ph.D. in Computer Science from NUS (2017-2021), co-advised by Prof. Bryan Kian Hsiang Low (NUS) and Prof. Patrick Jaillet (MIT). My Ph.D. study was supported by Singapore-MIT Alliance for Research and Technology (SMART) Graduate Fellowship. In 2015, I obtained my undergraduate degree from NUS, Electrical Engineering with first class honors.

News

  • Sep 2025 Two papers accepted to NeurIPS 2025!
  • Aug 2025 Invited to serve as an Area Chair for ICLR 2026!
  • May 2025 Our paper on LLM source attribution & watermaking accepted to ACL 2025 Findings!
  • May 2025 2 papers accepted to ICML 2025!
  • Apr 2025 Invited to serve as an Area Chair for NeurIPS 2025!
  • Apr 2025 Our paper on Bayesian optimization accepted to Journal of Machine Learning Research (JMLR)!

Selected Workshop Papers & Preprints

* denotes equal contribution, denotes corresponding author, 🎓 denotes my students.

  1. T-POP: Test-Time Personalization with Online Preference Feedback.

    Zikun Qu🎓, Min Zhang, Mingze Kong🎓, Xiang Li, Zhiwei Shang🎓, Zhiyong Wang, Yikun Ban, Shuang Qiu, Yao Shu, Zhongxiang Dai

    Preprint, 2025. [arXiv]

  2. FedPOB: Sample-Efficient Federated Prompt Optimization via Bandits.

    Pingchen Lu*🎓, Zhi Hong*🎓, Zhiwei Shang🎓, Zhiyong Wang, Yikun Ban, Yao Shu, Min Zhang, Shuang Qiu, Zhongxiang Dai

    Preprint, 2025. [arXiv]

  3. Meta-Prompt Optimization for LLM-Based Sequential Decision Making.

    Mingze Kong🎓, Zhiyong Wang, Yao Shu, Zhongxiang Dai

    ICLR 2025 Workshop on Reasoning and Planning for Large Language Models. [arXiv]

  4. Large Language Model-Enhanced Multi-Armed Bandits.

    Jiahang Sun*🎓, Zhiyong Wang*, Runhan Yang*🎓, Chenjun Xiao, John C.S. Lui, Zhongxiang Dai

    ICLR 2025 Workshop on Reasoning and Planning for Large Language Models. [arXiv]

  5. Federated Linear Dueling Bandits.

    Xuhan Huang🎓, Yan Hu, Zhiyan Li🎓, Zhiyong Wang, Benyou Wang, Zhongxiang Dai

    Preprint, 2025. [arXiv]

  6. Active Human Feedback Collection via Neural Contextual Dueling Bandits.

    Arun Verma, Xiaoqiang Lin, Zhongxiang Dai, Daniela Rus, Bryan Kian Hsiang Low

    ICLR 2025, Workshop on on Bidirectional Human-AI Alignment. [arXiv]

  7. ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment.

    Xiaoqiang Lin, Arun Verma, Zhongxiang Dai, Daniela Rus, See-Kiong Ng, Bryan Kian Hsiang Low

    Preprint, 2025. [arXiv]

  8. Prompt Optimization with Human Feedback.

    Xiaoqiang Lin, Zhongxiang Dai, Arun Verma, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    ICML 2024 Workshop on Models of Human Feedback for AI Alignment. (Selected as Oral) [arXiv]

  9. Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate Gradients.

    Yao Shu, Xiaoqiang Lin, Zhongxiang Dai and Kian Hsiang Low

    ICML 2024 Workshop on Differentiable Almost Everything. [arXiv]

Publications

* denotes equal contribution, denotes corresponding author, 🎓 denotes my students.

  1. Convergence Rates of Constrained Expected Improvement.

    Haowei Wang, Jingyi Wang, Zhongxiang Dai, Nai-Yuan Chiang, Szu Hui Ng, Cosmin G. Petra.

    NeurIPS 2025 (Spotlight). [arXiv]

  2. Adaptive Sample Scheduling for Direct Preference Optimization.

    Zixuan Huang, Yikun Ban, Lean Fu, Xiaojie Li, Zhongxiang Dai, Jianxin Li, Deqing Wang

    NeurIPS 2025. [arXiv]

  3. Source Attribution for Large Language Model-Generated Data.

    Jingtan Wang*, Xinyang Lu*, Zitong Zhao*, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng and Kian Hsiang Low

    ACL Findings 2025 [arXiv]

  4. Online Clustering of Dueling Bandits.

    Zhiyong Wang, Jiahang Sun🎓, Mingze Kong🎓, Jize Xie, Qinghua Hu, John C.S. Lui, Zhongxiang Dai

    ICML 2025 [arXiv]

  5. Refining Adaptive Zeroth-Order Optimization at Ease.

    Yao Shu, Qixin Zhang, Kun He, Zhongxiang Dai

    ICML 2025 [arXiv]

  6. Adjusted Expected Improvement for Cumulative Regret Minimization in Noisy Bayesian Optimization.

    Shouri Hu, Haowei Wang, Zhongxiang Dai, Kian Hsiang Low and Szu Hui Ng.

    Journal of Machine Learning Research (JMLR), 2025 [Paper]

  7. Neural Dueling Bandits: Principled Preference-Based Optimization with Non-Linear Reward Function.

    Arun Verma*, Zhongxiang Dai*, Xiaoqiang Lin, Patrick Jaillet and Kian Hsiang Low.

    ICLR 2025 [arXiv]

  8. Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars.

    Zhaoxuan Wu*, Xiaoqiang Lin*, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    NeurIPS 2024 [arXiv]

  9. Localized Zeroth-Order Prompt Optimization.

    Wenyang Hu*, Yao Shu*, Zongmin Yu, Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, See-Kiong Ng and Kian Hsiang Low.

    NeurIPS 2024 (Spotlight) [arXiv]

  10. Data-Centric AI in the Age of Large Language Models.

    Xinyi Xu, et al.

    EMNLP Findings 2024 [arXiv]

  11. Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers.

    Xiaoqiang Lin*, Zhaoxuan Wu*, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.

    ICML 2024 [arXiv][Code][Project]

  12. Robustifying and Boosting Training-Free Neural Architecture Search.

    Zhenfeng He, Yao Shu, Zhongxiang Dai, Bryan Kian Hsiang Low.

    ICLR 2024 [arXiv]

  13. Quantum Bayesian Optimization.

    Zhongxiang Dai*, Gregory Kang Ruey Lau*, Arun Verma, Yao Shu, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2023 [Paper][Code]

  14. Batch Bayesian Optimization For Replicable Experimental Design.

    Zhongxiang Dai, Quoc Phong Nguyen, Sebastian Shenghong Tay, Daisuke Urano, Richalynn Leong, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2023 [Paper]

  15. Exploiting Correlated Auxiliary Feedback in Parameterized Bandits.

    Arun Verma, Zhongxiang Dai, Yao Shu and Kian Hsiang Low.

    NeurIPS 2023 [Paper]

  16. Training-Free Neural Active Learning with Initialization-Robustness Guarantees.

    Apivich Hemachandra, Zhongxiang Dai, Jasraj Singh, See-Kiong Ng and Kian Hsiang Low.

    ICML 2023 [Paper]

  17. Federated Neural Bandits.

    Zhongxiang Dai, Yao Shu, Arun Verma, Flint Xiaofeng Fan, Kian Hsiang Low and Patrick Jaillet.

    ICLR 2023 [Paper]

  18. Zeroth-Order Optimization with Trajectory-Informed Derivative Estimation.

    Yao Shu*, Zhongxiang Dai*, Weicong Sng, Arun Verma, Patrick Jaillet and Kian Hsiang Low.

    ICLR 2023 [Paper]

  19. Recursive Reasoning-Based Training-Time Adversarial Machine Learning.

    Yizhou Chen, Zhongxiang Dai, Haibin Yu, Kian Hsiang Low and Teck-Hua Ho.

    Artificial Intelligence Journal, 2023 [Paper]

  20. Sample-Then-Optimize Batch Neural Thompson Sampling.

    Zhongxiang Dai, Yao Shu, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2022 [arXiv][Code]

  21. Unifying and Boosting Gradient-Based Training-Free Neural Architecture Search.

    Yao Shu, Zhongxiang Dai, Zhaoxuan Wu and Kian Hsiang Low.

    NeurIPS 2022 [arXiv]

  22. Bayesian Optimization under Stochastic Delayed Feedback.

    Arun Verma*, Zhongxiang Dai* and Kian Hsiang Low.

    ICML 2022 [Paper]

  23. On Provably Robust Meta-Bayesian Optimization.

    Zhongxiang Dai, Yizhou Chen, Haibin Yu, Kian Hsiang Low and Patrick Jaillet.

    UAI 2022 [Paper]

  24. Neural Ensemble Search via Bayesian Sampling.

    Yao Shu, Yizhou Chen, Zhongxiang Dai and Kian Hsiang Low.

    UAI 2022 [Paper]

  25. NASI: Label- and Data-agnostic Neural Architecture Search at Initialization.

    Yao Shu, Shaofeng Cai, Zhongxiang Dai, Beng Chin Ooi and Kian Hsiang Low.

    ICLR 2022 [arXiv]

  26. Differentially Private Federated Bayesian Optimization with Distributed Exploration.

    Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2021 [Paper][Code]

  27. Optimizing Conditional Value-At-Risk of Black-Box Functions.

    Quoc Phong Nguyen, Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2021 [Paper][Code]

  28. Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee.

    Xiaofeng Fan, Yining Ma, Zhongxiang Dai, Wei Jing, Cheston Tan and Kian Hsiang Low.

    NeurIPS 2021 [Paper][Code]

  29. Value-at-Risk Optimization with Gaussian Processes.

    Quoc Phong Nguyen, Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    ICML 2021 [Paper][Code]

  30. Federated Bayesian Optimization via Thompson Sampling.

    Zhongxiang Dai, Kian Hsiang Low and Patrick Jaillet.

    NeurIPS 2020 [Paper][Code]

  31. R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games.

    Zhongxiang Dai, Yizhou Chen, Kian Hsiang Low, Patrick Jaillet and Teck-Hua Ho.

    ICML 2020 [Paper][Code][Video]

  32. Private Outsourced Bayesian Optimization.

    Dmitrii Kharkovskii, Zhongxiang Dai and Kian Hsiang Low.

    ICML 2020 [Paper][Code][Video]

  33. Bayesian Optimization Meets Bayesian Optimal Stopping.

    Zhongxiang Dai, Haibin Yu, Kian Hsiang Low, and Patrick Jaillet.

    ICML 2019 [Paper][Code]

  34. Bayesian Optimization with Binary Auxiliary Information.

    Yehong Zhang, Zhongxiang Dai, and Kian Hsiang Low.

    UAI 2019 (Plenary Talk) [Paper][Code]

  35. Implicit Posterior Variational Inference for Deep Gaussian Processes.

    Haibin Yu*, Yizhou Chen*, Zhongxiang Dai, Kian Hsiang Low, and Patrick Jaillet.

    NeurIPS 2019 (Spotlight) [Paper][Code]

Awards and Honors

  • Presidential Young Fellow, The Chinese University of Hong Kong, Shenzhen, 2024
  • Dean's Graduate Research Excellence Award, NUS, School of Computing, 2021
  • Research Achievement Award, NUS, School of Computing, 2019 & 2020 (Twice)
  • Singapore-MIT Alliance for Research and Technology (SMART) Graduate Fellowship, 2017
  • ST Electronics Prize (Top student in cohort), NUS, 2012 & 2013 (Twice)
  • Dean’s List, NUS, 2011-2015 (Five times)

Professional Service

  • Area Chair: NeurIPS 2025, ICLR 2025 & 2026
  • Conference Reviewer: NeurIPS, ICML, ICLR, UAI, AISTATS, AAAI, CoRL, CVPR, ICCV, AAMAS, IROS, ICRA.
  • Journal Reviewer: IEEE TPAMI, Operations Research, SIAM Journal on Optimization, Automatica, TMLR, Neural Networks, IEEE RA-L.