Kaixuan Ji

Kaixuan Ji

Ph.D. Student in Computer Science, University of California, Los Angeles

Google Scholar | Email

Publications

2025

  1. Reinforcement Learning from Human Feedback with Active Queries
    Kaixuan Ji*, Jiafan He*, Quanquan Gu, TMLR 2025, Featured Certification

  2. Self-play Preference Optimization for Language Model Alignment
    Yue Wu*, Zhiqing Sun*, Huizhuo Yuan*, Kaixuan Ji, Yiming Yang, Quanquan Gu, ICLR 2025,

2024

  1. Self-play Fine-tuning of Diffusion Models for Text-to-image Generation
    Huizhuo Yuan*, Zixiang Chen*, Kaixuan Ji*, Quanquan Gu, NeurIPS 2024

  2. Self-play Fine-tuning Converts Weak Language Models to Strong Language Models
    Zixiang Chen*, Yihe Deng*, Huizhuo Yuan*, Kaixuan Ji, Quanquan Gu, ICML 2024

  3. Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
    Kaixuan Ji*, Qingyue Zhao*, Jiafan He, Weitong Zhang, Quanquan Gu, ICLR 2024

2023 and Before

  1. Parameter-efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers
    Weng Lam Tam*, Xiao Liu*, Kaixuan Ji, Lilong Xue, Xingjian Zhang, Yuxiao Dong, Jiahua Liu, Maodi Hu, Jie Tang, Findings of EMNLP 2023

  2. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks
    Xiao Liu*, Kaixuan Ji*, Yicheng Fu*, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang, ACL 2022

Preprints

  1. Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment
    Ji Qi*, Kaixuan Ji*, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Lei Hou, Juanzi Li, Bin Xu, arXiv:2310.10590

  2. Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
    Kaixuan Ji*, Guanlin Liu*, Ning Dai, Qingping Yang, Renjie Zheng, Zheng Wu, Chen Dun, Quanquan Gu, Lin Yan, arXiv:2410.09302

  3. Towards a Sharp Analysis of Offline Policy Learning for f-Divergence-Regularized Contextual Bandits
    Qingyue Zhao*, Kaixuan Ji*, Heyang Zhao*, Tong Zhang, Quanquan Gu, arXiv:2502.06051.