First Author Publications
- HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments
Yongjun He*, Shuai Zhang*, Jiading Gai, Xiyuan Zhang, Boran Han, Bernie Wang, Huzefa Rangwala, George Karypis
MLSys 2026 - Resource Multiplexing in Tuning and Serving Large Language Models
Yongjun He, Haofeng Yang, Yao Lu, Ana Klimović, Gustavo Alonso
USENIX ATC 2025 [code] [talk at ATC 2025] [short version at EuroMLSys@EuroSys 24] - MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
Yongjun He, Roger Waleffe, Zhichao Han, Johnu George, Binhang Yuan, Zitao Zhang, Yinan Shan, Yang Zhao, Debojyoti Dutta, Theodoros Rekatsinas, Ce Zhang
ICDE 2025 Industry Track [code] - Decentralized Training of Foundation Models in Heterogeneous Environments
Binhang Yuan*, Yongjun He*, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Ré, Ce Zhang
NeurIPS 2022 (Oral 186/9600=1.9%) [code] [Together AI blog post] - CoroBase: Coroutine-Oriented Main-Memory Database Engine
Yongjun He, Jiacheng Lu, Tianzheng Wang
VLDB 2021 [code] [talk at VLDB 2021]
(* denotes equal contribution.)
Co-Authored Publications
- Benchtemp: A General Benchmark for Evaluating Temporal Graph Neural Networks
Qiang Huang, Xin Wang, Susie Xi Rao, Zhichao Han, Zitao Zhang, Yongjun He, Quanqing Xu, Yang Zhao, Zhigao Zheng, Jiawei Jiang
ICDE 2024 - Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data
Danrui Qi, Jinglin Peng, Yongjun He, Jiannan Wang
EDBT 2024 Experiments & Analysis Track - Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees
Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Ré, Ce Zhang
NeurIPS 2022 - Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, Ji Liu
SIGKDD 2022 Applied Data Science Track - Taurus Database: How to be Fast, Available, and Frugal in the Cloud
Alex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, Yongjun He
SIGMOD 2020 Industry Track - Deeper: A Data Enrichment System Powered by Deep Web
Pei Wang, Yongjun He, Ryan Shea, Jiannan Wang, Eugene Wu
SIGMOD 2018 Demo Track
