Publications

Learning guarantee of reward modeling using deep neural networks

Published in KDD2026, 2025

Architecture‑dependent, non-asymptotic regret bounds that can be sharpened under a simple margin condition (when human preferences are more “clear” than you think).

Learning guarantee of reward modeling using deep neural networks

Recommended citation: Luo, Y., Ge, Y., Han, R. & Shen, G. (2025). Learning Guarantee of Reward Modeling Using Deep Neural Networks. arXiv preprint arXiv:2505.06601.

Adaptive debiased lasso in high-dimensional GLMs with streaming data

Published in submitted, 2024

Real-time learning from streaming data! We propose a one-pass, time- and memory-efficient method for high-dimensional GLMs—Adaptive Debiased Lasso (ADL)—that updates coefficients and standard errors on arrival via stochastic gradients with online debiasing.

Adaptive debiased lasso in high-dimensional GLMs with streaming data

Recommended citation: Han, R., Luo, L., Luo, Y., Lin, Y., & Huang, J. (2024). Adaptive debiased lasso in high-dimensional GLMs with streaming data. To submit.