Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Posts

Tutorial: Build your first neural network from scratch

9 minute read

Published: August 17, 2023

In this episode, we want to investigate a simple working neural network (NN) anatomically, without the use of any deep learning package. You might find it very interesting to design every single element that is essential to perform forward and backward propagation. Every piece of component can be put up together like playing a LEGO game!

LaTeX on Github Pages+minimal mistakes+MathJax

3 minute read

Published: April 26, 2020

To have $\LaTeX$ equations on this website, which is run on Github Pages using Jekyll and AcademicPages (a fork of Minimal Mistakes), I had to make a few guesses, so here are the secrets.

music

HKGNA Concert

Concert, Jockey Club Auditorium, the Hong Kong Polytechnic University, 2023

Featured Music:

Borodin Polovetsian Dances from Prince Igor, Act II No. 8 & 17
Russo Street Music, Op. 65, 1st & 4th mvt
Intermission
J. Williams Escapades for saxophone and orchestra
A. L. Webber The Phantom of the Opera (arranged by C.Cluster)

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Adaptive debiased SGD in high-dimensional GLMs with streaming data

Published in submitted, 2024

Online statistical inference refers to an inferential method that updates the model parameters as data is sequentially available. This approach has broad applications in network security, quantitative finance, and recommendation systems. In contrast to offline techniques where the entire dataset is used for training, online learning algorithms are anticipated to offer computational efficiency while still delivering statistical results comparable to their offline counterparts.

Recommended citation: Han, R., Luo, L., Luo, Y., Lin, Y., & Huang, J. (2024). Adaptive debiased SGD in high-dimensional GLMs with streaming data. To submit.

Learning guarantee of reward modeling using deep neural networks

Published in submited, 2025

In this work, we study the learning theory of reward modeling with pairwise comparison data using deep neural networks. We establish a novel non-asymptotic regret bound for deep reward estimators in a non-parametric setting, which depends explicitly on the network architecture. Furthermore, to underscore the critical importance of clear human beliefs, we introduce a margin-type condition that assumes the conditional winning probability of the optimal action in pairwise comparisons is significantly distanced from 1/2. This condition enables a sharper regret bound, which substantiates the empirical efficiency of Reinforcement Learning from Human Feedback and highlights clear human beliefs in its success. Notably, this improvement stems from high-quality pairwise comparison data implied by the margin-type condition, is independent of the specific estimators used, and thus applies to various learning algorithms and models.

Recommended citation: Luo, Y., Ge, Y., Han, R. & Shen, G. (2025). Learning Guarantee of Reward Modeling Using Deep Neural Networks. arXiv preprint arXiv:2505.06601.

teaching

I am not teaching, would you teach me?

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Yuanhang Luo

Sitemap

Pages

Page Not Found

About

Archive Layout with Content

Posts by Category

Posts by Collection

La Musique

Page not in menu

Page Archive

Portfolio

Blog posts

Sitemap

Posts by Tags

Talk map

Talks and presentations

Terms and Privacy Policy

Blog posts

Posts

Tutorial: Build your first neural network from scratch

LaTeX on Github Pages+minimal mistakes+MathJax

music

HKGNA Concert

portfolio

Portfolio item number 1

Portfolio item number 2

publications

Adaptive debiased SGD in high-dimensional GLMs with streaming data

Learning guarantee of reward modeling using deep neural networks

talks

Talk 1 on Relevant Topic in Your Field

Tutorial 1 on Relevant Topic in Your Field

Talk 2 on Relevant Topic in Your Field

Conference Proceeding talk 3 on Relevant Topic in Your Field

teaching

I am not teaching, would you teach me?