Skip to content

An encyclopedia of GLM algorithms

Linear and generalized linear models are still load-bearing statistical tools. Their algorithms are scattered across papers, packages, and systems. This project puts them in one coordinate system.

The one question every card answers

Given a design matrix \(X\), a response vector \(y\), a model specification — family, link \(g\), loss, and tuning parameters where relevant — what exactly does the algorithm return as \(\widehat\beta\) ?

Even someone in finance can parrot \(\widehat\beta = (X^\top X)^{-1} X^\top y\), but it is much trickier when the situation becomes irregular. In statistics, each method can be essential viewed as the mapping \((X, y, \text{model}) \mapsto \widehat\beta\) and often written against the shared template

\[ \widehat\beta\in\arg\min_\beta \mathcal L(\beta; X, y, g)+\lambda P(\beta), \]

or other procedures/updating rules. In here, estimators are defined across decades of literature that can be read, compared, and — in later phases — benchmarked on equal footing.

Explore

  •   Notation & framework


    The shared mathematical language: exponential family, link functions, loss, score and Fisher information, penalties, and the optimization primitives every card builds on.

    Read the framework

  •   Algorithm card


    The template that keeps the catalogue honest: metadata, source-backed mathematics, and a rule against undocumented reinterpretations.

    View the algorithm card format

  •   Algorithm catalogue


    The growing collection: OLS, Ridge, Lasso, Elastic Net, SCAD, MCP, LARS, FISTA, debiased Lasso, SGD, and many more.

    Open the catalogue

  •   The Arena


    Every algorithm run across all datasets with multiple hyperparameter configs each. Interactive t-SNE / UMAP embeddings and coefficient heatmaps reveal how methods cluster and differ.

    Explore the Arena

Why this project exists

Linear models are old in the way linear algebra is old: mature, not obsolete. They sit underneath econometric studies, experimental analysis, clinical and credit-risk scores, fraud systems, demand and traffic models, industrial monitoring, and scientific baselines. Sometimes they are the final model. Often they are the baseline. Almost always they are the thing you want to understand before trusting something larger.

The algorithms around them are less tidy. The same statistical object can appear under different notation; the same penalty can be fit by several solvers; likelihood, regularization, inference, and streaming updates often live in different literatures.

This Arena puts this solver zoo in one coordinate system. Each algorithm card states the estimator, assumptions, objective or update rule, hyperparameters, and output contract using shared notation and source-faithful mathematics.

Project construction

  • I · Documentation   current


    Faithful algorithm cards — exact mathematics only. Each defines the estimator, the algorithm, its hyperparameters, and the precise \((X,y,g)\mapsto\widehat\beta\) contract.

  • II · Implementation & Arena   current


    Python solvers behind one interface, fit(X, y, link, **config) -> beta_hat, mirroring each method's official construction. Run across 16 datasets with 50+ configs per solver. Explore the Arena →

  • III · Deep analysis   planned


    Structured comparison of solver families: convergence rates, sensitivity to regularisation, agreement on held-out data, and statistical inference properties.

Contributing an algorithm card

If a source defines an algorithm precisely, it can become an algorithm card. See algorithm card for the template. We welcome contributions of new algorithm cards, datasets, or hyperparameter tuning suggestions. Feel free to reach out at chattelion.luo@connect.polyu.hk.